[PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3
@ 2026-04-15 10:55 Shameer Kolothum
  2026-04-15 10:55 ` [PATCH v4 01/31] backends/iommufd: Update iommufd_backend_get_device_info Shameer Kolothum
                   ` (30 more replies)
  0 siblings, 31 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Hi,

Changes from v3:
 https://lore.kernel.org/qemu-devel/20260226105056.897-1-skolothumtho@nvidia.com/

 - Addressed v3 feedback and picked up Reviewed-by tags.
 - Folded veventq alloc/free into alloc_viommu/free_viommu, removing
   the separate ops callback (patch 13).
 - Reworked register and macro names based on feedback.
 - Improved documentation around VCMDQ aperture usage, which was a
   source of confusion in v3. See patches 15, 16, 17, 19 and 20.
   Patch 20 in particular explains the cached register vs hardware-backed
   MMIO model for VCMDQ apertures. Hope this is clearer and correct now!.
 - Added patch 21 to skip IOMMU mappings for RAM device regions,
   eliminating spurious "IOMMU_IOAS_MAP failed: Bad address" warnings
   for the VINTF page0 guest mapping.
 - Updated SMMUv3 identifier property to accommodate the ITS node id
   (patch 27).
 - Removed qtest bios-tables blob patches; node id changes are now
   handled in patch 27.
 - Based on top of Nathan's "Resolve AUTO properties" series [0].
 - Added patch 30 to enforce viommu association stability when CMDQV
   is active.

Please find the complete branch here:
https://github.com/shamiali2008/qemu-master/tree/master-vcmdq-v4-ext

Sanity tested on NVIDIA Grace. Further testing in progress.

Feedback and testing are very welcome.

Thanks,
Shameer
[0] https://lore.kernel.org/qemu-devel/20260401010231.4166776-1-nathanc@nvidia.com

---
Background(from RFCv1):
https://lore.kernel.org/qemu-devel/20251210133737.78257-1-skolothumtho@nvidia.com/

Thanks to Nicolin for the initial patches and testing on which this
is based.

Tegra241 CMDQV extends SMMUv3 by allocating per-VM "virtual interfaces"
(VINTFs), each hosting up to 128 VCMDQs.

Each VINTF exposes two 64KB MMIO pages:
 - Page0 – guest owned control and status registers (directly mapped
           into the VM)
 - Page1 – queue configuration registers (trapped/emulated by QEMU)

Unlike the standard SMMU CMDQ, a guest owned Tegra241 VCMDQ does not
support the full command set. Only a subset, primarily invalidation
related commands, is accepted by the CMDQV hardware. For this reason,
a distinct CMDQV device must be exposed to the guest, and the guest OS
must include a Tegra241 CMDQV aware driver to take advantage of the
hardware acceleration.

VCMDQ support is integrated via the IOMMU_HW_QUEUE_ALLOC mechanism,
allowing QEMU to attach guest configured VCMDQ buffers to the
underlying CMDQV hardware through IOMMUFD. The Linux kernel already
supports the full CMDQV virtualisation model via IOMMUFD[0].
---

Nicolin Chen (15):
  backends/iommufd: Update iommufd_backend_get_device_info
  backends/iommufd: Update iommufd_backend_alloc_viommu to allow user
    ptr
  backends/iommufd: Introduce iommufd_backend_alloc_hw_queue
  backends/iommufd: Introduce iommufd_backend_viommu_mmap
  hw/arm/tegra241-cmdqv: Implement CMDQV init
  hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free
  hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region
  hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads
  hw/arm/tegra241-cmdqv: Emulate VCMDQ register writes
  hw/arm/tegra241-cmdqv: mmap VINTF Page0 for CMDQV
  hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
  hw/arm/tegra241-cmdqv: Map VINTF page0 into guest MMIO space
  hw/arm/tegra241-cmdqv: Add reset handler
  hw/arm/tegra241-cmdqv: Limit queue size based on backend page size
  hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT

Shameer Kolothum (16):
  system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq
  hw/arm/smmuv3-accel: Introduce CMDQV ops interface
  hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub
  hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle
  hw/arm/virt: Use stored SMMUv3 device list for IORT build
  hw/arm/tegra241-cmdqv: Probe host Tegra241 CMDQV support
  hw/arm/virt: Link SMMUv3 CMDQV resources to platform bus
  system/physmem: Add address_space_is_ram() helper
  hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing
  memory: Allow RAM device regions to skip IOMMU mapping
  hw/arm/smmuv3-accel: Introduce common helper for veventq read
  hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors
  hw/arm/smmuv3: Add per-device identifier property
  hw/arm/smmuv3-accel: Introduce helper to query CMDQV type
  hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active
  hw/arm/smmuv3: Add cmdqv property for SMMUv3 device

 hw/arm/smmuv3-accel.h         |  29 ++
 hw/arm/tegra241-cmdqv.h       | 367 +++++++++++++++
 include/hw/arm/smmuv3.h       |   3 +
 include/hw/arm/virt.h         |   1 +
 include/system/iommufd.h      |  17 +-
 include/system/memory.h       |  12 +
 backends/iommufd.c            |  64 +++
 hw/arm/smmuv3-accel-stubs.c   |  16 +
 hw/arm/smmuv3-accel.c         | 187 ++++++--
 hw/arm/smmuv3.c               |  15 +
 hw/arm/tegra241-cmdqv-stubs.c |  16 +
 hw/arm/tegra241-cmdqv.c       | 817 ++++++++++++++++++++++++++++++++++
 hw/arm/virt-acpi-build.c      | 127 ++++--
 hw/arm/virt.c                 |  37 ++
 hw/vfio/iommufd.c             |   4 +-
 hw/vfio/listener.c            |   5 +
 system/physmem.c              |  11 +
 backends/trace-events         |   4 +-
 hw/arm/Kconfig                |   5 +
 hw/arm/meson.build            |   2 +
 hw/arm/trace-events           |   7 +
 21 files changed, 1666 insertions(+), 80 deletions(-)
 create mode 100644 hw/arm/tegra241-cmdqv.h
 create mode 100644 hw/arm/tegra241-cmdqv-stubs.c
 create mode 100644 hw/arm/tegra241-cmdqv.c

-- 
2.43.0



^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v4 01/31] backends/iommufd: Update iommufd_backend_get_device_info
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-04-15 10:55 ` [PATCH v4 02/31] backends/iommufd: Update iommufd_backend_alloc_viommu to allow user ptr Shameer Kolothum
                   ` (29 subsequent siblings)
  30 siblings, 0 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

The updated IOMMUFD uAPI introduces the ability for userspace to request
a specific hardware info data type via IOMMU_GET_HW_INFO. Update
iommufd_backend_get_device_info() to set IOMMU_HW_INFO_FLAG_INPUT_TYPE
when a non-zero type is supplied, and adjust all callers to pass a type
value explicitly initialised to zero (IOMMU_HW_INFO_TYPE_DEFAULT) when
no specific type is requested.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 backends/iommufd.c    | 7 +++++++
 hw/arm/smmuv3-accel.c | 2 +-
 hw/vfio/iommufd.c     | 4 ++--
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/backends/iommufd.c b/backends/iommufd.c
index 52cb060454..20d4186f29 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -397,16 +397,23 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
     return true;
 }
 
+/*
+ * @type can carry a desired HW info type defined in the uapi headers. If caller
+ * doesn't have one, indicating it wants the default type, then @type should be
+ * zeroed (i.e. IOMMU_HW_INFO_TYPE_DEFAULT).
+ */
 bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
                                      uint32_t *type, void *data, uint32_t len,
                                      uint64_t *caps, uint8_t *max_pasid_log2,
                                      Error **errp)
 {
     struct iommu_hw_info info = {
+        .flags = (*type) ? IOMMU_HW_INFO_FLAG_INPUT_TYPE : 0,
         .size = sizeof(info),
         .dev_id = devid,
         .data_len = len,
         .data_uptr = (uintptr_t)data,
+        .in_data_type = *type,
     };
 
     if (ioctl(be->fd, IOMMU_GET_HW_INFO, &info)) {
diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 48f8017262..d68d4141a0 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -172,7 +172,7 @@ smmuv3_accel_hw_compatible(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
                            Error **errp)
 {
     struct iommu_hw_info_arm_smmuv3 info;
-    uint32_t data_type;
+    uint32_t data_type = IOMMU_HW_INFO_TYPE_DEFAULT;
     uint64_t caps;
 
     if (!iommufd_backend_get_device_info(idev->iommufd, idev->devid, &data_type,
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 3e33dfbb35..4043111667 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -352,7 +352,7 @@ static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
     ERRP_GUARD();
     IOMMUFDBackend *iommufd = vbasedev->iommufd;
     VFIOContainer *bcontainer = VFIO_IOMMU(container);
-    uint32_t type, flags = 0;
+    uint32_t type = IOMMU_HW_INFO_TYPE_DEFAULT, flags = 0;
     uint64_t hw_caps;
     VendorCaps caps;
     VFIOIOASHwpt *hwpt;
@@ -941,7 +941,7 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
     HostIOMMUDeviceIOMMUFD *idev;
     HostIOMMUDeviceCaps *caps = &hiod->caps;
     VendorCaps *vendor_caps = &caps->vendor_caps;
-    enum iommu_hw_info_type type;
+    uint32_t type = IOMMU_HW_INFO_TYPE_DEFAULT;
     uint8_t max_pasid_log2;
     uint64_t hw_caps;
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 02/31] backends/iommufd: Update iommufd_backend_alloc_viommu to allow user ptr
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
  2026-04-15 10:55 ` [PATCH v4 01/31] backends/iommufd: Update iommufd_backend_get_device_info Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-04-15 10:55 ` [PATCH v4 03/31] backends/iommufd: Introduce iommufd_backend_alloc_hw_queue Shameer Kolothum
                   ` (28 subsequent siblings)
  30 siblings, 0 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

The updated IOMMUFD VIOMMU_ALLOC uAPI allows userspace to provide a data
buffer when creating a vIOMMU (e.g. for Tegra241 CMDQV). Extend
iommufd_backend_alloc_viommu() to pass a user pointer and size to the
kernel.

Update the caller accordingly.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 include/system/iommufd.h | 1 +
 backends/iommufd.c       | 4 ++++
 hw/arm/smmuv3-accel.c    | 4 ++--
 backends/trace-events    | 2 +-
 4 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index 7062944fe6..e027800c91 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -89,6 +89,7 @@ bool iommufd_backend_alloc_hwpt(IOMMUFDBackend *be, uint32_t dev_id,
                                 Error **errp);
 bool iommufd_backend_alloc_viommu(IOMMUFDBackend *be, uint32_t dev_id,
                                   uint32_t viommu_type, uint32_t hwpt_id,
+                                  void *data_ptr, uint32_t data_len,
                                   uint32_t *out_hwpt, Error **errp);
 
 bool iommufd_backend_alloc_vdev(IOMMUFDBackend *be, uint32_t dev_id,
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 20d4186f29..9b07ac19c2 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -470,6 +470,7 @@ bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
 
 bool iommufd_backend_alloc_viommu(IOMMUFDBackend *be, uint32_t dev_id,
                                   uint32_t viommu_type, uint32_t hwpt_id,
+                                  void *data_ptr, uint32_t data_len,
                                   uint32_t *out_viommu_id, Error **errp)
 {
     int ret;
@@ -478,11 +479,14 @@ bool iommufd_backend_alloc_viommu(IOMMUFDBackend *be, uint32_t dev_id,
         .type = viommu_type,
         .dev_id = dev_id,
         .hwpt_id = hwpt_id,
+        .data_len = data_len,
+        .data_uptr = (uintptr_t)data_ptr,
     };
 
     ret = ioctl(be->fd, IOMMU_VIOMMU_ALLOC, &alloc_viommu);
 
     trace_iommufd_backend_alloc_viommu(be->fd, dev_id, viommu_type, hwpt_id,
+                                       (uintptr_t)data_ptr, data_len,
                                        alloc_viommu.out_viommu_id, ret);
     if (ret) {
         error_setg_errno(errp, errno, "IOMMU_VIOMMU_ALLOC failed");
diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index d68d4141a0..c356ff9708 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -578,8 +578,8 @@ smmuv3_accel_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
     IOMMUFDViommu *viommu;
 
     if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
-                                      IOMMU_VIOMMU_TYPE_ARM_SMMUV3,
-                                      s2_hwpt_id, &viommu_id, errp)) {
+                                      IOMMU_VIOMMU_TYPE_ARM_SMMUV3, s2_hwpt_id,
+                                      NULL, 0, &viommu_id, errp)) {
         return false;
     }
 
diff --git a/backends/trace-events b/backends/trace-events
index b9365113e7..3ba0c3503c 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -21,7 +21,7 @@ iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%
 iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
 iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t flags, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" flags=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
 iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
-iommufd_backend_alloc_viommu(int iommufd, uint32_t dev_id, uint32_t type, uint32_t hwpt_id, uint32_t viommu_id, int ret) " iommufd=%d type=%u dev_id=%u hwpt_id=%u viommu_id=%u (%d)"
+iommufd_backend_alloc_viommu(int iommufd, uint32_t dev_id, uint32_t type, uint32_t hwpt_id, uint64_t data_ptr, uint32_t data_len, uint32_t viommu_id, int ret) " iommufd=%d type=%u dev_id=%u hwpt_id=%u data_ptr=0x%"PRIx64" data_len=0x%x viommu_id=%u (%d)"
 iommufd_backend_alloc_vdev(int iommufd, uint32_t dev_id, uint32_t viommu_id, uint64_t virt_id, uint32_t vdev_id, int ret) " iommufd=%d dev_id=%u viommu_id=%u virt_id=0x%"PRIx64" vdev_id=%u (%d)"
 iommufd_viommu_alloc_eventq(int iommufd, uint32_t viommu_id, uint32_t type, uint32_t veventq_id, uint32_t veventq_fd, int ret) " iommufd=%d viommu_id=%u type=%u veventq_id=%u veventq_fd=%u (%d)"
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 03/31] backends/iommufd: Introduce iommufd_backend_alloc_hw_queue
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
  2026-04-15 10:55 ` [PATCH v4 01/31] backends/iommufd: Update iommufd_backend_get_device_info Shameer Kolothum
  2026-04-15 10:55 ` [PATCH v4 02/31] backends/iommufd: Update iommufd_backend_alloc_viommu to allow user ptr Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-04-15 10:55 ` [PATCH v4 04/31] backends/iommufd: Introduce iommufd_backend_viommu_mmap Shameer Kolothum
                   ` (27 subsequent siblings)
  30 siblings, 0 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

Add a helper to allocate an iommufd backed HW queue for a vIOMMU.

While at it, define a struct IOMMUFDHWqueue for use by vendor
implementations.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 include/system/iommufd.h | 11 +++++++++++
 backends/iommufd.c       | 31 +++++++++++++++++++++++++++++++
 backends/trace-events    |  1 +
 3 files changed, 43 insertions(+)

diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index e027800c91..8009ce3d31 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -65,6 +65,12 @@ typedef struct IOMMUFDVeventq {
     bool event_start; /* True after first valid event; cleared on overflow */
 } IOMMUFDVeventq;
 
+/* HW queue object for a vIOMMU-specific HW-accelerated queue */
+typedef struct IOMMUFDHWqueue {
+    IOMMUFDViommu *viommu;
+    uint32_t hw_queue_id;
+} IOMMUFDHWqueue;
+
 bool iommufd_backend_connect(IOMMUFDBackend *be, Error **errp);
 void iommufd_backend_disconnect(IOMMUFDBackend *be);
 
@@ -101,6 +107,11 @@ bool iommufd_backend_alloc_veventq(IOMMUFDBackend *be, uint32_t viommu_id,
                                    uint32_t *out_veventq_id,
                                    uint32_t *out_veventq_fd, Error **errp);
 
+bool iommufd_backend_alloc_hw_queue(IOMMUFDBackend *be, uint32_t viommu_id,
+                                    uint32_t queue_type, uint32_t index,
+                                    uint64_t addr, uint64_t length,
+                                    uint32_t *out_hw_queue_id, Error **errp);
+
 bool iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t hwpt_id,
                                         bool start, Error **errp);
 bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 9b07ac19c2..3be7b07eec 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -556,6 +556,37 @@ bool iommufd_backend_alloc_veventq(IOMMUFDBackend *be, uint32_t viommu_id,
     return true;
 }
 
+bool iommufd_backend_alloc_hw_queue(IOMMUFDBackend *be, uint32_t viommu_id,
+                                    uint32_t queue_type, uint32_t index,
+                                    uint64_t addr, uint64_t length,
+                                    uint32_t *out_hw_queue_id, Error **errp)
+{
+    int ret;
+    struct iommu_hw_queue_alloc alloc_hw_queue = {
+        .size = sizeof(alloc_hw_queue),
+        .flags = 0,
+        .viommu_id = viommu_id,
+        .type = queue_type,
+        .index = index,
+        .nesting_parent_iova = addr,
+        .length = length,
+    };
+
+    ret = ioctl(be->fd, IOMMU_HW_QUEUE_ALLOC, &alloc_hw_queue);
+
+    trace_iommufd_backend_alloc_hw_queue(be->fd, viommu_id, queue_type,
+                                         index, addr, length,
+                                         alloc_hw_queue.out_hw_queue_id, ret);
+    if (ret) {
+        error_setg_errno(errp, errno, "IOMMU_HW_QUEUE_ALLOC failed");
+        return false;
+    }
+
+    g_assert(out_hw_queue_id);
+    *out_hw_queue_id = alloc_hw_queue.out_hw_queue_id;
+    return true;
+}
+
 bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
                                            uint32_t hwpt_id, Error **errp)
 {
diff --git a/backends/trace-events b/backends/trace-events
index 3ba0c3503c..c5c1d95aad 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -24,6 +24,7 @@ iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, u
 iommufd_backend_alloc_viommu(int iommufd, uint32_t dev_id, uint32_t type, uint32_t hwpt_id, uint64_t data_ptr, uint32_t data_len, uint32_t viommu_id, int ret) " iommufd=%d type=%u dev_id=%u hwpt_id=%u data_ptr=0x%"PRIx64" data_len=0x%x viommu_id=%u (%d)"
 iommufd_backend_alloc_vdev(int iommufd, uint32_t dev_id, uint32_t viommu_id, uint64_t virt_id, uint32_t vdev_id, int ret) " iommufd=%d dev_id=%u viommu_id=%u virt_id=0x%"PRIx64" vdev_id=%u (%d)"
 iommufd_viommu_alloc_eventq(int iommufd, uint32_t viommu_id, uint32_t type, uint32_t veventq_id, uint32_t veventq_fd, int ret) " iommufd=%d viommu_id=%u type=%u veventq_id=%u veventq_fd=%u (%d)"
+iommufd_backend_alloc_hw_queue(int iommufd, uint32_t viommu_id, uint32_t queue_type, uint32_t index, uint64_t addr, uint64_t size, uint32_t queue_id, int ret) " iommufd=%d viommu_id=%u queue_type=%u index=%u addr=0x%"PRIx64" size=0x%"PRIx64" queue_id=%u (%d)"
 
 # igvm-cfg.c
 igvm_reset_enter(int type) "type=%u"
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 04/31] backends/iommufd: Introduce iommufd_backend_viommu_mmap
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (2 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 03/31] backends/iommufd: Introduce iommufd_backend_alloc_hw_queue Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-04-15 10:55 ` [PATCH v4 05/31] system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq Shameer Kolothum
                   ` (26 subsequent siblings)
  30 siblings, 0 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

Add a backend helper to mmap hardware MMIO regions exposed via iommufd for
a vIOMMU instance. This allows user space to access HW-accelerated MMIO
pages provided by the vIOMMU.

The caller is responsible for unmapping the returned region.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 include/system/iommufd.h |  4 ++++
 backends/iommufd.c       | 22 ++++++++++++++++++++++
 backends/trace-events    |  1 +
 3 files changed, 27 insertions(+)

diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index 8009ce3d31..38cfceca84 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -112,6 +112,10 @@ bool iommufd_backend_alloc_hw_queue(IOMMUFDBackend *be, uint32_t viommu_id,
                                     uint64_t addr, uint64_t length,
                                     uint32_t *out_hw_queue_id, Error **errp);
 
+bool iommufd_backend_viommu_mmap(IOMMUFDBackend *be, uint32_t viommu_id,
+                                 uint64_t size, off_t offset, void **out_ptr,
+                                 Error **errp);
+
 bool iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t hwpt_id,
                                         bool start, Error **errp);
 bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 3be7b07eec..e26675990e 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -587,6 +587,28 @@ bool iommufd_backend_alloc_hw_queue(IOMMUFDBackend *be, uint32_t viommu_id,
     return true;
 }
 
+/*
+ * Helper to mmap HW MMIO regions exposed via iommufd for a vIOMMU instance.
+ * The caller is responsible for unmapping the mapped region.
+ */
+bool iommufd_backend_viommu_mmap(IOMMUFDBackend *be, uint32_t viommu_id,
+                                 uint64_t size, off_t offset, void **out_ptr,
+                                 Error **errp)
+{
+    g_assert(viommu_id);
+    g_assert(out_ptr);
+
+    *out_ptr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, be->fd,
+                   offset);
+    trace_iommufd_backend_viommu_mmap(be->fd, viommu_id, size, offset);
+    if (*out_ptr == MAP_FAILED) {
+        error_setg_errno(errp, errno, "IOMMUFD vIOMMU mmap failed");
+        return false;
+    }
+
+    return true;
+}
+
 bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *idev,
                                            uint32_t hwpt_id, Error **errp)
 {
diff --git a/backends/trace-events b/backends/trace-events
index c5c1d95aad..b63420b73e 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -25,6 +25,7 @@ iommufd_backend_alloc_viommu(int iommufd, uint32_t dev_id, uint32_t type, uint32
 iommufd_backend_alloc_vdev(int iommufd, uint32_t dev_id, uint32_t viommu_id, uint64_t virt_id, uint32_t vdev_id, int ret) " iommufd=%d dev_id=%u viommu_id=%u virt_id=0x%"PRIx64" vdev_id=%u (%d)"
 iommufd_viommu_alloc_eventq(int iommufd, uint32_t viommu_id, uint32_t type, uint32_t veventq_id, uint32_t veventq_fd, int ret) " iommufd=%d viommu_id=%u type=%u veventq_id=%u veventq_fd=%u (%d)"
 iommufd_backend_alloc_hw_queue(int iommufd, uint32_t viommu_id, uint32_t queue_type, uint32_t index, uint64_t addr, uint64_t size, uint32_t queue_id, int ret) " iommufd=%d viommu_id=%u queue_type=%u index=%u addr=0x%"PRIx64" size=0x%"PRIx64" queue_id=%u (%d)"
+iommufd_backend_viommu_mmap(int iommufd, uint32_t viommu_id, uint64_t size, uint64_t offset) " iommufd=%d viommu_id=%u size=0x%"PRIx64" offset=0x%"PRIx64
 
 # igvm-cfg.c
 igvm_reset_enter(int type) "type=%u"
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 05/31] system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (3 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 04/31] backends/iommufd: Introduce iommufd_backend_viommu_mmap Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-04 15:00   ` Eric Auger
  2026-05-04 18:16   ` Nicolin Chen
  2026-04-15 10:55 ` [PATCH v4 06/31] hw/arm/smmuv3-accel: Introduce CMDQV ops interface Shameer Kolothum
                   ` (25 subsequent siblings)
  30 siblings, 2 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

The viommu field is assigned but never used. Callers freeing the
veventq already have access to the IOMMUFDViommu object through other
references, so this field is redundant.

Removing it also simplifies upcoming changes where veventq is
allocated based on the viommu id before the IOMMUFDViommu object is
created (e.g. vendor CMDQV-based veventq allocation).

No functional change.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 include/system/iommufd.h | 1 -
 hw/arm/smmuv3-accel.c    | 1 -
 2 files changed, 2 deletions(-)

diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index 38cfceca84..b6599521b8 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -58,7 +58,6 @@ typedef struct IOMMUFDVdev {
 
 /* Virtual event queue interface for a vIOMMU */
 typedef struct IOMMUFDVeventq {
-    IOMMUFDViommu *viommu;
     uint32_t veventq_id;
     uint32_t veventq_fd;
     uint32_t last_event_seq; /* Sequence number of last processed event */
diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index c356ff9708..f65e654adf 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -549,7 +549,6 @@ bool smmuv3_accel_alloc_veventq(SMMUv3State *s, Error **errp)
     veventq = g_new0(IOMMUFDVeventq, 1);
     veventq->veventq_id = veventq_id;
     veventq->veventq_fd = veventq_fd;
-    veventq->viommu = accel->viommu;
     accel->veventq = veventq;
 
     /* Set up event handler for veventq fd */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 06/31] hw/arm/smmuv3-accel: Introduce CMDQV ops interface
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (4 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 05/31] system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-04 15:19   ` Eric Auger
  2026-05-04 18:28   ` Nicolin Chen
  2026-04-15 10:55 ` [PATCH v4 07/31] hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub Shameer Kolothum
                   ` (24 subsequent siblings)
  30 siblings, 2 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Command Queue Virtualization (CMDQV) is a hardware extension available
on certain platforms that allows the SMMUv3 command queue to be
virtualized and passed through to a VM, improving performance.

For example, NVIDIA Tegra241 implements CMDQV to support virtualization
of multiple command queues (VCMDQs).

The term CMDQV is used here generically to refer to any platform that
provides hardware support to virtualize the SMMUv3 command queue.

CMDQV support is a specialization of the IOMMUFD-backed accelerated
SMMUv3 path. Introduce an ops interface to factor out CMDQV-specific
probe, initialization, and vIOMMU allocation logic from the base
implementation. The ops pointer and associated state are stored in
the accelerated SMMUv3 state.

This provides an extensible design to support future vendor-specific
CMDQV implementations.

No functional change.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/smmuv3-accel.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
index 7b4a0be000..86301afcb4 100644
--- a/hw/arm/smmuv3-accel.h
+++ b/hw/arm/smmuv3-accel.h
@@ -10,11 +10,28 @@
 #define HW_ARM_SMMUV3_ACCEL_H
 
 #include "hw/arm/smmu-common.h"
+#include "hw/arm/smmuv3.h"
 #include "system/iommufd.h"
 #ifdef CONFIG_LINUX
 #include <linux/iommufd.h>
 #endif
 
+/*
+ * CMDQ-Virtualization (CMDQV) hardware support, extends the SMMUv3 to
+ * support multiple VCMDQs with virtualization capabilities.
+ * CMDQV specific behavior is factored behind this ops interface.
+ */
+typedef struct SMMUv3AccelCmdqvOps {
+    bool (*probe)(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev, Error **errp);
+    bool (*init)(SMMUv3State *s, Error **errp);
+    bool (*alloc_viommu)(SMMUv3State *s,
+                         HostIOMMUDeviceIOMMUFD *idev,
+                         uint32_t *out_viommu_id,
+                         Error **errp);
+    void (*free_viommu)(SMMUv3State *s);
+    void (*reset)(SMMUv3State *s);
+} SMMUv3AccelCmdqvOps;
+
 /*
  * Represents an accelerated SMMU instance backed by an iommufd vIOMMU object.
  * Holds bypass and abort proxy HWPT IDs used for device attachment.
@@ -27,6 +44,7 @@ typedef struct SMMUv3AccelState {
     QLIST_HEAD(, SMMUv3AccelDevice) device_list;
     bool auto_mode;
     bool auto_finalised;
+    const SMMUv3AccelCmdqvOps *cmdqv_ops;
 } SMMUv3AccelState;
 
 typedef struct SMMUS1Hwpt {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 07/31] hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (5 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 06/31] hw/arm/smmuv3-accel: Introduce CMDQV ops interface Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-04 15:19   ` Eric Auger
  2026-05-04 18:23   ` Nicolin Chen
  2026-04-15 10:55 ` [PATCH v4 08/31] hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle Shameer Kolothum
                   ` (23 subsequent siblings)
  30 siblings, 2 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Introduce a Tegra241 CMDQV backend that plugs into the SMMUv3 accelerated
CMDQV ops interface.

This patch wires up the Tegra241 CMDQV backend and provides a stub
implementation for CMDQV probe, initialization, vIOMMU allocation
and reset handling.

Functional CMDQV support is added in follow-up patches.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/tegra241-cmdqv.h       | 15 ++++++++++
 hw/arm/tegra241-cmdqv-stubs.c | 16 ++++++++++
 hw/arm/tegra241-cmdqv.c       | 56 +++++++++++++++++++++++++++++++++++
 hw/arm/Kconfig                |  5 ++++
 hw/arm/meson.build            |  2 ++
 5 files changed, 94 insertions(+)
 create mode 100644 hw/arm/tegra241-cmdqv.h
 create mode 100644 hw/arm/tegra241-cmdqv-stubs.c
 create mode 100644 hw/arm/tegra241-cmdqv.c

diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
new file mode 100644
index 0000000000..07e10e86ee
--- /dev/null
+++ b/hw/arm/tegra241-cmdqv.h
@@ -0,0 +1,15 @@
+/*
+ * Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved
+ * NVIDIA Tegra241 CMDQ-Virtualiisation extension for SMMUv3
+ *
+ * Written by Nicolin Chen, Shameer Kolothum
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef HW_ARM_TEGRA241_CMDQV_H
+#define HW_ARM_TEGRA241_CMDQV_H
+
+const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
+
+#endif /* HW_ARM_TEGRA241_CMDQV_H */
diff --git a/hw/arm/tegra241-cmdqv-stubs.c b/hw/arm/tegra241-cmdqv-stubs.c
new file mode 100644
index 0000000000..eabf90daf8
--- /dev/null
+++ b/hw/arm/tegra241-cmdqv-stubs.c
@@ -0,0 +1,16 @@
+/*
+ * Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved
+ *
+ * Stubs for Tegra241 CMDQ-Virtualiisation extension for SMMUv3
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "smmuv3-accel.h"
+#include "hw/arm/tegra241-cmdqv.h"
+
+const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void)
+{
+    return NULL;
+}
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
new file mode 100644
index 0000000000..ad5a0d4611
--- /dev/null
+++ b/hw/arm/tegra241-cmdqv.c
@@ -0,0 +1,56 @@
+/*
+ * Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved
+ * NVIDIA Tegra241 CMDQ-Virtualization extension for SMMUv3
+ *
+ * Written by Nicolin Chen, Shameer Kolothum
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+
+#include "hw/arm/smmuv3.h"
+#include "smmuv3-accel.h"
+#include "tegra241-cmdqv.h"
+
+static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
+{
+}
+
+static bool
+tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
+                            uint32_t *out_viommu_id, Error **errp)
+{
+    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
+    return false;
+}
+
+static void tegra241_cmdqv_reset(SMMUv3State *s)
+{
+}
+
+static bool tegra241_cmdqv_init(SMMUv3State *s, Error **errp)
+{
+    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
+    return false;
+}
+
+static bool tegra241_cmdqv_probe(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
+                                 Error **errp)
+{
+    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
+    return false;
+}
+
+static const SMMUv3AccelCmdqvOps tegra241_cmdqv_ops = {
+    .probe = tegra241_cmdqv_probe,
+    .init = tegra241_cmdqv_init,
+    .alloc_viommu = tegra241_cmdqv_alloc_viommu,
+    .free_viommu = tegra241_cmdqv_free_viommu,
+    .reset = tegra241_cmdqv_reset,
+};
+
+const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void)
+{
+    return &tegra241_cmdqv_ops;
+}
diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 4e50fb1111..073f2ebaaf 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -618,6 +618,10 @@ config FSL_IMX8MP_EVK
     depends on TCG
     select FSL_IMX8MP
 
+config TEGRA241_CMDQV
+    bool
+    depends on ARM_SMMUV3_ACCEL
+
 config ARM_SMMUV3_ACCEL
     bool
     depends on ARM_SMMUV3
@@ -625,6 +629,7 @@ config ARM_SMMUV3_ACCEL
 config ARM_SMMUV3
     bool
     select ARM_SMMUV3_ACCEL if IOMMUFD
+    imply TEGRA241_CMDQV
 
 config FSL_IMX6UL
     bool
diff --git a/hw/arm/meson.build b/hw/arm/meson.build
index 3be1252c4f..64bcdc5a7c 100644
--- a/hw/arm/meson.build
+++ b/hw/arm/meson.build
@@ -87,6 +87,8 @@ arm_common_ss.add(when: 'CONFIG_FSL_IMX8MP_EVK', if_true: files('imx8mp-evk.c'))
 arm_common_ss.add(when: 'CONFIG_ARM_SMMUV3', if_true: files('smmuv3.c'))
 arm_common_ss.add(when: 'CONFIG_ARM_SMMUV3_ACCEL', if_true: files('smmuv3-accel.c'))
 stub_ss.add(files('smmuv3-accel-stubs.c'))
+arm_common_ss.add(when: 'CONFIG_TEGRA241_CMDQV', if_true: files('tegra241-cmdqv.c'))
+stub_ss.add(files('tegra241-cmdqv-stubs.c'))
 arm_common_ss.add(when: 'CONFIG_FSL_IMX6UL', if_true: files('fsl-imx6ul.c', 'mcimx6ul-evk.c'))
 arm_common_ss.add(when: 'CONFIG_NRF51_SOC', if_true: files('nrf51_soc.c'))
 arm_common_ss.add(when: 'CONFIG_XEN', if_true: files(
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 08/31] hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (6 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 07/31] hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-04 15:33   ` Eric Auger
  2026-05-04 18:38   ` Nicolin Chen
  2026-04-15 10:55 ` [PATCH v4 09/31] hw/arm/virt: Use stored SMMUv3 device list for IORT build Shameer Kolothum
                   ` (22 subsequent siblings)
  30 siblings, 2 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Add support for selecting and initializing a CMDQV backend based on the
cmdqv OnOffAuto property.

If set to OFF, CMDQV is not used and the default IOMMUFD-backed allocation
path is taken.

If set to AUTO, QEMU attempts to probe a CMDQV backend during device setup.
If probing succeeds, the selected ops are stored in the accelerated SMMUv3
state and used. If probing fails, QEMU silently falls back to the default
path.

If set to ON, QEMU requires CMDQV support. Probing is performed during
setup and failure results in an error.

When a CMDQV backend is active, its callbacks are used for vIOMMU
allocation, free, and reset handling. Otherwise, the base implementation
is used.

The current implementation wires up the Tegra241 CMDQV backend through the
generic ops interface. Functional CMDQV behaviour is added in subsequent
patches.

No functional change.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 include/hw/arm/smmuv3.h |  2 +
 hw/arm/smmuv3-accel.c   | 93 +++++++++++++++++++++++++++++++++++++----
 2 files changed, 88 insertions(+), 7 deletions(-)

diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
index fe0493c1aa..aa6a79237a 100644
--- a/include/hw/arm/smmuv3.h
+++ b/include/hw/arm/smmuv3.h
@@ -74,6 +74,8 @@ struct SMMUv3State {
     OnOffAuto ats;
     OasMode oas;
     SsidSizeMode ssidsize;
+    /* SMMU CMDQV extension */
+    OnOffAuto cmdqv;
 
     Notifier machine_done;
 };
diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index f65e654adf..9068e65e2b 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -18,6 +18,7 @@
 
 #include "smmuv3-internal.h"
 #include "smmuv3-accel.h"
+#include "tegra241-cmdqv.h"
 
 /*
  * The root region aliases the global system memory, and shared_as_sysmem
@@ -566,6 +567,7 @@ smmuv3_accel_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
                           Error **errp)
 {
     SMMUv3AccelState *accel = s->s_accel;
+    const SMMUv3AccelCmdqvOps *cmdqv_ops = accel->cmdqv_ops;
     struct iommu_hwpt_arm_smmuv3 bypass_data = {
         .ste = { SMMU_STE_CFG_BYPASS | SMMU_STE_VALID, 0x0ULL },
     };
@@ -576,10 +578,17 @@ smmuv3_accel_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
     uint32_t viommu_id, hwpt_id;
     IOMMUFDViommu *viommu;
 
-    if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
-                                      IOMMU_VIOMMU_TYPE_ARM_SMMUV3, s2_hwpt_id,
-                                      NULL, 0, &viommu_id, errp)) {
-        return false;
+    if (cmdqv_ops) {
+        if (!cmdqv_ops->alloc_viommu(s, idev, &viommu_id, errp)) {
+            return false;
+        }
+    } else {
+        if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
+                                          IOMMU_VIOMMU_TYPE_ARM_SMMUV3,
+                                          s2_hwpt_id, NULL, 0, &viommu_id,
+                                          errp)) {
+            return false;
+        }
     }
 
     viommu = g_new0(IOMMUFDViommu, 1);
@@ -625,12 +634,69 @@ free_bypass_hwpt:
 free_abort_hwpt:
     iommufd_backend_free_id(idev->iommufd, accel->abort_hwpt_id);
 free_viommu:
-    iommufd_backend_free_id(idev->iommufd, viommu->viommu_id);
+    if (cmdqv_ops && cmdqv_ops->free_viommu) {
+        cmdqv_ops->free_viommu(s);
+    } else {
+        iommufd_backend_free_id(idev->iommufd, viommu->viommu_id);
+    }
     g_free(viommu);
     accel->viommu = NULL;
     return false;
 }
 
+static const SMMUv3AccelCmdqvOps *
+smmuv3_accel_probe_cmdqv(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
+                          Error **errp)
+{
+    const SMMUv3AccelCmdqvOps *ops = tegra241_cmdqv_get_ops();
+
+    if (!ops || !ops->probe) {
+        error_setg(errp, "No CMDQV ops found");
+        return NULL;
+    }
+
+    if (!ops->probe(s, idev, errp)) {
+        return NULL;
+    }
+    return ops;
+}
+
+static bool
+smmuv3_accel_select_cmdqv(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
+                          Error **errp)
+{
+    const SMMUv3AccelCmdqvOps *ops = NULL;
+
+    if (s->s_accel->cmdqv_ops) {
+        return true;
+    }
+
+    switch (s->cmdqv) {
+    case ON_OFF_AUTO_OFF:
+        s->s_accel->cmdqv_ops = NULL;
+        return true;
+    case ON_OFF_AUTO_AUTO:
+        ops = smmuv3_accel_probe_cmdqv(s, idev, NULL);
+        break;
+    case ON_OFF_AUTO_ON:
+        ops = smmuv3_accel_probe_cmdqv(s, idev, errp);
+        if (!ops) {
+            error_append_hint(errp, "CMDQV requested but not supported");
+            return false;
+        }
+        s->s_accel->cmdqv_ops = ops;
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    if (ops && ops->init && !ops->init(s, errp)) {
+        return false;
+    }
+    s->s_accel->cmdqv_ops = ops;
+    return true;
+}
+
 static bool smmuv3_accel_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
                                           HostIOMMUDevice *hiod, Error **errp)
 {
@@ -665,6 +731,10 @@ static bool smmuv3_accel_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
         goto done;
     }
 
+    if (!smmuv3_accel_select_cmdqv(s, idev, errp)) {
+        return false;
+    }
+
     if (!smmuv3_accel_alloc_viommu(s, idev, errp)) {
         error_append_hint(errp, "Unable to alloc vIOMMU: idev devid 0x%x: ",
                           idev->devid);
@@ -936,8 +1006,17 @@ bool smmuv3_accel_attach_gbpa_hwpt(SMMUv3State *s, Error **errp)
 
 void smmuv3_accel_reset(SMMUv3State *s)
 {
-     /* Attach a HWPT based on GBPA reset value */
-     smmuv3_accel_attach_gbpa_hwpt(s, NULL);
+    SMMUv3AccelState *accel = s->s_accel;
+
+    if (!accel) {
+        return;
+    }
+    /* Attach a HWPT based on GBPA reset value */
+    smmuv3_accel_attach_gbpa_hwpt(s, NULL);
+
+    if (accel->cmdqv_ops && accel->cmdqv_ops->reset) {
+        accel->cmdqv_ops->reset(s);
+    }
 }
 
 static void smmuv3_accel_as_init(SMMUv3State *s)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 09/31] hw/arm/virt: Use stored SMMUv3 device list for IORT build
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (7 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 08/31] hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-04 18:46   ` Nicolin Chen
  2026-04-15 10:55 ` [PATCH v4 10/31] hw/arm/tegra241-cmdqv: Probe host Tegra241 CMDQV support Shameer Kolothum
                   ` (21 subsequent siblings)
  30 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Introduce a GPtrArray in VirtMachineState to track all SMMUv3 devices
created on the virt machine, and use it when building the IORT table
instead of relying on object_child_foreach_recursive() walks of the
object tree.

This avoids recursive object traversal and provides a foundation for
subsequent patches that need direct access to SMMUv3 instances for
CMDQV-related handling.

No functional change. No bios-tables qtest failures observed.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 include/hw/arm/virt.h    |  1 +
 hw/arm/virt-acpi-build.c | 70 ++++++++++++++++++----------------------
 hw/arm/virt.c            |  3 ++
 3 files changed, 35 insertions(+), 39 deletions(-)

diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 5fcbd1c76f..a840a97de8 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -187,6 +187,7 @@ struct VirtMachineState {
     MemoryRegion *sysmem;
     MemoryRegion *secure_sysmem;
     bool pci_preserve_config;
+    GPtrArray *smmuv3_devices;
 };
 
 #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 591cfc993c..521443de87 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -385,49 +385,41 @@ static int smmuv3_dev_idmap_compare(gconstpointer a, gconstpointer b)
     return map_a->input_base - map_b->input_base;
 }
 
-static int iort_smmuv3_devices(Object *obj, void *opaque)
-{
-    VirtMachineState *vms = VIRT_MACHINE(qdev_get_machine());
-    AcpiIortSMMUv3Dev sdev = {0};
-    GArray *sdev_blob = opaque;
-    AcpiIortIdMapping idmap;
-    PlatformBusDevice *pbus;
-    int min_bus, max_bus;
-    SysBusDevice *sbdev;
-    PCIBus *bus;
-
-    if (!object_dynamic_cast(obj, TYPE_ARM_SMMUV3)) {
-        return 0;
-    }
-
-    bus = PCI_BUS(object_property_get_link(obj, "primary-bus", &error_abort));
-    sdev.accel = object_property_get_bool(obj, "accel", &error_abort);
-    sdev.ats = smmuv3_ats_enabled(ARM_SMMUV3(obj));
-    pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
-    sbdev = SYS_BUS_DEVICE(obj);
-    sdev.base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
-    sdev.base += vms->memmap[VIRT_PLATFORM_BUS].base;
-    sdev.irq = platform_bus_get_irqn(pbus, sbdev, 0);
-    sdev.irq += vms->irqmap[VIRT_PLATFORM_BUS];
-    sdev.irq += ARM_SPI_BASE;
-
-    pci_bus_range(bus, &min_bus, &max_bus);
-    sdev.rc_smmu_idmaps = g_array_new(false, true, sizeof(AcpiIortIdMapping));
-    idmap.input_base = min_bus << 8,
-    idmap.id_count = (max_bus - min_bus + 1) << 8,
-    g_array_append_val(sdev.rc_smmu_idmaps, idmap);
-    g_array_append_val(sdev_blob, sdev);
-    return 0;
-}
-
 /*
  * Populate the struct AcpiIortSMMUv3Dev for all SMMUv3 devices and
  * return the total number of idmaps.
  */
-static int populate_smmuv3_dev(GArray *sdev_blob)
+static int populate_smmuv3_dev(VirtMachineState *vms, GArray *sdev_blob)
 {
-    object_child_foreach_recursive(object_get_root(),
-                                   iort_smmuv3_devices, sdev_blob);
+    for (int i = 0; i < vms->smmuv3_devices->len; i++) {
+        Object *obj = OBJECT(g_ptr_array_index(vms->smmuv3_devices, i));
+        AcpiIortSMMUv3Dev sdev = {0};
+        AcpiIortIdMapping idmap;
+        PlatformBusDevice *pbus;
+        int min_bus, max_bus;
+        SysBusDevice *sbdev;
+        PCIBus *bus;
+
+        bus = PCI_BUS(object_property_get_link(obj, "primary-bus",
+                                               &error_abort));
+        sdev.accel = object_property_get_bool(obj, "accel", &error_abort);
+        sdev.ats = smmuv3_ats_enabled(ARM_SMMUV3(obj));
+        pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
+        sbdev = SYS_BUS_DEVICE(obj);
+        sdev.base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
+        sdev.base += vms->memmap[VIRT_PLATFORM_BUS].base;
+        sdev.irq = platform_bus_get_irqn(pbus, sbdev, 0);
+        sdev.irq += vms->irqmap[VIRT_PLATFORM_BUS];
+        sdev.irq += ARM_SPI_BASE;
+
+        pci_bus_range(bus, &min_bus, &max_bus);
+        sdev.rc_smmu_idmaps = g_array_new(false, true,
+                                          sizeof(AcpiIortIdMapping));
+        idmap.input_base = min_bus << 8;
+        idmap.id_count = (max_bus - min_bus + 1) << 8;
+        g_array_append_val(sdev.rc_smmu_idmaps, idmap);
+        g_array_append_val(sdev_blob, sdev);
+    }
     /* Sort the smmuv3 devices(if any) by smmu idmap input_base */
     g_array_sort(sdev_blob, smmuv3_dev_idmap_compare);
     /*
@@ -561,7 +553,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     if (vms->legacy_smmuv3_present) {
         rc_smmu_idmaps_len = populate_smmuv3_legacy_dev(smmuv3_devs);
     } else {
-        rc_smmu_idmaps_len = populate_smmuv3_dev(smmuv3_devs);
+        rc_smmu_idmaps_len = populate_smmuv3_dev(vms, smmuv3_devs);
     }
 
     num_smmus = smmuv3_devs->len;
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ec0d8475ca..68464ceb14 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3261,6 +3261,7 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
             }
 
             create_smmuv3_dev_dtb(vms, dev, bus, errp);
+            g_ptr_array_add(vms->smmuv3_devices, dev);
         }
     }
 
@@ -3697,6 +3698,8 @@ static void virt_instance_init(Object *obj)
     vms->oem_id = g_strndup(ACPI_BUILD_APPNAME6, 6);
     vms->oem_table_id = g_strndup(ACPI_BUILD_APPNAME8, 8);
     cxl_machine_init(obj, &vms->cxl_devices_state);
+
+    vms->smmuv3_devices = g_ptr_array_new_with_free_func(NULL);
 }
 
 static const TypeInfo virt_machine_info = {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 10/31] hw/arm/tegra241-cmdqv: Probe host Tegra241 CMDQV support
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (8 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 09/31] hw/arm/virt: Use stored SMMUv3 device list for IORT build Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-04 18:49   ` Nicolin Chen
  2026-04-15 10:55 ` [PATCH v4 11/31] hw/arm/tegra241-cmdqv: Implement CMDQV init Shameer Kolothum
                   ` (20 subsequent siblings)
  30 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Use IOMMU_GET_HW_INFO to query host support for Tegra241 CMDQV.

Validate the returned data type, version, and minimum number of vCMDQs and
SIDs per Tegra241 CMDQ Virtual Interface(VI). Fail the probe if the host
does not meet these requirements.

The QEMU model supports one Virtual Interface(VI) per VM with 2 vCMDQs and
16 SIDs per VI, so the probe ensures the host implementation is compatible
with these limits.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/tegra241-cmdqv.h |  4 ++++
 hw/arm/tegra241-cmdqv.c | 32 ++++++++++++++++++++++++++++++--
 2 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index 07e10e86ee..c1866084f8 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -10,6 +10,10 @@
 #ifndef HW_ARM_TEGRA241_CMDQV_H
 #define HW_ARM_TEGRA241_CMDQV_H
 
+#define CMDQV_VER                 1
+#define CMDQV_NUM_CMDQ_LOG2       1
+#define CMDQV_NUM_SID_PER_VI_LOG2 4
+
 const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
 
 #endif /* HW_ARM_TEGRA241_CMDQV_H */
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index ad5a0d4611..3a19a1af56 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -38,8 +38,36 @@ static bool tegra241_cmdqv_init(SMMUv3State *s, Error **errp)
 static bool tegra241_cmdqv_probe(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
                                  Error **errp)
 {
-    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
-    return false;
+    uint32_t data_type = IOMMU_HW_INFO_TYPE_TEGRA241_CMDQV;
+    struct iommu_hw_info_tegra241_cmdqv cmdqv_info;
+    uint64_t caps;
+
+    if (!iommufd_backend_get_device_info(idev->iommufd, idev->devid, &data_type,
+                                         &cmdqv_info, sizeof(cmdqv_info), &caps,
+                                         NULL, errp)) {
+        return false;
+    }
+    if (data_type != IOMMU_HW_INFO_TYPE_TEGRA241_CMDQV) {
+        error_setg(errp, "Host CMDQV: unexpected data type %u (expected %u)",
+                   data_type, IOMMU_HW_INFO_TYPE_TEGRA241_CMDQV);
+        return false;
+    }
+    if (cmdqv_info.version != CMDQV_VER) {
+        error_setg(errp, "Host CMDQV: unsupported version %u (expected %u)",
+                   cmdqv_info.version, CMDQV_VER);
+        return false;
+    }
+    if (cmdqv_info.log2vcmdqs < CMDQV_NUM_CMDQ_LOG2) {
+        error_setg(errp, "Host CMDQV: insufficient vCMDQs log2=%u (need >= %u)",
+                   cmdqv_info.log2vcmdqs, CMDQV_NUM_CMDQ_LOG2);
+        return false;
+    }
+    if (cmdqv_info.log2vsids < CMDQV_NUM_SID_PER_VI_LOG2) {
+        error_setg(errp, "Host CMDQV: insufficient SIDs log2=%u (need >= %u)",
+                   cmdqv_info.log2vsids, CMDQV_NUM_SID_PER_VI_LOG2);
+        return false;
+    }
+    return true;
 }
 
 static const SMMUv3AccelCmdqvOps tegra241_cmdqv_ops = {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 11/31] hw/arm/tegra241-cmdqv: Implement CMDQV init
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (9 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 10/31] hw/arm/tegra241-cmdqv: Probe host Tegra241 CMDQV support Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-04-15 10:55 ` [PATCH v4 12/31] hw/arm/virt: Link SMMUv3 CMDQV resources to platform bus Shameer Kolothum
                   ` (19 subsequent siblings)
  30 siblings, 0 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

Tegra241 CMDQV extends SMMUv3 with support for virtual command queues
(VCMDQs) exposed via a CMDQV MMIO region. The CMDQV MMIO space is split
into 64KB pages:

0x00000  (CMDQ-V Config page)
0x10000  (CMDQ-V CMDQ Page0)
0x20000  (CMDQ-V CMDQ Page1)
0x30000  (Virtual Interface Page0)
0x40000  (Virtual Interface Page1)

This patch wires up the Tegra241 CMDQV init callback and allocates
vendor-specific CMDQV state. The state pointer is stored in
SMMUv3AccelState for use by subsequent CMDQV operations.

The CMDQV MMIO region and a dedicated IRQ line are registered with the
SMMUv3 device. The MMIO read/write handlers are currently stubs and will
be implemented in later patches.

The CMDQV interrupt is edge-triggered and indicates VCMDQ or VINTF
error conditions. This patch only registers the IRQ line. Interrupt
generation and propagation to the guest will be added in a subsequent
patch.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/smmuv3-accel.h   |  1 +
 hw/arm/tegra241-cmdqv.h | 18 ++++++++++++++++++
 hw/arm/tegra241-cmdqv.c | 30 ++++++++++++++++++++++++++++--
 3 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
index 86301afcb4..28bceca061 100644
--- a/hw/arm/smmuv3-accel.h
+++ b/hw/arm/smmuv3-accel.h
@@ -45,6 +45,7 @@ typedef struct SMMUv3AccelState {
     bool auto_mode;
     bool auto_finalised;
     const SMMUv3AccelCmdqvOps *cmdqv_ops;
+    void *cmdqv;  /* vendor specific CMDQV state */
 } SMMUv3AccelState;
 
 typedef struct SMMUS1Hwpt {
diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index c1866084f8..2a34a4b6b4 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -14,6 +14,24 @@
 #define CMDQV_NUM_CMDQ_LOG2       1
 #define CMDQV_NUM_SID_PER_VI_LOG2 4
 
+/*
+ * Tegra241 CMDQV MMIO layout (64KB pages)
+ *
+ * 0x00000  (CMDQ-V Config page)
+ * 0x10000  (CMDQ-V CMDQ Page0)
+ * 0x20000  (CMDQ-V CMDQ Page1)
+ * 0x30000  (Virtual Interface Page0)
+ * 0x40000  (Virtual Interface Page1)
+ */
+#define TEGRA241_CMDQV_IO_LEN 0x50000
+
+typedef struct Tegra241CMDQV {
+    struct iommu_viommu_tegra241_cmdqv cmdqv_data;
+    SMMUv3AccelState *s_accel;
+    MemoryRegion mmio_cmdqv;
+    qemu_irq irq;
+} Tegra241CMDQV;
+
 const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
 
 #endif /* HW_ARM_TEGRA241_CMDQV_H */
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 3a19a1af56..ccd3c6d275 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -13,6 +13,16 @@
 #include "smmuv3-accel.h"
 #include "tegra241-cmdqv.h"
 
+static uint64_t tegra241_cmdqv_read(void *opaque, hwaddr offset, unsigned size)
+{
+    return 0;
+}
+
+static void tegra241_cmdqv_write(void *opaque, hwaddr offset, uint64_t value,
+                                 unsigned size)
+{
+}
+
 static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
 {
 }
@@ -29,10 +39,26 @@ static void tegra241_cmdqv_reset(SMMUv3State *s)
 {
 }
 
+static const MemoryRegionOps mmio_cmdqv_ops = {
+    .read = tegra241_cmdqv_read,
+    .write = tegra241_cmdqv_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+};
+
 static bool tegra241_cmdqv_init(SMMUv3State *s, Error **errp)
 {
-    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
-    return false;
+    SysBusDevice *sbd = SYS_BUS_DEVICE(OBJECT(s));
+    SMMUv3AccelState *accel = s->s_accel;
+    Tegra241CMDQV *cmdqv;
+
+    cmdqv = g_new0(Tegra241CMDQV, 1);
+    memory_region_init_io(&cmdqv->mmio_cmdqv, OBJECT(s), &mmio_cmdqv_ops, cmdqv,
+                          "tegra241-cmdqv", TEGRA241_CMDQV_IO_LEN);
+    sysbus_init_mmio(sbd, &cmdqv->mmio_cmdqv);
+    sysbus_init_irq(sbd, &cmdqv->irq);
+    cmdqv->s_accel = accel;
+    accel->cmdqv = cmdqv;
+    return true;
 }
 
 static bool tegra241_cmdqv_probe(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 12/31] hw/arm/virt: Link SMMUv3 CMDQV resources to platform bus
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (10 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 11/31] hw/arm/tegra241-cmdqv: Implement CMDQV init Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-04 18:57   ` Nicolin Chen
  2026-04-15 10:55 ` [PATCH v4 13/31] hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free Shameer Kolothum
                   ` (18 subsequent siblings)
  30 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

SMMUv3 devices with acceleration may enable CMDQV extensions
after device realize. In that case, additional MMIO regions and
IRQ lines may be registered but not yet mapped to the platform bus.

Ensure SMMUv3 device resources are linked to the platform bus
during machine_done().

This is safe to do unconditionally since the platform bus helpers
skip resources that are already mapped.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/virt.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 68464ceb14..6c5e51af37 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1832,6 +1832,25 @@ static void virt_build_smbios(VirtMachineState *vms)
     }
 }
 
+/*
+ * SMMUv3 devices with acceleration may enable CMDQV extensions
+ * after device realize. In that case, additional MMIO regions and
+ * IRQ lines may be registered but not yet mapped to the platform bus.
+ *
+ * Ensure all resources are linked to the platform bus before final
+ * machine setup.
+ */
+
+static void virt_smmuv3_dev_link_cmdqv(VirtMachineState *vms)
+{
+    for (int i = 0; i < vms->smmuv3_devices->len; i++) {
+        DeviceState *dev = g_ptr_array_index(vms->smmuv3_devices, i);
+
+        platform_bus_link_device(PLATFORM_BUS_DEVICE(vms->platform_bus_dev),
+                                 SYS_BUS_DEVICE(dev));
+    }
+}
+
 static
 void virt_machine_done(Notifier *notifier, void *data)
 {
@@ -1848,6 +1867,9 @@ void virt_machine_done(Notifier *notifier, void *data)
     if (vms->cxl_devices_state.is_enabled) {
         cxl_fmws_link_targets(&error_fatal);
     }
+
+    virt_smmuv3_dev_link_cmdqv(vms);
+
     /*
      * If the user provided a dtb, we assume the dynamic sysbus nodes
      * already are integrated there. This corresponds to a use case where
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 13/31] hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (11 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 12/31] hw/arm/virt: Link SMMUv3 CMDQV resources to platform bus Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-04 16:01   ` Eric Auger
  2026-05-04 19:54   ` Nicolin Chen
  2026-04-15 10:55 ` [PATCH v4 14/31] hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region Shameer Kolothum
                   ` (17 subsequent siblings)
  30 siblings, 2 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

Replace the stub implementation with real vIOMMU allocation for
Tegra241 CMDQV.

Allocate a matching vEVENTQ together with the vIOMMU, since it is
specific to the Tegra241 CMDQV vIOMMU and used to receive CMDQV
events.

Free both objects on teardown.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/tegra241-cmdqv.h |  1 +
 hw/arm/tegra241-cmdqv.c | 46 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index 2a34a4b6b4..fa0aa3ab04 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -30,6 +30,7 @@ typedef struct Tegra241CMDQV {
     SMMUv3AccelState *s_accel;
     MemoryRegion mmio_cmdqv;
     qemu_irq irq;
+    IOMMUFDVeventq *veventq;
 } Tegra241CMDQV;
 
 const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index ccd3c6d275..2f1084b55f 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -25,13 +25,57 @@ static void tegra241_cmdqv_write(void *opaque, hwaddr offset, uint64_t value,
 
 static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
 {
+    SMMUv3AccelState *accel = s->s_accel;
+    IOMMUFDViommu *viommu = accel->viommu;
+    Tegra241CMDQV *cmdqv = accel->cmdqv;
+    IOMMUFDVeventq *veventq = cmdqv->veventq;
+
+    if (!viommu) {
+        return;
+    }
+    if (veventq) {
+        close(veventq->veventq_fd);
+        iommufd_backend_free_id(viommu->iommufd, veventq->veventq_id);
+        g_free(veventq);
+        cmdqv->veventq = NULL;
+    }
+    iommufd_backend_free_id(viommu->iommufd, viommu->viommu_id);
 }
 
 static bool
 tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
                             uint32_t *out_viommu_id, Error **errp)
 {
-    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
+    Tegra241CMDQV *cmdqv = s->s_accel->cmdqv;
+    uint32_t viommu_id, veventq_id, veventq_fd;
+    IOMMUFDVeventq *veventq;
+
+    if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
+                                      IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV,
+                                      idev->hwpt_id, &cmdqv->cmdqv_data,
+                                      sizeof(cmdqv->cmdqv_data), &viommu_id,
+                                      errp)) {
+        return false;
+    }
+
+    if (!iommufd_backend_alloc_veventq(idev->iommufd, viommu_id,
+                                       IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV,
+                                       1 << 16, &veventq_id, &veventq_fd,
+                                       errp)) {
+        error_append_hint(errp, "Tegra241 CMDQV: failed to alloc veventq");
+        goto free_viommu;
+    }
+
+    veventq = g_new(IOMMUFDVeventq, 1);
+    veventq->veventq_id = veventq_id;
+    veventq->veventq_fd = veventq_fd;
+    cmdqv->veventq = veventq;
+
+    *out_viommu_id = viommu_id;
+    return true;
+
+free_viommu:
+    iommufd_backend_free_id(idev->iommufd, viommu_id);
     return false;
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 14/31] hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (12 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 13/31] hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-05  0:09   ` Nicolin Chen
  2026-05-05  7:26   ` Eric Auger
  2026-04-15 10:55 ` [PATCH v4 15/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads Shameer Kolothum
                   ` (16 subsequent siblings)
  30 siblings, 2 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

Tegra241 CMDQV exposes control and status registers in the CMDQ-V
Config page (offset [0x0, 0x10000)) used to configure virtual command
queue allocation and interrupt behavior.

Add read/write emulation for the CMDQ-V Config region
([CMDQV_BASE, CMDQV_CMDQ_BASE]), backed by a simple register cache.
This includes CONFIG, PARAM, STATUS, VI error and interrupt maps, CMDQ
allocation map and the VINTF0 related registers defined in the CMDQ-V
Config space. Only VINTF0 is supported; VINTF1-63 are not.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/tegra241-cmdqv.h | 127 +++++++++++++++++++++++++++++++
 hw/arm/tegra241-cmdqv.c | 163 ++++++++++++++++++++++++++++++++++++++--
 hw/arm/trace-events     |   4 +
 3 files changed, 288 insertions(+), 6 deletions(-)

diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index fa0aa3ab04..965670066d 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -10,10 +10,14 @@
 #ifndef HW_ARM_TEGRA241_CMDQV_H
 #define HW_ARM_TEGRA241_CMDQV_H
 
+#include "hw/core/registerfields.h"
+
 #define CMDQV_VER                 1
 #define CMDQV_NUM_CMDQ_LOG2       1
 #define CMDQV_NUM_SID_PER_VI_LOG2 4
 
+#define TEGRA241_CMDQV_MAX_CMDQ   (1U << CMDQV_NUM_CMDQ_LOG2)
+
 /*
  * Tegra241 CMDQV MMIO layout (64KB pages)
  *
@@ -31,8 +35,131 @@ typedef struct Tegra241CMDQV {
     MemoryRegion mmio_cmdqv;
     qemu_irq irq;
     IOMMUFDVeventq *veventq;
+
+    /* Register Cache */
+    uint32_t config;
+    uint32_t param;
+    uint32_t status;
+    uint32_t vi_err_map[2];
+    uint32_t vi_int_mask[2];
+    uint32_t cmdq_err_map[4];
+    uint32_t cmdq_alloc_map[TEGRA241_CMDQV_MAX_CMDQ];
+    uint32_t vintf_config;
+    uint32_t vintf_status;
+    uint32_t vintf_sid_match[16];
+    uint32_t vintf_sid_replace[16];
+    uint32_t vintf_cmdq_err_map[4];
 } Tegra241CMDQV;
 
+/* CMDQ-V Config page registers (offset 0x00000) */
+REG32(CONFIG, 0x0)
+FIELD(CONFIG, CMDQV_EN, 0, 1)
+FIELD(CONFIG, CMDQV_PER_CMD_OFFSET, 1, 3)
+FIELD(CONFIG, CMDQ_MAX_CLK_BATCH, 4, 8)
+FIELD(CONFIG, CMDQ_MAX_CMD_BATCH, 12, 8)
+FIELD(CONFIG, CONS_DRAM_EN, 20, 1)
+
+REG32(PARAM, 0x4)
+FIELD(PARAM, CMDQV_VER, 0, 4)
+FIELD(PARAM, CMDQV_NUM_CMDQ_LOG2, 4, 4)
+FIELD(PARAM, CMDQV_NUM_VI_LOG2, 8, 4)
+FIELD(PARAM, CMDQV_NUM_SID_PER_VI_LOG2, 12, 4)
+
+REG32(STATUS, 0x8)
+FIELD(STATUS, CMDQV_ENABLED, 0, 1)
+
+/* SMMU_CMDQV_VI_ERR_MAP_0/1 definitions */
+#define A_VI_ERR_MAP_0 0x14
+#define A_VI_ERR_MAP_1 0x18
+#define V_VI_ERR_MAP_NO_ERROR (0)
+#define V_VI_ERR_MAP_ERROR (1)
+
+/* SMMU_CMDQV_VI_INT_MASK_0/1 definitions */
+#define A_VI_INT_MASK 0x1c
+#define A_VI_INT_MASK_1 0x20
+#define V_VI_INT_MASK_NOT_MASKED (0)
+#define V_VI_INT_MASK_MASKED (1)
+
+/* SMMU_CMDQV_CMDQ_ERR_MAP_0-3 definitions */
+#define A_CMDQ_ERR_MAP_0 0x24
+#define A_CMDQ_ERR_MAP_1 0x28
+#define A_CMDQ_ERR_MAP_2 0x2c
+#define A_CMDQ_ERR_MAP_3 0x30
+
+/*
+ * CMDQ_ALLOC_MAP: one entry per physical VCMDQ. Hardware supports up to 128
+ * entries (CMDQV_NUM_CMDQ_LOG2=7), but QEMU only exposes
+ * TEGRA241_CMDQV_MAX_CMDQ (=2) VCMDQs per VM so only entries 0 and 1 are
+ * defined here.
+ */
+/* 2 identical register entries */
+#define SMMU_CMDQV_CMDQ_ALLOC_MAP_(i)        \
+    REG32(CMDQ_ALLOC_MAP_##i, 0x200 + i * 4) \
+    FIELD(CMDQ_ALLOC_MAP_##i, ALLOC, 0, 1)   \
+    FIELD(CMDQ_ALLOC_MAP_##i, LVCMDQ, 1, 7)  \
+    FIELD(CMDQ_ALLOC_MAP_##i, VIRT_INTF_INDX, 15, 6)
+
+SMMU_CMDQV_CMDQ_ALLOC_MAP_(0)
+SMMU_CMDQV_CMDQ_ALLOC_MAP_(1)
+
+
+/* Only VINTF0 is exposed to the guest; vintf = 0 */
+#define SMMU_CMDQV_VINTFi_CONFIG_(vi)                 \
+    REG32(VINTF##vi##_CONFIG, 0x1000 + vi * 0x100) \
+    FIELD(VINTF##vi##_CONFIG, ENABLE, 0, 1)       \
+    FIELD(VINTF##vi##_CONFIG, VMID, 1, 16)        \
+    FIELD(VINTF##vi##_CONFIG, HYP_OWN, 17, 1)
+
+SMMU_CMDQV_VINTFi_CONFIG_(0)
+
+#define SMMU_CMDQV_VINTFi_STATUS_(vi)                 \
+    REG32(VINTF##vi##_STATUS, 0x1004 + vi * 0x100) \
+    FIELD(VINTF##vi##_STATUS, ENABLE_OK, 0, 1)    \
+    FIELD(VINTF##vi##_STATUS, STATUS, 1, 3)       \
+    FIELD(VINTF##vi##_STATUS, VI_NUM_LVCMDQ, 16, 8)
+
+SMMU_CMDQV_VINTFi_STATUS_(0)
+
+#define V_VINTF_STATUS_NO_ERROR (0 << 1)
+#define V_VINTF_STATUS_VCMDQ_ERROR (1 << 1)
+
+/*
+ * SID_MATCH/SID_REPLACE: 16 entries per VINTF (CMDQV_NUM_SID_PER_VI_LOG2=4).
+ * vintf = 0, 16 identical register entries
+ */
+#define SMMU_CMDQV_VINTFi_SID_MATCH_(vi, j)                          \
+    REG32(VINTF##vi##_SID_MATCH_##j, 0x1040 + j * 4 + vi * 0x100) \
+    FIELD(VINTF##vi##_SID_MATCH_##j, ENABLE, 0, 1)               \
+    FIELD(VINTF##vi##_SID_MATCH_##j, VIRT_SID, 1, 20)
+
+SMMU_CMDQV_VINTFi_SID_MATCH_(0, 0)
+/* Omitting [0][1~14] as not being directly called */
+SMMU_CMDQV_VINTFi_SID_MATCH_(0, 15)
+
+/* vintf = 0, 16 identical register entries */
+#define SMMU_CMDQV_VINTFi_SID_REPLACE_(vi, j)                          \
+    REG32(VINTF##vi##_SID_REPLACE_##j, 0x1080 + j * 4 + vi * 0x100) \
+    FIELD(VINTF##vi##_SID_REPLACE_##j, PHYS_SID, 0, 20)
+
+SMMU_CMDQV_VINTFi_SID_REPLACE_(0, 0)
+/* Omitting [0][1~14] as not being directly called */
+SMMU_CMDQV_VINTFi_SID_REPLACE_(0, 15)
+
+/*
+ * LVCMDQ_ERR_MAP: hardware defines 4 registers per VINTF (offset
+ * 0x10c0..0x10cc), each covering 32 logical VCMDQs. All 4 are accessible
+ * by the guest. With TEGRA241_CMDQV_MAX_CMDQ=2 only MAP_0 bits [1:0]
+ * carry meaningful error state; MAP_1..MAP_3 always read as 0.
+ * vintf = 0, 4 identical register entries
+ */
+#define SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(vi, j)                          \
+    REG32(VINTF##vi##_LVCMDQ_ERR_MAP_##j, 0x10c0 + j * 4 + vi * 0x100) \
+    FIELD(VINTF##vi##_LVCMDQ_ERR_MAP_##j, LVCMDQ_ERR_MAP, 0, 32)
+
+SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(0, 0)
+/* MAP_1 and MAP_2 omitted; not referenced directly */
+SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(0, 3)
+
 const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
 
 #endif /* HW_ARM_TEGRA241_CMDQV_H */
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 2f1084b55f..3b08ed0ff3 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -8,19 +8,170 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/log.h"
 
 #include "hw/arm/smmuv3.h"
 #include "smmuv3-accel.h"
 #include "tegra241-cmdqv.h"
+#include "trace.h"
 
-static uint64_t tegra241_cmdqv_read(void *opaque, hwaddr offset, unsigned size)
+static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV *cmdqv,
+                                                 hwaddr offset)
 {
-    return 0;
+    int i;
+
+    switch (offset) {
+    case A_VINTF0_CONFIG:
+        return cmdqv->vintf_config;
+    case A_VINTF0_STATUS:
+        return cmdqv->vintf_status;
+    case A_VINTF0_SID_MATCH_0 ... A_VINTF0_SID_MATCH_15:
+        i = (offset - A_VINTF0_SID_MATCH_0) / 4;
+        return cmdqv->vintf_sid_match[i];
+    case A_VINTF0_SID_REPLACE_0 ... A_VINTF0_SID_REPLACE_15:
+        i = (offset - A_VINTF0_SID_REPLACE_0) / 4;
+        return cmdqv->vintf_sid_replace[i];
+    case A_VINTF0_LVCMDQ_ERR_MAP_0 ... A_VINTF0_LVCMDQ_ERR_MAP_3:
+        i = (offset - A_VINTF0_LVCMDQ_ERR_MAP_0) / 4;
+        return cmdqv->vintf_cmdq_err_map[i];
+    default:
+        /*
+         * GLB_FILT_CFG_0 (offset 0xC) and GLB_FILT_DATA_0 (offset 0x10) are
+         * filter config and filter data registers. They are not required for
+         * normal VINTF operation and are not emulated.
+         */
+        qemu_log_mask(LOG_UNIMP, "%s unhandled read access at 0x%" PRIx64 "\n",
+                      __func__, offset);
+        return 0;
+    }
+}
+
+static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
+                                              hwaddr offset, uint64_t value)
+{
+    int i;
+
+    switch (offset) {
+    case A_VINTF0_CONFIG:
+        /*
+         * Mask out HYP_OWN on guest writes. This bit selects Hypervisor (1) vs
+         * Guest (0) ownership of the CMDQ. Force it to 0 so the VINTF always
+         * remains guest-owned.
+         */
+        value &= ~R_VINTF0_CONFIG_HYP_OWN_MASK;
+
+        cmdqv->vintf_config = value;
+        if (value & R_VINTF0_CONFIG_ENABLE_MASK) {
+            cmdqv->vintf_status |= R_VINTF0_STATUS_ENABLE_OK_MASK;
+        } else {
+            cmdqv->vintf_status &= ~R_VINTF0_STATUS_ENABLE_OK_MASK;
+        }
+        break;
+    case A_VINTF0_SID_MATCH_0 ... A_VINTF0_SID_MATCH_15:
+        i = (offset - A_VINTF0_SID_MATCH_0) / 4;
+        cmdqv->vintf_sid_match[i] = value;
+        break;
+    case A_VINTF0_SID_REPLACE_0 ... A_VINTF0_SID_REPLACE_15:
+        i = (offset - A_VINTF0_SID_REPLACE_0) / 4;
+        cmdqv->vintf_sid_replace[i] = value;
+        break;
+    default:
+        /*
+         * GLB_FILT_CFG_0 (offset 0xC) and GLB_FILT_DATA_0 (offset 0x10) are
+         * filter config and filter data registers. They are not required for
+         * normal VINTF operation and are not emulated.
+         */
+        qemu_log_mask(LOG_UNIMP, "%s unhandled write access at 0x%" PRIx64 "\n",
+                      __func__, offset);
+        return;
+    }
+}
+
+static uint64_t tegra241_cmdqv_read_mmio(void *opaque, hwaddr offset,
+                                         unsigned size)
+{
+    Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
+    uint64_t val = 0;
+
+    if (offset >= TEGRA241_CMDQV_IO_LEN) {
+        qemu_log_mask(LOG_UNIMP,
+                      "%s offset 0x%" PRIx64 " off limit (0x%x)\n", __func__,
+                      offset, TEGRA241_CMDQV_IO_LEN);
+        goto out;
+    }
+
+    switch (offset) {
+    case A_CONFIG:
+        val = cmdqv->config;
+        break;
+    case A_PARAM:
+        val = cmdqv->param;
+        break;
+    case A_STATUS:
+        val = cmdqv->status;
+        break;
+    case A_VI_ERR_MAP_0 ... A_VI_ERR_MAP_1:
+        val = cmdqv->vi_err_map[(offset - A_VI_ERR_MAP_0) / 4];
+        break;
+    case A_VI_INT_MASK ... A_VI_INT_MASK_1:
+        val = cmdqv->vi_int_mask[(offset - A_VI_INT_MASK) / 4];
+        break;
+    case A_CMDQ_ERR_MAP_0 ... A_CMDQ_ERR_MAP_3:
+        val = cmdqv->cmdq_err_map[(offset - A_CMDQ_ERR_MAP_0) / 4];
+        break;
+    case A_CMDQ_ALLOC_MAP_0 ... A_CMDQ_ALLOC_MAP_1:
+        val = cmdqv->cmdq_alloc_map[(offset - A_CMDQ_ALLOC_MAP_0) / 4];
+        break;
+    case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
+        val = tegra241_cmdqv_config_vintf_read(cmdqv, offset);
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP, "%s unhandled read access at 0x%" PRIx64 "\n",
+                      __func__, offset);
+    }
+
+out:
+    trace_tegra241_cmdqv_read_mmio(offset, val, size);
+    return val;
 }
 
-static void tegra241_cmdqv_write(void *opaque, hwaddr offset, uint64_t value,
-                                 unsigned size)
+static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
+                                      uint64_t value, unsigned size)
 {
+    Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
+
+    if (offset >= TEGRA241_CMDQV_IO_LEN) {
+        qemu_log_mask(LOG_UNIMP,
+                      "%s offset 0x%" PRIx64 " off limit (0x%x)\n", __func__,
+                      offset, TEGRA241_CMDQV_IO_LEN);
+        goto out;
+    }
+
+    switch (offset) {
+    case A_CONFIG:
+        cmdqv->config = value;
+        if (value & R_CONFIG_CMDQV_EN_MASK) {
+            cmdqv->status |= R_STATUS_CMDQV_ENABLED_MASK;
+        } else {
+            cmdqv->status &= ~R_STATUS_CMDQV_ENABLED_MASK;
+        }
+        break;
+    case A_VI_INT_MASK ... A_VI_INT_MASK_1:
+        cmdqv->vi_int_mask[(offset - A_VI_INT_MASK) / 4] = value;
+        break;
+    case A_CMDQ_ALLOC_MAP_0 ... A_CMDQ_ALLOC_MAP_1:
+        cmdqv->cmdq_alloc_map[(offset - A_CMDQ_ALLOC_MAP_0) / 4] = value;
+        break;
+    case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
+        tegra241_cmdqv_config_vintf_write(cmdqv, offset, value);
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP, "%s unhandled write access at 0x%" PRIx64 "\n",
+                      __func__, offset);
+    }
+
+out:
+    trace_tegra241_cmdqv_write_mmio(offset, value, size);
 }
 
 static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
@@ -84,8 +235,8 @@ static void tegra241_cmdqv_reset(SMMUv3State *s)
 }
 
 static const MemoryRegionOps mmio_cmdqv_ops = {
-    .read = tegra241_cmdqv_read,
-    .write = tegra241_cmdqv_write,
+    .read = tegra241_cmdqv_read_mmio,
+    .write = tegra241_cmdqv_write_mmio,
     .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 3457536fb0..8c61d66a26 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -72,6 +72,10 @@ smmuv3_accel_unset_iommu_device(int devfn, uint32_t devid) "devfn=0x%x (idev dev
 smmuv3_accel_translate_ste(uint32_t vsid, uint32_t hwpt_id, uint64_t ste_1, uint64_t ste_0) "vSID=0x%x hwpt_id=0x%x ste=%"PRIx64":%"PRIx64
 smmuv3_accel_install_ste(uint32_t vsid, const char * type, uint32_t hwpt_id) "vSID=0x%x ste type=%s hwpt_id=0x%x"
 
+# tegra241-cmdqv
+tegra241_cmdqv_read_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
+tegra241_cmdqv_write_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
+
 # strongarm.c
 strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
 strongarm_ssp_read_underrun(void) "SSP rx underrun"
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 15/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (13 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 14/31] hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-05 10:12   ` Eric Auger
  2026-04-15 10:55 ` [PATCH v4 16/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register writes Shameer Kolothum
                   ` (15 subsequent siblings)
  30 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

Tegra241 CMDQV exposes per-VCMDQ register windows through two MMIO
apertures:

  CMDQV_CMDQ_BASE (0x10000/0x20000): VCMDQ Page0/Page1
  CMDQV_VI_CMDQ_BASE (0x30000/0x40000): VINTF VCMDQ Page0/Page1

VINTF Page0 (0x30000) and VCMDQ Page0 (0x10000) are hardware aliases
addressing the same underlying registers. Add read emulation for both
apertures, backed by a register cache. VINTF Page0 reads are translated
to their VCMDQ Page0 equivalent and served from the same cached state.

Once IOMMU_HW_QUEUE_ALLOC and viommu_mmap are wired up in a subsequent
patch, Page0 register reads will be served directly from the hardware
backed mmap'd page instead of the cache. Page1 registers are always
served from cache.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/tegra241-cmdqv.h | 185 ++++++++++++++++++++++++++++++++++++++++
 hw/arm/tegra241-cmdqv.c |  73 ++++++++++++++++
 2 files changed, 258 insertions(+)

diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index 965670066d..b8bd8cd8ff 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -29,6 +29,13 @@
  */
 #define TEGRA241_CMDQV_IO_LEN 0x50000
 
+/* CMDQV MMIO aperture bases and VCMDQ stride */
+#define CMDQV_VCMDQ_PAGE0_BASE  0x10000  /* CMDQV_CMDQ_BASE */
+#define CMDQV_VCMDQ_PAGE1_BASE  0x20000
+#define CMDQV_VINTF_PAGE0_BASE  0x30000  /* CMDQV_VI_CMDQ_BASE */
+#define CMDQV_VINTF_PAGE1_BASE  0x40000
+#define CMDQV_VCMDQ_STRIDE      0x80
+
 typedef struct Tegra241CMDQV {
     struct iommu_viommu_tegra241_cmdqv cmdqv_data;
     SMMUv3AccelState *s_accel;
@@ -49,6 +56,14 @@ typedef struct Tegra241CMDQV {
     uint32_t vintf_sid_match[16];
     uint32_t vintf_sid_replace[16];
     uint32_t vintf_cmdq_err_map[4];
+    uint32_t vcmdq_cons_indx[TEGRA241_CMDQV_MAX_CMDQ];
+    uint32_t vcmdq_prod_indx[TEGRA241_CMDQV_MAX_CMDQ];
+    uint32_t vcmdq_config[TEGRA241_CMDQV_MAX_CMDQ];
+    uint32_t vcmdq_status[TEGRA241_CMDQV_MAX_CMDQ];
+    uint32_t vcmdq_gerror[TEGRA241_CMDQV_MAX_CMDQ];
+    uint32_t vcmdq_gerrorn[TEGRA241_CMDQV_MAX_CMDQ];
+    uint64_t vcmdq_base[TEGRA241_CMDQV_MAX_CMDQ];
+    uint64_t vcmdq_cons_indx_base[TEGRA241_CMDQV_MAX_CMDQ];
 } Tegra241CMDQV;
 
 /* CMDQ-V Config page registers (offset 0x00000) */
@@ -160,6 +175,176 @@ SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(0, 0)
 /* MAP_1 and MAP_2 omitted; not referenced directly */
 SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(0, 3)
 
+/*
+ * VCMDQ register windows.
+ *
+ * Page 0 @ 0x10000: VCMDQ control and status registers
+ * Page 1 @ 0x20000: VCMDQ base and DRAM address registers
+ */
+#define A_VCMDQi_CONS_INDX(i)                       \
+    REG32(VCMDQ##i##_CONS_INDX, 0x10000 + i * 0x80) \
+    FIELD(VCMDQ##i##_CONS_INDX, RD, 0, 20)          \
+    FIELD(VCMDQ##i##_CONS_INDX, ERR, 24, 7)
+
+A_VCMDQi_CONS_INDX(0)
+A_VCMDQi_CONS_INDX(1)
+
+#define V_VCMDQ_CONS_INDX_ERR_CERROR_NONE 0
+#define V_VCMDQ_CONS_INDX_ERR_CERROR_ILL_OPCODE 1
+#define V_VCMDQ_CONS_INDX_ERR_CERROR_ABT 2
+#define V_VCMDQ_CONS_INDX_ERR_CERROR_ATC_INV_SYNC 3
+#define V_VCMDQ_CONS_INDX_ERR_CERROR_ILL_ACCESS 4
+
+#define A_VCMDQi_PROD_INDX(i)                             \
+    REG32(VCMDQ##i##_PROD_INDX, 0x10000 + 0x4 + i * 0x80) \
+    FIELD(VCMDQ##i##_PROD_INDX, WR, 0, 20)
+
+A_VCMDQi_PROD_INDX(0)
+A_VCMDQi_PROD_INDX(1)
+
+#define A_VCMDQi_CONFIG(i)                             \
+    REG32(VCMDQ##i##_CONFIG, 0x10000 + 0x8 + i * 0x80) \
+    FIELD(VCMDQ##i##_CONFIG, CMDQ_EN, 0, 1)
+
+A_VCMDQi_CONFIG(0)
+A_VCMDQi_CONFIG(1)
+
+#define A_VCMDQi_STATUS(i)                             \
+    REG32(VCMDQ##i##_STATUS, 0x10000 + 0xc + i * 0x80) \
+    FIELD(VCMDQ##i##_STATUS, CMDQ_EN_OK, 0, 1)
+
+A_VCMDQi_STATUS(0)
+A_VCMDQi_STATUS(1)
+
+#define A_VCMDQi_GERROR(i)                               \
+    REG32(VCMDQ##i##_GERROR, 0x10000 + 0x10 + i * 0x80)  \
+    FIELD(VCMDQ##i##_GERROR, CMDQ_ERR, 0, 1)             \
+    FIELD(VCMDQ##i##_GERROR, CONS_DRAM_WR_ABT_ERR, 1, 1) \
+    FIELD(VCMDQ##i##_GERROR, CMDQ_INIT_ERR, 2, 1)
+
+A_VCMDQi_GERROR(0)
+A_VCMDQi_GERROR(1)
+
+#define A_VCMDQi_GERRORN(i)                               \
+    REG32(VCMDQ##i##_GERRORN, 0x10000 + 0x14 + i * 0x80)  \
+    FIELD(VCMDQ##i##_GERRORN, CMDQ_ERR, 0, 1)             \
+    FIELD(VCMDQ##i##_GERRORN, CONS_DRAM_WR_ABT_ERR, 1, 1) \
+    FIELD(VCMDQ##i##_GERRORN, CMDQ_INIT_ERR, 2, 1)
+
+A_VCMDQi_GERRORN(0)
+A_VCMDQi_GERRORN(1)
+
+#define A_VCMDQi_BASE_L(i)                       \
+    REG32(VCMDQ##i##_BASE_L, 0x20000 + i * 0x80) \
+    FIELD(VCMDQ##i##_BASE_L, LOG2SIZE, 0, 5)     \
+    FIELD(VCMDQ##i##_BASE_L, ADDR, 5, 27)
+
+A_VCMDQi_BASE_L(0)
+A_VCMDQi_BASE_L(1)
+
+#define A_VCMDQi_BASE_H(i)                             \
+    REG32(VCMDQ##i##_BASE_H, 0x20000 + 0x4 + i * 0x80) \
+    FIELD(VCMDQ##i##_BASE_H, ADDR, 0, 16)
+
+A_VCMDQi_BASE_H(0)
+A_VCMDQi_BASE_H(1)
+
+#define A_VCMDQi_CONS_INDX_BASE_DRAM_L(i)                             \
+    REG32(VCMDQ##i##_CONS_INDX_BASE_DRAM_L, 0x20000 + 0x8 + i * 0x80) \
+    FIELD(VCMDQ##i##_CONS_INDX_BASE_DRAM_L, ADDR, 0, 32)
+
+A_VCMDQi_CONS_INDX_BASE_DRAM_L(0)
+A_VCMDQi_CONS_INDX_BASE_DRAM_L(1)
+
+#define A_VCMDQi_CONS_INDX_BASE_DRAM_H(i)                             \
+    REG32(VCMDQ##i##_CONS_INDX_BASE_DRAM_H, 0x20000 + 0xc + i * 0x80) \
+    FIELD(VCMDQ##i##_CONS_INDX_BASE_DRAM_H, ADDR, 0, 16)
+
+A_VCMDQi_CONS_INDX_BASE_DRAM_H(0)
+A_VCMDQi_CONS_INDX_BASE_DRAM_H(1)
+
+/*
+ * VI_VCMDQ register windows (VCMDQs mapped via VINTF).
+ *
+ * Page 0 @ 0x30000: VI_VCMDQ control and status registers
+ * Page 1 @ 0x40000: VI_VCMDQ base and DRAM address registers
+ */
+#define A_VI_VCMDQi_CONS_INDX(i)                       \
+    REG32(VI_VCMDQ##i##_CONS_INDX, 0x30000 + i * 0x80) \
+    FIELD(VI_VCMDQ##i##_CONS_INDX, RD, 0, 20)          \
+    FIELD(VI_VCMDQ##i##_CONS_INDX, ERR, 24, 7)
+
+A_VI_VCMDQi_CONS_INDX(0)
+A_VI_VCMDQi_CONS_INDX(1)
+
+#define A_VI_VCMDQi_PROD_INDX(i)                             \
+    REG32(VI_VCMDQ##i##_PROD_INDX, 0x30000 + 0x4 + i * 0x80) \
+    FIELD(VI_VCMDQ##i##_PROD_INDX, WR, 0, 20)
+
+A_VI_VCMDQi_PROD_INDX(0)
+A_VI_VCMDQi_PROD_INDX(1)
+
+#define A_VI_VCMDQi_CONFIG(i)                             \
+    REG32(VI_VCMDQ##i##_CONFIG, 0x30000 + 0x8 + i * 0x80) \
+    FIELD(VI_VCMDQ##i##_CONFIG, CMDQ_EN, 0, 1)
+
+A_VI_VCMDQi_CONFIG(0)
+A_VI_VCMDQi_CONFIG(1)
+
+#define A_VI_VCMDQi_STATUS(i)                             \
+    REG32(VI_VCMDQ##i##_STATUS, 0x30000 + 0xc + i * 0x80) \
+    FIELD(VI_VCMDQ##i##_STATUS, CMDQ_EN_OK, 0, 1)
+
+A_VI_VCMDQi_STATUS(0)
+A_VI_VCMDQi_STATUS(1)
+
+#define A_VI_VCMDQi_GERROR(i)                               \
+    REG32(VI_VCMDQ##i##_GERROR, 0x30000 + 0x10 + i * 0x80)  \
+    FIELD(VI_VCMDQ##i##_GERROR, CMDQ_ERR, 0, 1)             \
+    FIELD(VI_VCMDQ##i##_GERROR, CONS_DRAM_WR_ABT_ERR, 1, 1) \
+    FIELD(VI_VCMDQ##i##_GERROR, CMDQ_INIT_ERR, 2, 1)
+
+A_VI_VCMDQi_GERROR(0)
+A_VI_VCMDQi_GERROR(1)
+
+#define A_VI_VCMDQi_GERRORN(i)                               \
+    REG32(VI_VCMDQ##i##_GERRORN, 0x30000 + 0x14 + i * 0x80)  \
+    FIELD(VI_VCMDQ##i##_GERRORN, CMDQ_ERR, 0, 1)             \
+    FIELD(VI_VCMDQ##i##_GERRORN, CONS_DRAM_WR_ABT_ERR, 1, 1) \
+    FIELD(VI_VCMDQ##i##_GERRORN, CMDQ_INIT_ERR, 2, 1)
+
+A_VI_VCMDQi_GERRORN(0)
+A_VI_VCMDQi_GERRORN(1)
+
+#define A_VI_VCMDQi_BASE_L(i)                       \
+    REG32(VI_VCMDQ##i##_BASE_L, 0x40000 + i * 0x80) \
+    FIELD(VI_VCMDQ##i##_BASE_L, LOG2SIZE, 0, 5)     \
+    FIELD(VI_VCMDQ##i##_BASE_L, ADDR, 5, 27)
+
+A_VI_VCMDQi_BASE_L(0)
+A_VI_VCMDQi_BASE_L(1)
+
+#define A_VI_VCMDQi_BASE_H(i)                             \
+    REG32(VI_VCMDQ##i##_BASE_H, 0x40000 + 0x4 + i * 0x80) \
+    FIELD(VI_VCMDQ##i##_BASE_H, ADDR, 0, 16)
+
+A_VI_VCMDQi_BASE_H(0)
+A_VI_VCMDQi_BASE_H(1)
+
+#define A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(i)                             \
+    REG32(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_L, 0x40000 + 0x8 + i * 0x80) \
+    FIELD(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_L, ADDR, 0, 32)
+
+A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(0)
+A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(1)
+
+#define A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(i)                             \
+    REG32(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_H, 0x40000 + 0xc + i * 0x80) \
+    FIELD(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_H, ADDR, 0, 16)
+
+A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(0)
+A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(1)
+
 const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
 
 #endif /* HW_ARM_TEGRA241_CMDQV_H */
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 3b08ed0ff3..35e6f0bbd6 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -15,6 +15,46 @@
 #include "tegra241-cmdqv.h"
 #include "trace.h"
 
+/*
+ * Read a VCMDQ register using VCMDQ0_* offsets.
+ *
+ * The caller normalizes the MMIO offset such that @offset0 always refers
+ * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
+ *
+ * All VCMDQ accesses return cached registers.
+ */
+static uint64_t tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
+                                          int index)
+{
+    switch (offset0) {
+    case A_VCMDQ0_CONS_INDX:
+        return cmdqv->vcmdq_cons_indx[index];
+    case A_VCMDQ0_PROD_INDX:
+        return cmdqv->vcmdq_prod_indx[index];
+    case A_VCMDQ0_CONFIG:
+        return cmdqv->vcmdq_config[index];
+    case A_VCMDQ0_STATUS:
+        return cmdqv->vcmdq_status[index];
+    case A_VCMDQ0_GERROR:
+        return cmdqv->vcmdq_gerror[index];
+    case A_VCMDQ0_GERRORN:
+        return cmdqv->vcmdq_gerrorn[index];
+    case A_VCMDQ0_BASE_L:
+        return cmdqv->vcmdq_base[index];
+    case A_VCMDQ0_BASE_H:
+        return cmdqv->vcmdq_base[index] >> 32;
+    case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
+        return cmdqv->vcmdq_cons_indx_base[index];
+    case A_VCMDQ0_CONS_INDX_BASE_DRAM_H:
+        return cmdqv->vcmdq_cons_indx_base[index] >> 32;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "%s unhandled read access at 0x%" PRIx64 "\n",
+                      __func__, offset0);
+        return 0;
+    }
+}
+
 static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV *cmdqv,
                                                  hwaddr offset)
 {
@@ -92,6 +132,7 @@ static uint64_t tegra241_cmdqv_read_mmio(void *opaque, hwaddr offset,
 {
     Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
     uint64_t val = 0;
+    int index;
 
     if (offset >= TEGRA241_CMDQV_IO_LEN) {
         qemu_log_mask(LOG_UNIMP,
@@ -125,6 +166,38 @@ static uint64_t tegra241_cmdqv_read_mmio(void *opaque, hwaddr offset,
     case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
         val = tegra241_cmdqv_config_vintf_read(cmdqv, offset);
         break;
+    case A_VI_VCMDQ0_CONS_INDX ... A_VI_VCMDQ1_GERRORN:
+        /*
+         * VINTF Page0 registers have the same per-VCMDQ layout as the
+         * VCMDQ Page0 registers. Translate the VINTF aperture offset to the
+         * equivalent VCMDQ aperture offset, then fall through to reuse the
+         * common VCMDQ decoding logic below.
+         */
+        offset -= CMDQV_VINTF_PAGE0_BASE - CMDQV_VCMDQ_PAGE0_BASE;
+        QEMU_FALLTHROUGH;
+    case A_VCMDQ0_CONS_INDX ... A_VCMDQ1_GERRORN:
+        /*
+         * Decode a per-VCMDQ register access.
+         *
+         * The hardware supports up to 128 identical VCMDQ instances; we
+         * currently expose TEGRA241_CMDQV_MAX_CMDQ (= 2). Each VCMDQ
+         * occupies a CMDQV_VCMDQ_STRIDE-byte window within the page.
+         *
+         * Extract the VCMDQ index and normalize to the VCMDQ0_* register
+         * offset. A single helper services all instances via @index.
+         */
+        index = (offset - CMDQV_VCMDQ_PAGE0_BASE) / CMDQV_VCMDQ_STRIDE;
+        return tegra241_cmdqv_read_vcmdq(cmdqv,
+                offset - index * CMDQV_VCMDQ_STRIDE, index);
+    case A_VI_VCMDQ0_BASE_L ... A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
+        /* Same VINTF-to-VCMDQ translation as VINTF Page0 case above */
+        offset -= CMDQV_VINTF_PAGE1_BASE - CMDQV_VCMDQ_PAGE1_BASE;
+        QEMU_FALLTHROUGH;
+    case A_VCMDQ0_BASE_L ... A_VCMDQ1_CONS_INDX_BASE_DRAM_H:
+        /* Same decode logic as VCMDQ Page0 case above */
+        index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
+        return tegra241_cmdqv_read_vcmdq(cmdqv,
+                offset - index * CMDQV_VCMDQ_STRIDE, index);
     default:
         qemu_log_mask(LOG_UNIMP, "%s unhandled read access at 0x%" PRIx64 "\n",
                       __func__, offset);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 16/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register writes
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (14 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 15/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-05 10:42   ` Eric Auger
  2026-04-15 10:55 ` [PATCH v4 17/31] hw/arm/tegra241-cmdqv: mmap VINTF Page0 for CMDQV Shameer Kolothum
                   ` (14 subsequent siblings)
  30 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

This is the write side counterpart of the VCMDQ read emulation. Add write
handling for both CMDQV_CMDQ_BASE and CMDQV_VI_CMDQ_BASE apertures using
the same index decoding and VINTF-to-VCMDQ translation logic as the read
path.

VINTF aperture writes are translated to their CMDQV_CMDQ_BASE equivalent
and update the same cached state. Page1 registers (BASE, CONS_INDX_BASE)
always update the cache. Once IOMMU_HW_QUEUE_ALLOC and viommu_mmap are
wired up in a subsequent patch, Page0 register writes will be forwarded
to the hardware-backed mmap'd page.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/tegra241-cmdqv.c | 99 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 99 insertions(+)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 35e6f0bbd6..d4ba2ada92 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -55,6 +55,70 @@ static uint64_t tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
     }
 }
 
+/*
+ * Write a VCMDQ register using VCMDQ0_* offsets.
+ *
+ * The caller normalizes the MMIO offset such that @offset0 always refers
+ * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
+ */
+static void tegra241_cmdqv_write_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
+                                       int index, uint64_t value,
+                                       unsigned size)
+{
+    switch (offset0) {
+    case A_VCMDQ0_CONS_INDX:
+        cmdqv->vcmdq_cons_indx[index] = (uint32_t)value;
+        return;
+    case A_VCMDQ0_PROD_INDX:
+        cmdqv->vcmdq_prod_indx[index] = (uint32_t)value;
+        return;
+    case A_VCMDQ0_CONFIG:
+        if (value & R_VCMDQ0_CONFIG_CMDQ_EN_MASK) {
+            cmdqv->vcmdq_status[index] |= R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
+        } else {
+            cmdqv->vcmdq_status[index] &= ~R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
+        }
+        cmdqv->vcmdq_config[index] = (uint32_t)value;
+        return;
+    case A_VCMDQ0_GERRORN:
+        cmdqv->vcmdq_gerrorn[index] = (uint32_t)value;
+        return;
+    case A_VCMDQ0_BASE_L:
+        if (size == 8) {
+            cmdqv->vcmdq_base[index] = value;
+        } else {
+            cmdqv->vcmdq_base[index] =
+                (cmdqv->vcmdq_base[index] & 0xffffffff00000000ULL) |
+                (value & 0xffffffffULL);
+        }
+        return;
+    case A_VCMDQ0_BASE_H:
+        cmdqv->vcmdq_base[index] =
+            (cmdqv->vcmdq_base[index] & 0xffffffffULL) |
+            ((uint64_t)value << 32);
+        return;
+    case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
+        if (size == 8) {
+            cmdqv->vcmdq_cons_indx_base[index] = value;
+        } else {
+            cmdqv->vcmdq_cons_indx_base[index] =
+                (cmdqv->vcmdq_cons_indx_base[index] & 0xffffffff00000000ULL) |
+                (value & 0xffffffffULL);
+        }
+        return;
+    case A_VCMDQ0_CONS_INDX_BASE_DRAM_H:
+        cmdqv->vcmdq_cons_indx_base[index] =
+            (cmdqv->vcmdq_cons_indx_base[index] & 0xffffffffULL) |
+            ((uint64_t)value << 32);
+        return;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "%s unhandled write access at 0x%" PRIx64 "\n",
+                      __func__, offset0);
+        return;
+    }
+}
+
 static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV *cmdqv,
                                                  hwaddr offset)
 {
@@ -212,6 +276,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
                                       uint64_t value, unsigned size)
 {
     Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
+    int index;
 
     if (offset >= TEGRA241_CMDQV_IO_LEN) {
         qemu_log_mask(LOG_UNIMP,
@@ -238,6 +303,40 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
     case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
         tegra241_cmdqv_config_vintf_write(cmdqv, offset, value);
         break;
+    case A_VI_VCMDQ0_CONS_INDX ... A_VI_VCMDQ1_GERRORN:
+        /*
+         * VINTF Page0 registers have the same per-VCMDQ layout as the
+         * VCMDQ Page0 registers. Translate the VINTF aperture offset to the
+         * equivalent VCMDQ aperture offset, then fall through to reuse the
+         * common VCMDQ decoding logic below.
+         */
+        offset -= CMDQV_VINTF_PAGE0_BASE - CMDQV_VCMDQ_PAGE0_BASE;
+        QEMU_FALLTHROUGH;
+    case A_VCMDQ0_CONS_INDX ... A_VCMDQ1_GERRORN:
+        /*
+         * Decode a per-VCMDQ register access.
+         *
+         * The hardware supports up to 128 identical VCMDQ instances; we
+         * currently expose TEGRA241_CMDQV_MAX_CMDQ (= 2). Each VCMDQ
+         * occupies a CMDQV_VCMDQ_STRIDE-byte window within the page.
+         *
+         * Extract the VCMDQ index and normalize to the VCMDQ0_* register
+         * offset. A single helper services all instances via @index.
+         */
+        index = (offset - CMDQV_VCMDQ_PAGE0_BASE) / CMDQV_VCMDQ_STRIDE;
+        tegra241_cmdqv_write_vcmdq(cmdqv, offset - index * CMDQV_VCMDQ_STRIDE,
+                                   index, value, size);
+        break;
+    case A_VI_VCMDQ0_BASE_L ... A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
+        /* Same VINTF-to-VCMDQ translation as VINTF Page0 case above */
+        offset -= CMDQV_VINTF_PAGE1_BASE - CMDQV_VCMDQ_PAGE1_BASE;
+        QEMU_FALLTHROUGH;
+    case A_VCMDQ0_BASE_L ... A_VCMDQ1_CONS_INDX_BASE_DRAM_H:
+        /* Same decode logic as VCMDQ Page0 case above */
+        index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
+        tegra241_cmdqv_write_vcmdq(cmdqv, offset - index * CMDQV_VCMDQ_STRIDE,
+                                   index, value, size);
+        break;
     default:
         qemu_log_mask(LOG_UNIMP, "%s unhandled write access at 0x%" PRIx64 "\n",
                       __func__, offset);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 17/31] hw/arm/tegra241-cmdqv: mmap VINTF Page0 for CMDQV
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (15 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 16/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register writes Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-04-15 10:55 ` [PATCH v4 18/31] system/physmem: Add address_space_is_ram() helper Shameer Kolothum
                   ` (13 subsequent siblings)
  30 siblings, 0 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

The CMDQ-V CMDQ pages provide a VM wide view of all VCMDQs, while the
VINTF pages expose a logical view local to a given VINTF. Although real
hardware may support multiple VINTFs, the kernel currently exposes a
single VINTF per VM.

The kernel provides an mmap offset for the VINTF Page0 region during
vIOMMU allocation. However, the logical-to-physical association between
VCMDQs and a VINTF is only established after HW_QUEUE allocation. Prior
to that, the mapped Page0 does not back any real VCMDQ state.

When VINTF is enabled, mmap the kernel provided Page0 region and set
ENABLE_OK only if the mmap succeeds. Unmap it when VINTF is disabled.
This prepares the VINTF mapping in advance of subsequent patches that
add VCMDQ allocation support.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/tegra241-cmdqv.h |  3 +++
 hw/arm/tegra241-cmdqv.c | 47 ++++++++++++++++++++++++++++++++++++++---
 2 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index b8bd8cd8ff..88572ad939 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -18,6 +18,8 @@
 
 #define TEGRA241_CMDQV_MAX_CMDQ   (1U << CMDQV_NUM_CMDQ_LOG2)
 
+#define VINTF_PAGE_SIZE 0x10000
+
 /*
  * Tegra241 CMDQV MMIO layout (64KB pages)
  *
@@ -42,6 +44,7 @@ typedef struct Tegra241CMDQV {
     MemoryRegion mmio_cmdqv;
     qemu_irq irq;
     IOMMUFDVeventq *veventq;
+    void *vintf_page0;
 
     /* Register Cache */
     uint32_t config;
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index d4ba2ada92..cdd941cec9 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -119,6 +119,39 @@ static void tegra241_cmdqv_write_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
     }
 }
 
+static bool
+tegra241_cmdqv_munmap_vintf_page0(Tegra241CMDQV *cmdqv, Error **errp)
+{
+    if (!cmdqv->vintf_page0) {
+        return true;
+    }
+
+    if (munmap(cmdqv->vintf_page0, VINTF_PAGE_SIZE) < 0) {
+        error_setg_errno(errp, errno, "Failed to unmap VINTF page0");
+        return false;
+    }
+    cmdqv->vintf_page0 = NULL;
+    return true;
+}
+
+static bool tegra241_cmdqv_mmap_vintf_page0(Tegra241CMDQV *cmdqv, Error **errp)
+{
+    IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
+
+    if (cmdqv->vintf_page0) {
+        return true;
+    }
+
+    if (!iommufd_backend_viommu_mmap(viommu->iommufd, viommu->viommu_id,
+                                     VINTF_PAGE_SIZE,
+                                     cmdqv->cmdqv_data.out_vintf_mmap_offset,
+                                     &cmdqv->vintf_page0, errp)) {
+        return false;
+    }
+
+    return true;
+}
+
 static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV *cmdqv,
                                                  hwaddr offset)
 {
@@ -151,7 +184,8 @@ static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV *cmdqv,
 }
 
 static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
-                                              hwaddr offset, uint64_t value)
+                                              hwaddr offset, uint64_t value,
+                                              Error **errp)
 {
     int i;
 
@@ -166,8 +200,11 @@ static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
 
         cmdqv->vintf_config = value;
         if (value & R_VINTF0_CONFIG_ENABLE_MASK) {
-            cmdqv->vintf_status |= R_VINTF0_STATUS_ENABLE_OK_MASK;
+            if (tegra241_cmdqv_mmap_vintf_page0(cmdqv, errp)) {
+                cmdqv->vintf_status |= R_VINTF0_STATUS_ENABLE_OK_MASK;
+            }
         } else {
+            tegra241_cmdqv_munmap_vintf_page0(cmdqv, errp);
             cmdqv->vintf_status &= ~R_VINTF0_STATUS_ENABLE_OK_MASK;
         }
         break;
@@ -276,6 +313,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
                                       uint64_t value, unsigned size)
 {
     Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
+    Error *local_err = NULL;
     int index;
 
     if (offset >= TEGRA241_CMDQV_IO_LEN) {
@@ -301,7 +339,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
         cmdqv->cmdq_alloc_map[(offset - A_CMDQ_ALLOC_MAP_0) / 4] = value;
         break;
     case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
-        tegra241_cmdqv_config_vintf_write(cmdqv, offset, value);
+        tegra241_cmdqv_config_vintf_write(cmdqv, offset, value, &local_err);
         break;
     case A_VI_VCMDQ0_CONS_INDX ... A_VI_VCMDQ1_GERRORN:
         /*
@@ -343,6 +381,9 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
     }
 
 out:
+    if (local_err) {
+        error_report_err(local_err);
+    }
     trace_tegra241_cmdqv_write_mmio(offset, value, size);
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 18/31] system/physmem: Add address_space_is_ram() helper
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (16 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 17/31] hw/arm/tegra241-cmdqv: mmap VINTF Page0 for CMDQV Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-05  0:24   ` Nicolin Chen
  2026-04-15 10:55 ` [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming Shameer Kolothum
                   ` (12 subsequent siblings)
  30 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Introduce address_space_is_ram(), a helper to determine whether
a guest physical address resolves to a RAM-backed MemoryRegion within
an AddressSpace.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 include/system/memory.h | 10 ++++++++++
 system/physmem.c        | 11 +++++++++++
 2 files changed, 21 insertions(+)

diff --git a/include/system/memory.h b/include/system/memory.h
index d7b18b632d..7aed255e81 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -2841,6 +2841,16 @@ bool address_space_access_valid(AddressSpace *as, hwaddr addr, hwaddr len,
  */
 bool address_space_is_io(AddressSpace *as, hwaddr addr);
 
+/**
+ * address_space_is_ram: check whether a guest physical address whithin
+ *                       an address space is RAM.
+ *
+ * @as: #AddressSpace to be accessed
+ * @addr: address within that address space
+ */
+
+bool address_space_is_ram(AddressSpace *as, hwaddr addr);
+
 /* address_space_map: map a physical memory region into a host virtual address
  *
  * May map a subset of the requested range, given by and returned in @plen.
diff --git a/system/physmem.c b/system/physmem.c
index 4e26f1a1d4..b67dde80fb 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -3674,6 +3674,17 @@ bool address_space_is_io(AddressSpace *as, hwaddr addr)
     return !(memory_region_is_ram(mr) || memory_region_is_romd(mr));
 }
 
+bool address_space_is_ram(AddressSpace *as, hwaddr addr)
+{
+    MemoryRegion *mr;
+
+    RCU_READ_LOCK_GUARD();
+    mr = address_space_translate(as, addr, &addr, NULL, false,
+                                 MEMTXATTRS_UNSPECIFIED);
+
+    return memory_region_is_ram(mr);
+}
+
 static hwaddr
 flatview_extend_translation(FlatView *fv, hwaddr addr,
                             hwaddr target_len,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (17 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 18/31] system/physmem: Add address_space_is_ram() helper Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-05  0:40   ` Nicolin Chen
                     ` (2 more replies)
  2026-04-15 10:55 ` [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing Shameer Kolothum
                   ` (11 subsequent siblings)
  30 siblings, 3 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

Add support for allocating IOMMUFD hardware queues when the guest
programs the VCMDQ BASE registers.

VCMDQ_EN is part of the VCMDQ_CONFIG register, which is accessed
through the VINTF Page0 region. A subsequent patch maps this region
directly into the guest address space, so QEMU does not trap writes
to VCMDQ_CONFIG.

Since VCMDQ_EN writes are not trapped, QEMU cannot allocate the
hardware queue based on that bit. Instead, allocate the IOMMUFD
hardware queue when the guest writes a VCMDQ BASE register with a
valid RAM-backed address and when CMDQV and VINTF are enabled.

If a hardware queue was previously allocated for the same VCMDQ,
free it before reallocation.

Writes with invalid addresses are ignored.

All allocated VCMDQs are freed when CMDQV or VINTF is disabled.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/tegra241-cmdqv.h | 11 +++++++
 hw/arm/tegra241-cmdqv.c | 70 +++++++++++++++++++++++++++++++++++++++--
 2 files changed, 78 insertions(+), 3 deletions(-)

diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index 88572ad939..039d86374f 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -44,6 +44,7 @@ typedef struct Tegra241CMDQV {
     MemoryRegion mmio_cmdqv;
     qemu_irq irq;
     IOMMUFDVeventq *veventq;
+    IOMMUFDHWqueue *vcmdq[TEGRA241_CMDQV_MAX_CMDQ];
     void *vintf_page0;
 
     /* Register Cache */
@@ -348,6 +349,16 @@ A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(1)
 A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(0)
 A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(1)
 
+static inline bool tegra241_cmdq_enabled(Tegra241CMDQV *cmdqv)
+{
+    return cmdqv->status & R_STATUS_CMDQV_ENABLED_MASK;
+}
+
+static inline bool tegra241_vintf_enabled(Tegra241CMDQV *cmdqv)
+{
+    return cmdqv->vintf_status & R_VINTF0_STATUS_ENABLE_OK_MASK;
+}
+
 const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
 
 #endif /* HW_ARM_TEGRA241_CMDQV_H */
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index cdd941cec9..b5f2f74cf2 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -15,6 +15,66 @@
 #include "tegra241-cmdqv.h"
 #include "trace.h"
 
+static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int index)
+{
+    IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
+    IOMMUFDHWqueue *vcmdq = cmdqv->vcmdq[index];
+
+    if (!vcmdq) {
+        return;
+    }
+    iommufd_backend_free_id(viommu->iommufd, vcmdq->hw_queue_id);
+    g_free(vcmdq);
+    cmdqv->vcmdq[index] = NULL;
+}
+
+static void tegra241_cmdqv_free_all_vcmdq(Tegra241CMDQV *cmdqv)
+{
+    /* Free in reverse order to avoid "resource busy" error */
+    for (int i = (TEGRA241_CMDQV_MAX_CMDQ - 1); i >= 0; i--) {
+        tegra241_cmdqv_free_vcmdq(cmdqv, i);
+    }
+}
+
+static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int index,
+                                       Error **errp)
+{
+    SMMUv3AccelState *accel = cmdqv->s_accel;
+    uint64_t base_mask = (uint64_t)R_VCMDQ0_BASE_L_ADDR_MASK |
+                         (uint64_t)R_VCMDQ0_BASE_H_ADDR_MASK << 32;
+    uint64_t addr = cmdqv->vcmdq_base[index] & base_mask;
+    uint64_t log2 = cmdqv->vcmdq_base[index] & R_VCMDQ0_BASE_L_LOG2SIZE_MASK;
+    uint64_t size = 1ULL << (log2 + 4);
+    IOMMUFDViommu *viommu = accel->viommu;
+    IOMMUFDHWqueue *hw_queue;
+    uint32_t hw_queue_id;
+
+    /* Ignore any invalid address. This may come as part of reset etc. */
+    if (!address_space_is_ram(&address_space_memory, addr) ||
+        !address_space_is_ram(&address_space_memory, addr + size - 1)) {
+        return true;
+    }
+
+    if (!tegra241_cmdq_enabled(cmdqv) || !tegra241_vintf_enabled(cmdqv)) {
+        return true;
+    }
+
+    tegra241_cmdqv_free_vcmdq(cmdqv, index);
+
+    if (!iommufd_backend_alloc_hw_queue(viommu->iommufd, viommu->viommu_id,
+                                        IOMMU_HW_QUEUE_TYPE_TEGRA241_CMDQV,
+                                        index, addr, size, &hw_queue_id,
+                                        errp)) {
+        return false;
+    }
+    hw_queue = g_new(IOMMUFDHWqueue, 1);
+    hw_queue->hw_queue_id = hw_queue_id;
+    hw_queue->viommu = viommu;
+    cmdqv->vcmdq[index] = hw_queue;
+
+    return true;
+}
+
 /*
  * Read a VCMDQ register using VCMDQ0_* offsets.
  *
@@ -63,7 +123,7 @@ static uint64_t tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
  */
 static void tegra241_cmdqv_write_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
                                        int index, uint64_t value,
-                                       unsigned size)
+                                       unsigned size, Error **errp)
 {
     switch (offset0) {
     case A_VCMDQ0_CONS_INDX:
@@ -91,11 +151,13 @@ static void tegra241_cmdqv_write_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
                 (cmdqv->vcmdq_base[index] & 0xffffffff00000000ULL) |
                 (value & 0xffffffffULL);
         }
+        tegra241_cmdqv_setup_vcmdq(cmdqv, index, errp);
         return;
     case A_VCMDQ0_BASE_H:
         cmdqv->vcmdq_base[index] =
             (cmdqv->vcmdq_base[index] & 0xffffffffULL) |
             ((uint64_t)value << 32);
+        tegra241_cmdqv_setup_vcmdq(cmdqv, index, errp);
         return;
     case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
         if (size == 8) {
@@ -204,6 +266,7 @@ static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
                 cmdqv->vintf_status |= R_VINTF0_STATUS_ENABLE_OK_MASK;
             }
         } else {
+            tegra241_cmdqv_free_all_vcmdq(cmdqv);
             tegra241_cmdqv_munmap_vintf_page0(cmdqv, errp);
             cmdqv->vintf_status &= ~R_VINTF0_STATUS_ENABLE_OK_MASK;
         }
@@ -329,6 +392,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
         if (value & R_CONFIG_CMDQV_EN_MASK) {
             cmdqv->status |= R_STATUS_CMDQV_ENABLED_MASK;
         } else {
+            tegra241_cmdqv_free_all_vcmdq(cmdqv);
             cmdqv->status &= ~R_STATUS_CMDQV_ENABLED_MASK;
         }
         break;
@@ -363,7 +427,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
          */
         index = (offset - CMDQV_VCMDQ_PAGE0_BASE) / CMDQV_VCMDQ_STRIDE;
         tegra241_cmdqv_write_vcmdq(cmdqv, offset - index * CMDQV_VCMDQ_STRIDE,
-                                   index, value, size);
+                                   index, value, size, &local_err);
         break;
     case A_VI_VCMDQ0_BASE_L ... A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
         /* Same VINTF-to-VCMDQ translation as VINTF Page0 case above */
@@ -373,7 +437,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
         /* Same decode logic as VCMDQ Page0 case above */
         index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
         tegra241_cmdqv_write_vcmdq(cmdqv, offset - index * CMDQV_VCMDQ_STRIDE,
-                                   index, value, size);
+                                   index, value, size, &local_err);
         break;
     default:
         qemu_log_mask(LOG_UNIMP, "%s unhandled write access at 0x%" PRIx64 "\n",
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (18 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-05  0:50   ` Nicolin Chen
  2026-05-06 12:27   ` Eric Auger
  2026-04-15 10:55 ` [PATCH v4 21/31] memory: Allow RAM device regions to skip IOMMU mapping Shameer Kolothum
                   ` (10 subsequent siblings)
  30 siblings, 2 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Introduce tegra241_cmdqv_vintf_ptr() to route VCMDQ register accesses
through the mmap'd VINTF page0 backing once a hardware queue has been
allocated.

There are two QEMU trapped MMIO apertures for VCMDQ registers:

  - Direct VCMDQ aperture (offset 0x10000)
  - VINTF Page0 (offset 0x30000)

These are hardware aliases: they address the same underlying registers.
A subsequent patch maps the VINTF aperture as a guest-direct RAM region;
in this patch both remain QEMU-trapped.

VCMDQ register accesses operate in one of two mutually exclusive modes,
depending on whether a hardware queue (IOMMU_HW_QUEUE_ALLOC) has been
allocated for the VCMDQ:

Pre-alloc: vintf_ptr is NULL. Both apertures use QEMU's register
cache. Hardware is not yet engaged;

Post-alloc: vintf_ptr is valid. Both QEMU trapped apertures access
registers directly via the mmap'd vintf_page0 pointer, bypassing
the cache. Hardware is the single source of truth.

The pre-to-post-alloc transition is triggered by the BASE register write
that initiates IOMMU_HW_QUEUE_ALLOC. No cache-to-hardware synchronisation
is needed at transition time. The hardware mandated init sequence requires
BASE to be written first; PROD_INDX, CONS_INDX and CONFIG.CMDQ_EN are
programmed only after BASE and are therefore always post-alloc.

Any pre-alloc writes to those registers update only the register cache,
which is discarded at the transition.

CMDQV acceleration only becomes active once the guest enables VINTF and
programs the VCMDQ BASE register. Until then, all VCMDQ accesses are
served from the emulated register cache with no real hardware command
processing. This matches the CMDQV hardware specification: if the logical
CMDQ index does not map to any allocated Virtual CMDQ, "the access is
dropped with no Fault/Interrupt".

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/tegra241-cmdqv.c | 48 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 47 insertions(+), 1 deletion(-)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index b5f2f74cf2..eb619e1134 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -75,17 +75,45 @@ static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int index,
     return true;
 }
 
+static inline uint32_t *tegra241_cmdqv_vintf_ptr(Tegra241CMDQV *cmdqv,
+                                                 int index, hwaddr offset0)
+{
+    if (!cmdqv->vcmdq[index] || !cmdqv->vintf_page0) {
+        return NULL;
+    }
+    return (uint32_t *)(cmdqv->vintf_page0 +
+                        (index * CMDQV_VCMDQ_STRIDE) +
+                        (offset0 - CMDQV_VCMDQ_PAGE0_BASE));
+}
+
 /*
  * Read a VCMDQ register using VCMDQ0_* offsets.
  *
  * The caller normalizes the MMIO offset such that @offset0 always refers
  * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
  *
- * All VCMDQ accesses return cached registers.
+ * If the VCMDQ is allocated and VINTF page0 is mmap'd, read directly
+ * from the VINTF page0 backing. Otherwise, fall back to cached state.
  */
 static uint64_t tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
                                           int index)
 {
+    uint32_t *ptr = tegra241_cmdqv_vintf_ptr(cmdqv, index, offset0);
+
+    if (ptr) {
+        switch (offset0) {
+        case A_VCMDQ0_CONS_INDX:
+        case A_VCMDQ0_PROD_INDX:
+        case A_VCMDQ0_CONFIG:
+        case A_VCMDQ0_STATUS:
+        case A_VCMDQ0_GERROR:
+        case A_VCMDQ0_GERRORN:
+            return *ptr;
+        default:
+            break;
+        }
+    }
+
     switch (offset0) {
     case A_VCMDQ0_CONS_INDX:
         return cmdqv->vcmdq_cons_indx[index];
@@ -120,11 +148,29 @@ static uint64_t tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
  *
  * The caller normalizes the MMIO offset such that @offset0 always refers
  * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
+ *
+ * If the VCMDQ is allocated and VINTF page0 is mmap'd, write directly
+ * to the VINTF page0 backing. Otherwise, update cached state.
  */
 static void tegra241_cmdqv_write_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
                                        int index, uint64_t value,
                                        unsigned size, Error **errp)
 {
+    uint32_t *ptr = tegra241_cmdqv_vintf_ptr(cmdqv, index, offset0);
+
+    if (ptr) {
+        switch (offset0) {
+        case A_VCMDQ0_CONS_INDX:
+        case A_VCMDQ0_PROD_INDX:
+        case A_VCMDQ0_CONFIG:
+        case A_VCMDQ0_GERRORN:
+            *ptr = (uint32_t)value;
+            return;
+        default:
+            break;
+        }
+    }
+
     switch (offset0) {
     case A_VCMDQ0_CONS_INDX:
         cmdqv->vcmdq_cons_indx[index] = (uint32_t)value;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 21/31] memory: Allow RAM device regions to skip IOMMU mapping
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (19 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-06 12:39   ` Eric Auger
  2026-04-15 10:55 ` [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0 into guest MMIO space Shameer Kolothum
                   ` (9 subsequent siblings)
  30 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Some RAM device regions created with memory_region_init_ram_device_ptr()
are not intended to be P2P DMA targets.

The VFIO listener currently treats all RAM device regions as DMA
capable and attempts to map them into the IOMMU. For regions without
dma-buf backing this fails and prints warnings such as:

  IOMMU_IOAS_MAP failed: Bad address, PCI BAR?

Introduce a MemoryRegion flag (ram_device_skip_iommu_map) to mark RAM
device regions that should not be IOMMU mapped. When set, the VFIO
listener skips DMA mapping for that region.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 include/system/memory.h | 2 ++
 hw/vfio/listener.c      | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/include/system/memory.h b/include/system/memory.h
index 7aed255e81..9df15e833a 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -864,6 +864,8 @@ struct MemoryRegion {
 
     /* For devices designed to perform re-entrant IO into their own IO MRs */
     bool disable_reentrancy_guard;
+    /* RAM device region that does not require IOMMU mapping for P2P */
+    bool ram_device_skip_iommu_map;
 };
 
 struct IOMMUMemoryRegion {
diff --git a/hw/vfio/listener.c b/hw/vfio/listener.c
index 960da9e0a9..32d33a740a 100644
--- a/hw/vfio/listener.c
+++ b/hw/vfio/listener.c
@@ -614,6 +614,11 @@ void vfio_container_region_add(VFIOContainer *bcontainer,
         }
     }
 
+    if (memory_region_is_ram_device(section->mr) &&
+        section->mr->ram_device_skip_iommu_map) {
+        return;
+    }
+
     ret = vfio_container_dma_map(bcontainer, iova, int128_get64(llsize),
                                  vaddr, section->readonly, section->mr);
     if (ret) {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0 into guest MMIO space
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (20 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 21/31] memory: Allow RAM device regions to skip IOMMU mapping Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-06 12:44   ` Eric Auger
  2026-04-15 10:55 ` [PATCH v4 23/31] hw/arm/smmuv3-accel: Introduce common helper for veventq read Shameer Kolothum
                   ` (8 subsequent siblings)
  30 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

Once a VCMDQ is allocated, map the mmap'd vintf_page0 region directly
into the guest-visible MMIO space at offset 0x30000 as a RAM-backed
MemoryRegion. This eliminates QEMU trapping for hot-path CONS/PROD
index updates.

After this patch, the two VCMDQ apertures use different access paths:
the direct aperture (0x10000) remains QEMU-trapped and writes via
vintf_ptr, while the VI aperture (0x30000) is a direct guest RAM
mapping. Both paths write to the same underlying vintf_page0 memory,
so no synchronisation between the apertures is needed.

The mapping is installed lazily on first successful VCMDQ hardware
queue allocation and removed when CMDQV or VINTF is disabled.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/tegra241-cmdqv.h |  1 +
 hw/arm/tegra241-cmdqv.c | 37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index 039d86374f..2befa6205e 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -46,6 +46,7 @@ typedef struct Tegra241CMDQV {
     IOMMUFDVeventq *veventq;
     IOMMUFDHWqueue *vcmdq[TEGRA241_CMDQV_MAX_CMDQ];
     void *vintf_page0;
+    MemoryRegion *mr_vintf_page0;
 
     /* Register Cache */
     uint32_t config;
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index eb619e1134..bf989dd51f 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -15,6 +15,40 @@
 #include "tegra241-cmdqv.h"
 #include "trace.h"
 
+static void tegra241_cmdqv_guest_unmap_vintf_page0(Tegra241CMDQV *cmdqv)
+{
+    if (!cmdqv->mr_vintf_page0) {
+        return;
+    }
+
+    memory_region_del_subregion(&cmdqv->mmio_cmdqv, cmdqv->mr_vintf_page0);
+    object_unparent(OBJECT(cmdqv->mr_vintf_page0));
+    g_free(cmdqv->mr_vintf_page0);
+    cmdqv->mr_vintf_page0 = NULL;
+}
+
+static void tegra241_cmdqv_guest_map_vintf_page0(Tegra241CMDQV *cmdqv)
+{
+    char *name;
+
+    if (cmdqv->mr_vintf_page0) {
+        return;
+    }
+
+    name = g_strdup_printf("%s vintf-page0",
+                           memory_region_name(&cmdqv->mmio_cmdqv));
+    cmdqv->mr_vintf_page0 = g_malloc0(sizeof(*cmdqv->mr_vintf_page0));
+    memory_region_init_ram_device_ptr(cmdqv->mr_vintf_page0,
+                                      memory_region_owner(&cmdqv->mmio_cmdqv),
+                                      name, VINTF_PAGE_SIZE,
+                                      cmdqv->vintf_page0);
+    cmdqv->mr_vintf_page0->ram_device_skip_iommu_map = true;
+    memory_region_add_subregion_overlap(&cmdqv->mmio_cmdqv,
+                                        CMDQV_VINTF_PAGE0_BASE,
+                                        cmdqv->mr_vintf_page0, 1);
+    g_free(name);
+}
+
 static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int index)
 {
     IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
@@ -72,6 +106,7 @@ static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int index,
     hw_queue->viommu = viommu;
     cmdqv->vcmdq[index] = hw_queue;
 
+    tegra241_cmdqv_guest_map_vintf_page0(cmdqv);
     return true;
 }
 
@@ -312,6 +347,7 @@ static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
                 cmdqv->vintf_status |= R_VINTF0_STATUS_ENABLE_OK_MASK;
             }
         } else {
+            tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
             tegra241_cmdqv_free_all_vcmdq(cmdqv);
             tegra241_cmdqv_munmap_vintf_page0(cmdqv, errp);
             cmdqv->vintf_status &= ~R_VINTF0_STATUS_ENABLE_OK_MASK;
@@ -438,6 +474,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
         if (value & R_CONFIG_CMDQV_EN_MASK) {
             cmdqv->status |= R_STATUS_CMDQV_ENABLED_MASK;
         } else {
+            tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
             tegra241_cmdqv_free_all_vcmdq(cmdqv);
             cmdqv->status &= ~R_STATUS_CMDQV_ENABLED_MASK;
         }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 23/31] hw/arm/smmuv3-accel: Introduce common helper for veventq read
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (21 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0 into guest MMIO space Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-05  1:07   ` Nicolin Chen
  2026-05-06 12:49   ` Eric Auger
  2026-04-15 10:55 ` [PATCH v4 24/31] hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors Shameer Kolothum
                   ` (7 subsequent siblings)
  30 siblings, 2 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Move the vEVENTQ read and validation logic into a common helper
smmuv3_accel_event_read_validate(). The helper performs the read(),
checks for overflow and short reads, validates the sequence number,
and updates the sequence state.

This helper can be reused for Tegra241 CMDQV vEVENTQ support in a
subsequent patch.

Error handling is slightly adjusted: instead of reporting errors
directly in the read handler, the helper now returns errors via
Error **. Sequence gaps are reported as warnings.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/smmuv3-accel.h       |  2 ++
 hw/arm/smmuv3-accel-stubs.c | 11 ++++++
 hw/arm/smmuv3-accel.c       | 67 ++++++++++++++++++++++---------------
 3 files changed, 53 insertions(+), 27 deletions(-)

diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
index 28bceca061..448f47c0ca 100644
--- a/hw/arm/smmuv3-accel.h
+++ b/hw/arm/smmuv3-accel.h
@@ -71,6 +71,8 @@ bool smmuv3_accel_issue_inv_cmd(SMMUv3State *s, void *cmd, SMMUDevice *sdev,
                                 Error **errp);
 void smmuv3_accel_idr_override(SMMUv3State *s);
 bool smmuv3_accel_alloc_veventq(SMMUv3State *s, Error **errp);
+bool smmuv3_accel_event_read_validate(IOMMUFDVeventq *veventq, uint32_t type,
+                                      void *buf, size_t size, Error **errp);
 void smmuv3_accel_reset(SMMUv3State *s);
 
 #endif /* HW_ARM_SMMUV3_ACCEL_H */
diff --git a/hw/arm/smmuv3-accel-stubs.c b/hw/arm/smmuv3-accel-stubs.c
index c08caa6fa4..e8f08dc833 100644
--- a/hw/arm/smmuv3-accel-stubs.c
+++ b/hw/arm/smmuv3-accel-stubs.c
@@ -41,6 +41,17 @@ void smmuv3_accel_idr_override(SMMUv3State *s)
 {
 }
 
+bool smmuv3_accel_alloc_veventq(SMMUv3State *s, Error **errp)
+{
+    return true;
+}
+
+bool smmuv3_accel_event_read_validate(IOMMUFDVeventq *veventq, uint32_t type,
+                                      void *buf, size_t size, Error **errp)
+{
+    return true;
+}
+
 void smmuv3_accel_reset(SMMUv3State *s)
 {
 }
diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 9068e65e2b..230f608f03 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -436,47 +436,60 @@ bool smmuv3_accel_issue_inv_cmd(SMMUv3State *bs, void *cmd, SMMUDevice *sdev,
                    sizeof(Cmd), &entry_num, cmd, errp);
 }
 
-static void smmuv3_accel_event_read(void *opaque)
+bool smmuv3_accel_event_read_validate(IOMMUFDVeventq *veventq, uint32_t type,
+                                      void *buf, size_t size, Error **errp)
 {
-    SMMUv3State *s = opaque;
-    IOMMUFDVeventq *veventq = s->s_accel->veventq;
-    struct {
-        struct iommufd_vevent_header hdr;
-        struct iommu_vevent_arm_smmuv3 vevent;
-    } buf;
-    enum iommu_veventq_type type = IOMMU_VEVENTQ_TYPE_ARM_SMMUV3;
-    uint32_t id = veventq->veventq_id;
     uint32_t last_seq = veventq->last_event_seq;
+    uint32_t id = veventq->veventq_id;
+    struct iommufd_vevent_header *hdr;
     ssize_t bytes;
 
-    bytes = read(veventq->veventq_fd, &buf, sizeof(buf));
+    bytes = read(veventq->veventq_fd, buf, size);
     if (bytes <= 0) {
         if (errno == EAGAIN || errno == EINTR) {
-            return;
+            return true;
         }
-        error_report_once("vEVENTQ(type %u id %u): read failed (%m)", type, id);
-        return;
+        error_setg(errp, "vEVENTQ(type %u id %u): read failed (%m)", type, id);
+        return false;
     }
-
-    if (bytes == sizeof(buf.hdr) &&
-        (buf.hdr.flags & IOMMU_VEVENTQ_FLAG_LOST_EVENTS)) {
-        error_report_once("vEVENTQ(type %u id %u): overflowed", type, id);
+    hdr = (struct iommufd_vevent_header *)buf;
+    if (bytes == sizeof(*hdr) &&
+        (hdr->flags & IOMMU_VEVENTQ_FLAG_LOST_EVENTS)) {
+        error_setg(errp, "vEVENTQ(type %u id %u): overflowed", type, id);
         veventq->event_start = false;
-        return;
+        return false;
     }
-    if (bytes < sizeof(buf)) {
-        error_report_once("vEVENTQ(type %u id %u): short read(%zd/%zd bytes)",
-                          type, id, bytes, sizeof(buf));
-        return;
+    if (bytes < size) {
+        error_setg(errp, "vEVENTQ(type %u id %u): short read(%zd/%zd bytes)",
+                          type, id, bytes, size);
+        return false;
     }
-
     /* Check sequence in hdr for lost events if any */
-    if (veventq->event_start && (buf.hdr.sequence - last_seq != 1)) {
-        error_report_once("vEVENTQ(type %u id %u): lost %u event(s)",
-                          type, id, buf.hdr.sequence - last_seq - 1);
+    if (veventq->event_start && (hdr->sequence - last_seq != 1)) {
+        warn_report("vEVENTQ(type %u id %u): lost %u event(s)",
+                    type, id, hdr->sequence - last_seq - 1);
     }
-    veventq->last_event_seq = buf.hdr.sequence;
+    veventq->last_event_seq = hdr->sequence;
     veventq->event_start = true;
+    return true;
+}
+
+static void smmuv3_accel_event_read(void *opaque)
+{
+    SMMUv3State *s = opaque;
+    IOMMUFDVeventq *veventq = s->s_accel->veventq;
+    struct {
+        struct iommufd_vevent_header hdr;
+        struct iommu_vevent_arm_smmuv3 vevent;
+    } buf;
+    Error *local_err = NULL;
+
+    if (!smmuv3_accel_event_read_validate(veventq,
+                                          IOMMU_VEVENTQ_TYPE_ARM_SMMUV3, &buf,
+                                          sizeof(buf), &local_err)) {
+        warn_report_err_once(local_err);
+        return;
+    }
     smmuv3_propagate_event(s, (Evt *)&buf.vevent);
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 24/31] hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (22 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 23/31] hw/arm/smmuv3-accel: Introduce common helper for veventq read Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-05  1:13   ` Nicolin Chen
  2026-05-07 16:40   ` Eric Auger
  2026-04-15 10:55 ` [PATCH v4 25/31] hw/arm/tegra241-cmdqv: Add reset handler Shameer Kolothum
                   ` (6 subsequent siblings)
  30 siblings, 2 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Install an event handler on the CMDQV vEVENTQ fd to read and propagate
host received CMDQV errors to the guest.

The handler runs in QEMU’s main loop, using a non-blocking fd registered
via qemu_set_fd_handler().

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/tegra241-cmdqv.c | 55 +++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events     |  1 +
 2 files changed, 56 insertions(+)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index bf989dd51f..9c2fc02b92 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -11,6 +11,7 @@
 #include "qemu/log.h"
 
 #include "hw/arm/smmuv3.h"
+#include "hw/core/irq.h"
 #include "smmuv3-accel.h"
 #include "tegra241-cmdqv.h"
 #include "trace.h"
@@ -534,6 +535,43 @@ out:
     trace_tegra241_cmdqv_write_mmio(offset, value, size);
 }
 
+static void tegra241_cmdqv_event_read(void *opaque)
+{
+    Tegra241CMDQV *cmdqv = opaque;
+    IOMMUFDVeventq *veventq = cmdqv->veventq;
+    struct {
+        struct iommufd_vevent_header hdr;
+        struct iommu_vevent_tegra241_cmdqv vevent;
+    } buf;
+    Error *local_err = NULL;
+
+    if (!smmuv3_accel_event_read_validate(veventq,
+                                          IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV,
+                                          &buf, sizeof(buf), &local_err)) {
+        warn_report_err_once(local_err);
+        return;
+    }
+
+    if (buf.vevent.lvcmdq_err_map[0] || buf.vevent.lvcmdq_err_map[1]) {
+        cmdqv->vintf_cmdq_err_map[0] =
+            buf.vevent.lvcmdq_err_map[0] & 0xffffffff;
+        cmdqv->vintf_cmdq_err_map[1] =
+            (buf.vevent.lvcmdq_err_map[0] >> 32) & 0xffffffff;
+        cmdqv->vintf_cmdq_err_map[2] =
+            buf.vevent.lvcmdq_err_map[1] & 0xffffffff;
+        cmdqv->vintf_cmdq_err_map[3] =
+            (buf.vevent.lvcmdq_err_map[1] >> 32) & 0xffffffff;
+        for (int i = 0; i < 4; i++) {
+            cmdqv->cmdq_err_map[i] = cmdqv->vintf_cmdq_err_map[i];
+        }
+        cmdqv->vi_err_map[0] |= 0x1;
+        qemu_irq_pulse(cmdqv->irq);
+        trace_tegra241_cmdqv_err_map(
+            cmdqv->vintf_cmdq_err_map[3], cmdqv->vintf_cmdq_err_map[2],
+            cmdqv->vintf_cmdq_err_map[1], cmdqv->vintf_cmdq_err_map[0]);
+    }
+}
+
 static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
 {
     SMMUv3AccelState *accel = s->s_accel;
@@ -545,6 +583,7 @@ static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
         return;
     }
     if (veventq) {
+        qemu_set_fd_handler(veventq->veventq_fd, NULL, NULL, NULL);
         close(veventq->veventq_fd);
         iommufd_backend_free_id(viommu->iommufd, veventq->veventq_id);
         g_free(veventq);
@@ -560,6 +599,7 @@ tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
     Tegra241CMDQV *cmdqv = s->s_accel->cmdqv;
     uint32_t viommu_id, veventq_id, veventq_fd;
     IOMMUFDVeventq *veventq;
+    int flags;
 
     if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
                                       IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV,
@@ -577,14 +617,29 @@ tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
         goto free_viommu;
     }
 
+    flags = fcntl(veventq_fd, F_GETFL);
+    if (flags < 0) {
+        error_setg(errp, "Failed to get flags for vEVENTQ fd");
+        goto free_veventq;
+    }
+    if (fcntl(veventq_fd, F_SETFL, O_NONBLOCK | flags) < 0) {
+        error_setg(errp, "Failed to set O_NONBLOCK on vEVENTQ fd");
+        goto free_veventq;
+    }
+
     veventq = g_new(IOMMUFDVeventq, 1);
     veventq->veventq_id = veventq_id;
     veventq->veventq_fd = veventq_fd;
     cmdqv->veventq = veventq;
 
+    /* Set up event handler for veventq fd */
+    qemu_set_fd_handler(veventq_fd, tegra241_cmdqv_event_read, NULL, cmdqv);
     *out_viommu_id = viommu_id;
     return true;
 
+free_veventq:
+    close(veventq_fd);
+    iommufd_backend_free_id(idev->iommufd, veventq_id);
 free_viommu:
     iommufd_backend_free_id(idev->iommufd, viommu_id);
     return false;
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 8c61d66a26..fd6441bfa7 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -75,6 +75,7 @@ smmuv3_accel_install_ste(uint32_t vsid, const char * type, uint32_t hwpt_id) "vS
 # tegra241-cmdqv
 tegra241_cmdqv_read_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
 tegra241_cmdqv_write_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
+tegra241_cmdqv_err_map(uint32_t map3, uint32_t map2, uint32_t map1, uint32_t map0) "hw irq received. error (hex) maps: %04X:%04X:%04X:%04X"
 
 # strongarm.c
 strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 25/31] hw/arm/tegra241-cmdqv: Add reset handler
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (23 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 24/31] hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-07 16:51   ` Eric Auger
  2026-05-07 17:03   ` Eric Auger
  2026-04-15 10:55 ` [PATCH v4 26/31] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size Shameer Kolothum
                   ` (5 subsequent siblings)
  30 siblings, 2 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

Introduce a reset handler for the Tegra241 CMDQV and initialize its
register state.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/tegra241-cmdqv.h |  2 ++
 hw/arm/tegra241-cmdqv.c | 50 +++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events     |  1 +
 3 files changed, 53 insertions(+)

diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index 2befa6205e..b2a444daef 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -79,6 +79,8 @@ FIELD(CONFIG, CMDQ_MAX_CLK_BATCH, 4, 8)
 FIELD(CONFIG, CMDQ_MAX_CMD_BATCH, 12, 8)
 FIELD(CONFIG, CONS_DRAM_EN, 20, 1)
 
+#define V_CONFIG_RESET 0x00020403
+
 REG32(PARAM, 0x4)
 FIELD(PARAM, CMDQV_VER, 0, 4)
 FIELD(PARAM, CMDQV_NUM_CMDQ_LOG2, 4, 4)
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 9c2fc02b92..af68add2f0 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -8,6 +8,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/error-report.h"
 #include "qemu/log.h"
 
 #include "hw/arm/smmuv3.h"
@@ -645,8 +646,57 @@ free_viommu:
     return false;
 }
 
+static void tegra241_cmdqv_init_regs(SMMUv3State *s, Tegra241CMDQV *cmdqv)
+{
+    int i;
+
+    cmdqv->config = V_CONFIG_RESET;
+    cmdqv->param = FIELD_DP32(0, PARAM, CMDQV_VER, CMDQV_VER);
+    cmdqv->param = FIELD_DP32(cmdqv->param, PARAM, CMDQV_NUM_CMDQ_LOG2,
+                              CMDQV_NUM_CMDQ_LOG2);
+    cmdqv->param = FIELD_DP32(cmdqv->param, PARAM, CMDQV_NUM_SID_PER_VI_LOG2,
+                              CMDQV_NUM_SID_PER_VI_LOG2);
+    trace_tegra241_cmdqv_init_regs(cmdqv->param);
+    cmdqv->status = R_STATUS_CMDQV_ENABLED_MASK;
+    for (i = 0; i < 2; i++) {
+        cmdqv->vi_err_map[i] = 0;
+        cmdqv->vi_int_mask[i] = 0;
+        cmdqv->cmdq_err_map[i] = 0;
+    }
+    cmdqv->cmdq_err_map[2] = 0;
+    cmdqv->cmdq_err_map[3] = 0;
+    cmdqv->vintf_config = 0;
+    cmdqv->vintf_status = 0;
+    for (i = 0; i < 4; i++) {
+        cmdqv->vintf_cmdq_err_map[i] = 0;
+    }
+    for (i = 0; i < TEGRA241_CMDQV_MAX_CMDQ; i++) {
+        cmdqv->cmdq_alloc_map[i] = 0;
+        cmdqv->vcmdq_cons_indx[i] = 0;
+        cmdqv->vcmdq_prod_indx[i] = 0;
+        cmdqv->vcmdq_config[i] = 0;
+        cmdqv->vcmdq_status[i] = 0;
+        cmdqv->vcmdq_gerror[i] = 0;
+        cmdqv->vcmdq_gerrorn[i] = 0;
+        cmdqv->vcmdq_base[i] = 0;
+        cmdqv->vcmdq_cons_indx_base[i] = 0;
+    }
+}
+
 static void tegra241_cmdqv_reset(SMMUv3State *s)
 {
+    SMMUv3AccelState *accel = s->s_accel;
+    Tegra241CMDQV *cmdqv = accel->cmdqv;
+
+    if (!cmdqv) {
+        return;
+    }
+
+    tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
+    tegra241_cmdqv_munmap_vintf_page0(cmdqv, NULL);
+    tegra241_cmdqv_free_all_vcmdq(cmdqv);
+
+    tegra241_cmdqv_init_regs(s, cmdqv);
 }
 
 static const MemoryRegionOps mmio_cmdqv_ops = {
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index fd6441bfa7..6f602b9eda 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -76,6 +76,7 @@ smmuv3_accel_install_ste(uint32_t vsid, const char * type, uint32_t hwpt_id) "vS
 tegra241_cmdqv_read_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
 tegra241_cmdqv_write_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
 tegra241_cmdqv_err_map(uint32_t map3, uint32_t map2, uint32_t map1, uint32_t map0) "hw irq received. error (hex) maps: %04X:%04X:%04X:%04X"
+tegra241_cmdqv_init_regs(uint32_t param) "hw info received. param: 0x%04X"
 
 # strongarm.c
 strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 26/31] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (24 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 25/31] hw/arm/tegra241-cmdqv: Add reset handler Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-05  1:26   ` Nicolin Chen
  2026-05-07 17:23   ` Eric Auger
  2026-04-15 10:55 ` [PATCH v4 27/31] hw/arm/smmuv3: Add per-device identifier property Shameer Kolothum
                   ` (4 subsequent siblings)
  30 siblings, 2 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

CMDQV HW reads guest queue memory in its host physical address setup via
IOMMUFD. This requires the guest queue memory is not only contiguous in
guest PA space but also in host PA space. With Tegra241 CMDQV enabled, we
must only advertise a CMDQS that the host can safely back with physically
contiguous memory. Allowing a queue larger than the host page size could
cause the hardware to DMA across page boundaries, leading to faults.

Walk the RAMBlock list to find the smallest memory-backend page size, then
limit IDR1.CMDQS so the guest cannot configure a command queue that exceeds
that contiguous backing. Fall back to the real host page size if no
memory-backend RAM blocks are found.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/tegra241-cmdqv.c | 41 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index af68add2f0..2870886783 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -14,6 +14,9 @@
 #include "hw/arm/smmuv3.h"
 #include "hw/core/irq.h"
 #include "smmuv3-accel.h"
+#include "smmuv3-internal.h"
+#include "system/ramblock.h"
+#include "system/ramlist.h"
 #include "tegra241-cmdqv.h"
 #include "trace.h"
 
@@ -646,9 +649,38 @@ free_viommu:
     return false;
 }
 
+static size_t tegra241_cmdqv_min_ram_pagesize(void)
+{
+    RAMBlock *rb;
+    size_t pg, min_pg = SIZE_MAX;
+
+    RAMBLOCK_FOREACH(rb) {
+        MemoryRegion *mr = rb->mr;
+
+        /* Only consider real RAM regions */
+        if (!mr || !memory_region_is_ram(mr)) {
+            continue;
+        }
+
+        /* Skip RAM regions that are not backed by a memory-backend */
+        if (!object_dynamic_cast(mr->owner, TYPE_MEMORY_BACKEND)) {
+            continue;
+        }
+
+        pg = qemu_ram_pagesize(rb);
+        if (pg && pg < min_pg) {
+            min_pg = pg;
+        }
+    }
+
+    return (min_pg == SIZE_MAX) ? qemu_real_host_page_size() : min_pg;
+}
+
 static void tegra241_cmdqv_init_regs(SMMUv3State *s, Tegra241CMDQV *cmdqv)
 {
     int i;
+    size_t pgsize;
+    uint32_t val;
 
     cmdqv->config = V_CONFIG_RESET;
     cmdqv->param = FIELD_DP32(0, PARAM, CMDQV_VER, CMDQV_VER);
@@ -681,6 +713,15 @@ static void tegra241_cmdqv_init_regs(SMMUv3State *s, Tegra241CMDQV *cmdqv)
         cmdqv->vcmdq_base[i] = 0;
         cmdqv->vcmdq_cons_indx_base[i] = 0;
     }
+
+    /*
+     * CMDQ must not cross a physical RAM backend page. Adjust CMDQS so the
+     * queue fits entirely within the smallest backend page size, ensuring
+     * the command queue is physically contiguous in host memory.
+     */
+    pgsize = tegra241_cmdqv_min_ram_pagesize();
+    val = FIELD_EX32(s->idr[1], IDR1, CMDQS);
+    s->idr[1] = FIELD_DP32(s->idr[1], IDR1, CMDQS, MIN(ctz64(pgsize) - 4, val));
 }
 
 static void tegra241_cmdqv_reset(SMMUv3State *s)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 27/31] hw/arm/smmuv3: Add per-device identifier property
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (25 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 26/31] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-05  1:30   ` Nicolin Chen
  2026-04-15 10:55 ` [PATCH v4 28/31] hw/arm/smmuv3-accel: Introduce helper to query CMDQV type Shameer Kolothum
                   ` (3 subsequent siblings)
  30 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Add an "identifier" property to the SMMUv3 device and use it when
building the ACPI IORT SMMUv3 node Identifier field.

This avoids relying on device enumeration order and provides a stable
per-device identifier. A subsequent patch will use the same identifier
when generating the DSDT description for Tegra241 CMDQV, ensuring that
the IORT and DSDT entries refer to the same SMMUv3 instance.

The identifier is assigned at pre-plug time, accounting for the ITS Group
node that build_iort() places before SMMUv3 nodes in the IORT table, so
that identifiers are globally unique across all IORT nodes.

No functional change: IORT blob content for bios-tables qtest is identical
to before.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 include/hw/arm/smmuv3.h  |  1 +
 hw/arm/smmuv3.c          |  2 ++
 hw/arm/virt-acpi-build.c |  5 ++++-
 hw/arm/virt.c            | 12 ++++++++++++
 4 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
index aa6a79237a..0fce564619 100644
--- a/include/hw/arm/smmuv3.h
+++ b/include/hw/arm/smmuv3.h
@@ -64,6 +64,7 @@ struct SMMUv3State {
     qemu_irq     irq[4];
     QemuMutex mutex;
     char *stage;
+    uint8_t identifier;
 
     /* SMMU has HW accelerator support for nested S1 + s2 */
     bool accel;
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 682d89c3ea..1d6fdd776c 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -2144,6 +2144,8 @@ static const Property smmuv3_properties[] = {
      * Defaults to stage 1
      */
     DEFINE_PROP_STRING("stage", SMMUv3State, stage),
+    /* Identifier used for ACPI IORT SMMUv3 (and DSDT for CMDQV) generation */
+    DEFINE_PROP_UINT8("identifier", SMMUv3State, identifier, 0),
     DEFINE_PROP_BOOL("accel", SMMUv3State, accel, false),
     /* GPA of MSI doorbell, for SMMUv3 accel use. */
     DEFINE_PROP_UINT64("msi-gpa", SMMUv3State, msi_gpa, 0),
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 521443de87..65ccc96349 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -342,6 +342,7 @@ static int iort_idmap_compare(gconstpointer a, gconstpointer b)
 typedef struct AcpiIortSMMUv3Dev {
     int irq;
     hwaddr base;
+    uint8_t id;
     GArray *rc_smmu_idmaps;
     /* Offset of the SMMUv3 IORT Node relative to the start of the IORT */
     size_t offset;
@@ -404,6 +405,7 @@ static int populate_smmuv3_dev(VirtMachineState *vms, GArray *sdev_blob)
                                                &error_abort));
         sdev.accel = object_property_get_bool(obj, "accel", &error_abort);
         sdev.ats = smmuv3_ats_enabled(ARM_SMMUV3(obj));
+        sdev.id = object_property_get_uint(obj, "identifier", &error_abort);
         pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
         sbdev = SYS_BUS_DEVICE(obj);
         sdev.base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
@@ -630,7 +632,8 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
                      (ID_MAPPING_ENTRY_SIZE * smmu_mapping_count);
         build_append_int_noprefix(table_data, node_size, 2); /* Length */
         build_append_int_noprefix(table_data, 4, 1); /* Revision */
-        build_append_int_noprefix(table_data, id++, 4); /* Identifier */
+        build_append_int_noprefix(table_data, sdev->id, 4); /* Identifier */
+        id++;  /* advance shared counter for RC/RMR node uniqueness */
         /* Number of ID mappings */
         build_append_int_noprefix(table_data, smmu_mapping_count, 4);
         /* Reference to ID Array */
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 6c5e51af37..22d6b9eec9 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -240,6 +240,9 @@ static MemMapEntry extended_memmap[] = {
     /* Any CXL Fixed memory windows come here */
 };
 
+/* Counts SMMUv3 devices plugged; used to assign stable IORT identifiers */
+static uint8_t smmuv3_dev_id;
+
 static const int a15irqmap[] = {
     [VIRT_UART0] = 1,
     [VIRT_RTC] = 2,
@@ -3226,6 +3229,15 @@ static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
                                      OBJECT(vms->sysmem), NULL);
             object_property_set_link(OBJECT(dev), "secure-memory",
                                      OBJECT(vms->secure_sysmem), NULL);
+            /*
+             * In build_iort(), the ITS node(id=0) precedes SMMUv3 nodes
+             * when present. Account for it so this SMMUv3's identifier
+             * is globally unique across all IORT nodes.
+             */
+            uint8_t its_offset = (vms->msi_controller == VIRT_MSI_CTRL_ITS)
+                                  ? 1 : 0;
+            object_property_set_uint(OBJECT(dev), "identifier",
+                                     its_offset + smmuv3_dev_id++, NULL);
         }
         if (object_property_get_bool(OBJECT(dev), "accel", &error_abort)) {
             hwaddr db_start = 0;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 28/31] hw/arm/smmuv3-accel: Introduce helper to query CMDQV type
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (26 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 27/31] hw/arm/smmuv3: Add per-device identifier property Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-05  1:32   ` Nicolin Chen
  2026-04-15 10:55 ` [PATCH v4 29/31] hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT Shameer Kolothum
                   ` (2 subsequent siblings)
  30 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Introduce a SMMUv3AccelCmdqvType enum and a helper to query the
CMDQV implementation type associated with an accelerated SMMUv3
instance.

A subsequent patch will use this helper when generating the
Tegra241 CMDQV DSDT.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/smmuv3-accel.h       |  7 +++++++
 hw/arm/smmuv3-accel-stubs.c |  5 +++++
 hw/arm/smmuv3-accel.c       | 12 ++++++++++++
 hw/arm/tegra241-cmdqv.c     |  6 ++++++
 4 files changed, 30 insertions(+)

diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
index 448f47c0ca..3ed94ed05c 100644
--- a/hw/arm/smmuv3-accel.h
+++ b/hw/arm/smmuv3-accel.h
@@ -16,6 +16,11 @@
 #include <linux/iommufd.h>
 #endif
 
+typedef enum SMMUv3AccelCmdqvType {
+    SMMUV3_CMDQV_NONE = 0,
+    SMMUV3_CMDQV_TEGRA241,
+} SMMUv3AccelCmdqvType;
+
 /*
  * CMDQ-Virtualization (CMDQV) hardware support, extends the SMMUv3 to
  * support multiple VCMDQs with virtualization capabilities.
@@ -29,6 +34,7 @@ typedef struct SMMUv3AccelCmdqvOps {
                          uint32_t *out_viommu_id,
                          Error **errp);
     void (*free_viommu)(SMMUv3State *s);
+    SMMUv3AccelCmdqvType (*get_type)(void);
     void (*reset)(SMMUv3State *s);
 } SMMUv3AccelCmdqvOps;
 
@@ -74,5 +80,6 @@ bool smmuv3_accel_alloc_veventq(SMMUv3State *s, Error **errp);
 bool smmuv3_accel_event_read_validate(IOMMUFDVeventq *veventq, uint32_t type,
                                       void *buf, size_t size, Error **errp);
 void smmuv3_accel_reset(SMMUv3State *s);
+SMMUv3AccelCmdqvType smmuv3_accel_cmdqv_type(Object *obj);
 
 #endif /* HW_ARM_SMMUV3_ACCEL_H */
diff --git a/hw/arm/smmuv3-accel-stubs.c b/hw/arm/smmuv3-accel-stubs.c
index e8f08dc833..08de01d909 100644
--- a/hw/arm/smmuv3-accel-stubs.c
+++ b/hw/arm/smmuv3-accel-stubs.c
@@ -55,3 +55,8 @@ bool smmuv3_accel_event_read_validate(IOMMUFDVeventq *veventq, uint32_t type,
 void smmuv3_accel_reset(SMMUv3State *s)
 {
 }
+
+SMMUv3AccelCmdqvType smmuv3_accel_cmdqv_type(Object *obj)
+{
+    return SMMUV3_CMDQV_NONE;
+}
diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 230f608f03..a58815ded2 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -1049,6 +1049,18 @@ static void smmuv3_accel_as_init(SMMUv3State *s)
     address_space_init(shared_as_sysmem, &root, "smmuv3-accel-as-sysmem");
 }
 
+SMMUv3AccelCmdqvType smmuv3_accel_cmdqv_type(Object *obj)
+{
+    SMMUv3State *s = ARM_SMMUV3(obj);
+    SMMUv3AccelState *accel = s->s_accel;
+
+    if (!accel || !accel->cmdqv_ops || !accel->cmdqv_ops->get_type) {
+        return SMMUV3_CMDQV_NONE;
+    }
+
+    return accel->cmdqv_ops->get_type();
+}
+
 bool smmuv3_accel_init(SMMUv3State *s, Error **errp)
 {
     SMMUState *bs = ARM_SMMU(s);
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 2870886783..71f89abcb4 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -762,6 +762,11 @@ static bool tegra241_cmdqv_init(SMMUv3State *s, Error **errp)
     return true;
 }
 
+static SMMUv3AccelCmdqvType tegra241_cmdqv_get_type(void)
+{
+    return SMMUV3_CMDQV_TEGRA241;
+}
+
 static bool tegra241_cmdqv_probe(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
                                  Error **errp)
 {
@@ -802,6 +807,7 @@ static const SMMUv3AccelCmdqvOps tegra241_cmdqv_ops = {
     .init = tegra241_cmdqv_init,
     .alloc_viommu = tegra241_cmdqv_alloc_viommu,
     .free_viommu = tegra241_cmdqv_free_viommu,
+    .get_type = tegra241_cmdqv_get_type,
     .reset = tegra241_cmdqv_reset,
 };
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 29/31] hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (27 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 28/31] hw/arm/smmuv3-accel: Introduce helper to query CMDQV type Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-07 17:32   ` Eric Auger
  2026-04-15 10:55 ` [PATCH v4 30/31] hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active Shameer Kolothum
  2026-04-15 10:55 ` [PATCH v4 31/31] hw/arm/smmuv3: Add cmdqv property for SMMUv3 device Shameer Kolothum
  30 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

From: Nicolin Chen <nicolinc@nvidia.com>

Add ACPI DSDT support for Tegra241 CMDQV when the SMMUv3 instance is
created with tegra241-cmdqv.

The SMMUv3 device identifier is used as the ACPI _UID. This matches
the Identifier field of the corresponding SMMUv3 IORT node, allowing
the CMDQV DSDT device to be correctly associated with its SMMU.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/virt-acpi-build.c | 52 ++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events      |  1 +
 2 files changed, 53 insertions(+)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 65ccc96349..fbc793d06e 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -65,6 +65,9 @@
 #include "target/arm/cpu.h"
 #include "target/arm/multiprocessing.h"
 
+#include "smmuv3-accel.h"
+#include "tegra241-cmdqv.h"
+
 #define ARM_SPI_BASE 32
 
 #define ACPI_BUILD_TABLE_SIZE             0x20000
@@ -1114,6 +1117,51 @@ static void build_fadt_rev6(GArray *table_data, BIOSLinker *linker,
     build_fadt(table_data, linker, &fadt, vms->oem_id, vms->oem_table_id);
 }
 
+static void acpi_dsdt_add_tegra241_cmdqv(Aml *scope, VirtMachineState *vms)
+{
+    for (int i = 0; i < vms->smmuv3_devices->len; i++) {
+        Object *obj = OBJECT(g_ptr_array_index(vms->smmuv3_devices, i));
+        PlatformBusDevice *pbus;
+        Aml *dev, *crs, *addr;
+        SysBusDevice *sbdev;
+        hwaddr base;
+        uint32_t id;
+        int irq;
+
+        if (smmuv3_accel_cmdqv_type(obj) != SMMUV3_CMDQV_TEGRA241) {
+            continue;
+        }
+        id = object_property_get_uint(obj, "identifier", &error_abort);
+        pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
+        sbdev = SYS_BUS_DEVICE(obj);
+        base = platform_bus_get_mmio_addr(pbus, sbdev, 1);
+        base += vms->memmap[VIRT_PLATFORM_BUS].base;
+        irq = platform_bus_get_irqn(pbus, sbdev, NUM_SMMU_IRQS);
+        irq += vms->irqmap[VIRT_PLATFORM_BUS];
+        irq += ARM_SPI_BASE;
+
+        dev = aml_device("CV%.02u", id);
+        aml_append(dev, aml_name_decl("_HID", aml_string("NVDA200C")));
+        aml_append(dev, aml_name_decl("_UID", aml_int(id)));
+        aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
+
+        crs = aml_resource_template();
+        addr = aml_qword_memory(AML_POS_DECODE, AML_MIN_FIXED, AML_MAX_FIXED,
+                                AML_CACHEABLE, AML_READ_WRITE, 0x0, base,
+                                base + TEGRA241_CMDQV_IO_LEN - 0x1, 0x0,
+                                TEGRA241_CMDQV_IO_LEN);
+        aml_append(crs, addr);
+        aml_append(crs, aml_interrupt(AML_CONSUMER, AML_EDGE,
+                                      AML_ACTIVE_HIGH, AML_EXCLUSIVE,
+                                      (uint32_t *)&irq, 1));
+        aml_append(dev, aml_name_decl("_CRS", crs));
+
+        aml_append(scope, dev);
+
+        trace_virt_acpi_dsdt_tegra241_cmdqv(id, base, irq);
+    }
+}
+
 /* DSDT */
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
@@ -1178,6 +1226,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     acpi_dsdt_add_tpm(scope, vms);
 #endif
 
+    if (!vms->legacy_smmuv3_present) {
+        acpi_dsdt_add_tegra241_cmdqv(scope, vms);
+    }
+
     aml_append(dsdt, scope);
 
     pci0_scope = aml_scope("\\_SB.PCI0");
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 6f602b9eda..e5e4e93324 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -9,6 +9,7 @@ omap1_lpg_led(const char *onoff) "omap1 LPG: LED is %s"
 
 # virt-acpi-build.c
 virt_acpi_setup(void) "No fw cfg or ACPI disabled. Bailing out."
+virt_acpi_dsdt_tegra241_cmdqv(int smmu_id, uint64_t base, uint32_t irq) "DSDT: add cmdqv node for (id=%d), base=0x%" PRIx64 ", irq=%d"
 
 # smmu-common.c
 smmu_add_mr(const char *name) "%s"
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 30/31] hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (28 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 29/31] hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-05  1:35   ` Nicolin Chen
  2026-05-07 17:36   ` Eric Auger
  2026-04-15 10:55 ` [PATCH v4 31/31] hw/arm/smmuv3: Add cmdqv property for SMMUv3 device Shameer Kolothum
  30 siblings, 2 replies; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

When CMDQV is active, the first cold-plugged VFIO device establishes the
viommu to host SMMUv3 association. Block its hot-unplug to preserve this
association and the guest's boot time CMDQV configuration.

Also abort at machine_done if cmdqv=on is requested but no cold-plugged
VFIO device was present to initialize it.

Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/smmuv3-accel.h |  1 +
 hw/arm/smmuv3-accel.c | 12 ++++++++++++
 hw/arm/smmuv3.c       |  6 ++++++
 3 files changed, 19 insertions(+)

diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
index 3ed94ed05c..c4441d5b3f 100644
--- a/hw/arm/smmuv3-accel.h
+++ b/hw/arm/smmuv3-accel.h
@@ -65,6 +65,7 @@ typedef struct SMMUv3AccelDevice {
     IOMMUFDVdev *vdev;
     QLIST_ENTRY(SMMUv3AccelDevice) next;
     SMMUv3AccelState *s_accel;
+    Error *unplug_blocker; /* set when CMDQV is active to block hot-unplug */
 } SMMUv3AccelDevice;
 
 bool smmuv3_accel_init(SMMUv3State *s, Error **errp);
diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index a58815ded2..f381702a08 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -754,6 +754,18 @@ static bool smmuv3_accel_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
         return false;
     }
 
+    /*
+     * CMDQV is active: block hot-unplug of the device that established the
+     * viommu association. Removing it would cause the vIOMMU to host SMMUv3
+     * association be changed via device hot-plug.
+     */
+    if (s->s_accel->cmdqv_ops) {
+        PCIDevice *pdev = pci_find_device(bus, pci_bus_num(bus), devfn);
+        error_setg(&accel_dev->unplug_blocker,
+                   "CMDQV is active: removing the device that established the "
+                   "viommu association would break the guest CMDQV");
+        qdev_add_unplug_blocker(DEVICE(pdev), accel_dev->unplug_blocker);
+    }
 done:
     accel_dev->idev = idev;
     accel_dev->s_accel = s->s_accel;
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 1d6fdd776c..c9ff6298f5 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -2020,6 +2020,12 @@ static void smmuv3_machine_done(Notifier *notifier, void *data)
                      "at least one cold-plugged VFIO device");
         exit(1);
     }
+
+    if (s->cmdqv == ON_OFF_AUTO_ON && !accel->cmdqv) {
+        error_report("arm-smmuv3 cmdqv=on requires at least one cold-plugged "
+                     "VFIO device");
+        exit(1);
+    }
 }
 
 static void smmu_realize(DeviceState *d, Error **errp)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v4 31/31] hw/arm/smmuv3: Add cmdqv property for SMMUv3 device
  2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
                   ` (29 preceding siblings ...)
  2026-04-15 10:55 ` [PATCH v4 30/31] hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active Shameer Kolothum
@ 2026-04-15 10:55 ` Shameer Kolothum
  2026-05-07 17:28   ` Eric Auger
  30 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum @ 2026-04-15 10:55 UTC (permalink / raw)
  To: qemu-arm, qemu-devel
  Cc: eric.auger, peter.maydell, clg, alex, nicolinc, nathanc, mochs,
	jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju, phrdina,
	skolothumtho

Introduce a "cmdqv" property to enable Tegra241 CMDQV support.
This is only enabled for accelerated SMMUv3 devices.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/smmuv3.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index c9ff6298f5..51b7d01da5 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1993,6 +1993,10 @@ static bool smmu_validate_property(SMMUv3State *s, Error **errp)
             error_setg(errp, "ssidsize can only be set if accel=on");
             return false;
         }
+        if (s->cmdqv == ON_OFF_AUTO_ON) {
+            error_setg(errp, "cmdqv can only be enabled if accel=on");
+            return false;
+        }
         return true;
     }
 
@@ -2161,6 +2165,7 @@ static const Property smmuv3_properties[] = {
     DEFINE_PROP_OAS_MODE("oas", SMMUv3State, oas, OAS_MODE_AUTO),
     DEFINE_PROP_SSIDSIZE_MODE("ssidsize", SMMUv3State, ssidsize,
                               SSID_SIZE_MODE_AUTO),
+    DEFINE_PROP_ON_OFF_AUTO("cmdqv", SMMUv3State, cmdqv, ON_OFF_AUTO_AUTO),
 };
 
 static void smmuv3_instance_init(Object *obj)
@@ -2200,6 +2205,8 @@ static void smmuv3_class_init(ObjectClass *klass, const void *data)
         "Valid range is 0-20, where 0 disables SubstreamID support. "
         "Defaults to auto. A value greater than 0 is required to enable "
         "PASID support.");
+    object_class_property_set_description(klass, "cmdqv",
+        "Enable/disable CMDQ-Virtualisation support (for accel=on)");
 }
 
 static int smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 05/31] system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq
  2026-04-15 10:55 ` [PATCH v4 05/31] system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq Shameer Kolothum
@ 2026-05-04 15:00   ` Eric Auger
  2026-05-04 18:16   ` Nicolin Chen
  1 sibling, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-04 15:00 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina



On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> The viommu field is assigned but never used. Callers freeing the
> veventq already have access to the IOMMUFDViommu object through other
> references, so this field is redundant.
>
> Removing it also simplifies upcoming changes where veventq is
> allocated based on the viommu id before the IOMMUFDViommu object is
> created (e.g. vendor CMDQV-based veventq allocation).
>
> No functional change.
>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  include/system/iommufd.h | 1 -
>  hw/arm/smmuv3-accel.c    | 1 -
>  2 files changed, 2 deletions(-)
>
> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
> index 38cfceca84..b6599521b8 100644
> --- a/include/system/iommufd.h
> +++ b/include/system/iommufd.h
> @@ -58,7 +58,6 @@ typedef struct IOMMUFDVdev {
>  
>  /* Virtual event queue interface for a vIOMMU */
>  typedef struct IOMMUFDVeventq {
> -    IOMMUFDViommu *viommu;
>      uint32_t veventq_id;
>      uint32_t veventq_fd;
>      uint32_t last_event_seq; /* Sequence number of last processed event */
> diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> index c356ff9708..f65e654adf 100644
> --- a/hw/arm/smmuv3-accel.c
> +++ b/hw/arm/smmuv3-accel.c
> @@ -549,7 +549,6 @@ bool smmuv3_accel_alloc_veventq(SMMUv3State *s, Error **errp)
>      veventq = g_new0(IOMMUFDVeventq, 1);
>      veventq->veventq_id = veventq_id;
>      veventq->veventq_fd = veventq_fd;
> -    veventq->viommu = accel->viommu;
>      accel->veventq = veventq;
>  
>      /* Set up event handler for veventq fd */



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 06/31] hw/arm/smmuv3-accel: Introduce CMDQV ops interface
  2026-04-15 10:55 ` [PATCH v4 06/31] hw/arm/smmuv3-accel: Introduce CMDQV ops interface Shameer Kolothum
@ 2026-05-04 15:19   ` Eric Auger
  2026-05-04 18:28   ` Nicolin Chen
  1 sibling, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-04 15:19 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina

Hi Shameer,

On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> Command Queue Virtualization (CMDQV) is a hardware extension available
> on certain platforms that allows the SMMUv3 command queue to be
> virtualized and passed through to a VM, improving performance.
>
> For example, NVIDIA Tegra241 implements CMDQV to support virtualization
> of multiple command queues (VCMDQs).
>
> The term CMDQV is used here generically to refer to any platform that
> provides hardware support to virtualize the SMMUv3 command queue.
>
> CMDQV support is a specialization of the IOMMUFD-backed accelerated
> SMMUv3 path. Introduce an ops interface to factor out CMDQV-specific
> probe, initialization, and vIOMMU allocation logic from the base
> implementation. The ops pointer and associated state are stored in
> the accelerated SMMUv3 state.
>
> This provides an extensible design to support future vendor-specific
> CMDQV implementations.
>
> No functional change.
>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/smmuv3-accel.h | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
>
> diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
> index 7b4a0be000..86301afcb4 100644
> --- a/hw/arm/smmuv3-accel.h
> +++ b/hw/arm/smmuv3-accel.h
> @@ -10,11 +10,28 @@
>  #define HW_ARM_SMMUV3_ACCEL_H
>  
>  #include "hw/arm/smmu-common.h"
> +#include "hw/arm/smmuv3.h"
>  #include "system/iommufd.h"
>  #ifdef CONFIG_LINUX
>  #include <linux/iommufd.h>
>  #endif
>  
> +/*
> + * CMDQ-Virtualization (CMDQV) hardware support, extends the SMMUv3 to
> + * support multiple VCMDQs with virtualization capabilities.
> + * CMDQV specific behavior is factored behind this ops interface.
I would doc-comment which of those ops are mandated. For instance it
looks alloc_viommu is mandatory if the the whole ops exists. By the way,
it looks strange that !cmdqv_ops->free_viommu does not cause any assert
in 8/31 because we may not free the viommu_id if it does not exist?
Can't we enforce each mandated ops is implemented?

Thanks

Eric
> + */
> +typedef struct SMMUv3AccelCmdqvOps {
> +    bool (*probe)(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev, Error **errp);
> +    bool (*init)(SMMUv3State *s, Error **errp);
> +    bool (*alloc_viommu)(SMMUv3State *s,
> +                         HostIOMMUDeviceIOMMUFD *idev,
> +                         uint32_t *out_viommu_id,
> +                         Error **errp);
> +    void (*free_viommu)(SMMUv3State *s);
> +    void (*reset)(SMMUv3State *s);
> +} SMMUv3AccelCmdqvOps;
> +
>  /*
>   * Represents an accelerated SMMU instance backed by an iommufd vIOMMU object.
>   * Holds bypass and abort proxy HWPT IDs used for device attachment.
> @@ -27,6 +44,7 @@ typedef struct SMMUv3AccelState {
>      QLIST_HEAD(, SMMUv3AccelDevice) device_list;
>      bool auto_mode;
>      bool auto_finalised;
> +    const SMMUv3AccelCmdqvOps *cmdqv_ops;
>  } SMMUv3AccelState;
>  
>  typedef struct SMMUS1Hwpt {



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 07/31] hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub
  2026-04-15 10:55 ` [PATCH v4 07/31] hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub Shameer Kolothum
@ 2026-05-04 15:19   ` Eric Auger
  2026-05-04 18:23   ` Nicolin Chen
  1 sibling, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-04 15:19 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina

Hi Shameer,

On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> Introduce a Tegra241 CMDQV backend that plugs into the SMMUv3 accelerated
> CMDQV ops interface.
>
> This patch wires up the Tegra241 CMDQV backend and provides a stub
> implementation for CMDQV probe, initialization, vIOMMU allocation
> and reset handling.
>
> Functional CMDQV support is added in follow-up patches.
>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/tegra241-cmdqv.h       | 15 ++++++++++
>  hw/arm/tegra241-cmdqv-stubs.c | 16 ++++++++++
>  hw/arm/tegra241-cmdqv.c       | 56 +++++++++++++++++++++++++++++++++++
>  hw/arm/Kconfig                |  5 ++++
>  hw/arm/meson.build            |  2 ++
>  5 files changed, 94 insertions(+)
>  create mode 100644 hw/arm/tegra241-cmdqv.h
>  create mode 100644 hw/arm/tegra241-cmdqv-stubs.c
>  create mode 100644 hw/arm/tegra241-cmdqv.c
>
> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> new file mode 100644
> index 0000000000..07e10e86ee
> --- /dev/null
> +++ b/hw/arm/tegra241-cmdqv.h
> @@ -0,0 +1,15 @@
> +/*
> + * Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved
> + * NVIDIA Tegra241 CMDQ-Virtualiisation extension for SMMUv3
virtualization
> + *
> + * Written by Nicolin Chen, Shameer Kolothum
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +#ifndef HW_ARM_TEGRA241_CMDQV_H
> +#define HW_ARM_TEGRA241_CMDQV_H
> +
> +const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
> +
> +#endif /* HW_ARM_TEGRA241_CMDQV_H */
> diff --git a/hw/arm/tegra241-cmdqv-stubs.c b/hw/arm/tegra241-cmdqv-stubs.c
> new file mode 100644
> index 0000000000..eabf90daf8
> --- /dev/null
> +++ b/hw/arm/tegra241-cmdqv-stubs.c
> @@ -0,0 +1,16 @@
> +/*
> + * Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved
> + *
> + * Stubs for Tegra241 CMDQ-Virtualiisation extension for SMMUv3
same
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +#include "qemu/osdep.h"
> +#include "smmuv3-accel.h"
> +#include "hw/arm/tegra241-cmdqv.h"
> +
> +const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void)
> +{
> +    return NULL;
> +}
> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> new file mode 100644
> index 0000000000..ad5a0d4611
> --- /dev/null
> +++ b/hw/arm/tegra241-cmdqv.c
> @@ -0,0 +1,56 @@
> +/*
> + * Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved
> + * NVIDIA Tegra241 CMDQ-Virtualization extension for SMMUv3
> + *
> + * Written by Nicolin Chen, Shameer Kolothum
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +#include "qemu/osdep.h"
> +
> +#include "hw/arm/smmuv3.h"
> +#include "smmuv3-accel.h"
> +#include "tegra241-cmdqv.h"
> +
> +static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
> +{
> +}
> +
> +static bool
> +tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
> +                            uint32_t *out_viommu_id, Error **errp)
> +{
> +    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
> +    return false;
> +}
> +
> +static void tegra241_cmdqv_reset(SMMUv3State *s)
> +{
> +}
> +
> +static bool tegra241_cmdqv_init(SMMUv3State *s, Error **errp)
> +{
> +    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
> +    return false;
> +}
> +
> +static bool tegra241_cmdqv_probe(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
> +                                 Error **errp)
> +{
> +    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
> +    return false;
> +}
> +
> +static const SMMUv3AccelCmdqvOps tegra241_cmdqv_ops = {
> +    .probe = tegra241_cmdqv_probe,
> +    .init = tegra241_cmdqv_init,
> +    .alloc_viommu = tegra241_cmdqv_alloc_viommu,
> +    .free_viommu = tegra241_cmdqv_free_viommu,
> +    .reset = tegra241_cmdqv_reset,
> +};
> +
> +const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void)
> +{
> +    return &tegra241_cmdqv_ops;
> +}
> diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
> index 4e50fb1111..073f2ebaaf 100644
> --- a/hw/arm/Kconfig
> +++ b/hw/arm/Kconfig
> @@ -618,6 +618,10 @@ config FSL_IMX8MP_EVK
>      depends on TCG
>      select FSL_IMX8MP
>  
> +config TEGRA241_CMDQV
> +    bool
> +    depends on ARM_SMMUV3_ACCEL
> +
>  config ARM_SMMUV3_ACCEL
>      bool
>      depends on ARM_SMMUV3
> @@ -625,6 +629,7 @@ config ARM_SMMUV3_ACCEL
>  config ARM_SMMUV3
>      bool
>      select ARM_SMMUV3_ACCEL if IOMMUFD
> +    imply TEGRA241_CMDQV
>  
>  config FSL_IMX6UL
>      bool
> diff --git a/hw/arm/meson.build b/hw/arm/meson.build
> index 3be1252c4f..64bcdc5a7c 100644
> --- a/hw/arm/meson.build
> +++ b/hw/arm/meson.build
> @@ -87,6 +87,8 @@ arm_common_ss.add(when: 'CONFIG_FSL_IMX8MP_EVK', if_true: files('imx8mp-evk.c'))
>  arm_common_ss.add(when: 'CONFIG_ARM_SMMUV3', if_true: files('smmuv3.c'))
>  arm_common_ss.add(when: 'CONFIG_ARM_SMMUV3_ACCEL', if_true: files('smmuv3-accel.c'))
>  stub_ss.add(files('smmuv3-accel-stubs.c'))
> +arm_common_ss.add(when: 'CONFIG_TEGRA241_CMDQV', if_true: files('tegra241-cmdqv.c'))
> +stub_ss.add(files('tegra241-cmdqv-stubs.c'))
>  arm_common_ss.add(when: 'CONFIG_FSL_IMX6UL', if_true: files('fsl-imx6ul.c', 'mcimx6ul-evk.c'))
>  arm_common_ss.add(when: 'CONFIG_NRF51_SOC', if_true: files('nrf51_soc.c'))
>  arm_common_ss.add(when: 'CONFIG_XEN', if_true: files(
Besides

Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric





^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 08/31] hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle
  2026-04-15 10:55 ` [PATCH v4 08/31] hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle Shameer Kolothum
@ 2026-05-04 15:33   ` Eric Auger
  2026-05-05  7:47     ` Shameer Kolothum Thodi
  2026-05-04 18:38   ` Nicolin Chen
  1 sibling, 1 reply; 102+ messages in thread
From: Eric Auger @ 2026-05-04 15:33 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina



On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> Add support for selecting and initializing a CMDQV backend based on the
> cmdqv OnOffAuto property.
>
> If set to OFF, CMDQV is not used and the default IOMMUFD-backed allocation
> path is taken.
>
> If set to AUTO, QEMU attempts to probe a CMDQV backend during device setup.
> If probing succeeds, the selected ops are stored in the accelerated SMMUv3
> state and used. If probing fails, QEMU silently falls back to the default
> path.
>
> If set to ON, QEMU requires CMDQV support. Probing is performed during
> setup and failure results in an error.
>
> When a CMDQV backend is active, its callbacks are used for vIOMMU
> allocation, free, and reset handling. Otherwise, the base implementation
> is used.
>
> The current implementation wires up the Tegra241 CMDQV backend through the
> generic ops interface. Functional CMDQV behaviour is added in subsequent
> patches.
>
> No functional change.
>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  include/hw/arm/smmuv3.h |  2 +
>  hw/arm/smmuv3-accel.c   | 93 +++++++++++++++++++++++++++++++++++++----
>  2 files changed, 88 insertions(+), 7 deletions(-)
>
> diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
> index fe0493c1aa..aa6a79237a 100644
> --- a/include/hw/arm/smmuv3.h
> +++ b/include/hw/arm/smmuv3.h
> @@ -74,6 +74,8 @@ struct SMMUv3State {
>      OnOffAuto ats;
>      OasMode oas;
>      SsidSizeMode ssidsize;
> +    /* SMMU CMDQV extension */
> +    OnOffAuto cmdqv;
>  
>      Notifier machine_done;
>  };
> diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> index f65e654adf..9068e65e2b 100644
> --- a/hw/arm/smmuv3-accel.c
> +++ b/hw/arm/smmuv3-accel.c
> @@ -18,6 +18,7 @@
>  
>  #include "smmuv3-internal.h"
>  #include "smmuv3-accel.h"
> +#include "tegra241-cmdqv.h"
>  
>  /*
>   * The root region aliases the global system memory, and shared_as_sysmem
> @@ -566,6 +567,7 @@ smmuv3_accel_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
>                            Error **errp)
>  {
>      SMMUv3AccelState *accel = s->s_accel;
> +    const SMMUv3AccelCmdqvOps *cmdqv_ops = accel->cmdqv_ops;
>      struct iommu_hwpt_arm_smmuv3 bypass_data = {
>          .ste = { SMMU_STE_CFG_BYPASS | SMMU_STE_VALID, 0x0ULL },
>      };
> @@ -576,10 +578,17 @@ smmuv3_accel_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
>      uint32_t viommu_id, hwpt_id;
>      IOMMUFDViommu *viommu;
>  
> -    if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
> -                                      IOMMU_VIOMMU_TYPE_ARM_SMMUV3, s2_hwpt_id,
> -                                      NULL, 0, &viommu_id, errp)) {
> -        return false;
> +    if (cmdqv_ops) {
> +        if (!cmdqv_ops->alloc_viommu(s, idev, &viommu_id, errp)) {
> +            return false;
> +        }
> +    } else {
> +        if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
> +                                          IOMMU_VIOMMU_TYPE_ARM_SMMUV3,
> +                                          s2_hwpt_id, NULL, 0, &viommu_id,
> +                                          errp)) {
> +            return false;
> +        }
>      }
>  
>      viommu = g_new0(IOMMUFDViommu, 1);
> @@ -625,12 +634,69 @@ free_bypass_hwpt:
>  free_abort_hwpt:
>      iommufd_backend_free_id(idev->iommufd, accel->abort_hwpt_id);
>  free_viommu:
> -    iommufd_backend_free_id(idev->iommufd, viommu->viommu_id);
> +    if (cmdqv_ops && cmdqv_ops->free_viommu) {
> +        cmdqv_ops->free_viommu(s);
hum actually we do free the iommu id below in case the free_viommu is
not implemented. So forget my previously comment. Besides this means
free_viommu shall be documented as not mandatory.
> +    } else {
> +        iommufd_backend_free_id(idev->iommufd, viommu->viommu_id);
> +    }
>      g_free(viommu);
>      accel->viommu = NULL;
>      return false;
>  }
>  
> +static const SMMUv3AccelCmdqvOps *
> +smmuv3_accel_probe_cmdqv(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
> +                          Error **errp)
> +{
> +    const SMMUv3AccelCmdqvOps *ops = tegra241_cmdqv_get_ops();
> +
> +    if (!ops || !ops->probe) {
> +        error_setg(errp, "No CMDQV ops found");
> +        return NULL;
> +    }
> +
> +    if (!ops->probe(s, idev, errp)) {
> +        return NULL;
> +    }
> +    return ops;
> +}
> +
> +static bool
> +smmuv3_accel_select_cmdqv(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
> +                          Error **errp)
> +{
> +    const SMMUv3AccelCmdqvOps *ops = NULL;
> +
> +    if (s->s_accel->cmdqv_ops) {
> +        return true;
> +    }
> +
> +    switch (s->cmdqv) {
> +    case ON_OFF_AUTO_OFF:
> +        s->s_accel->cmdqv_ops = NULL;
> +        return true;
> +    case ON_OFF_AUTO_AUTO:
> +        ops = smmuv3_accel_probe_cmdqv(s, idev, NULL);
> +        break;
> +    case ON_OFF_AUTO_ON:
> +        ops = smmuv3_accel_probe_cmdqv(s, idev, errp);
> +        if (!ops) {
> +            error_append_hint(errp, "CMDQV requested but not supported");
> +            return false;
> +        }
> +        s->s_accel->cmdqv_ops = ops;
Shouldn't you remove the above setting as you do it again below in case
init succeeds?
> +        break;
> +    default:
> +        g_assert_not_reached();
> +    }
> +
> +    if (ops && ops->init && !ops->init(s, errp)) {
> +        return false;
> +    }
> +    s->s_accel->cmdqv_ops = ops;
> +    return true;
> +}
> +
>  static bool smmuv3_accel_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
>                                            HostIOMMUDevice *hiod, Error **errp)
>  {
> @@ -665,6 +731,10 @@ static bool smmuv3_accel_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
>          goto done;
>      }
>  
> +    if (!smmuv3_accel_select_cmdqv(s, idev, errp)) {
> +        return false;
> +    }
> +
>      if (!smmuv3_accel_alloc_viommu(s, idev, errp)) {
>          error_append_hint(errp, "Unable to alloc vIOMMU: idev devid 0x%x: ",
>                            idev->devid);
> @@ -936,8 +1006,17 @@ bool smmuv3_accel_attach_gbpa_hwpt(SMMUv3State *s, Error **errp)
>  
>  void smmuv3_accel_reset(SMMUv3State *s)
>  {
> -     /* Attach a HWPT based on GBPA reset value */
> -     smmuv3_accel_attach_gbpa_hwpt(s, NULL);
> +    SMMUv3AccelState *accel = s->s_accel;
> +
> +    if (!accel) {
> +        return;
> +    }
> +    /* Attach a HWPT based on GBPA reset value */
> +    smmuv3_accel_attach_gbpa_hwpt(s, NULL);
> +
> +    if (accel->cmdqv_ops && accel->cmdqv_ops->reset) {
> +        accel->cmdqv_ops->reset(s);
> +    }
>  }
>  
>  static void smmuv3_accel_as_init(SMMUv3State *s)
Besides
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Eric



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 13/31] hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free
  2026-04-15 10:55 ` [PATCH v4 13/31] hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free Shameer Kolothum
@ 2026-05-04 16:01   ` Eric Auger
  2026-05-04 19:54   ` Nicolin Chen
  1 sibling, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-04 16:01 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina



On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> Replace the stub implementation with real vIOMMU allocation for
> Tegra241 CMDQV.
>
> Allocate a matching vEVENTQ together with the vIOMMU, since it is
> specific to the Tegra241 CMDQV vIOMMU and used to receive CMDQV
> events.
>
> Free both objects on teardown.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  hw/arm/tegra241-cmdqv.h |  1 +
>  hw/arm/tegra241-cmdqv.c | 46 ++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 46 insertions(+), 1 deletion(-)
>
> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> index 2a34a4b6b4..fa0aa3ab04 100644
> --- a/hw/arm/tegra241-cmdqv.h
> +++ b/hw/arm/tegra241-cmdqv.h
> @@ -30,6 +30,7 @@ typedef struct Tegra241CMDQV {
>      SMMUv3AccelState *s_accel;
>      MemoryRegion mmio_cmdqv;
>      qemu_irq irq;
> +    IOMMUFDVeventq *veventq;
>  } Tegra241CMDQV;
>  
>  const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> index ccd3c6d275..2f1084b55f 100644
> --- a/hw/arm/tegra241-cmdqv.c
> +++ b/hw/arm/tegra241-cmdqv.c
> @@ -25,13 +25,57 @@ static void tegra241_cmdqv_write(void *opaque, hwaddr offset, uint64_t value,
>  
>  static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
>  {
> +    SMMUv3AccelState *accel = s->s_accel;
> +    IOMMUFDViommu *viommu = accel->viommu;
> +    Tegra241CMDQV *cmdqv = accel->cmdqv;
> +    IOMMUFDVeventq *veventq = cmdqv->veventq;
> +
> +    if (!viommu) {
> +        return;
> +    }
> +    if (veventq) {
> +        close(veventq->veventq_fd);
> +        iommufd_backend_free_id(viommu->iommufd, veventq->veventq_id);
> +        g_free(veventq);
> +        cmdqv->veventq = NULL;
> +    }
> +    iommufd_backend_free_id(viommu->iommufd, viommu->viommu_id);
>  }
>  
>  static bool
>  tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
>                              uint32_t *out_viommu_id, Error **errp)
>  {
> -    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
> +    Tegra241CMDQV *cmdqv = s->s_accel->cmdqv;
> +    uint32_t viommu_id, veventq_id, veventq_fd;
> +    IOMMUFDVeventq *veventq;
> +
> +    if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
> +                                      IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV,
> +                                      idev->hwpt_id, &cmdqv->cmdqv_data,
> +                                      sizeof(cmdqv->cmdqv_data), &viommu_id,
> +                                      errp)) {
> +        return false;
> +    }
> +
> +    if (!iommufd_backend_alloc_veventq(idev->iommufd, viommu_id,
> +                                       IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV,
> +                                       1 << 16, &veventq_id, &veventq_fd,
> +                                       errp)) {
> +        error_append_hint(errp, "Tegra241 CMDQV: failed to alloc veventq");
> +        goto free_viommu;
> +    }
> +
> +    veventq = g_new(IOMMUFDVeventq, 1);
> +    veventq->veventq_id = veventq_id;
> +    veventq->veventq_fd = veventq_fd;
> +    cmdqv->veventq = veventq;
> +
> +    *out_viommu_id = viommu_id;
> +    return true;
> +
> +free_viommu:
> +    iommufd_backend_free_id(idev->iommufd, viommu_id);
>      return false;
>  }
>  



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 05/31] system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq
  2026-04-15 10:55 ` [PATCH v4 05/31] system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq Shameer Kolothum
  2026-05-04 15:00   ` Eric Auger
@ 2026-05-04 18:16   ` Nicolin Chen
  1 sibling, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-04 18:16 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:26AM +0100, Shameer Kolothum wrote:
> The viommu field is assigned but never used. Callers freeing the
> veventq already have access to the IOMMUFDViommu object through other
> references, so this field is redundant.
> 
> Removing it also simplifies upcoming changes where veventq is
> allocated based on the viommu id before the IOMMUFDViommu object is
> created (e.g. vendor CMDQV-based veventq allocation).
> 
> No functional change.
> 
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
 
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 07/31] hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub
  2026-04-15 10:55 ` [PATCH v4 07/31] hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub Shameer Kolothum
  2026-05-04 15:19   ` Eric Auger
@ 2026-05-04 18:23   ` Nicolin Chen
  1 sibling, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-04 18:23 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:28AM +0100, Shameer Kolothum wrote:
> Introduce a Tegra241 CMDQV backend that plugs into the SMMUv3 accelerated
> CMDQV ops interface.
> 
> This patch wires up the Tegra241 CMDQV backend and provides a stub
> implementation for CMDQV probe, initialization, vIOMMU allocation
> and reset handling.
> 
> Functional CMDQV support is added in follow-up patches.
> 
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 06/31] hw/arm/smmuv3-accel: Introduce CMDQV ops interface
  2026-04-15 10:55 ` [PATCH v4 06/31] hw/arm/smmuv3-accel: Introduce CMDQV ops interface Shameer Kolothum
  2026-05-04 15:19   ` Eric Auger
@ 2026-05-04 18:28   ` Nicolin Chen
  1 sibling, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-04 18:28 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:27AM +0100, Shameer Kolothum wrote:
> Command Queue Virtualization (CMDQV) is a hardware extension available
> on certain platforms that allows the SMMUv3 command queue to be
> virtualized and passed through to a VM, improving performance.
> 
> For example, NVIDIA Tegra241 implements CMDQV to support virtualization
> of multiple command queues (VCMDQs).
> 
> The term CMDQV is used here generically to refer to any platform that
> provides hardware support to virtualize the SMMUv3 command queue.
> 
> CMDQV support is a specialization of the IOMMUFD-backed accelerated
> SMMUv3 path. Introduce an ops interface to factor out CMDQV-specific
> probe, initialization, and vIOMMU allocation logic from the base
> implementation. The ops pointer and associated state are stored in
> the accelerated SMMUv3 state.
> 
> This provides an extensible design to support future vendor-specific
> CMDQV implementations.
> 
> No functional change.
> 
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 08/31] hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle
  2026-04-15 10:55 ` [PATCH v4 08/31] hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle Shameer Kolothum
  2026-05-04 15:33   ` Eric Auger
@ 2026-05-04 18:38   ` Nicolin Chen
  1 sibling, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-04 18:38 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:29AM +0100, Shameer Kolothum wrote:
> Add support for selecting and initializing a CMDQV backend based on the
> cmdqv OnOffAuto property.
> 
> If set to OFF, CMDQV is not used and the default IOMMUFD-backed allocation
> path is taken.
> 
> If set to AUTO, QEMU attempts to probe a CMDQV backend during device setup.
> If probing succeeds, the selected ops are stored in the accelerated SMMUv3
> state and used. If probing fails, QEMU silently falls back to the default
> path.
> 
> If set to ON, QEMU requires CMDQV support. Probing is performed during
> setup and failure results in an error.
> 
> When a CMDQV backend is active, its callbacks are used for vIOMMU
> allocation, free, and reset handling. Otherwise, the base implementation
> is used.
> 
> The current implementation wires up the Tegra241 CMDQV backend through the
> generic ops interface. Functional CMDQV behaviour is added in subsequent
> patches.
> 
> No functional change.
> 
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>

With Eric's comments addressed,

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 09/31] hw/arm/virt: Use stored SMMUv3 device list for IORT build
  2026-04-15 10:55 ` [PATCH v4 09/31] hw/arm/virt: Use stored SMMUv3 device list for IORT build Shameer Kolothum
@ 2026-05-04 18:46   ` Nicolin Chen
  0 siblings, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-04 18:46 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:30AM +0100, Shameer Kolothum wrote:
> -static int iort_smmuv3_devices(Object *obj, void *opaque)
> -{
> -    VirtMachineState *vms = VIRT_MACHINE(qdev_get_machine());
> -    AcpiIortSMMUv3Dev sdev = {0};
> -    GArray *sdev_blob = opaque;
> -    AcpiIortIdMapping idmap;
> -    PlatformBusDevice *pbus;
> -    int min_bus, max_bus;
> -    SysBusDevice *sbdev;
> -    PCIBus *bus;
> -
> -    if (!object_dynamic_cast(obj, TYPE_ARM_SMMUV3)) {
> -        return 0;
> -    }
> -
> -    bus = PCI_BUS(object_property_get_link(obj, "primary-bus", &error_abort));
> -    sdev.accel = object_property_get_bool(obj, "accel", &error_abort);
> -    sdev.ats = smmuv3_ats_enabled(ARM_SMMUV3(obj));
> -    pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
> -    sbdev = SYS_BUS_DEVICE(obj);
> -    sdev.base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
> -    sdev.base += vms->memmap[VIRT_PLATFORM_BUS].base;
> -    sdev.irq = platform_bus_get_irqn(pbus, sbdev, 0);
> -    sdev.irq += vms->irqmap[VIRT_PLATFORM_BUS];
> -    sdev.irq += ARM_SPI_BASE;
> -
> -    pci_bus_range(bus, &min_bus, &max_bus);
> -    sdev.rc_smmu_idmaps = g_array_new(false, true, sizeof(AcpiIortIdMapping));
> -    idmap.input_base = min_bus << 8,
> -    idmap.id_count = (max_bus - min_bus + 1) << 8,
> -    g_array_append_val(sdev.rc_smmu_idmaps, idmap);
> -    g_array_append_val(sdev_blob, sdev);
> -    return 0;
> -}

This could just stay as:

static int __populate_smmuv3_dev(VirtMachineState *vms, GArray *sdev_blob)

?

Either way,

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 10/31] hw/arm/tegra241-cmdqv: Probe host Tegra241 CMDQV support
  2026-04-15 10:55 ` [PATCH v4 10/31] hw/arm/tegra241-cmdqv: Probe host Tegra241 CMDQV support Shameer Kolothum
@ 2026-05-04 18:49   ` Nicolin Chen
  0 siblings, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-04 18:49 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:31AM +0100, Shameer Kolothum wrote:
> Use IOMMU_GET_HW_INFO to query host support for Tegra241 CMDQV.
> 
> Validate the returned data type, version, and minimum number of vCMDQs and
> SIDs per Tegra241 CMDQ Virtual Interface(VI). Fail the probe if the host
> does not meet these requirements.
> 
> The QEMU model supports one Virtual Interface(VI) per VM with 2 vCMDQs and
> 16 SIDs per VI, so the probe ensures the host implementation is compatible
> with these limits.
> 
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 12/31] hw/arm/virt: Link SMMUv3 CMDQV resources to platform bus
  2026-04-15 10:55 ` [PATCH v4 12/31] hw/arm/virt: Link SMMUv3 CMDQV resources to platform bus Shameer Kolothum
@ 2026-05-04 18:57   ` Nicolin Chen
  0 siblings, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-04 18:57 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:33AM +0100, Shameer Kolothum wrote:
> SMMUv3 devices with acceleration may enable CMDQV extensions
> after device realize. In that case, additional MMIO regions and
> IRQ lines may be registered but not yet mapped to the platform bus.
> 
> Ensure SMMUv3 device resources are linked to the platform bus
> during machine_done().
> 
> This is safe to do unconditionally since the platform bus helpers
> skip resources that are already mapped.
> 
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 13/31] hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free
  2026-04-15 10:55 ` [PATCH v4 13/31] hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free Shameer Kolothum
  2026-05-04 16:01   ` Eric Auger
@ 2026-05-04 19:54   ` Nicolin Chen
  1 sibling, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-04 19:54 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:34AM +0100, Shameer Kolothum wrote:
> +    if (!iommufd_backend_alloc_veventq(idev->iommufd, viommu_id,
> +                                       IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV,
> +                                       1 << 16, &veventq_id, &veventq_fd,

I forgot why we use 16 here. But perhaps we could do SMMU_EVENTQS?

Nicolin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 14/31] hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region
  2026-04-15 10:55 ` [PATCH v4 14/31] hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region Shameer Kolothum
@ 2026-05-05  0:09   ` Nicolin Chen
  2026-05-05  7:26   ` Eric Auger
  1 sibling, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-05  0:09 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:35AM +0100, Shameer Kolothum wrote:

LGTM. Some nits:

> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> index fa0aa3ab04..965670066d 100644
> --- a/hw/arm/tegra241-cmdqv.h
> +++ b/hw/arm/tegra241-cmdqv.h
> @@ -10,10 +10,14 @@
>  #ifndef HW_ARM_TEGRA241_CMDQV_H
>  #define HW_ARM_TEGRA241_CMDQV_H
>  
> +#include "hw/core/registerfields.h"
> +
>  #define CMDQV_VER                 1
>  #define CMDQV_NUM_CMDQ_LOG2       1
>  #define CMDQV_NUM_SID_PER_VI_LOG2 4
>  
> +#define TEGRA241_CMDQV_MAX_CMDQ   (1U << CMDQV_NUM_CMDQ_LOG2)

Maybe add:
#define TEGRA241_CMDQV_MAX_NUM_SID   (1U << CMDQV_NUM_SID_PER_VI_LOG2)

>  /*
>   * Tegra241 CMDQV MMIO layout (64KB pages)
>   *
> @@ -31,8 +35,131 @@ typedef struct Tegra241CMDQV {
>      MemoryRegion mmio_cmdqv;
>      qemu_irq irq;
>      IOMMUFDVeventq *veventq;
> +
> +    /* Register Cache */
> +    uint32_t config;
> +    uint32_t param;
> +    uint32_t status;
> +    uint32_t vi_err_map[2];
> +    uint32_t vi_int_mask[2];
> +    uint32_t cmdq_err_map[4];
> +    uint32_t cmdq_alloc_map[TEGRA241_CMDQV_MAX_CMDQ];
> +    uint32_t vintf_config;
> +    uint32_t vintf_status;
[...]
> +    uint32_t vintf_sid_match[16];
> +    uint32_t vintf_sid_replace[16];

Then s/16/TEGRA241_CMDQV_MAX_NUM_SID

> +#define SMMU_CMDQV_CMDQ_ALLOC_MAP_(i)        \
> +    REG32(CMDQ_ALLOC_MAP_##i, 0x200 + i * 4) \
> +    FIELD(CMDQ_ALLOC_MAP_##i, ALLOC, 0, 1)   \
> +    FIELD(CMDQ_ALLOC_MAP_##i, LVCMDQ, 1, 7)  \
> +    FIELD(CMDQ_ALLOC_MAP_##i, VIRT_INTF_INDX, 15, 6)
> +
> +SMMU_CMDQV_CMDQ_ALLOC_MAP_(0)
> +SMMU_CMDQV_CMDQ_ALLOC_MAP_(1)
> +
> +

Can drop the extra line.

>  static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
> @@ -84,8 +235,8 @@ static void tegra241_cmdqv_reset(SMMUv3State *s)
>  }
>  
>  static const MemoryRegionOps mmio_cmdqv_ops = {
> -    .read = tegra241_cmdqv_read,
> -    .write = tegra241_cmdqv_write,
> +    .read = tegra241_cmdqv_read_mmio,
> +    .write = tegra241_cmdqv_write_mmio,
>      .endianness = DEVICE_LITTLE_ENDIAN,

This could squash to PATCH-11.

Nicolin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 18/31] system/physmem: Add address_space_is_ram() helper
  2026-04-15 10:55 ` [PATCH v4 18/31] system/physmem: Add address_space_is_ram() helper Shameer Kolothum
@ 2026-05-05  0:24   ` Nicolin Chen
  0 siblings, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-05  0:24 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:39AM +0100, Shameer Kolothum wrote:
> Introduce address_space_is_ram(), a helper to determine whether
> a guest physical address resolves to a RAM-backed MemoryRegion within
> an AddressSpace.
> 
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
  2026-04-15 10:55 ` [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming Shameer Kolothum
@ 2026-05-05  0:40   ` Nicolin Chen
  2026-05-05  9:59     ` Shameer Kolothum Thodi
  2026-05-05 13:25   ` Eric Auger
  2026-05-06 16:51   ` Eric Auger
  2 siblings, 1 reply; 102+ messages in thread
From: Nicolin Chen @ 2026-05-05  0:40 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:40AM +0100, Shameer Kolothum wrote:
> +static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int index,
> +                                       Error **errp)
> +{
> +    SMMUv3AccelState *accel = cmdqv->s_accel;
> +    uint64_t base_mask = (uint64_t)R_VCMDQ0_BASE_L_ADDR_MASK |
> +                         (uint64_t)R_VCMDQ0_BASE_H_ADDR_MASK << 32;
> +    uint64_t addr = cmdqv->vcmdq_base[index] & base_mask;
> +    uint64_t log2 = cmdqv->vcmdq_base[index] & R_VCMDQ0_BASE_L_LOG2SIZE_MASK;
> +    uint64_t size = 1ULL << (log2 + 4);
> +    IOMMUFDViommu *viommu = accel->viommu;
> +    IOMMUFDHWqueue *hw_queue;
> +    uint32_t hw_queue_id;
> +
> +    /* Ignore any invalid address. This may come as part of reset etc. */
> +    if (!address_space_is_ram(&address_space_memory, addr) ||
> +        !address_space_is_ram(&address_space_memory, addr + size - 1)) {

Overflow prevention:
size - 1 + addr

> +    if (!tegra241_cmdq_enabled(cmdqv) || !tegra241_vintf_enabled(cmdqv)) {
> +        return true;
> +    }

It's good to have these two checks. But if vcmdq setup is skipped
for vintf=disabled, we need to call this setup() again upon vintf
gets enabled?

Also, do we fence against unassigned vcmdq? Corner case is that a
guest might write base address registers via direct (global) MMIO
space.

> @@ -363,7 +427,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
>           */
>          index = (offset - CMDQV_VCMDQ_PAGE0_BASE) / CMDQV_VCMDQ_STRIDE;
>          tegra241_cmdqv_write_vcmdq(cmdqv, offset - index * CMDQV_VCMDQ_STRIDE,
> -                                   index, value, size);
> +                                   index, value, size, &local_err);
>          break;
>      case A_VI_VCMDQ0_BASE_L ... A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
>          /* Same VINTF-to-VCMDQ translation as VINTF Page0 case above */
> @@ -373,7 +437,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
>          /* Same decode logic as VCMDQ Page0 case above */
>          index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
>          tegra241_cmdqv_write_vcmdq(cmdqv, offset - index * CMDQV_VCMDQ_STRIDE,
> -                                   index, value, size);
> +                                   index, value, size, &local_err);

Should these two be squashed into an earlier patch?

Nicolin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing
  2026-04-15 10:55 ` [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing Shameer Kolothum
@ 2026-05-05  0:50   ` Nicolin Chen
  2026-05-05 15:13     ` Shameer Kolothum Thodi
  2026-05-06 12:27   ` Eric Auger
  1 sibling, 1 reply; 102+ messages in thread
From: Nicolin Chen @ 2026-05-05  0:50 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:41AM +0100, Shameer Kolothum wrote:
> The pre-to-post-alloc transition is triggered by the BASE register write
> that initiates IOMMU_HW_QUEUE_ALLOC. No cache-to-hardware synchronisation
> is needed at transition time. The hardware mandated init sequence requires
> BASE to be written first; PROD_INDX, CONS_INDX and CONFIG.CMDQ_EN are
> programmed only after BASE and are therefore always post-alloc.
> 
> Any pre-alloc writes to those registers update only the register cache,
> which is discarded at the transition.

Is "discard" the correct action?

Guest OS might expect HW (VM) to retain what it writes to those
page0 registers?

Nicolin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 23/31] hw/arm/smmuv3-accel: Introduce common helper for veventq read
  2026-04-15 10:55 ` [PATCH v4 23/31] hw/arm/smmuv3-accel: Introduce common helper for veventq read Shameer Kolothum
@ 2026-05-05  1:07   ` Nicolin Chen
  2026-05-06 12:49   ` Eric Auger
  1 sibling, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-05  1:07 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:44AM +0100, Shameer Kolothum wrote:
> Move the vEVENTQ read and validation logic into a common helper
> smmuv3_accel_event_read_validate(). The helper performs the read(),
> checks for overflow and short reads, validates the sequence number,
> and updates the sequence state.
> 
> This helper can be reused for Tegra241 CMDQV vEVENTQ support in a
> subsequent patch.
> 
> Error handling is slightly adjusted: instead of reporting errors
> directly in the read handler, the helper now returns errors via
> Error **. Sequence gaps are reported as warnings.
> 
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 24/31] hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors
  2026-04-15 10:55 ` [PATCH v4 24/31] hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors Shameer Kolothum
@ 2026-05-05  1:13   ` Nicolin Chen
  2026-05-07 16:40   ` Eric Auger
  1 sibling, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-05  1:13 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:45AM +0100, Shameer Kolothum wrote:
> Install an event handler on the CMDQV vEVENTQ fd to read and propagate
> host received CMDQV errors to the guest.
> 
> The handler runs in QEMU’s main loop, using a non-blocking fd registered
> via qemu_set_fd_handler().
> 
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 26/31] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size
  2026-04-15 10:55 ` [PATCH v4 26/31] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size Shameer Kolothum
@ 2026-05-05  1:26   ` Nicolin Chen
  2026-05-07 17:23   ` Eric Auger
  1 sibling, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-05  1:26 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:47AM +0100, Shameer Kolothum wrote:
> +    /*
> +     * CMDQ must not cross a physical RAM backend page. Adjust CMDQS so the
> +     * queue fits entirely within the smallest backend page size, ensuring
> +     * the command queue is physically contiguous in host memory.
> +     */
> +    pgsize = tegra241_cmdqv_min_ram_pagesize();
> +    val = FIELD_EX32(s->idr[1], IDR1, CMDQS);
> +    s->idr[1] = FIELD_DP32(s->idr[1], IDR1, CMDQS, MIN(ctz64(pgsize) - 4, val));

Might be helpful to add in the note:
   idr1.cmdqs = log2(max_qsz) - entry shift

Otherwise, lgtm.

Nicolin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 27/31] hw/arm/smmuv3: Add per-device identifier property
  2026-04-15 10:55 ` [PATCH v4 27/31] hw/arm/smmuv3: Add per-device identifier property Shameer Kolothum
@ 2026-05-05  1:30   ` Nicolin Chen
  0 siblings, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-05  1:30 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:48AM +0100, Shameer Kolothum wrote:
> Add an "identifier" property to the SMMUv3 device and use it when
> building the ACPI IORT SMMUv3 node Identifier field.
> 
> This avoids relying on device enumeration order and provides a stable
> per-device identifier. A subsequent patch will use the same identifier
> when generating the DSDT description for Tegra241 CMDQV, ensuring that
> the IORT and DSDT entries refer to the same SMMUv3 instance.
> 
> The identifier is assigned at pre-plug time, accounting for the ITS Group
> node that build_iort() places before SMMUv3 nodes in the IORT table, so
> that identifiers are globally unique across all IORT nodes.
> 
> No functional change: IORT blob content for bios-tables qtest is identical
> to before.
> 
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 28/31] hw/arm/smmuv3-accel: Introduce helper to query CMDQV type
  2026-04-15 10:55 ` [PATCH v4 28/31] hw/arm/smmuv3-accel: Introduce helper to query CMDQV type Shameer Kolothum
@ 2026-05-05  1:32   ` Nicolin Chen
  0 siblings, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-05  1:32 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:49AM +0100, Shameer Kolothum wrote:
> Introduce a SMMUv3AccelCmdqvType enum and a helper to query the
> CMDQV implementation type associated with an accelerated SMMUv3
> instance.
> 
> A subsequent patch will use this helper when generating the
> Tegra241 CMDQV DSDT.
> 
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 30/31] hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active
  2026-04-15 10:55 ` [PATCH v4 30/31] hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active Shameer Kolothum
@ 2026-05-05  1:35   ` Nicolin Chen
  2026-05-07 17:36   ` Eric Auger
  1 sibling, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-05  1:35 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: qemu-arm, qemu-devel, eric.auger, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, Apr 15, 2026 at 11:55:51AM +0100, Shameer Kolothum wrote:
> When CMDQV is active, the first cold-plugged VFIO device establishes the
> viommu to host SMMUv3 association. Block its hot-unplug to preserve this
> association and the guest's boot time CMDQV configuration.
> 
> Also abort at machine_done if cmdqv=on is requested but no cold-plugged
> VFIO device was present to initialize it.
> 
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 14/31] hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region
  2026-04-15 10:55 ` [PATCH v4 14/31] hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region Shameer Kolothum
  2026-05-05  0:09   ` Nicolin Chen
@ 2026-05-05  7:26   ` Eric Auger
  2026-05-05 10:28     ` Shameer Kolothum Thodi
  1 sibling, 1 reply; 102+ messages in thread
From: Eric Auger @ 2026-05-05  7:26 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina

Hi Shameer,

On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> Tegra241 CMDQV exposes control and status registers in the CMDQ-V
> Config page (offset [0x0, 0x10000)) used to configure virtual command
> queue allocation and interrupt behavior.
>
> Add read/write emulation for the CMDQ-V Config region
> ([CMDQV_BASE, CMDQV_CMDQ_BASE]), backed by a simple register cache.
> This includes CONFIG, PARAM, STATUS, VI error and interrupt maps, CMDQ
> allocation map and the VINTF0 related registers defined in the CMDQ-V
> Config space. Only VINTF0 is supported; VINTF1-63 are not.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/tegra241-cmdqv.h | 127 +++++++++++++++++++++++++++++++
>  hw/arm/tegra241-cmdqv.c | 163 ++++++++++++++++++++++++++++++++++++++--
>  hw/arm/trace-events     |   4 +
>  3 files changed, 288 insertions(+), 6 deletions(-)
>
> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> index fa0aa3ab04..965670066d 100644
> --- a/hw/arm/tegra241-cmdqv.h
> +++ b/hw/arm/tegra241-cmdqv.h
> @@ -10,10 +10,14 @@
>  #ifndef HW_ARM_TEGRA241_CMDQV_H
>  #define HW_ARM_TEGRA241_CMDQV_H
>  
> +#include "hw/core/registerfields.h"
> +
>  #define CMDQV_VER                 1
>  #define CMDQV_NUM_CMDQ_LOG2       1
>  #define CMDQV_NUM_SID_PER_VI_LOG2 4
>  
> +#define TEGRA241_CMDQV_MAX_CMDQ   (1U << CMDQV_NUM_CMDQ_LOG2)
> +
>  /*
>   * Tegra241 CMDQV MMIO layout (64KB pages)
>   *
> @@ -31,8 +35,131 @@ typedef struct Tegra241CMDQV {
>      MemoryRegion mmio_cmdqv;
>      qemu_irq irq;
>      IOMMUFDVeventq *veventq;
> +
> +    /* Register Cache */
> +    uint32_t config;
> +    uint32_t param;
> +    uint32_t status;
> +    uint32_t vi_err_map[2];
> +    uint32_t vi_int_mask[2];
> +    uint32_t cmdq_err_map[4];
> +    uint32_t cmdq_alloc_map[TEGRA241_CMDQV_MAX_CMDQ];
> +    uint32_t vintf_config;
> +    uint32_t vintf_status;
> +    uint32_t vintf_sid_match[16];
> +    uint32_t vintf_sid_replace[16];
> +    uint32_t vintf_cmdq_err_map[4];
>  } Tegra241CMDQV;
>  
> +/* CMDQ-V Config page registers (offset 0x00000) */
> +REG32(CONFIG, 0x0)
> +FIELD(CONFIG, CMDQV_EN, 0, 1)
> +FIELD(CONFIG, CMDQV_PER_CMD_OFFSET, 1, 3)
> +FIELD(CONFIG, CMDQ_MAX_CLK_BATCH, 4, 8)
> +FIELD(CONFIG, CMDQ_MAX_CMD_BATCH, 12, 8)
> +FIELD(CONFIG, CONS_DRAM_EN, 20, 1)
> +
> +REG32(PARAM, 0x4)
> +FIELD(PARAM, CMDQV_VER, 0, 4)
> +FIELD(PARAM, CMDQV_NUM_CMDQ_LOG2, 4, 4)
> +FIELD(PARAM, CMDQV_NUM_VI_LOG2, 8, 4)
> +FIELD(PARAM, CMDQV_NUM_SID_PER_VI_LOG2, 12, 4)
> +
> +REG32(STATUS, 0x8)
> +FIELD(STATUS, CMDQV_ENABLED, 0, 1)
> +
> +/* SMMU_CMDQV_VI_ERR_MAP_0/1 definitions */
> +#define A_VI_ERR_MAP_0 0x14
> +#define A_VI_ERR_MAP_1 0x18
> +#define V_VI_ERR_MAP_NO_ERROR (0)
> +#define V_VI_ERR_MAP_ERROR (1)
> +
> +/* SMMU_CMDQV_VI_INT_MASK_0/1 definitions */
> +#define A_VI_INT_MASK 0x1c
pls use spec terminology: _0


> +#define A_VI_INT_MASK_1 0x20
> +#define V_VI_INT_MASK_NOT_MASKED (0)
> +#define V_VI_INT_MASK_MASKED (1)
> +
> +/* SMMU_CMDQV_CMDQ_ERR_MAP_0-3 definitions */
> +#define A_CMDQ_ERR_MAP_0 0x24
> +#define A_CMDQ_ERR_MAP_1 0x28
> +#define A_CMDQ_ERR_MAP_2 0x2c
> +#define A_CMDQ_ERR_MAP_3 0x30
> +
> +/*
> + * CMDQ_ALLOC_MAP: one entry per physical VCMDQ. Hardware supports up to 128
> + * entries (CMDQV_NUM_CMDQ_LOG2=7), but QEMU only exposes
> + * TEGRA241_CMDQV_MAX_CMDQ (=2) VCMDQs per VM so only entries 0 and 1 are
> + * defined here.
> + */
> +/* 2 identical register entries */
> +#define SMMU_CMDQV_CMDQ_ALLOC_MAP_(i)        \
> +    REG32(CMDQ_ALLOC_MAP_##i, 0x200 + i * 4) \
> +    FIELD(CMDQ_ALLOC_MAP_##i, ALLOC, 0, 1)   \
> +    FIELD(CMDQ_ALLOC_MAP_##i, LVCMDQ, 1, 7)  \
> +    FIELD(CMDQ_ALLOC_MAP_##i, VIRT_INTF_INDX, 15, 6)
> +
> +SMMU_CMDQV_CMDQ_ALLOC_MAP_(0)
> +SMMU_CMDQV_CMDQ_ALLOC_MAP_(1)
> +
> +
> +/* Only VINTF0 is exposed to the guest; vintf = 0 */
is it supposed to evolve? Otherwise we can keep things simple and just
define a single

SMMU_CMDQV_VINTF0_CONFIG_0 and same for the rest.


> +#define SMMU_CMDQV_VINTFi_CONFIG_(vi)                 \
> +    REG32(VINTF##vi##_CONFIG, 0x1000 + vi * 0x100) \
> +    FIELD(VINTF##vi##_CONFIG, ENABLE, 0, 1)       \
> +    FIELD(VINTF##vi##_CONFIG, VMID, 1, 16)        \
> +    FIELD(VINTF##vi##_CONFIG, HYP_OWN, 17, 1)
> +
> +SMMU_CMDQV_VINTFi_CONFIG_(0)
> +
> +#define SMMU_CMDQV_VINTFi_STATUS_(vi)                 \
> +    REG32(VINTF##vi##_STATUS, 0x1004 + vi * 0x100) \
> +    FIELD(VINTF##vi##_STATUS, ENABLE_OK, 0, 1)    \
> +    FIELD(VINTF##vi##_STATUS, STATUS, 1, 3)       \
> +    FIELD(VINTF##vi##_STATUS, VI_NUM_LVCMDQ, 16, 8)
> +
> +SMMU_CMDQV_VINTFi_STATUS_(0)
> +
> +#define V_VINTF_STATUS_NO_ERROR (0 << 1)
> +#define V_VINTF_STATUS_VCMDQ_ERROR (1 << 1)
> +
> +/*
> + * SID_MATCH/SID_REPLACE: 16 entries per VINTF (CMDQV_NUM_SID_PER_VI_LOG2=4).
> + * vintf = 0, 16 identical register entries
> + */
> +#define SMMU_CMDQV_VINTFi_SID_MATCH_(vi, j)  
this could also be simplified
>                         \
> +    REG32(VINTF##vi##_SID_MATCH_##j, 0x1040 + j * 4 + vi * 0x100) \
> +    FIELD(VINTF##vi##_SID_MATCH_##j, ENABLE, 0, 1)               \
> +    FIELD(VINTF##vi##_SID_MATCH_##j, VIRT_SID, 1, 20)
> +
> +SMMU_CMDQV_VINTFi_SID_MATCH_(0, 0)
> +/* Omitting [0][1~14] as not being directly called */
> +SMMU_CMDQV_VINTFi_SID_MATCH_(0, 15)
> +
> +/* vintf = 0, 16 identical register entries */
> +#define SMMU_CMDQV_VINTFi_SID_REPLACE_(vi, j)                          \
> +    REG32(VINTF##vi##_SID_REPLACE_##j, 0x1080 + j * 4 + vi * 0x100) \
> +    FIELD(VINTF##vi##_SID_REPLACE_##j, PHYS_SID, 0, 20)
> +
> +SMMU_CMDQV_VINTFi_SID_REPLACE_(0, 0)
> +/* Omitting [0][1~14] as not being directly called */
why?
> +SMMU_CMDQV_VINTFi_SID_REPLACE_(0, 15)
> +
> +/*
> + * LVCMDQ_ERR_MAP: hardware defines 4 registers per VINTF (offset
> + * 0x10c0..0x10cc), each covering 32 logical VCMDQs. All 4 are accessible
> + * by the guest. With TEGRA241_CMDQV_MAX_CMDQ=2 only MAP_0 bits [1:0]
> + * carry meaningful error state; MAP_1..MAP_3 always read as 0.
> + * vintf = 0, 4 identical register entries
> + */
> +#define SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(vi, j)  
Again I won't add this extra complexity, get rid of vi everywhere for now.
>                         \
> +    REG32(VINTF##vi##_LVCMDQ_ERR_MAP_##j, 0x10c0 + j * 4 + vi * 0x100) \
> +    FIELD(VINTF##vi##_LVCMDQ_ERR_MAP_##j, LVCMDQ_ERR_MAP, 0, 32)
> +
> +SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(0, 0)
> +/* MAP_1 and MAP_2 omitted; not referenced directly */
> +SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(0, 3)
> +
>  const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
>  
>  #endif /* HW_ARM_TEGRA241_CMDQV_H */
> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> index 2f1084b55f..3b08ed0ff3 100644
> --- a/hw/arm/tegra241-cmdqv.c
> +++ b/hw/arm/tegra241-cmdqv.c
> @@ -8,19 +8,170 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/log.h"
>  
>  #include "hw/arm/smmuv3.h"
>  #include "smmuv3-accel.h"
>  #include "tegra241-cmdqv.h"
> +#include "trace.h"
>  
> -static uint64_t tegra241_cmdqv_read(void *opaque, hwaddr offset, unsigned size)
> +static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV *cmdqv,
> +                                                 hwaddr offset)
>  {
> -    return 0;
> +    int i;
> +
> +    switch (offset) {
> +    case A_VINTF0_CONFIG:
> +        return cmdqv->vintf_config;
> +    case A_VINTF0_STATUS:
> +        return cmdqv->vintf_status;
> +    case A_VINTF0_SID_MATCH_0 ... A_VINTF0_SID_MATCH_15:
> +        i = (offset - A_VINTF0_SID_MATCH_0) / 4;
> +        return cmdqv->vintf_sid_match[i];
> +    case A_VINTF0_SID_REPLACE_0 ... A_VINTF0_SID_REPLACE_15:
> +        i = (offset - A_VINTF0_SID_REPLACE_0) / 4;
> +        return cmdqv->vintf_sid_replace[i];
> +    case A_VINTF0_LVCMDQ_ERR_MAP_0 ... A_VINTF0_LVCMDQ_ERR_MAP_3:
> +        i = (offset - A_VINTF0_LVCMDQ_ERR_MAP_0) / 4;
> +        return cmdqv->vintf_cmdq_err_map[i];
> +    default:
> +        /*
> +         * GLB_FILT_CFG_0 (offset 0xC) and GLB_FILT_DATA_0 (offset 0x10) are
> +         * filter config and filter data registers. They are not required for
> +         * normal VINTF operation and are not emulated.
> +         */
> +        qemu_log_mask(LOG_UNIMP, "%s unhandled read access at 0x%" PRIx64 "\n",
> +                      __func__, offset);
> +        return 0;
> +    }
> +}
> +
> +static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
> +                                              hwaddr offset, uint64_t value)
> +{
> +    int i;
> +
> +    switch (offset) {
> +    case A_VINTF0_CONFIG:
> +        /*
> +         * Mask out HYP_OWN on guest writes. This bit selects Hypervisor (1) vs
> +         * Guest (0) ownership of the CMDQ. Force it to 0 so the VINTF always
> +         * remains guest-owned.
> +         */
> +        value &= ~R_VINTF0_CONFIG_HYP_OWN_MASK;
> +
> +        cmdqv->vintf_config = value;
> +        if (value & R_VINTF0_CONFIG_ENABLE_MASK) {
> +            cmdqv->vintf_status |= R_VINTF0_STATUS_ENABLE_OK_MASK;
> +        } else {
> +            cmdqv->vintf_status &= ~R_VINTF0_STATUS_ENABLE_OK_MASK;
> +        }
> +        break;
> +    case A_VINTF0_SID_MATCH_0 ... A_VINTF0_SID_MATCH_15:
> +        i = (offset - A_VINTF0_SID_MATCH_0) / 4;
> +        cmdqv->vintf_sid_match[i] = value;
> +        break;
> +    case A_VINTF0_SID_REPLACE_0 ... A_VINTF0_SID_REPLACE_15:
> +        i = (offset - A_VINTF0_SID_REPLACE_0) / 4;
> +        cmdqv->vintf_sid_replace[i] = value;
> +        break;
> +    default:
> +        /*
> +         * GLB_FILT_CFG_0 (offset 0xC) and GLB_FILT_DATA_0 (offset 0x10) are
> +         * filter config and filter data registers. They are not required for
> +         * normal VINTF operation and are not emulated.
> +         */
> +        qemu_log_mask(LOG_UNIMP, "%s unhandled write access at 0x%" PRIx64 "\n",
> +                      __func__, offset);
> +        return;
> +    }
> +}
> +
> +static uint64_t tegra241_cmdqv_read_mmio(void *opaque, hwaddr offset,
> +                                         unsigned size)
> +{
> +    Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
> +    uint64_t val = 0;
> +
> +    if (offset >= TEGRA241_CMDQV_IO_LEN) {
> +        qemu_log_mask(LOG_UNIMP,
> +                      "%s offset 0x%" PRIx64 " off limit (0x%x)\n", __func__,
> +                      offset, TEGRA241_CMDQV_IO_LEN);
> +        goto out;
> +    }
> +
> +    switch (offset) {
> +    case A_CONFIG:
> +        val = cmdqv->config;
> +        break;
> +    case A_PARAM:
> +        val = cmdqv->param;
> +        break;
> +    case A_STATUS:
> +        val = cmdqv->status;
> +        break;
> +    case A_VI_ERR_MAP_0 ... A_VI_ERR_MAP_1:
> +        val = cmdqv->vi_err_map[(offset - A_VI_ERR_MAP_0) / 4];
> +        break;
> +    case A_VI_INT_MASK ... A_VI_INT_MASK_1:
> +        val = cmdqv->vi_int_mask[(offset - A_VI_INT_MASK) / 4];
> +        break;
> +    case A_CMDQ_ERR_MAP_0 ... A_CMDQ_ERR_MAP_3:
> +        val = cmdqv->cmdq_err_map[(offset - A_CMDQ_ERR_MAP_0) / 4];
> +        break;
> +    case A_CMDQ_ALLOC_MAP_0 ... A_CMDQ_ALLOC_MAP_1:
> +        val = cmdqv->cmdq_alloc_map[(offset - A_CMDQ_ALLOC_MAP_0) / 4];
> +        break;
> +    case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
> +        val = tegra241_cmdqv_config_vintf_read(cmdqv, offset);
> +        break;
> +    default:
> +        qemu_log_mask(LOG_UNIMP, "%s unhandled read access at 0x%" PRIx64 "\n",
> +                      __func__, offset);
> +    }
> +
> +out:
> +    trace_tegra241_cmdqv_read_mmio(offset, val, size);
> +    return val;
>  }
>  
> -static void tegra241_cmdqv_write(void *opaque, hwaddr offset, uint64_t value,
> -                                 unsigned size)
> +static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
> +                                      uint64_t value, unsigned size)
>  {
> +    Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
> +
> +    if (offset >= TEGRA241_CMDQV_IO_LEN) {
> +        qemu_log_mask(LOG_UNIMP,
> +                      "%s offset 0x%" PRIx64 " off limit (0x%x)\n", __func__,
> +                      offset, TEGRA241_CMDQV_IO_LEN);
> +        goto out;
> +    }
> +
> +    switch (offset) {
> +    case A_CONFIG:
> +        cmdqv->config = value;
> +        if (value & R_CONFIG_CMDQV_EN_MASK) {
> +            cmdqv->status |= R_STATUS_CMDQV_ENABLED_MASK;
> +        } else {
> +            cmdqv->status &= ~R_STATUS_CMDQV_ENABLED_MASK;
> +        }
> +        break;
> +    case A_VI_INT_MASK ... A_VI_INT_MASK_1:
> +        cmdqv->vi_int_mask[(offset - A_VI_INT_MASK) / 4] = value;
> +        break;
> +    case A_CMDQ_ALLOC_MAP_0 ... A_CMDQ_ALLOC_MAP_1:
> +        cmdqv->cmdq_alloc_map[(offset - A_CMDQ_ALLOC_MAP_0) / 4] = value;
> +        break;
> +    case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
> +        tegra241_cmdqv_config_vintf_write(cmdqv, offset, value);
> +        break;
> +    default:
> +        qemu_log_mask(LOG_UNIMP, "%s unhandled write access at 0x%" PRIx64 "\n",
> +                      __func__, offset);
> +    }
> +
> +out:
> +    trace_tegra241_cmdqv_write_mmio(offset, value, size);
>  }
>  
>  static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
> @@ -84,8 +235,8 @@ static void tegra241_cmdqv_reset(SMMUv3State *s)
>  }
>  
>  static const MemoryRegionOps mmio_cmdqv_ops = {
> -    .read = tegra241_cmdqv_read,
> -    .write = tegra241_cmdqv_write,
> +    .read = tegra241_cmdqv_read_mmio,
> +    .write = tegra241_cmdqv_write_mmio,
>      .endianness = DEVICE_LITTLE_ENDIAN,
>  };
>  
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 3457536fb0..8c61d66a26 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -72,6 +72,10 @@ smmuv3_accel_unset_iommu_device(int devfn, uint32_t devid) "devfn=0x%x (idev dev
>  smmuv3_accel_translate_ste(uint32_t vsid, uint32_t hwpt_id, uint64_t ste_1, uint64_t ste_0) "vSID=0x%x hwpt_id=0x%x ste=%"PRIx64":%"PRIx64
>  smmuv3_accel_install_ste(uint32_t vsid, const char * type, uint32_t hwpt_id) "vSID=0x%x ste type=%s hwpt_id=0x%x"
>  
> +# tegra241-cmdqv
> +tegra241_cmdqv_read_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
> +tegra241_cmdqv_write_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
> +
>  # strongarm.c
>  strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
>  strongarm_ssp_read_underrun(void) "SSP rx underrun"
Thanks

Eric



^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 08/31] hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle
  2026-05-04 15:33   ` Eric Auger
@ 2026-05-05  7:47     ` Shameer Kolothum Thodi
  0 siblings, 0 replies; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-05  7:47 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 04 May 2026 16:33
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org
> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 08/31] hw/arm/smmuv3-accel: Wire CMDQV ops into
> accel lifecycle
> 
> External email: Use caution opening links or attachments
> 
> 
> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> > Add support for selecting and initializing a CMDQV backend based on the
> > cmdqv OnOffAuto property.
> >
> > If set to OFF, CMDQV is not used and the default IOMMUFD-backed
> allocation
> > path is taken.
> >
> > If set to AUTO, QEMU attempts to probe a CMDQV backend during device
> setup.
> > If probing succeeds, the selected ops are stored in the accelerated SMMUv3
> > state and used. If probing fails, QEMU silently falls back to the default
> > path.
> >
> > If set to ON, QEMU requires CMDQV support. Probing is performed during
> > setup and failure results in an error.
> >
> > When a CMDQV backend is active, its callbacks are used for vIOMMU
> > allocation, free, and reset handling. Otherwise, the base implementation
> > is used.
> >
> > The current implementation wires up the Tegra241 CMDQV backend
> through the
> > generic ops interface. Functional CMDQV behaviour is added in subsequent
> > patches.
> >
> > No functional change.
> >
> > Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> > ---
> >  include/hw/arm/smmuv3.h |  2 +
> >  hw/arm/smmuv3-accel.c   | 93
> +++++++++++++++++++++++++++++++++++++----
> >  2 files changed, 88 insertions(+), 7 deletions(-)
> >
> > diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
> > index fe0493c1aa..aa6a79237a 100644
> > --- a/include/hw/arm/smmuv3.h
> > +++ b/include/hw/arm/smmuv3.h
> > @@ -74,6 +74,8 @@ struct SMMUv3State {
> >      OnOffAuto ats;
> >      OasMode oas;
> >      SsidSizeMode ssidsize;
> > +    /* SMMU CMDQV extension */
> > +    OnOffAuto cmdqv;
> >
> >      Notifier machine_done;
> >  };
> > diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> > index f65e654adf..9068e65e2b 100644
> > --- a/hw/arm/smmuv3-accel.c
> > +++ b/hw/arm/smmuv3-accel.c
> > @@ -18,6 +18,7 @@
> >
> >  #include "smmuv3-internal.h"
> >  #include "smmuv3-accel.h"
> > +#include "tegra241-cmdqv.h"
> >
> >  /*
> >   * The root region aliases the global system memory, and
> shared_as_sysmem
> > @@ -566,6 +567,7 @@ smmuv3_accel_alloc_viommu(SMMUv3State *s,
> HostIOMMUDeviceIOMMUFD *idev,
> >                            Error **errp)
> >  {
> >      SMMUv3AccelState *accel = s->s_accel;
> > +    const SMMUv3AccelCmdqvOps *cmdqv_ops = accel->cmdqv_ops;
> >      struct iommu_hwpt_arm_smmuv3 bypass_data = {
> >          .ste = { SMMU_STE_CFG_BYPASS | SMMU_STE_VALID, 0x0ULL },
> >      };
> > @@ -576,10 +578,17 @@ smmuv3_accel_alloc_viommu(SMMUv3State *s,
> HostIOMMUDeviceIOMMUFD *idev,
> >      uint32_t viommu_id, hwpt_id;
> >      IOMMUFDViommu *viommu;
> >
> > -    if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
> > -                                      IOMMU_VIOMMU_TYPE_ARM_SMMUV3,
> s2_hwpt_id,
> > -                                      NULL, 0, &viommu_id, errp)) {
> > -        return false;
> > +    if (cmdqv_ops) {
> > +        if (!cmdqv_ops->alloc_viommu(s, idev, &viommu_id, errp)) {
> > +            return false;
> > +        }
> > +    } else {
> > +        if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
> > +                                          IOMMU_VIOMMU_TYPE_ARM_SMMUV3,
> > +                                          s2_hwpt_id, NULL, 0, &viommu_id,
> > +                                          errp)) {
> > +            return false;
> > +        }
> >      }
> >
> >      viommu = g_new0(IOMMUFDViommu, 1);
> > @@ -625,12 +634,69 @@ free_bypass_hwpt:
> >  free_abort_hwpt:
> >      iommufd_backend_free_id(idev->iommufd, accel->abort_hwpt_id);
> >  free_viommu:
> > -    iommufd_backend_free_id(idev->iommufd, viommu->viommu_id);
> > +    if (cmdqv_ops && cmdqv_ops->free_viommu) {
> > +        cmdqv_ops->free_viommu(s);
> hum actually we do free the iommu id below in case the free_viommu is
> not implemented. So forget my previously comment. Besides this means
> free_viommu shall be documented as not mandatory.

Sure. Will add documentation to make the mandatory/not mandatory 
clear.

> > +    } else {
> > +        iommufd_backend_free_id(idev->iommufd, viommu->viommu_id);
> > +    }
> >      g_free(viommu);
> >      accel->viommu = NULL;
> >      return false;
> >  }
> >
> > +static const SMMUv3AccelCmdqvOps *
> > +smmuv3_accel_probe_cmdqv(SMMUv3State *s,
> HostIOMMUDeviceIOMMUFD *idev,
> > +                          Error **errp)
> > +{
> > +    const SMMUv3AccelCmdqvOps *ops = tegra241_cmdqv_get_ops();
> > +
> > +    if (!ops || !ops->probe) {
> > +        error_setg(errp, "No CMDQV ops found");
> > +        return NULL;
> > +    }
> > +
> > +    if (!ops->probe(s, idev, errp)) {
> > +        return NULL;
> > +    }
> > +    return ops;
> > +}
> > +
> > +static bool
> > +smmuv3_accel_select_cmdqv(SMMUv3State *s,
> HostIOMMUDeviceIOMMUFD *idev,
> > +                          Error **errp)
> > +{
> > +    const SMMUv3AccelCmdqvOps *ops = NULL;
> > +
> > +    if (s->s_accel->cmdqv_ops) {
> > +        return true;
> > +    }
> > +
> > +    switch (s->cmdqv) {
> > +    case ON_OFF_AUTO_OFF:
> > +        s->s_accel->cmdqv_ops = NULL;
> > +        return true;
> > +    case ON_OFF_AUTO_AUTO:
> > +        ops = smmuv3_accel_probe_cmdqv(s, idev, NULL);
> > +        break;
> > +    case ON_OFF_AUTO_ON:
> > +        ops = smmuv3_accel_probe_cmdqv(s, idev, errp);
> > +        if (!ops) {
> > +            error_append_hint(errp, "CMDQV requested but not supported");
> > +            return false;
> > +        }
> > +        s->s_accel->cmdqv_ops = ops;
> Shouldn't you remove the above setting as you do it again below in case
> init succeeds?

Correct. We don’t need to assign it here. 

> > +        break;
> > +    default:
> > +        g_assert_not_reached();
> > +    }
> > +
> > +    if (ops && ops->init && !ops->init(s, errp)) {
> > +        return false;
> > +    }
> > +    s->s_accel->cmdqv_ops = ops;
> > +    return true;
> > +}
> > +
> >  static bool smmuv3_accel_set_iommu_device(PCIBus *bus, void *opaque,
> int devfn,
> >                                            HostIOMMUDevice *hiod, Error **errp)
> >  {
> > @@ -665,6 +731,10 @@ static bool
> smmuv3_accel_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
> >          goto done;
> >      }
> >
> > +    if (!smmuv3_accel_select_cmdqv(s, idev, errp)) {
> > +        return false;
> > +    }
> > +
> >      if (!smmuv3_accel_alloc_viommu(s, idev, errp)) {
> >          error_append_hint(errp, "Unable to alloc vIOMMU: idev devid 0x%x: ",
> >                            idev->devid);
> > @@ -936,8 +1006,17 @@ bool
> smmuv3_accel_attach_gbpa_hwpt(SMMUv3State *s, Error **errp)
> >
> >  void smmuv3_accel_reset(SMMUv3State *s)
> >  {
> > -     /* Attach a HWPT based on GBPA reset value */
> > -     smmuv3_accel_attach_gbpa_hwpt(s, NULL);
> > +    SMMUv3AccelState *accel = s->s_accel;
> > +
> > +    if (!accel) {
> > +        return;
> > +    }
> > +    /* Attach a HWPT based on GBPA reset value */
> > +    smmuv3_accel_attach_gbpa_hwpt(s, NULL);
> > +
> > +    if (accel->cmdqv_ops && accel->cmdqv_ops->reset) {
> > +        accel->cmdqv_ops->reset(s);
> > +    }
> >  }
> >
> >  static void smmuv3_accel_as_init(SMMUv3State *s)
> Besides
> Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
  2026-05-05  0:40   ` Nicolin Chen
@ 2026-05-05  9:59     ` Shameer Kolothum Thodi
  2026-05-05 19:38       ` Nicolin Chen
  0 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-05  9:59 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org, eric.auger@redhat.com,
	peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: 05 May 2026 01:41
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>
> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org;
> eric.auger@redhat.com; peter.maydell@linaro.org; clg@redhat.com;
> alex@shazbot.org; Nathan Chen <nathanc@nvidia.com>; Matt Ochs
> <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW
> VCMDQs on base register programming
> 
> On Wed, Apr 15, 2026 at 11:55:40AM +0100, Shameer Kolothum wrote:
> > +static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int
> index,
> > +                                       Error **errp)
> > +{
> > +    SMMUv3AccelState *accel = cmdqv->s_accel;
> > +    uint64_t base_mask = (uint64_t)R_VCMDQ0_BASE_L_ADDR_MASK |
> > +                         (uint64_t)R_VCMDQ0_BASE_H_ADDR_MASK << 32;
> > +    uint64_t addr = cmdqv->vcmdq_base[index] & base_mask;
> > +    uint64_t log2 = cmdqv->vcmdq_base[index] &
> R_VCMDQ0_BASE_L_LOG2SIZE_MASK;
> > +    uint64_t size = 1ULL << (log2 + 4);
> > +    IOMMUFDViommu *viommu = accel->viommu;
> > +    IOMMUFDHWqueue *hw_queue;
> > +    uint32_t hw_queue_id;
> > +
> > +    /* Ignore any invalid address. This may come as part of reset etc. */
> > +    if (!address_space_is_ram(&address_space_memory, addr) ||
> > +        !address_space_is_ram(&address_space_memory, addr + size - 1)) {
> 
> Overflow prevention:
> size - 1 + addr

Ok.

> 
> > +    if (!tegra241_cmdq_enabled(cmdqv) ||
> !tegra241_vintf_enabled(cmdqv)) {
> > +        return true;
> > +    }
> 
> It's good to have these two checks. But if vcmdq setup is skipped
> for vintf=disabled, we need to call this setup() again upon vintf
> gets enabled?

Yeah. It looks like more restrictive than required as per spec. I will
address this.

> 
> Also, do we fence against unassigned vcmdq? Corner case is that a
> guest might write base address registers via direct (global) MMIO
> space.
> 

Not sure I get that completely.

Spec(p. 176) has:

"While the software can program the Virtual CMDQ(s) directly using the
direct VCMDQ aperture (and not through the Virtual Interface), it is
required that the VCMDQ be allocated to a Virtual Interface before it
is used to send commands to the SMMU."

The spec only restricts sending commands before allocation, not
programming BASE. In our model, the BASE write itself triggers 
alloc_hw_queue so there's nothing to fence there. For other's
(Page 0: CONS_INDX, PROD_INDX etc.), the vintf_ptr() check already drops
them silently if vcmdq[index] is not yet allocated, consistent
with spec p.172:

"If no Virtual CMDQ is mapped to the Guest, or if the logical CMDQ index
in the Virtual Interface being accessed by the software does not map to
any Virtual CMDQ, the access is dropped with no Fault/Interrupt".

Please let me know if the above interpretation is correct or not. 

> > @@ -363,7 +427,7 @@ static void tegra241_cmdqv_write_mmio(void
> *opaque, hwaddr offset,
> >           */
> >          index = (offset - CMDQV_VCMDQ_PAGE0_BASE) /
> CMDQV_VCMDQ_STRIDE;
> >          tegra241_cmdqv_write_vcmdq(cmdqv, offset - index *
> CMDQV_VCMDQ_STRIDE,
> > -                                   index, value, size);
> > +                                   index, value, size, &local_err);
> >          break;
> >      case A_VI_VCMDQ0_BASE_L ...
> A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
> >          /* Same VINTF-to-VCMDQ translation as VINTF Page0 case above */
> > @@ -373,7 +437,7 @@ static void tegra241_cmdqv_write_mmio(void
> *opaque, hwaddr offset,
> >          /* Same decode logic as VCMDQ Page0 case above */
> >          index = (offset - CMDQV_VCMDQ_PAGE1_BASE) /
> CMDQV_VCMDQ_STRIDE;
> >          tegra241_cmdqv_write_vcmdq(cmdqv, offset - index *
> CMDQV_VCMDQ_STRIDE,
> > -                                   index, value, size);
> > +                                   index, value, size, &local_err);
> 
> Should these two be squashed into an earlier patch?

Yes.

Thanks,
Shameer



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 15/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads
  2026-04-15 10:55 ` [PATCH v4 15/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads Shameer Kolothum
@ 2026-05-05 10:12   ` Eric Auger
  2026-05-05 13:27     ` Shameer Kolothum Thodi
  0 siblings, 1 reply; 102+ messages in thread
From: Eric Auger @ 2026-05-05 10:12 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina

Hi,

On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> Tegra241 CMDQV exposes per-VCMDQ register windows through two MMIO
> apertures:
>
>   CMDQV_CMDQ_BASE (0x10000/0x20000): VCMDQ Page0/Page1
(global)
>   CMDQV_VI_CMDQ_BASE (0x30000/0x40000): VINTF VCMDQ Page0/Page1
(logical)
>
> VINTF Page0 (0x30000) and VCMDQ Page0 (0x10000) are hardware aliases
what about page1?
> addressing the same underlying registers. Add read emulation for both
> apertures, backed by a register cache. VINTF Page0 reads are translated
> to their VCMDQ Page0 equivalent and served from the same cached state.
>
> Once IOMMU_HW_QUEUE_ALLOC and viommu_mmap are wired up in a subsequent
> patch, Page0 register reads will be served directly from the hardware
> backed mmap'd page instead of the cache. Page1 registers are always
> served from cache.

I would add add that
Page 0 contains VCMDQ control and status registers
while Page 1 contains VCMDQ base and DRAM address registers


I would add Nicolin's explanation
"

The global page 0 programmable at any time so long as CMDQV_EN
is enabled.

The logical (VINTF) page 0 are programmable only when SW allocates and maps
global vcmdq(s) to a VINTF. "logical" also means "local" to that
VINTF.
"





> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/tegra241-cmdqv.h | 185 ++++++++++++++++++++++++++++++++++++++++
>  hw/arm/tegra241-cmdqv.c |  73 ++++++++++++++++
>  2 files changed, 258 insertions(+)
>
> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> index 965670066d..b8bd8cd8ff 100644
> --- a/hw/arm/tegra241-cmdqv.h
> +++ b/hw/arm/tegra241-cmdqv.h
> @@ -29,6 +29,13 @@
>   */
>  #define TEGRA241_CMDQV_IO_LEN 0x50000
>  
> +/* CMDQV MMIO aperture bases and VCMDQ stride */
> +#define CMDQV_VCMDQ_PAGE0_BASE  0x10000  /* CMDQV_CMDQ_BASE */
> +#define CMDQV_VCMDQ_PAGE1_BASE  0x20000
> +#define CMDQV_VINTF_PAGE0_BASE  0x30000  /* CMDQV_VI_CMDQ_BASE */
> +#define CMDQV_VINTF_PAGE1_BASE  0x40000
> +#define CMDQV_VCMDQ_STRIDE      0x80
> +
>  typedef struct Tegra241CMDQV {
>      struct iommu_viommu_tegra241_cmdqv cmdqv_data;
>      SMMUv3AccelState *s_accel;
> @@ -49,6 +56,14 @@ typedef struct Tegra241CMDQV {
>      uint32_t vintf_sid_match[16];
>      uint32_t vintf_sid_replace[16];
>      uint32_t vintf_cmdq_err_map[4];
I would add a comment explaining that those cached registers store the
consistent/idenftical values for both the global and logical regs (they
are same)
> +    uint32_t vcmdq_cons_indx[TEGRA241_CMDQV_MAX_CMDQ];
> +    uint32_t vcmdq_prod_indx[TEGRA241_CMDQV_MAX_CMDQ];
> +    uint32_t vcmdq_config[TEGRA241_CMDQV_MAX_CMDQ];
> +    uint32_t vcmdq_status[TEGRA241_CMDQV_MAX_CMDQ];
> +    uint32_t vcmdq_gerror[TEGRA241_CMDQV_MAX_CMDQ];
> +    uint32_t vcmdq_gerrorn[TEGRA241_CMDQV_MAX_CMDQ];
> +    uint64_t vcmdq_base[TEGRA241_CMDQV_MAX_CMDQ];
> +    uint64_t vcmdq_cons_indx_base[TEGRA241_CMDQV_MAX_CMDQ];
>  } Tegra241CMDQV;
>  
>  /* CMDQ-V Config page registers (offset 0x00000) */
> @@ -160,6 +175,176 @@ SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(0, 0)
>  /* MAP_1 and MAP_2 omitted; not referenced directly */
>  SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(0, 3)
>  
> +/*
> + * VCMDQ register windows.
> + *
> + * Page 0 @ 0x10000: VCMDQ control and status registers
> + * Page 1 @ 0x20000: VCMDQ base and DRAM address registers
Can you clearly separate regs that belong to page 0 from regs that
belong to page 1
> + */
> +#define A_VCMDQi_CONS_INDX(i)        
same remark as for config registers. Please use the spec terminology

SMMU_CMDQV_VCMDQi_CONS_INDX_0

>                \
> +    REG32(VCMDQ##i##_CONS_INDX, 0x10000 + i * 0x80) \
> +    FIELD(VCMDQ##i##_CONS_INDX, RD, 0, 20)          \
> +    FIELD(VCMDQ##i##_CONS_INDX, ERR, 24, 7)
> +
> +A_VCMDQi_CONS_INDX(0)
> +A_VCMDQi_CONS_INDX(1)
> +
> +#define V_VCMDQ_CONS_INDX_ERR_CERROR_NONE 0
> +#define V_VCMDQ_CONS_INDX_ERR_CERROR_ILL_OPCODE 1
> +#define V_VCMDQ_CONS_INDX_ERR_CERROR_ABT 2
> +#define V_VCMDQ_CONS_INDX_ERR_CERROR_ATC_INV_SYNC 3
> +#define V_VCMDQ_CONS_INDX_ERR_CERROR_ILL_ACCESS 4
> +
> +#define A_VCMDQi_PROD_INDX(i)      
SMMU_CMDQV_VCMDQi_PROD_INDX_0
>                        \
> +    REG32(VCMDQ##i##_PROD_INDX, 0x10000 + 0x4 + i * 0x80) \
> +    FIELD(VCMDQ##i##_PROD_INDX, WR, 0, 20)
> +
> +A_VCMDQi_PROD_INDX(0)
> +A_VCMDQi_PROD_INDX(1)
> +
> +#define A_VCMDQi_CONFIG(i)           
SMMU_CMDQV_VCMDQi_CONFIG_0(i)
>                   \
> +    REG32(VCMDQ##i##_CONFIG, 0x10000 + 0x8 + i * 0x80) \
0x80, isn't it the stride you defined earlier, ie 

 CMDQV_VCMDQ_STRIDE ?

> +    FIELD(VCMDQ##i##_CONFIG, CMDQ_EN, 0, 1)
> +
> +A_VCMDQi_CONFIG(0)
> +A_VCMDQi_CONFIG(1)
> +
> +#define A_VCMDQi_STATUS(i)                             \
> +    REG32(VCMDQ##i##_STATUS, 0x10000 + 0xc + i * 0x80) \
> +    FIELD(VCMDQ##i##_STATUS, CMDQ_EN_OK, 0, 1)
> +
> +A_VCMDQi_STATUS(0)
> +A_VCMDQi_STATUS(1)
> +
> +#define A_VCMDQi_GERROR(i)                               \
> +    REG32(VCMDQ##i##_GERROR, 0x10000 + 0x10 + i * 0x80)  \
> +    FIELD(VCMDQ##i##_GERROR, CMDQ_ERR, 0, 1)             \
> +    FIELD(VCMDQ##i##_GERROR, CONS_DRAM_WR_ABT_ERR, 1, 1) \
> +    FIELD(VCMDQ##i##_GERROR, CMDQ_INIT_ERR, 2, 1)
> +
> +A_VCMDQi_GERROR(0)
> +A_VCMDQi_GERROR(1)
> +
> +#define A_VCMDQi_GERRORN(i)                               \
> +    REG32(VCMDQ##i##_GERRORN, 0x10000 + 0x14 + i * 0x80)  \
> +    FIELD(VCMDQ##i##_GERRORN, CMDQ_ERR, 0, 1)             \
> +    FIELD(VCMDQ##i##_GERRORN, CONS_DRAM_WR_ABT_ERR, 1, 1) \
> +    FIELD(VCMDQ##i##_GERRORN, CMDQ_INIT_ERR, 2, 1)
> +
> +A_VCMDQi_GERRORN(0)
> +A_VCMDQi_GERRORN(1)

/* Page 1 */
> +
> +#define A_VCMDQi_BASE_L(i)         
ditto remove the A_ prefix and use spec terminology
>               \
> +    REG32(VCMDQ##i##_BASE_L, 0x20000 + i * 0x80) \
> +    FIELD(VCMDQ##i##_BASE_L, LOG2SIZE, 0, 5)     \
> +    FIELD(VCMDQ##i##_BASE_L, ADDR, 5, 27)
> +
> +A_VCMDQi_BASE_L(0)
> +A_VCMDQi_BASE_L(1)
> +
> +#define A_VCMDQi_BASE_H(i)                             \
> +    REG32(VCMDQ##i##_BASE_H, 0x20000 + 0x4 + i * 0x80) \
may instead of using 0x20000 define macros for the base address of each
region
> +    FIELD(VCMDQ##i##_BASE_H, ADDR, 0, 16)
> +
> +A_VCMDQi_BASE_H(0)
> +A_VCMDQi_BASE_H(1)
> +
> +#define A_VCMDQi_CONS_INDX_BASE_DRAM_L(i)                             \
> +    REG32(VCMDQ##i##_CONS_INDX_BASE_DRAM_L, 0x20000 + 0x8 + i * 0x80) \
> +    FIELD(VCMDQ##i##_CONS_INDX_BASE_DRAM_L, ADDR, 0, 32)
> +
> +A_VCMDQi_CONS_INDX_BASE_DRAM_L(0)
> +A_VCMDQi_CONS_INDX_BASE_DRAM_L(1)
> +
> +#define A_VCMDQi_CONS_INDX_BASE_DRAM_H(i)                             \
> +    REG32(VCMDQ##i##_CONS_INDX_BASE_DRAM_H, 0x20000 + 0xc + i * 0x80) \
> +    FIELD(VCMDQ##i##_CONS_INDX_BASE_DRAM_H, ADDR, 0, 16)
> +
> +A_VCMDQi_CONS_INDX_BASE_DRAM_H(0)
> +A_VCMDQi_CONS_INDX_BASE_DRAM_H(1)
I would group all regs definition par page. The layout would look
clearer to me.
> +
> +/*
> + * VI_VCMDQ register windows (VCMDQs mapped via VINTF).
> + *
> + * Page 0 @ 0x30000: VI_VCMDQ control and status registers
> + * Page 1 @ 0x40000: VI_VCMDQ base and DRAM address registers
same here, pls separate page 0 and page 1 definitions
> + */
> +#define A_VI_VCMDQi_CONS_INDX(i)      
A_
>                  \
> +    REG32(VI_VCMDQ##i##_CONS_INDX, 0x30000 + i * 0x80) \
> +    FIELD(VI_VCMDQ##i##_CONS_INDX, RD, 0, 20)          \
> +    FIELD(VI_VCMDQ##i##_CONS_INDX, ERR, 24, 7)
> +
> +A_VI_VCMDQi_CONS_INDX(0)
> +A_VI_VCMDQi_CONS_INDX(1)
> +
> +#define A_VI_VCMDQi_PROD_INDX(i)                             \
> +    REG32(VI_VCMDQ##i##_PROD_INDX, 0x30000 + 0x4 + i * 0x80) \
> +    FIELD(VI_VCMDQ##i##_PROD_INDX, WR, 0, 20)
> +
> +A_VI_VCMDQi_PROD_INDX(0)
> +A_VI_VCMDQi_PROD_INDX(1)
> +
> +#define A_VI_VCMDQi_CONFIG(i)                             \
> +    REG32(VI_VCMDQ##i##_CONFIG, 0x30000 + 0x8 + i * 0x80) \
> +    FIELD(VI_VCMDQ##i##_CONFIG, CMDQ_EN, 0, 1)
> +
> +A_VI_VCMDQi_CONFIG(0)
> +A_VI_VCMDQi_CONFIG(1)
> +
> +#define A_VI_VCMDQi_STATUS(i)                             \
> +    REG32(VI_VCMDQ##i##_STATUS, 0x30000 + 0xc + i * 0x80) \
> +    FIELD(VI_VCMDQ##i##_STATUS, CMDQ_EN_OK, 0, 1)
> +
> +A_VI_VCMDQi_STATUS(0)
> +A_VI_VCMDQi_STATUS(1)
> +
> +#define A_VI_VCMDQi_GERROR(i)                               \
> +    REG32(VI_VCMDQ##i##_GERROR, 0x30000 + 0x10 + i * 0x80)  \
> +    FIELD(VI_VCMDQ##i##_GERROR, CMDQ_ERR, 0, 1)             \
> +    FIELD(VI_VCMDQ##i##_GERROR, CONS_DRAM_WR_ABT_ERR, 1, 1) \
> +    FIELD(VI_VCMDQ##i##_GERROR, CMDQ_INIT_ERR, 2, 1)
> +
> +A_VI_VCMDQi_GERROR(0)
> +A_VI_VCMDQi_GERROR(1)
> +
> +#define A_VI_VCMDQi_GERRORN(i)                               \
> +    REG32(VI_VCMDQ##i##_GERRORN, 0x30000 + 0x14 + i * 0x80)  \
> +    FIELD(VI_VCMDQ##i##_GERRORN, CMDQ_ERR, 0, 1)             \
> +    FIELD(VI_VCMDQ##i##_GERRORN, CONS_DRAM_WR_ABT_ERR, 1, 1) \
> +    FIELD(VI_VCMDQ##i##_GERRORN, CMDQ_INIT_ERR, 2, 1)
> +
> +A_VI_VCMDQi_GERRORN(0)
> +A_VI_VCMDQi_GERRORN(1)
> +
> +#define A_VI_VCMDQi_BASE_L(i)                       \
> +    REG32(VI_VCMDQ##i##_BASE_L, 0x40000 + i * 0x80) \
> +    FIELD(VI_VCMDQ##i##_BASE_L, LOG2SIZE, 0, 5)     \
> +    FIELD(VI_VCMDQ##i##_BASE_L, ADDR, 5, 27)
> +
> +A_VI_VCMDQi_BASE_L(0)
> +A_VI_VCMDQi_BASE_L(1)
> +
> +#define A_VI_VCMDQi_BASE_H(i)                             \
> +    REG32(VI_VCMDQ##i##_BASE_H, 0x40000 + 0x4 + i * 0x80) \
> +    FIELD(VI_VCMDQ##i##_BASE_H, ADDR, 0, 16)
> +
> +A_VI_VCMDQi_BASE_H(0)
> +A_VI_VCMDQi_BASE_H(1)
> +
> +#define A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(i)                             \
> +    REG32(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_L, 0x40000 + 0x8 + i * 0x80) \
> +    FIELD(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_L, ADDR, 0, 32)
> +
> +A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(0)
> +A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(1)
> +
> +#define A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(i)                             \
> +    REG32(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_H, 0x40000 + 0xc + i * 0x80) \
> +    FIELD(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_H, ADDR, 0, 16)
> +
> +A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(0)
> +A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(1)
> +
>  const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
>  
>  #endif /* HW_ARM_TEGRA241_CMDQV_H */
> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> index 3b08ed0ff3..35e6f0bbd6 100644
> --- a/hw/arm/tegra241-cmdqv.c
> +++ b/hw/arm/tegra241-cmdqv.c
> @@ -15,6 +15,46 @@
>  #include "tegra241-cmdqv.h"
>  #include "trace.h"
>  
> +/*
> + * Read a VCMDQ register using VCMDQ0_* offsets.
> + *
> + * The caller normalizes the MMIO offset such that @offset0 always refers
> + * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
> + *
> + * All VCMDQ accesses return cached registers.
> + */
> +static uint64_t tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
> +                                          int index)
> +{
> +    switch (offset0) {
> +    case A_VCMDQ0_CONS_INDX:
> +        return cmdqv->vcmdq_cons_indx[index];
> +    case A_VCMDQ0_PROD_INDX:
> +        return cmdqv->vcmdq_prod_indx[index];
> +    case A_VCMDQ0_CONFIG:
> +        return cmdqv->vcmdq_config[index];
> +    case A_VCMDQ0_STATUS:
> +        return cmdqv->vcmdq_status[index];
> +    case A_VCMDQ0_GERROR:
> +        return cmdqv->vcmdq_gerror[index];
> +    case A_VCMDQ0_GERRORN:
> +        return cmdqv->vcmdq_gerrorn[index];
> +    case A_VCMDQ0_BASE_L:
> +        return cmdqv->vcmdq_base[index];
> +    case A_VCMDQ0_BASE_H:
> +        return cmdqv->vcmdq_base[index] >> 32;
> +    case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
> +        return cmdqv->vcmdq_cons_indx_base[index];
> +    case A_VCMDQ0_CONS_INDX_BASE_DRAM_H:
> +        return cmdqv->vcmdq_cons_indx_base[index] >> 32;
> +    default:
> +        qemu_log_mask(LOG_UNIMP,
> +                      "%s unhandled read access at 0x%" PRIx64 "\n",
> +                      __func__, offset0);
> +        return 0;
> +    }
> +}
> +
>  static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV *cmdqv,
>                                                   hwaddr offset)
>  {
> @@ -92,6 +132,7 @@ static uint64_t tegra241_cmdqv_read_mmio(void *opaque, hwaddr offset,
>  {
>      Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
>      uint64_t val = 0;
> +    int index;
>  
>      if (offset >= TEGRA241_CMDQV_IO_LEN) {
>          qemu_log_mask(LOG_UNIMP,
> @@ -125,6 +166,38 @@ static uint64_t tegra241_cmdqv_read_mmio(void *opaque, hwaddr offset,
>      case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
>          val = tegra241_cmdqv_config_vintf_read(cmdqv, offset);
>          break;
> +    case A_VI_VCMDQ0_CONS_INDX ... A_VI_VCMDQ1_GERRORN:
> +        /*
> +         * VINTF Page0 registers have the same per-VCMDQ layout as the
> +         * VCMDQ Page0 registers. Translate the VINTF aperture offset to the
> +         * equivalent VCMDQ aperture offset, then fall through to reuse the
> +         * common VCMDQ decoding logic below.
> +         */
> +        offset -= CMDQV_VINTF_PAGE0_BASE - CMDQV_VCMDQ_PAGE0_BASE;
> +        QEMU_FALLTHROUGH;
> +    case A_VCMDQ0_CONS_INDX ... A_VCMDQ1_GERRORN:
> +        /*
> +         * Decode a per-VCMDQ register access.
> +         *
> +         * The hardware supports up to 128 identical VCMDQ instances; we
> +         * currently expose TEGRA241_CMDQV_MAX_CMDQ (= 2). Each VCMDQ
> +         * occupies a CMDQV_VCMDQ_STRIDE-byte window within the page.
> +         *
> +         * Extract the VCMDQ index and normalize to the VCMDQ0_* register
> +         * offset. A single helper services all instances via @index.
> +         */
> +        index = (offset - CMDQV_VCMDQ_PAGE0_BASE) / CMDQV_VCMDQ_STRIDE;
> +        return tegra241_cmdqv_read_vcmdq(cmdqv,
> +                offset - index * CMDQV_VCMDQ_STRIDE, index);
> +    case A_VI_VCMDQ0_BASE_L ... A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
> +        /* Same VINTF-to-VCMDQ translation as VINTF Page0 case above */
> +        offset -= CMDQV_VINTF_PAGE1_BASE - CMDQV_VCMDQ_PAGE1_BASE;
> +        QEMU_FALLTHROUGH;
> +    case A_VCMDQ0_BASE_L ... A_VCMDQ1_CONS_INDX_BASE_DRAM_H:
> +        /* Same decode logic as VCMDQ Page0 case above */
> +        index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
> +        return tegra241_cmdqv_read_vcmdq(cmdqv,
> +                offset - index * CMDQV_VCMDQ_STRIDE, index);
Please add trace points for read & write accesses

Thanks

Eric
>      default:
>          qemu_log_mask(LOG_UNIMP, "%s unhandled read access at 0x%" PRIx64 "\n",
>                        __func__, offset);



^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 14/31] hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region
  2026-05-05  7:26   ` Eric Auger
@ 2026-05-05 10:28     ` Shameer Kolothum Thodi
  0 siblings, 0 replies; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-05 10:28 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 05 May 2026 08:27
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org
> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 14/31] hw/arm/tegra241-cmdqv: Emulate CMDQ-V
> Config region
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi Shameer,
> 
> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> >
> > Tegra241 CMDQV exposes control and status registers in the CMDQ-V
> > Config page (offset [0x0, 0x10000)) used to configure virtual command
> > queue allocation and interrupt behavior.
> >
> > Add read/write emulation for the CMDQ-V Config region
> > ([CMDQV_BASE, CMDQV_CMDQ_BASE]), backed by a simple register cache.
> > This includes CONFIG, PARAM, STATUS, VI error and interrupt maps, CMDQ
> > allocation map and the VINTF0 related registers defined in the CMDQ-V
> > Config space. Only VINTF0 is supported; VINTF1-63 are not.
> >
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> > ---
> >  hw/arm/tegra241-cmdqv.h | 127 +++++++++++++++++++++++++++++++
> >  hw/arm/tegra241-cmdqv.c | 163
> ++++++++++++++++++++++++++++++++++++++--
> >  hw/arm/trace-events     |   4 +
> >  3 files changed, 288 insertions(+), 6 deletions(-)
> >
> > diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> > index fa0aa3ab04..965670066d 100644
> > --- a/hw/arm/tegra241-cmdqv.h
> > +++ b/hw/arm/tegra241-cmdqv.h
> > @@ -10,10 +10,14 @@
> >  #ifndef HW_ARM_TEGRA241_CMDQV_H
> >  #define HW_ARM_TEGRA241_CMDQV_H
> >
> > +#include "hw/core/registerfields.h"
> > +
> >  #define CMDQV_VER                 1
> >  #define CMDQV_NUM_CMDQ_LOG2       1
> >  #define CMDQV_NUM_SID_PER_VI_LOG2 4
> >
> > +#define TEGRA241_CMDQV_MAX_CMDQ   (1U <<
> CMDQV_NUM_CMDQ_LOG2)
> > +
> >  /*
> >   * Tegra241 CMDQV MMIO layout (64KB pages)
> >   *
> > @@ -31,8 +35,131 @@ typedef struct Tegra241CMDQV {
> >      MemoryRegion mmio_cmdqv;
> >      qemu_irq irq;
> >      IOMMUFDVeventq *veventq;
> > +
> > +    /* Register Cache */
> > +    uint32_t config;
> > +    uint32_t param;
> > +    uint32_t status;
> > +    uint32_t vi_err_map[2];
> > +    uint32_t vi_int_mask[2];
> > +    uint32_t cmdq_err_map[4];
> > +    uint32_t cmdq_alloc_map[TEGRA241_CMDQV_MAX_CMDQ];
> > +    uint32_t vintf_config;
> > +    uint32_t vintf_status;
> > +    uint32_t vintf_sid_match[16];
> > +    uint32_t vintf_sid_replace[16];
> > +    uint32_t vintf_cmdq_err_map[4];
> >  } Tegra241CMDQV;
> >
> > +/* CMDQ-V Config page registers (offset 0x00000) */
> > +REG32(CONFIG, 0x0)
> > +FIELD(CONFIG, CMDQV_EN, 0, 1)
> > +FIELD(CONFIG, CMDQV_PER_CMD_OFFSET, 1, 3)
> > +FIELD(CONFIG, CMDQ_MAX_CLK_BATCH, 4, 8)
> > +FIELD(CONFIG, CMDQ_MAX_CMD_BATCH, 12, 8)
> > +FIELD(CONFIG, CONS_DRAM_EN, 20, 1)
> > +
> > +REG32(PARAM, 0x4)
> > +FIELD(PARAM, CMDQV_VER, 0, 4)
> > +FIELD(PARAM, CMDQV_NUM_CMDQ_LOG2, 4, 4)
> > +FIELD(PARAM, CMDQV_NUM_VI_LOG2, 8, 4)
> > +FIELD(PARAM, CMDQV_NUM_SID_PER_VI_LOG2, 12, 4)
> > +
> > +REG32(STATUS, 0x8)
> > +FIELD(STATUS, CMDQV_ENABLED, 0, 1)
> > +
> > +/* SMMU_CMDQV_VI_ERR_MAP_0/1 definitions */
> > +#define A_VI_ERR_MAP_0 0x14
> > +#define A_VI_ERR_MAP_1 0x18
> > +#define V_VI_ERR_MAP_NO_ERROR (0)
> > +#define V_VI_ERR_MAP_ERROR (1)
> > +
> > +/* SMMU_CMDQV_VI_INT_MASK_0/1 definitions */
> > +#define A_VI_INT_MASK 0x1c
> pls use spec terminology: _0

Sure. 

> 
> 
> > +#define A_VI_INT_MASK_1 0x20
> > +#define V_VI_INT_MASK_NOT_MASKED (0)
> > +#define V_VI_INT_MASK_MASKED (1)
> > +
> > +/* SMMU_CMDQV_CMDQ_ERR_MAP_0-3 definitions */
> > +#define A_CMDQ_ERR_MAP_0 0x24
> > +#define A_CMDQ_ERR_MAP_1 0x28
> > +#define A_CMDQ_ERR_MAP_2 0x2c
> > +#define A_CMDQ_ERR_MAP_3 0x30
> > +
> > +/*
> > + * CMDQ_ALLOC_MAP: one entry per physical VCMDQ. Hardware supports
> up to 128
> > + * entries (CMDQV_NUM_CMDQ_LOG2=7), but QEMU only exposes
> > + * TEGRA241_CMDQV_MAX_CMDQ (=2) VCMDQs per VM so only entries 0
> and 1 are
> > + * defined here.
> > + */
> > +/* 2 identical register entries */
> > +#define SMMU_CMDQV_CMDQ_ALLOC_MAP_(i)        \
> > +    REG32(CMDQ_ALLOC_MAP_##i, 0x200 + i * 4) \
> > +    FIELD(CMDQ_ALLOC_MAP_##i, ALLOC, 0, 1)   \
> > +    FIELD(CMDQ_ALLOC_MAP_##i, LVCMDQ, 1, 7)  \
> > +    FIELD(CMDQ_ALLOC_MAP_##i, VIRT_INTF_INDX, 15, 6)
> > +
> > +SMMU_CMDQV_CMDQ_ALLOC_MAP_(0)
> > +SMMU_CMDQV_CMDQ_ALLOC_MAP_(1)
> > +
> > +
> > +/* Only VINTF0 is exposed to the guest; vintf = 0 */
> is it supposed to evolve? Otherwise we can keep things simple and just
> define a single

I don’t think it will in near future. Will change to a single VINTF.
 
> SMMU_CMDQV_VINTF0_CONFIG_0 and same for the rest.
> 
> 
> > +#define SMMU_CMDQV_VINTFi_CONFIG_(vi)                 \
> > +    REG32(VINTF##vi##_CONFIG, 0x1000 + vi * 0x100) \
> > +    FIELD(VINTF##vi##_CONFIG, ENABLE, 0, 1)       \
> > +    FIELD(VINTF##vi##_CONFIG, VMID, 1, 16)        \
> > +    FIELD(VINTF##vi##_CONFIG, HYP_OWN, 17, 1)
> > +
> > +SMMU_CMDQV_VINTFi_CONFIG_(0)
> > +
> > +#define SMMU_CMDQV_VINTFi_STATUS_(vi)                 \
> > +    REG32(VINTF##vi##_STATUS, 0x1004 + vi * 0x100) \
> > +    FIELD(VINTF##vi##_STATUS, ENABLE_OK, 0, 1)    \
> > +    FIELD(VINTF##vi##_STATUS, STATUS, 1, 3)       \
> > +    FIELD(VINTF##vi##_STATUS, VI_NUM_LVCMDQ, 16, 8)
> > +
> > +SMMU_CMDQV_VINTFi_STATUS_(0)
> > +
> > +#define V_VINTF_STATUS_NO_ERROR (0 << 1)
> > +#define V_VINTF_STATUS_VCMDQ_ERROR (1 << 1)
> > +
> > +/*
> > + * SID_MATCH/SID_REPLACE: 16 entries per VINTF
> (CMDQV_NUM_SID_PER_VI_LOG2=4).
> > + * vintf = 0, 16 identical register entries
> > + */
> > +#define SMMU_CMDQV_VINTFi_SID_MATCH_(vi, j)
> this could also be simplified
> >                         \
> > +    REG32(VINTF##vi##_SID_MATCH_##j, 0x1040 + j * 4 + vi * 0x100) \
> > +    FIELD(VINTF##vi##_SID_MATCH_##j, ENABLE, 0, 1)               \
> > +    FIELD(VINTF##vi##_SID_MATCH_##j, VIRT_SID, 1, 20)
> > +
> > +SMMU_CMDQV_VINTFi_SID_MATCH_(0, 0)
> > +/* Omitting [0][1~14] as not being directly called */
> > +SMMU_CMDQV_VINTFi_SID_MATCH_(0, 15)
> > +
> > +/* vintf = 0, 16 identical register entries */
> > +#define SMMU_CMDQV_VINTFi_SID_REPLACE_(vi, j)                          \
> > +    REG32(VINTF##vi##_SID_REPLACE_##j, 0x1080 + j * 4 + vi * 0x100) \
> > +    FIELD(VINTF##vi##_SID_REPLACE_##j, PHYS_SID, 0, 20)
> > +
> > +SMMU_CMDQV_VINTFi_SID_REPLACE_(0, 0)
> > +/* Omitting [0][1~14] as not being directly called */
> why?

Because the only use in switch uses range bounds:
 A_VINTF0_SID_REPLACE_0 ... A_VINTF0_SID_REPLACE_15

I will make it clear.

> > +SMMU_CMDQV_VINTFi_SID_REPLACE_(0, 15)
> > +
> > +/*
> > + * LVCMDQ_ERR_MAP: hardware defines 4 registers per VINTF (offset
> > + * 0x10c0..0x10cc), each covering 32 logical VCMDQs. All 4 are accessible
> > + * by the guest. With TEGRA241_CMDQV_MAX_CMDQ=2 only MAP_0 bits
> [1:0]
> > + * carry meaningful error state; MAP_1..MAP_3 always read as 0.
> > + * vintf = 0, 4 identical register entries
> > + */
> > +#define SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(vi, j)
> Again I won't add this extra complexity, get rid of vi everywhere for now.

Ok.

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 16/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register writes
  2026-04-15 10:55 ` [PATCH v4 16/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register writes Shameer Kolothum
@ 2026-05-05 10:42   ` Eric Auger
  2026-05-05 13:49     ` Shameer Kolothum Thodi
  0 siblings, 1 reply; 102+ messages in thread
From: Eric Auger @ 2026-05-05 10:42 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina

Hi,
On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> This is the write side counterpart of the VCMDQ read emulation. Add write
> handling for both CMDQV_CMDQ_BASE and CMDQV_VI_CMDQ_BASE apertures using
> the same index decoding and VINTF-to-VCMDQ translation logic as the read
> path.
>
> VINTF aperture writes are translated to their CMDQV_CMDQ_BASE equivalent
> and update the same cached state. Page1 registers (BASE, CONS_INDX_BASE)
> always update the cache. Once IOMMU_HW_QUEUE_ALLOC and viommu_mmap are
> wired up in a subsequent patch, Page0 register writes will be forwarded
> to the hardware-backed mmap'd page.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/tegra241-cmdqv.c | 99 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 99 insertions(+)
>
> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> index 35e6f0bbd6..d4ba2ada92 100644
> --- a/hw/arm/tegra241-cmdqv.c
> +++ b/hw/arm/tegra241-cmdqv.c
> @@ -55,6 +55,70 @@ static uint64_t tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
>      }
>  }
>  
> +/*
> + * Write a VCMDQ register using VCMDQ0_* offsets.
> + *
> + * The caller normalizes the MMIO offset such that @offset0 always refers
> + * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
> + */
> +static void tegra241_cmdqv_write_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
> +                                       int index, uint64_t value,
> +                                       unsigned size)
> +{
> +    switch (offset0) {
> +    case A_VCMDQ0_CONS_INDX:
> +        cmdqv->vcmdq_cons_indx[index] = (uint32_t)value;
> +        return;
> +    case A_VCMDQ0_PROD_INDX:
> +        cmdqv->vcmdq_prod_indx[index] = (uint32_t)value;
> +        return;
> +    case A_VCMDQ0_CONFIG:
> +        if (value & R_VCMDQ0_CONFIG_CMDQ_EN_MASK) {
> +            cmdqv->vcmdq_status[index] |= R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
> +        } else {
> +            cmdqv->vcmdq_status[index] &= ~R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
> +        }
> +        cmdqv->vcmdq_config[index] = (uint32_t)value;
> +        return;
> +    case A_VCMDQ0_GERRORN:
> +        cmdqv->vcmdq_gerrorn[index] = (uint32_t)value;
> +        return;
> +    case A_VCMDQ0_BASE_L:
> +        if (size == 8) {
> +            cmdqv->vcmdq_base[index] = value;
> +        } else {
> +            cmdqv->vcmdq_base[index] =
> +                (cmdqv->vcmdq_base[index] & 0xffffffff00000000ULL) |
> +                (value & 0xffffffffULL);
can you explain the access pattern?
this does not look like A_STRTAB_BASE smmuv3 writes/read

> +        }
> +        return;
> +    case A_VCMDQ0_BASE_H:
> +        cmdqv->vcmdq_base[index] =
> +            (cmdqv->vcmdq_base[index] & 0xffffffffULL) |
> +            ((uint64_t)value << 32);
> +        return;
> +    case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
> +        if (size == 8) {
> +            cmdqv->vcmdq_cons_indx_base[index] = value;
> +        } else {
> +            cmdqv->vcmdq_cons_indx_base[index] =
> +                (cmdqv->vcmdq_cons_indx_base[index] & 0xffffffff00000000ULL) |
> +                (value & 0xffffffffULL);
> +        }
> +        return;
> +    case A_VCMDQ0_CONS_INDX_BASE_DRAM_H:
> +        cmdqv->vcmdq_cons_indx_base[index] =
> +            (cmdqv->vcmdq_cons_indx_base[index] & 0xffffffffULL) |
> +            ((uint64_t)value << 32);
> +        return;
> +    default:
> +        qemu_log_mask(LOG_UNIMP,
> +                      "%s unhandled write access at 0x%" PRIx64 "\n",
> +                      __func__, offset0);
> +        return;
> +    }
> +}
> +
>  static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV *cmdqv,
>                                                   hwaddr offset)
>  {
> @@ -212,6 +276,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
>                                        uint64_t value, unsigned size)
>  {
>      Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
> +    int index;
>  
>      if (offset >= TEGRA241_CMDQV_IO_LEN) {
>          qemu_log_mask(LOG_UNIMP,
> @@ -238,6 +303,40 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
>      case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
>          tegra241_cmdqv_config_vintf_write(cmdqv, offset, value);
>          break;
> +    case A_VI_VCMDQ0_CONS_INDX ... A_VI_VCMDQ1_GERRORN:
> +        /*
> +         * VINTF Page0 registers have the same per-VCMDQ layout as the
> +         * VCMDQ Page0 registers. Translate the VINTF aperture offset to the
> +         * equivalent VCMDQ aperture offset, then fall through to reuse the
> +         * common VCMDQ decoding logic below.
> +         */
> +        offset -= CMDQV_VINTF_PAGE0_BASE - CMDQV_VCMDQ_PAGE0_BASE;
> +        QEMU_FALLTHROUGH;
> +    case A_VCMDQ0_CONS_INDX ... A_VCMDQ1_GERRORN:
> +        /*
> +         * Decode a per-VCMDQ register access.
> +         *
> +         * The hardware supports up to 128 identical VCMDQ instances; we
> +         * currently expose TEGRA241_CMDQV_MAX_CMDQ (= 2). Each VCMDQ
> +         * occupies a CMDQV_VCMDQ_STRIDE-byte window within the page.
> +         *
> +         * Extract the VCMDQ index and normalize to the VCMDQ0_* register
> +         * offset. A single helper services all instances via @index.
> +         */
> +        index = (offset - CMDQV_VCMDQ_PAGE0_BASE) / CMDQV_VCMDQ_STRIDE;
> +        tegra241_cmdqv_write_vcmdq(cmdqv, offset - index * CMDQV_VCMDQ_STRIDE,
> +                                   index, value, size);
> +        break;
> +    case A_VI_VCMDQ0_BASE_L ... A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
> +        /* Same VINTF-to-VCMDQ translation as VINTF Page0 case above */
> +        offset -= CMDQV_VINTF_PAGE1_BASE - CMDQV_VCMDQ_PAGE1_BASE;
> +        QEMU_FALLTHROUGH;
> +    case A_VCMDQ0_BASE_L ... A_VCMDQ1_CONS_INDX_BASE_DRAM_H:
> +        /* Same decode logic as VCMDQ Page0 case above */
> +        index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
> +        tegra241_cmdqv_write_vcmdq(cmdqv, offset - index * CMDQV_VCMDQ_STRIDE,
> +                                   index, value, size);
> +        break;
>      default:
>          qemu_log_mask(LOG_UNIMP, "%s unhandled write access at 0x%" PRIx64 "\n",
>                        __func__, offset);
Thanks

Eric



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
  2026-04-15 10:55 ` [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming Shameer Kolothum
  2026-05-05  0:40   ` Nicolin Chen
@ 2026-05-05 13:25   ` Eric Auger
  2026-05-05 14:26     ` Shameer Kolothum Thodi
  2026-05-06 16:51   ` Eric Auger
  2 siblings, 1 reply; 102+ messages in thread
From: Eric Auger @ 2026-05-05 13:25 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina

Hi,

On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> Add support for allocating IOMMUFD hardware queues when the guest
> programs the VCMDQ BASE registers.
>
> VCMDQ_EN is part of the VCMDQ_CONFIG register, which is accessed
> through the VINTF Page0 region. A subsequent patch maps this region
> directly into the guest address space, so QEMU does not trap writes
> to VCMDQ_CONFIG.
>
> Since VCMDQ_EN writes are not trapped, QEMU cannot allocate the
> hardware queue based on that bit. Instead, allocate the IOMMUFD
> hardware queue when the guest writes a VCMDQ BASE register with a
> valid RAM-backed address and when CMDQV and VINTF are enabled.
> If a hardware queue was previously allocated for the same VCMDQ,
> free it before reallocation.
the asymetric alloc/free sounds unusual. Are there other alternatives?
>
> Writes with invalid addresses are ignored.
>
> All allocated VCMDQs are freed when CMDQV or VINTF is disabled.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/tegra241-cmdqv.h | 11 +++++++
>  hw/arm/tegra241-cmdqv.c | 70 +++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 78 insertions(+), 3 deletions(-)
>
> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> index 88572ad939..039d86374f 100644
> --- a/hw/arm/tegra241-cmdqv.h
> +++ b/hw/arm/tegra241-cmdqv.h
> @@ -44,6 +44,7 @@ typedef struct Tegra241CMDQV {
>      MemoryRegion mmio_cmdqv;
>      qemu_irq irq;
>      IOMMUFDVeventq *veventq;
> +    IOMMUFDHWqueue *vcmdq[TEGRA241_CMDQV_MAX_CMDQ];
>      void *vintf_page0;
>  
>      /* Register Cache */
> @@ -348,6 +349,16 @@ A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(1)
>  A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(0)
>  A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(1)
>  
> +static inline bool tegra241_cmdq_enabled(Tegra241CMDQV *cmdqv)
> +{
> +    return cmdqv->status & R_STATUS_CMDQV_ENABLED_MASK;
> +}
> +
> +static inline bool tegra241_vintf_enabled(Tegra241CMDQV *cmdqv)
> +{
> +    return cmdqv->vintf_status & R_VINTF0_STATUS_ENABLE_OK_MASK;
> +}
> +
>  const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
>  
>  #endif /* HW_ARM_TEGRA241_CMDQV_H */
> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> index cdd941cec9..b5f2f74cf2 100644
> --- a/hw/arm/tegra241-cmdqv.c
> +++ b/hw/arm/tegra241-cmdqv.c
> @@ -15,6 +15,66 @@
>  #include "tegra241-cmdqv.h"
>  #include "trace.h"
>  
> +static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int index)
> +{
> +    IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
> +    IOMMUFDHWqueue *vcmdq = cmdqv->vcmdq[index];
> +
> +    if (!vcmdq) {
> +        return;
> +    }
> +    iommufd_backend_free_id(viommu->iommufd, vcmdq->hw_queue_id);
> +    g_free(vcmdq);
> +    cmdqv->vcmdq[index] = NULL;
> +}
> +
> +static void tegra241_cmdqv_free_all_vcmdq(Tegra241CMDQV *cmdqv)
> +{
> +    /* Free in reverse order to avoid "resource busy" error */
can you provide additional details about the above problematic. Is it
documented in the spec?
> +    for (int i = (TEGRA241_CMDQV_MAX_CMDQ - 1); i >= 0; i--) {
> +        tegra241_cmdqv_free_vcmdq(cmdqv, i);
> +    }
> +}
> +
> +static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int index,
> +                                       Error **errp)
> +{
> +    SMMUv3AccelState *accel = cmdqv->s_accel;
> +    uint64_t base_mask = (uint64_t)R_VCMDQ0_BASE_L_ADDR_MASK |
> +                         (uint64_t)R_VCMDQ0_BASE_H_ADDR_MASK << 32;
> +    uint64_t addr = cmdqv->vcmdq_base[index] & base_mask;
> +    uint64_t log2 = cmdqv->vcmdq_base[index] & R_VCMDQ0_BASE_L_LOG2SIZE_MASK;
> +    uint64_t size = 1ULL << (log2 + 4);
> +    IOMMUFDViommu *viommu = accel->viommu;
> +    IOMMUFDHWqueue *hw_queue;
> +    uint32_t hw_queue_id;
> +
> +    /* Ignore any invalid address. This may come as part of reset etc. */
> +    if (!address_space_is_ram(&address_space_memory, addr) ||
> +        !address_space_is_ram(&address_space_memory, addr + size - 1)) {
this check looks a little bit risky, no? Don't we have a better way to
test the address has been properly set?
> +        return true;
> +    }
> +
> +    if (!tegra241_cmdq_enabled(cmdqv) || !tegra241_vintf_enabled(cmdqv)) {
> +        return true;
> +    }
> +
> +    tegra241_cmdqv_free_vcmdq(cmdqv, index);
would deserve a comment also here.
> +
> +    if (!iommufd_backend_alloc_hw_queue(viommu->iommufd, viommu->viommu_id,
> +                                        IOMMU_HW_QUEUE_TYPE_TEGRA241_CMDQV,
> +                                        index, addr, size, &hw_queue_id,
> +                                        errp)) {
> +        return false;
> +    }
> +    hw_queue = g_new(IOMMUFDHWqueue, 1);
> +    hw_queue->hw_queue_id = hw_queue_id;
> +    hw_queue->viommu = viommu;
> +    cmdqv->vcmdq[index] = hw_queue;
> +
> +    return true;
> +}
> +
>  /*
>   * Read a VCMDQ register using VCMDQ0_* offsets.
>   *
> @@ -63,7 +123,7 @@ static uint64_t tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
>   */
>  static void tegra241_cmdqv_write_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
>                                         int index, uint64_t value,
> -                                       unsigned size)
> +                                       unsigned size, Error **errp)
>  {
>      switch (offset0) {
>      case A_VCMDQ0_CONS_INDX:
> @@ -91,11 +151,13 @@ static void tegra241_cmdqv_write_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
>                  (cmdqv->vcmdq_base[index] & 0xffffffff00000000ULL) |
>                  (value & 0xffffffffULL);
>          }
> +        tegra241_cmdqv_setup_vcmdq(cmdqv, index, errp);
>          return;
>      case A_VCMDQ0_BASE_H:
>          cmdqv->vcmdq_base[index] =
>              (cmdqv->vcmdq_base[index] & 0xffffffffULL) |
>              ((uint64_t)value << 32);
> +        tegra241_cmdqv_setup_vcmdq(cmdqv, index, errp);
>          return;
>      case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
>          if (size == 8) {
> @@ -204,6 +266,7 @@ static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
>                  cmdqv->vintf_status |= R_VINTF0_STATUS_ENABLE_OK_MASK;
>              }
>          } else {
> +            tegra241_cmdqv_free_all_vcmdq(cmdqv);
>              tegra241_cmdqv_munmap_vintf_page0(cmdqv, errp);
>              cmdqv->vintf_status &= ~R_VINTF0_STATUS_ENABLE_OK_MASK;
>          }
> @@ -329,6 +392,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
>          if (value & R_CONFIG_CMDQV_EN_MASK) {
>              cmdqv->status |= R_STATUS_CMDQV_ENABLED_MASK;
>          } else {
> +            tegra241_cmdqv_free_all_vcmdq(cmdqv);
>              cmdqv->status &= ~R_STATUS_CMDQV_ENABLED_MASK;
>          }
>          break;
> @@ -363,7 +427,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
>           */
>          index = (offset - CMDQV_VCMDQ_PAGE0_BASE) / CMDQV_VCMDQ_STRIDE;
>          tegra241_cmdqv_write_vcmdq(cmdqv, offset - index * CMDQV_VCMDQ_STRIDE,
> -                                   index, value, size);
> +                                   index, value, size, &local_err);
>          break;
>      case A_VI_VCMDQ0_BASE_L ... A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
>          /* Same VINTF-to-VCMDQ translation as VINTF Page0 case above */
> @@ -373,7 +437,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
>          /* Same decode logic as VCMDQ Page0 case above */
>          index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
>          tegra241_cmdqv_write_vcmdq(cmdqv, offset - index * CMDQV_VCMDQ_STRIDE,
> -                                   index, value, size);
> +                                   index, value, size, &local_err);
>          break;
>      default:
>          qemu_log_mask(LOG_UNIMP, "%s unhandled write access at 0x%" PRIx64 "\n",
Eric



^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 15/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads
  2026-05-05 10:12   ` Eric Auger
@ 2026-05-05 13:27     ` Shameer Kolothum Thodi
  2026-05-06 11:14       ` Eric Auger
  0 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-05 13:27 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 05 May 2026 11:13
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org
> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 15/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ
> register reads
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi,
> 
> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> >
> > Tegra241 CMDQV exposes per-VCMDQ register windows through two
> MMIO
> > apertures:
> >
> >   CMDQV_CMDQ_BASE (0x10000/0x20000): VCMDQ Page0/Page1
> (global)
> >   CMDQV_VI_CMDQ_BASE (0x30000/0x40000): VINTF VCMDQ
> Page0/Page1
> (logical)
> >
> > VINTF Page0 (0x30000) and VCMDQ Page0 (0x10000) are hardware aliases
> what about page1?

Right. Page1 is also an alias, but kernel only provides mmap support for Page0.
Hence Page1 is always served from register cache. I will update.

> > addressing the same underlying registers. Add read emulation for both
> > apertures, backed by a register cache. VINTF Page0 reads are translated
> > to their VCMDQ Page0 equivalent and served from the same cached state.
> >
> > Once IOMMU_HW_QUEUE_ALLOC and viommu_mmap are wired up in a
> subsequent
> > patch, Page0 register reads will be served directly from the hardware
> > backed mmap'd page instead of the cache. Page1 registers are always
> > served from cache.
> 
> I would add add that
> Page 0 contains VCMDQ control and status registers
> while Page 1 contains VCMDQ base and DRAM address registers
> 
> 
> I would add Nicolin's explanation
> "
> 
> The global page 0 programmable at any time so long as CMDQV_EN
> is enabled.
> 
> The logical (VINTF) page 0 are programmable only when SW allocates and
> maps
> global vcmdq(s) to a VINTF. "logical" also means "local" to that
> VINTF.
> "

Ok. Will add.

> 
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> > ---
> >  hw/arm/tegra241-cmdqv.h | 185
> ++++++++++++++++++++++++++++++++++++++++
> >  hw/arm/tegra241-cmdqv.c |  73 ++++++++++++++++
> >  2 files changed, 258 insertions(+)
> >
> > diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> > index 965670066d..b8bd8cd8ff 100644
> > --- a/hw/arm/tegra241-cmdqv.h
> > +++ b/hw/arm/tegra241-cmdqv.h
> > @@ -29,6 +29,13 @@
> >   */
> >  #define TEGRA241_CMDQV_IO_LEN 0x50000
> >
> > +/* CMDQV MMIO aperture bases and VCMDQ stride */
> > +#define CMDQV_VCMDQ_PAGE0_BASE  0x10000  /* CMDQV_CMDQ_BASE
> */
> > +#define CMDQV_VCMDQ_PAGE1_BASE  0x20000
> > +#define CMDQV_VINTF_PAGE0_BASE  0x30000  /*
> CMDQV_VI_CMDQ_BASE */
> > +#define CMDQV_VINTF_PAGE1_BASE  0x40000
> > +#define CMDQV_VCMDQ_STRIDE      0x80
> > +
> >  typedef struct Tegra241CMDQV {
> >      struct iommu_viommu_tegra241_cmdqv cmdqv_data;
> >      SMMUv3AccelState *s_accel;
> > @@ -49,6 +56,14 @@ typedef struct Tegra241CMDQV {
> >      uint32_t vintf_sid_match[16];
> >      uint32_t vintf_sid_replace[16];
> >      uint32_t vintf_cmdq_err_map[4];
> I would add a comment explaining that those cached registers store the
> consistent/idenftical values for both the global and logical regs (they
> are same)
> > +    uint32_t vcmdq_cons_indx[TEGRA241_CMDQV_MAX_CMDQ];
> > +    uint32_t vcmdq_prod_indx[TEGRA241_CMDQV_MAX_CMDQ];
> > +    uint32_t vcmdq_config[TEGRA241_CMDQV_MAX_CMDQ];
> > +    uint32_t vcmdq_status[TEGRA241_CMDQV_MAX_CMDQ];
> > +    uint32_t vcmdq_gerror[TEGRA241_CMDQV_MAX_CMDQ];
> > +    uint32_t vcmdq_gerrorn[TEGRA241_CMDQV_MAX_CMDQ];
> > +    uint64_t vcmdq_base[TEGRA241_CMDQV_MAX_CMDQ];
> > +    uint64_t vcmdq_cons_indx_base[TEGRA241_CMDQV_MAX_CMDQ];
> >  } Tegra241CMDQV;
> >
> >  /* CMDQ-V Config page registers (offset 0x00000) */
> > @@ -160,6 +175,176 @@
> SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(0, 0)
> >  /* MAP_1 and MAP_2 omitted; not referenced directly */
> >  SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(0, 3)
> >
> > +/*
> > + * VCMDQ register windows.
> > + *
> > + * Page 0 @ 0x10000: VCMDQ control and status registers
> > + * Page 1 @ 0x20000: VCMDQ base and DRAM address registers
> Can you clearly separate regs that belong to page 0 from regs that
> belong to page 1
> > + */
> > +#define A_VCMDQi_CONS_INDX(i)
> same remark as for config registers. Please use the spec terminology
> 
> SMMU_CMDQV_VCMDQi_CONS_INDX_0
> 
> >                \
> > +    REG32(VCMDQ##i##_CONS_INDX, 0x10000 + i * 0x80) \
> > +    FIELD(VCMDQ##i##_CONS_INDX, RD, 0, 20)          \
> > +    FIELD(VCMDQ##i##_CONS_INDX, ERR, 24, 7)
> > +
> > +A_VCMDQi_CONS_INDX(0)
> > +A_VCMDQi_CONS_INDX(1)
> > +
> > +#define V_VCMDQ_CONS_INDX_ERR_CERROR_NONE 0
> > +#define V_VCMDQ_CONS_INDX_ERR_CERROR_ILL_OPCODE 1
> > +#define V_VCMDQ_CONS_INDX_ERR_CERROR_ABT 2
> > +#define V_VCMDQ_CONS_INDX_ERR_CERROR_ATC_INV_SYNC 3
> > +#define V_VCMDQ_CONS_INDX_ERR_CERROR_ILL_ACCESS 4
> > +
> > +#define A_VCMDQi_PROD_INDX(i)
> SMMU_CMDQV_VCMDQi_PROD_INDX_0
> >                        \
> > +    REG32(VCMDQ##i##_PROD_INDX, 0x10000 + 0x4 + i * 0x80) \
> > +    FIELD(VCMDQ##i##_PROD_INDX, WR, 0, 20)
> > +
> > +A_VCMDQi_PROD_INDX(0)
> > +A_VCMDQi_PROD_INDX(1)
> > +
> > +#define A_VCMDQi_CONFIG(i)
> SMMU_CMDQV_VCMDQi_CONFIG_0(i)
> >                   \
> > +    REG32(VCMDQ##i##_CONFIG, 0x10000 + 0x8 + i * 0x80) \
> 0x80, isn't it the stride you defined earlier, ie
> 
>  CMDQV_VCMDQ_STRIDE ?
> 
> > +    FIELD(VCMDQ##i##_CONFIG, CMDQ_EN, 0, 1)
> > +
> > +A_VCMDQi_CONFIG(0)
> > +A_VCMDQi_CONFIG(1)
> > +
> > +#define A_VCMDQi_STATUS(i)                             \
> > +    REG32(VCMDQ##i##_STATUS, 0x10000 + 0xc + i * 0x80) \
> > +    FIELD(VCMDQ##i##_STATUS, CMDQ_EN_OK, 0, 1)
> > +
> > +A_VCMDQi_STATUS(0)
> > +A_VCMDQi_STATUS(1)
> > +
> > +#define A_VCMDQi_GERROR(i)                               \
> > +    REG32(VCMDQ##i##_GERROR, 0x10000 + 0x10 + i * 0x80)  \
> > +    FIELD(VCMDQ##i##_GERROR, CMDQ_ERR, 0, 1)             \
> > +    FIELD(VCMDQ##i##_GERROR, CONS_DRAM_WR_ABT_ERR, 1, 1) \
> > +    FIELD(VCMDQ##i##_GERROR, CMDQ_INIT_ERR, 2, 1)
> > +
> > +A_VCMDQi_GERROR(0)
> > +A_VCMDQi_GERROR(1)
> > +
> > +#define A_VCMDQi_GERRORN(i)                               \
> > +    REG32(VCMDQ##i##_GERRORN, 0x10000 + 0x14 + i * 0x80)  \
> > +    FIELD(VCMDQ##i##_GERRORN, CMDQ_ERR, 0, 1)             \
> > +    FIELD(VCMDQ##i##_GERRORN, CONS_DRAM_WR_ABT_ERR, 1, 1) \
> > +    FIELD(VCMDQ##i##_GERRORN, CMDQ_INIT_ERR, 2, 1)
> > +
> > +A_VCMDQi_GERRORN(0)
> > +A_VCMDQi_GERRORN(1)
> 
> /* Page 1 */
> > +
> > +#define A_VCMDQi_BASE_L(i)
> ditto remove the A_ prefix and use spec terminology
> >               \
> > +    REG32(VCMDQ##i##_BASE_L, 0x20000 + i * 0x80) \
> > +    FIELD(VCMDQ##i##_BASE_L, LOG2SIZE, 0, 5)     \
> > +    FIELD(VCMDQ##i##_BASE_L, ADDR, 5, 27)
> > +
> > +A_VCMDQi_BASE_L(0)
> > +A_VCMDQi_BASE_L(1)
> > +
> > +#define A_VCMDQi_BASE_H(i)                             \
> > +    REG32(VCMDQ##i##_BASE_H, 0x20000 + 0x4 + i * 0x80) \
> may instead of using 0x20000 define macros for the base address of each
> region
> > +    FIELD(VCMDQ##i##_BASE_H, ADDR, 0, 16)
> > +
> > +A_VCMDQi_BASE_H(0)
> > +A_VCMDQi_BASE_H(1)
> > +
> > +#define A_VCMDQi_CONS_INDX_BASE_DRAM_L(i)                             \
> > +    REG32(VCMDQ##i##_CONS_INDX_BASE_DRAM_L, 0x20000 + 0x8 + i *
> 0x80) \
> > +    FIELD(VCMDQ##i##_CONS_INDX_BASE_DRAM_L, ADDR, 0, 32)
> > +
> > +A_VCMDQi_CONS_INDX_BASE_DRAM_L(0)
> > +A_VCMDQi_CONS_INDX_BASE_DRAM_L(1)
> > +
> > +#define A_VCMDQi_CONS_INDX_BASE_DRAM_H(i)                             \
> > +    REG32(VCMDQ##i##_CONS_INDX_BASE_DRAM_H, 0x20000 + 0xc + i *
> 0x80) \
> > +    FIELD(VCMDQ##i##_CONS_INDX_BASE_DRAM_H, ADDR, 0, 16)
> > +
> > +A_VCMDQi_CONS_INDX_BASE_DRAM_H(0)
> > +A_VCMDQi_CONS_INDX_BASE_DRAM_H(1)
> I would group all regs definition par page. The layout would look
> clearer to me.
> > +
> > +/*
> > + * VI_VCMDQ register windows (VCMDQs mapped via VINTF).
> > + *
> > + * Page 0 @ 0x30000: VI_VCMDQ control and status registers
> > + * Page 1 @ 0x40000: VI_VCMDQ base and DRAM address registers
> same here, pls separate page 0 and page 1 definitions
> > + */
> > +#define A_VI_VCMDQi_CONS_INDX(i)
> A_
> >                  \
> > +    REG32(VI_VCMDQ##i##_CONS_INDX, 0x30000 + i * 0x80) \
> > +    FIELD(VI_VCMDQ##i##_CONS_INDX, RD, 0, 20)          \
> > +    FIELD(VI_VCMDQ##i##_CONS_INDX, ERR, 24, 7)
> > +
> > +A_VI_VCMDQi_CONS_INDX(0)
> > +A_VI_VCMDQi_CONS_INDX(1)
> > +
> > +#define A_VI_VCMDQi_PROD_INDX(i)                             \
> > +    REG32(VI_VCMDQ##i##_PROD_INDX, 0x30000 + 0x4 + i * 0x80) \
> > +    FIELD(VI_VCMDQ##i##_PROD_INDX, WR, 0, 20)
> > +
> > +A_VI_VCMDQi_PROD_INDX(0)
> > +A_VI_VCMDQi_PROD_INDX(1)
> > +
> > +#define A_VI_VCMDQi_CONFIG(i)                             \
> > +    REG32(VI_VCMDQ##i##_CONFIG, 0x30000 + 0x8 + i * 0x80) \
> > +    FIELD(VI_VCMDQ##i##_CONFIG, CMDQ_EN, 0, 1)
> > +
> > +A_VI_VCMDQi_CONFIG(0)
> > +A_VI_VCMDQi_CONFIG(1)
> > +
> > +#define A_VI_VCMDQi_STATUS(i)                             \
> > +    REG32(VI_VCMDQ##i##_STATUS, 0x30000 + 0xc + i * 0x80) \
> > +    FIELD(VI_VCMDQ##i##_STATUS, CMDQ_EN_OK, 0, 1)
> > +
> > +A_VI_VCMDQi_STATUS(0)
> > +A_VI_VCMDQi_STATUS(1)
> > +
> > +#define A_VI_VCMDQi_GERROR(i)                               \
> > +    REG32(VI_VCMDQ##i##_GERROR, 0x30000 + 0x10 + i * 0x80)  \
> > +    FIELD(VI_VCMDQ##i##_GERROR, CMDQ_ERR, 0, 1)             \
> > +    FIELD(VI_VCMDQ##i##_GERROR, CONS_DRAM_WR_ABT_ERR, 1, 1) \
> > +    FIELD(VI_VCMDQ##i##_GERROR, CMDQ_INIT_ERR, 2, 1)
> > +
> > +A_VI_VCMDQi_GERROR(0)
> > +A_VI_VCMDQi_GERROR(1)
> > +
> > +#define A_VI_VCMDQi_GERRORN(i)                               \
> > +    REG32(VI_VCMDQ##i##_GERRORN, 0x30000 + 0x14 + i * 0x80)  \
> > +    FIELD(VI_VCMDQ##i##_GERRORN, CMDQ_ERR, 0, 1)             \
> > +    FIELD(VI_VCMDQ##i##_GERRORN, CONS_DRAM_WR_ABT_ERR, 1, 1) \
> > +    FIELD(VI_VCMDQ##i##_GERRORN, CMDQ_INIT_ERR, 2, 1)
> > +
> > +A_VI_VCMDQi_GERRORN(0)
> > +A_VI_VCMDQi_GERRORN(1)
> > +
> > +#define A_VI_VCMDQi_BASE_L(i)                       \
> > +    REG32(VI_VCMDQ##i##_BASE_L, 0x40000 + i * 0x80) \
> > +    FIELD(VI_VCMDQ##i##_BASE_L, LOG2SIZE, 0, 5)     \
> > +    FIELD(VI_VCMDQ##i##_BASE_L, ADDR, 5, 27)
> > +
> > +A_VI_VCMDQi_BASE_L(0)
> > +A_VI_VCMDQi_BASE_L(1)
> > +
> > +#define A_VI_VCMDQi_BASE_H(i)                             \
> > +    REG32(VI_VCMDQ##i##_BASE_H, 0x40000 + 0x4 + i * 0x80) \
> > +    FIELD(VI_VCMDQ##i##_BASE_H, ADDR, 0, 16)
> > +
> > +A_VI_VCMDQi_BASE_H(0)
> > +A_VI_VCMDQi_BASE_H(1)
> > +
> > +#define A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(i)                             \
> > +    REG32(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_L, 0x40000 + 0x8 + i
> * 0x80) \
> > +    FIELD(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_L, ADDR, 0, 32)
> > +
> > +A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(0)
> > +A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(1)
> > +
> > +#define A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(i)                             \
> > +    REG32(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_H, 0x40000 + 0xc + i
> * 0x80) \
> > +    FIELD(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_H, ADDR, 0, 16)
> > +
> > +A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(0)
> > +A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(1)
> > +
> >  const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
> >
> >  #endif /* HW_ARM_TEGRA241_CMDQV_H */
> > diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> > index 3b08ed0ff3..35e6f0bbd6 100644
> > --- a/hw/arm/tegra241-cmdqv.c
> > +++ b/hw/arm/tegra241-cmdqv.c
> > @@ -15,6 +15,46 @@
> >  #include "tegra241-cmdqv.h"
> >  #include "trace.h"
> >
> > +/*
> > + * Read a VCMDQ register using VCMDQ0_* offsets.
> > + *
> > + * The caller normalizes the MMIO offset such that @offset0 always refers
> > + * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
> > + *
> > + * All VCMDQ accesses return cached registers.
> > + */
> > +static uint64_t tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv,
> hwaddr offset0,
> > +                                          int index)
> > +{
> > +    switch (offset0) {
> > +    case A_VCMDQ0_CONS_INDX:
> > +        return cmdqv->vcmdq_cons_indx[index];
> > +    case A_VCMDQ0_PROD_INDX:
> > +        return cmdqv->vcmdq_prod_indx[index];
> > +    case A_VCMDQ0_CONFIG:
> > +        return cmdqv->vcmdq_config[index];
> > +    case A_VCMDQ0_STATUS:
> > +        return cmdqv->vcmdq_status[index];
> > +    case A_VCMDQ0_GERROR:
> > +        return cmdqv->vcmdq_gerror[index];
> > +    case A_VCMDQ0_GERRORN:
> > +        return cmdqv->vcmdq_gerrorn[index];
> > +    case A_VCMDQ0_BASE_L:
> > +        return cmdqv->vcmdq_base[index];
> > +    case A_VCMDQ0_BASE_H:
> > +        return cmdqv->vcmdq_base[index] >> 32;
> > +    case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
> > +        return cmdqv->vcmdq_cons_indx_base[index];
> > +    case A_VCMDQ0_CONS_INDX_BASE_DRAM_H:
> > +        return cmdqv->vcmdq_cons_indx_base[index] >> 32;
> > +    default:
> > +        qemu_log_mask(LOG_UNIMP,
> > +                      "%s unhandled read access at 0x%" PRIx64 "\n",
> > +                      __func__, offset0);
> > +        return 0;
> > +    }
> > +}
> > +
> >  static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV
> *cmdqv,
> >                                                   hwaddr offset)
> >  {
> > @@ -92,6 +132,7 @@ static uint64_t tegra241_cmdqv_read_mmio(void
> *opaque, hwaddr offset,
> >  {
> >      Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
> >      uint64_t val = 0;
> > +    int index;
> >
> >      if (offset >= TEGRA241_CMDQV_IO_LEN) {
> >          qemu_log_mask(LOG_UNIMP,
> > @@ -125,6 +166,38 @@ static uint64_t tegra241_cmdqv_read_mmio(void
> *opaque, hwaddr offset,
> >      case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
> >          val = tegra241_cmdqv_config_vintf_read(cmdqv, offset);
> >          break;
> > +    case A_VI_VCMDQ0_CONS_INDX ... A_VI_VCMDQ1_GERRORN:
> > +        /*
> > +         * VINTF Page0 registers have the same per-VCMDQ layout as the
> > +         * VCMDQ Page0 registers. Translate the VINTF aperture offset to the
> > +         * equivalent VCMDQ aperture offset, then fall through to reuse the
> > +         * common VCMDQ decoding logic below.
> > +         */
> > +        offset -= CMDQV_VINTF_PAGE0_BASE -
> CMDQV_VCMDQ_PAGE0_BASE;
> > +        QEMU_FALLTHROUGH;
> > +    case A_VCMDQ0_CONS_INDX ... A_VCMDQ1_GERRORN:
> > +        /*
> > +         * Decode a per-VCMDQ register access.
> > +         *
> > +         * The hardware supports up to 128 identical VCMDQ instances; we
> > +         * currently expose TEGRA241_CMDQV_MAX_CMDQ (= 2). Each
> VCMDQ
> > +         * occupies a CMDQV_VCMDQ_STRIDE-byte window within the page.
> > +         *
> > +         * Extract the VCMDQ index and normalize to the VCMDQ0_* register
> > +         * offset. A single helper services all instances via @index.
> > +         */
> > +        index = (offset - CMDQV_VCMDQ_PAGE0_BASE) /
> CMDQV_VCMDQ_STRIDE;
> > +        return tegra241_cmdqv_read_vcmdq(cmdqv,
> > +                offset - index * CMDQV_VCMDQ_STRIDE, index);
> > +    case A_VI_VCMDQ0_BASE_L ...
> A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
> > +        /* Same VINTF-to-VCMDQ translation as VINTF Page0 case above */
> > +        offset -= CMDQV_VINTF_PAGE1_BASE -
> CMDQV_VCMDQ_PAGE1_BASE;
> > +        QEMU_FALLTHROUGH;
> > +    case A_VCMDQ0_BASE_L ... A_VCMDQ1_CONS_INDX_BASE_DRAM_H:
> > +        /* Same decode logic as VCMDQ Page0 case above */
> > +        index = (offset - CMDQV_VCMDQ_PAGE1_BASE) /
> CMDQV_VCMDQ_STRIDE;
> > +        return tegra241_cmdqv_read_vcmdq(cmdqv,
> > +                offset - index * CMDQV_VCMDQ_STRIDE, index);
> Please add trace points for read & write accesses

Just to clarify, we already have trace_tegra241_cmdqv_read_mmio and
trace_tegra241_cmdqv_write_mmio, so this suggestion is for to have 
another pair for vcmdq read/write, right?

(Agree with all other suggestions above)

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 16/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register writes
  2026-05-05 10:42   ` Eric Auger
@ 2026-05-05 13:49     ` Shameer Kolothum Thodi
  0 siblings, 0 replies; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-05 13:49 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 05 May 2026 11:43
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org
> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 16/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ
> register writes
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi,
> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> >
> > This is the write side counterpart of the VCMDQ read emulation. Add
> > write handling for both CMDQV_CMDQ_BASE and CMDQV_VI_CMDQ_BASE
> > apertures using the same index decoding and VINTF-to-VCMDQ translation
> > logic as the read path.
> >
> > VINTF aperture writes are translated to their CMDQV_CMDQ_BASE
> > equivalent and update the same cached state. Page1 registers (BASE,
> > CONS_INDX_BASE) always update the cache. Once
> IOMMU_HW_QUEUE_ALLOC and
> > viommu_mmap are wired up in a subsequent patch, Page0 register writes
> > will be forwarded to the hardware-backed mmap'd page.
> >
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> > ---
> >  hw/arm/tegra241-cmdqv.c | 99
> > +++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 99 insertions(+)
> >
> > diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c index
> > 35e6f0bbd6..d4ba2ada92 100644
> > --- a/hw/arm/tegra241-cmdqv.c
> > +++ b/hw/arm/tegra241-cmdqv.c
> > @@ -55,6 +55,70 @@ static uint64_t
> tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
> >      }
> >  }
> >
> > +/*
> > + * Write a VCMDQ register using VCMDQ0_* offsets.
> > + *
> > + * The caller normalizes the MMIO offset such that @offset0 always
> > +refers
> > + * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
> > + */
> > +static void tegra241_cmdqv_write_vcmdq(Tegra241CMDQV *cmdqv,
> hwaddr offset0,
> > +                                       int index, uint64_t value,
> > +                                       unsigned size) {
> > +    switch (offset0) {
> > +    case A_VCMDQ0_CONS_INDX:
> > +        cmdqv->vcmdq_cons_indx[index] = (uint32_t)value;
> > +        return;
> > +    case A_VCMDQ0_PROD_INDX:
> > +        cmdqv->vcmdq_prod_indx[index] = (uint32_t)value;
> > +        return;
> > +    case A_VCMDQ0_CONFIG:
> > +        if (value & R_VCMDQ0_CONFIG_CMDQ_EN_MASK) {
> > +            cmdqv->vcmdq_status[index] |=
> R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
> > +        } else {
> > +            cmdqv->vcmdq_status[index] &=
> ~R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
> > +        }
> > +        cmdqv->vcmdq_config[index] = (uint32_t)value;
> > +        return;
> > +    case A_VCMDQ0_GERRORN:
> > +        cmdqv->vcmdq_gerrorn[index] = (uint32_t)value;
> > +        return;
> > +    case A_VCMDQ0_BASE_L:
> > +        if (size == 8) {
> > +            cmdqv->vcmdq_base[index] = value;
> > +        } else {
> > +            cmdqv->vcmdq_base[index] =
> > +                (cmdqv->vcmdq_base[index] & 0xffffffff00000000ULL) |
> > +                (value & 0xffffffffULL);
> can you explain the access pattern?
> this does not look like A_STRTAB_BASE smmuv3 writes/read

The size == 8 branch was added to handle a potential single 64-bit write
to BASE, but you're right this is inconsistent with the SMMUv3 pattern
where size dispatch happens before the per register switch, not inside it.

Also, mmio_cmdqv_ops is missing .valid/.impl size constraints. I will take
a look on next respin and update.

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
  2026-05-05 13:25   ` Eric Auger
@ 2026-05-05 14:26     ` Shameer Kolothum Thodi
  2026-05-06 17:49       ` Nicolin Chen
  2026-05-08 14:50       ` Eric Auger
  0 siblings, 2 replies; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-05 14:26 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org,
	Nicolin Chen
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 05 May 2026 14:26
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org
> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW
> VCMDQs on base register programming
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi,
> 
> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> >
> > Add support for allocating IOMMUFD hardware queues when the guest
> > programs the VCMDQ BASE registers.
> >
> > VCMDQ_EN is part of the VCMDQ_CONFIG register, which is accessed
> > through the VINTF Page0 region. A subsequent patch maps this region
> > directly into the guest address space, so QEMU does not trap writes
> > to VCMDQ_CONFIG.
> >
> > Since VCMDQ_EN writes are not trapped, QEMU cannot allocate the
> > hardware queue based on that bit. Instead, allocate the IOMMUFD
> > hardware queue when the guest writes a VCMDQ BASE register with a
> > valid RAM-backed address and when CMDQV and VINTF are enabled.
> > If a hardware queue was previously allocated for the same VCMDQ,
> > free it before reallocation.
> the asymetric alloc/free sounds unusual. Are there other alternatives?

Nothing that comes to my mind now. There is no "update hw_queue address"
hence we have to free and allocate. May be we can add a small optimisation
and skip free + alloc if the vcmdq_base[index] hasn't changed from
previous alloc.

> >
> > Writes with invalid addresses are ignored.
> >
> > All allocated VCMDQs are freed when CMDQV or VINTF is disabled.
> >
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> > ---
> >  hw/arm/tegra241-cmdqv.h | 11 +++++++
> >  hw/arm/tegra241-cmdqv.c | 70
> +++++++++++++++++++++++++++++++++++++++--
> >  2 files changed, 78 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> > index 88572ad939..039d86374f 100644
> > --- a/hw/arm/tegra241-cmdqv.h
> > +++ b/hw/arm/tegra241-cmdqv.h
> > @@ -44,6 +44,7 @@ typedef struct Tegra241CMDQV {
> >      MemoryRegion mmio_cmdqv;
> >      qemu_irq irq;
> >      IOMMUFDVeventq *veventq;
> > +    IOMMUFDHWqueue *vcmdq[TEGRA241_CMDQV_MAX_CMDQ];
> >      void *vintf_page0;
> >
> >      /* Register Cache */
> > @@ -348,6 +349,16 @@ A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(1)
> >  A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(0)
> >  A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(1)
> >
> > +static inline bool tegra241_cmdq_enabled(Tegra241CMDQV *cmdqv)
> > +{
> > +    return cmdqv->status & R_STATUS_CMDQV_ENABLED_MASK;
> > +}
> > +
> > +static inline bool tegra241_vintf_enabled(Tegra241CMDQV *cmdqv)
> > +{
> > +    return cmdqv->vintf_status & R_VINTF0_STATUS_ENABLE_OK_MASK;
> > +}
> > +
> >  const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
> >
> >  #endif /* HW_ARM_TEGRA241_CMDQV_H */
> > diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> > index cdd941cec9..b5f2f74cf2 100644
> > --- a/hw/arm/tegra241-cmdqv.c
> > +++ b/hw/arm/tegra241-cmdqv.c
> > @@ -15,6 +15,66 @@
> >  #include "tegra241-cmdqv.h"
> >  #include "trace.h"
> >
> > +static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int
> index)
> > +{
> > +    IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
> > +    IOMMUFDHWqueue *vcmdq = cmdqv->vcmdq[index];
> > +
> > +    if (!vcmdq) {
> > +        return;
> > +    }
> > +    iommufd_backend_free_id(viommu->iommufd, vcmdq->hw_queue_id);
> > +    g_free(vcmdq);
> > +    cmdqv->vcmdq[index] = NULL;
> > +}
> > +
> > +static void tegra241_cmdqv_free_all_vcmdq(Tegra241CMDQV *cmdqv)
> > +{
> > +    /* Free in reverse order to avoid "resource busy" error */
> can you provide additional details about the above problematic. Is it
> documented in the spec?

See p.176:

Deallocate a VCMDQ from a Virtual Interface 
    Logical CMDQ being deallocated for a Guest must be in decreasing order
    starting from the highest numbered LVCMDQ.

> > +    for (int i = (TEGRA241_CMDQV_MAX_CMDQ - 1); i >= 0; i--) {
> > +        tegra241_cmdqv_free_vcmdq(cmdqv, i);
> > +    }
> > +}
> > +
> > +static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int
> index,
> > +                                       Error **errp)
> > +{
> > +    SMMUv3AccelState *accel = cmdqv->s_accel;
> > +    uint64_t base_mask = (uint64_t)R_VCMDQ0_BASE_L_ADDR_MASK |
> > +                         (uint64_t)R_VCMDQ0_BASE_H_ADDR_MASK << 32;
> > +    uint64_t addr = cmdqv->vcmdq_base[index] & base_mask;
> > +    uint64_t log2 = cmdqv->vcmdq_base[index] &
> R_VCMDQ0_BASE_L_LOG2SIZE_MASK;
> > +    uint64_t size = 1ULL << (log2 + 4);
> > +    IOMMUFDViommu *viommu = accel->viommu;
> > +    IOMMUFDHWqueue *hw_queue;
> > +    uint32_t hw_queue_id;
> > +
> > +    /* Ignore any invalid address. This may come as part of reset etc. */
> > +    if (!address_space_is_ram(&address_space_memory, addr) ||
> > +        !address_space_is_ram(&address_space_memory, addr + size - 1)) {
> this check looks a little bit risky, no? Don't we have a better way to
> test the address has been properly set?

I think eventually kernel will handle any attempt to use an invalid address
through IOMMU_HW_QUEUE_ALLOC IOCTL:
     iommufd_hw_queue_alloc_phys()/iommufd_access_pin_pages() etc.

Any attempt to pass an invalid address will return error, I think.

@Nicolin, is that a safe assumption to make?

> > +        return true;
> > +    }
> > +
> > +    if (!tegra241_cmdq_enabled(cmdqv) ||
> !tegra241_vintf_enabled(cmdqv)) {
> > +        return true;
> > +    }
> > +
> > +    tegra241_cmdqv_free_vcmdq(cmdqv, index);
> would deserve a comment also here.

Sure.

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing
  2026-05-05  0:50   ` Nicolin Chen
@ 2026-05-05 15:13     ` Shameer Kolothum Thodi
  2026-05-05 19:52       ` Nicolin Chen
  0 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-05 15:13 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org, eric.auger@redhat.com,
	peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: 05 May 2026 01:50
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>
> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org;
> eric.auger@redhat.com; peter.maydell@linaro.org; clg@redhat.com;
> alex@shazbot.org; Nathan Chen <nathanc@nvidia.com>; Matt Ochs
> <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF
> page0 as VCMDQ backing
> 
> On Wed, Apr 15, 2026 at 11:55:41AM +0100, Shameer Kolothum wrote:
> > The pre-to-post-alloc transition is triggered by the BASE register
> > write that initiates IOMMU_HW_QUEUE_ALLOC. No cache-to-hardware
> > synchronisation is needed at transition time. The hardware mandated
> > init sequence requires BASE to be written first; PROD_INDX, CONS_INDX
> > and CONFIG.CMDQ_EN are programmed only after BASE and are therefore
> always post-alloc.
> >
> > Any pre-alloc writes to those registers update only the register
> > cache, which is discarded at the transition.
> 
> Is "discard" the correct action?
> 
> Guest OS might expect HW (VM) to retain what it writes to those
> page0 registers?

As explained above, I reached that conclusion based on spec p.174/175:

Under "Enabling the Virtual CMDQ" it specifies the init order as below:

- Program the VIRT_CMDQ_BASE 
- Init the PROD_INDX/ CONS_INDX to 0 or a consistent value
- Program the VIRT_CMDQ_CONFIG to enable the CMDQ

Since we set up the HW QUEUE on BASE write, a spec compliant guest is
expected to program the Page0 registers as above after that, and we pass
those writes directly to the VINTF Page0. There are no GERRORN writes
before HW queue setup either.

Please let me know if there is still a case to sync the registers.

Thanks,
Shameer








^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
  2026-05-05  9:59     ` Shameer Kolothum Thodi
@ 2026-05-05 19:38       ` Nicolin Chen
  2026-05-06  8:18         ` Shameer Kolothum Thodi
  0 siblings, 1 reply; 102+ messages in thread
From: Nicolin Chen @ 2026-05-05 19:38 UTC (permalink / raw)
  To: Shameer Kolothum Thodi
  Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org, eric.auger@redhat.com,
	peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com

On Tue, May 05, 2026 at 02:59:55AM -0700, Shameer Kolothum Thodi wrote:
> > Also, do we fence against unassigned vcmdq? Corner case is that a
> > guest might write base address registers via direct (global) MMIO
> > space.
> > 
> 
> Not sure I get that completely.
> 
> Spec(p. 176) has:
> 
> "While the software can program the Virtual CMDQ(s) directly using the
> direct VCMDQ aperture (and not through the Virtual Interface), it is
> required that the VCMDQ be allocated to a Virtual Interface before it
> is used to send commands to the SMMU."
> 
> The spec only restricts sending commands before allocation, not
> programming BASE. In our model, the BASE write itself triggers 
> alloc_hw_queue so there's nothing to fence there.

Our model has an assumption that guest would map a VCMDQ to the
VINTF0 first before doing any meaningful programming.

What if a guest programs BASE before it maps a VCMDQ to VINTF0?
This can end up with allocating hw_queue that will map the VCMDQ
to physical VINTFx, while the guest hasn't map it yet.

> For other's
> (Page 0: CONS_INDX, PROD_INDX etc.), the vintf_ptr() check already drops
> them silently if vcmdq[index] is not yet allocated, consistent
> with spec p.172:
> 
> "If no Virtual CMDQ is mapped to the Guest, or if the logical CMDQ index
> in the Virtual Interface being accessed by the software does not map to
> any Virtual CMDQ, the access is dropped with no Fault/Interrupt".
 
My reading: "the access" means accessing logical VCMDQ via VTINF's
MMIO space (0x30000 and 0x40000). Yes, silently dropping it is what
we want.

My concern here is about the access to the direct/global VCMDQ MMIO
space (0x10000 and 0x20000).

So, if this concern is a real case, should we:
 1. If the guest maps the VCMDQ to VINTF0 before it writes BASE,
    call setup() to allocate hw_queue as our model expects.
 2. If the guest writes BASE before it maps VCMDQ to VINTF0, we
    need to fence against it.
?

One more thing to check for case (2): after the guest programmed
BASE and then maps it to VINTF0, and it never writes BASE again,
should we call setup() at map?

Thanks
Nicolin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing
  2026-05-05 15:13     ` Shameer Kolothum Thodi
@ 2026-05-05 19:52       ` Nicolin Chen
  2026-05-06 13:16         ` Shameer Kolothum Thodi
  0 siblings, 1 reply; 102+ messages in thread
From: Nicolin Chen @ 2026-05-05 19:52 UTC (permalink / raw)
  To: Shameer Kolothum Thodi
  Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org, eric.auger@redhat.com,
	peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com

On Tue, May 05, 2026 at 08:13:40AM -0700, Shameer Kolothum Thodi wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > On Wed, Apr 15, 2026 at 11:55:41AM +0100, Shameer Kolothum wrote:
> > > The pre-to-post-alloc transition is triggered by the BASE register
> > > write that initiates IOMMU_HW_QUEUE_ALLOC. No cache-to-hardware
> > > synchronisation is needed at transition time. The hardware mandated
> > > init sequence requires BASE to be written first; PROD_INDX, CONS_INDX
> > > and CONFIG.CMDQ_EN are programmed only after BASE and are therefore
> > always post-alloc.
> > >
> > > Any pre-alloc writes to those registers update only the register
> > > cache, which is discarded at the transition.
> > 
> > Is "discard" the correct action?
> > 
> > Guest OS might expect HW (VM) to retain what it writes to those
> > page0 registers?
> 
> As explained above, I reached that conclusion based on spec p.174/175:
> 
> Under "Enabling the Virtual CMDQ" it specifies the init order as below:
> 
> - Program the VIRT_CMDQ_BASE 
> - Init the PROD_INDX/ CONS_INDX to 0 or a consistent value
> - Program the VIRT_CMDQ_CONFIG to enable the CMDQ
> 
> Since we set up the HW QUEUE on BASE write, a spec compliant guest is
> expected to program the Page0 registers as above after that, and we pass
> those writes directly to the VINTF Page0. There are no GERRORN writes
> before HW queue setup either.

I don't think we can make an assumption that guest would follow
what spec suggests.

Could this be a case:
 - Write the PROD_INDX/ CONS_INDX with a consistent value (0xf)
 - Program the VIRT_CMDQ_BASE 
 - Read the PROD_INDX/CONS_INDX, expecting 0xf
 - Write the PROD_INDX/ CONS_INDX with 0x0
 - Program the VIRT_CMDQ_CONFIG to enable the CMDQ
?

Nicolin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
  2026-05-05 19:38       ` Nicolin Chen
@ 2026-05-06  8:18         ` Shameer Kolothum Thodi
  2026-05-06 18:18           ` Nicolin Chen
  0 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-06  8:18 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org, eric.auger@redhat.com,
	peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: 05 May 2026 20:38
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>
> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org;
> eric.auger@redhat.com; peter.maydell@linaro.org; clg@redhat.com;
> alex@shazbot.org; Nathan Chen <nathanc@nvidia.com>; Matt Ochs
> <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW
> VCMDQs on base register programming
> 
> On Tue, May 05, 2026 at 02:59:55AM -0700, Shameer Kolothum Thodi
> wrote:
> > > Also, do we fence against unassigned vcmdq? Corner case is that a
> > > guest might write base address registers via direct (global) MMIO
> > > space.
> > >
> >
> > Not sure I get that completely.
> >
> > Spec(p. 176) has:
> >
> > "While the software can program the Virtual CMDQ(s) directly using the
> > direct VCMDQ aperture (and not through the Virtual Interface), it is
> > required that the VCMDQ be allocated to a Virtual Interface before it
> > is used to send commands to the SMMU."
> >
> > The spec only restricts sending commands before allocation, not
> > programming BASE. In our model, the BASE write itself triggers
> > alloc_hw_queue so there's nothing to fence there.
> 
> Our model has an assumption that guest would map a VCMDQ to the
> VINTF0 first before doing any meaningful programming.

Yes. And I think by "map" here, you mean the writing to the _ALLOC_MAP
register. If so, yes, we don't check that.

> 
> What if a guest programs BASE before it maps a VCMDQ to VINTF0?
> This can end up with allocating hw_queue that will map the VCMDQ
> to physical VINTFx, while the guest hasn't map it yet.

Yes. We probably need to gate setup_vcmdq() with:

if (!(cmdqv->cmdq_alloc_map[index] & R_CMDQ_ALLOC_MAP_0_ALLOC_MASK)) {
      return true;
  }
 
> > For other's
> > (Page 0: CONS_INDX, PROD_INDX etc.), the vintf_ptr() check already drops
> > them silently if vcmdq[index] is not yet allocated, consistent
> > with spec p.172:
> >
> > "If no Virtual CMDQ is mapped to the Guest, or if the logical CMDQ index
> > in the Virtual Interface being accessed by the software does not map to
> > any Virtual CMDQ, the access is dropped with no Fault/Interrupt".
> 
> My reading: "the access" means accessing logical VCMDQ via VTINF's
> MMIO space (0x30000 and 0x40000). Yes, silently dropping it is what
> we want.

Ok.
 
> My concern here is about the access to the direct/global VCMDQ MMIO
> space (0x10000 and 0x20000).
> 
> So, if this concern is a real case, should we:
>  1. If the guest maps the VCMDQ to VINTF0 before it writes BASE,
>     call setup() to allocate hw_queue as our model expects.
>  2. If the guest writes BASE before it maps VCMDQ to VINTF0, we
>     need to fence against it.
> ?
> 
> One more thing to check for case (2): after the guest programmed
> BASE and then maps it to VINTF0, and it never writes BASE again,
> should we call setup() at map?

That is a non-spec compliant behaviour I guess:.

In p.176: "Allocate a VCMDQ to the Virtual Interface", this is
the order:

-Program the CMDQV_CMDQ_ALLOC_MAP_X to map CMDQ X to the
 Logical CMDQ (L) on Virtual Interface (V). Logical CMDQ allocated in the
 Guest must be in order starting from 0.
-CMDQ is allocated to be used by the software. Guest can now program
  the VCMDQ to use it   as described in the "Enabling the Virtual CMDQ"
  section.

"Enabling the Virtual CMDQ" (p.175) is where BASE is programmed. So the
spec explicitly places ALLOC_MAP before BASE.

Should we support a non-spec compliant Guest is another question.
IMHO, supporting non-compliant behaviour adds complexity with
no real benefit.

Thanks,
Shameer


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 15/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads
  2026-05-05 13:27     ` Shameer Kolothum Thodi
@ 2026-05-06 11:14       ` Eric Auger
  0 siblings, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-06 11:14 UTC (permalink / raw)
  To: Shameer Kolothum Thodi, qemu-arm@nongnu.org,
	qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



On 5/5/26 3:27 PM, Shameer Kolothum Thodi wrote:
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Sent: 05 May 2026 11:13
>> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
>> arm@nongnu.org; qemu-devel@nongnu.org
>> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
>> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
>> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
>> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
>> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
>> phrdina@redhat.com
>> Subject: Re: [PATCH v4 15/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ
>> register reads
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi,
>>
>> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
>>> From: Nicolin Chen <nicolinc@nvidia.com>
>>>
>>> Tegra241 CMDQV exposes per-VCMDQ register windows through two
>> MMIO
>>> apertures:
>>>
>>>   CMDQV_CMDQ_BASE (0x10000/0x20000): VCMDQ Page0/Page1
>> (global)
>>>   CMDQV_VI_CMDQ_BASE (0x30000/0x40000): VINTF VCMDQ
>> Page0/Page1
>> (logical)
>>> VINTF Page0 (0x30000) and VCMDQ Page0 (0x10000) are hardware aliases
>> what about page1?
> Right. Page1 is also an alias, but kernel only provides mmap support for Page0.
> Hence Page1 is always served from register cache. I will update.
>
>>> addressing the same underlying registers. Add read emulation for both
>>> apertures, backed by a register cache. VINTF Page0 reads are translated
>>> to their VCMDQ Page0 equivalent and served from the same cached state.
>>>
>>> Once IOMMU_HW_QUEUE_ALLOC and viommu_mmap are wired up in a
>> subsequent
>>> patch, Page0 register reads will be served directly from the hardware
>>> backed mmap'd page instead of the cache. Page1 registers are always
>>> served from cache.
>> I would add add that
>> Page 0 contains VCMDQ control and status registers
>> while Page 1 contains VCMDQ base and DRAM address registers
>>
>>
>> I would add Nicolin's explanation
>> "
>>
>> The global page 0 programmable at any time so long as CMDQV_EN
>> is enabled.
>>
>> The logical (VINTF) page 0 are programmable only when SW allocates and
>> maps
>> global vcmdq(s) to a VINTF. "logical" also means "local" to that
>> VINTF.
>> "
> Ok. Will add.
>
>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
>>> ---
>>>  hw/arm/tegra241-cmdqv.h | 185
>> ++++++++++++++++++++++++++++++++++++++++
>>>  hw/arm/tegra241-cmdqv.c |  73 ++++++++++++++++
>>>  2 files changed, 258 insertions(+)
>>>
>>> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
>>> index 965670066d..b8bd8cd8ff 100644
>>> --- a/hw/arm/tegra241-cmdqv.h
>>> +++ b/hw/arm/tegra241-cmdqv.h
>>> @@ -29,6 +29,13 @@
>>>   */
>>>  #define TEGRA241_CMDQV_IO_LEN 0x50000
>>>
>>> +/* CMDQV MMIO aperture bases and VCMDQ stride */
>>> +#define CMDQV_VCMDQ_PAGE0_BASE  0x10000  /* CMDQV_CMDQ_BASE
>> */
>>> +#define CMDQV_VCMDQ_PAGE1_BASE  0x20000
>>> +#define CMDQV_VINTF_PAGE0_BASE  0x30000  /*
>> CMDQV_VI_CMDQ_BASE */
>>> +#define CMDQV_VINTF_PAGE1_BASE  0x40000
>>> +#define CMDQV_VCMDQ_STRIDE      0x80
>>> +
>>>  typedef struct Tegra241CMDQV {
>>>      struct iommu_viommu_tegra241_cmdqv cmdqv_data;
>>>      SMMUv3AccelState *s_accel;
>>> @@ -49,6 +56,14 @@ typedef struct Tegra241CMDQV {
>>>      uint32_t vintf_sid_match[16];
>>>      uint32_t vintf_sid_replace[16];
>>>      uint32_t vintf_cmdq_err_map[4];
>> I would add a comment explaining that those cached registers store the
>> consistent/idenftical values for both the global and logical regs (they
>> are same)
>>> +    uint32_t vcmdq_cons_indx[TEGRA241_CMDQV_MAX_CMDQ];
>>> +    uint32_t vcmdq_prod_indx[TEGRA241_CMDQV_MAX_CMDQ];
>>> +    uint32_t vcmdq_config[TEGRA241_CMDQV_MAX_CMDQ];
>>> +    uint32_t vcmdq_status[TEGRA241_CMDQV_MAX_CMDQ];
>>> +    uint32_t vcmdq_gerror[TEGRA241_CMDQV_MAX_CMDQ];
>>> +    uint32_t vcmdq_gerrorn[TEGRA241_CMDQV_MAX_CMDQ];
>>> +    uint64_t vcmdq_base[TEGRA241_CMDQV_MAX_CMDQ];
>>> +    uint64_t vcmdq_cons_indx_base[TEGRA241_CMDQV_MAX_CMDQ];
>>>  } Tegra241CMDQV;
>>>
>>>  /* CMDQ-V Config page registers (offset 0x00000) */
>>> @@ -160,6 +175,176 @@
>> SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(0, 0)
>>>  /* MAP_1 and MAP_2 omitted; not referenced directly */
>>>  SMMU_CMDQV_VINTFi_LVCMDQ_ERR_MAP_(0, 3)
>>>
>>> +/*
>>> + * VCMDQ register windows.
>>> + *
>>> + * Page 0 @ 0x10000: VCMDQ control and status registers
>>> + * Page 1 @ 0x20000: VCMDQ base and DRAM address registers
>> Can you clearly separate regs that belong to page 0 from regs that
>> belong to page 1
>>> + */
>>> +#define A_VCMDQi_CONS_INDX(i)
>> same remark as for config registers. Please use the spec terminology
>>
>> SMMU_CMDQV_VCMDQi_CONS_INDX_0
>>
>>>                \
>>> +    REG32(VCMDQ##i##_CONS_INDX, 0x10000 + i * 0x80) \
>>> +    FIELD(VCMDQ##i##_CONS_INDX, RD, 0, 20)          \
>>> +    FIELD(VCMDQ##i##_CONS_INDX, ERR, 24, 7)
>>> +
>>> +A_VCMDQi_CONS_INDX(0)
>>> +A_VCMDQi_CONS_INDX(1)
>>> +
>>> +#define V_VCMDQ_CONS_INDX_ERR_CERROR_NONE 0
>>> +#define V_VCMDQ_CONS_INDX_ERR_CERROR_ILL_OPCODE 1
>>> +#define V_VCMDQ_CONS_INDX_ERR_CERROR_ABT 2
>>> +#define V_VCMDQ_CONS_INDX_ERR_CERROR_ATC_INV_SYNC 3
>>> +#define V_VCMDQ_CONS_INDX_ERR_CERROR_ILL_ACCESS 4
>>> +
>>> +#define A_VCMDQi_PROD_INDX(i)
>> SMMU_CMDQV_VCMDQi_PROD_INDX_0
>>>                        \
>>> +    REG32(VCMDQ##i##_PROD_INDX, 0x10000 + 0x4 + i * 0x80) \
>>> +    FIELD(VCMDQ##i##_PROD_INDX, WR, 0, 20)
>>> +
>>> +A_VCMDQi_PROD_INDX(0)
>>> +A_VCMDQi_PROD_INDX(1)
>>> +
>>> +#define A_VCMDQi_CONFIG(i)
>> SMMU_CMDQV_VCMDQi_CONFIG_0(i)
>>>                   \
>>> +    REG32(VCMDQ##i##_CONFIG, 0x10000 + 0x8 + i * 0x80) \
>> 0x80, isn't it the stride you defined earlier, ie
>>
>>  CMDQV_VCMDQ_STRIDE ?
>>
>>> +    FIELD(VCMDQ##i##_CONFIG, CMDQ_EN, 0, 1)
>>> +
>>> +A_VCMDQi_CONFIG(0)
>>> +A_VCMDQi_CONFIG(1)
>>> +
>>> +#define A_VCMDQi_STATUS(i)                             \
>>> +    REG32(VCMDQ##i##_STATUS, 0x10000 + 0xc + i * 0x80) \
>>> +    FIELD(VCMDQ##i##_STATUS, CMDQ_EN_OK, 0, 1)
>>> +
>>> +A_VCMDQi_STATUS(0)
>>> +A_VCMDQi_STATUS(1)
>>> +
>>> +#define A_VCMDQi_GERROR(i)                               \
>>> +    REG32(VCMDQ##i##_GERROR, 0x10000 + 0x10 + i * 0x80)  \
>>> +    FIELD(VCMDQ##i##_GERROR, CMDQ_ERR, 0, 1)             \
>>> +    FIELD(VCMDQ##i##_GERROR, CONS_DRAM_WR_ABT_ERR, 1, 1) \
>>> +    FIELD(VCMDQ##i##_GERROR, CMDQ_INIT_ERR, 2, 1)
>>> +
>>> +A_VCMDQi_GERROR(0)
>>> +A_VCMDQi_GERROR(1)
>>> +
>>> +#define A_VCMDQi_GERRORN(i)                               \
>>> +    REG32(VCMDQ##i##_GERRORN, 0x10000 + 0x14 + i * 0x80)  \
>>> +    FIELD(VCMDQ##i##_GERRORN, CMDQ_ERR, 0, 1)             \
>>> +    FIELD(VCMDQ##i##_GERRORN, CONS_DRAM_WR_ABT_ERR, 1, 1) \
>>> +    FIELD(VCMDQ##i##_GERRORN, CMDQ_INIT_ERR, 2, 1)
>>> +
>>> +A_VCMDQi_GERRORN(0)
>>> +A_VCMDQi_GERRORN(1)
>> /* Page 1 */
>>> +
>>> +#define A_VCMDQi_BASE_L(i)
>> ditto remove the A_ prefix and use spec terminology
>>>               \
>>> +    REG32(VCMDQ##i##_BASE_L, 0x20000 + i * 0x80) \
>>> +    FIELD(VCMDQ##i##_BASE_L, LOG2SIZE, 0, 5)     \
>>> +    FIELD(VCMDQ##i##_BASE_L, ADDR, 5, 27)
>>> +
>>> +A_VCMDQi_BASE_L(0)
>>> +A_VCMDQi_BASE_L(1)
>>> +
>>> +#define A_VCMDQi_BASE_H(i)                             \
>>> +    REG32(VCMDQ##i##_BASE_H, 0x20000 + 0x4 + i * 0x80) \
>> may instead of using 0x20000 define macros for the base address of each
>> region
>>> +    FIELD(VCMDQ##i##_BASE_H, ADDR, 0, 16)
>>> +
>>> +A_VCMDQi_BASE_H(0)
>>> +A_VCMDQi_BASE_H(1)
>>> +
>>> +#define A_VCMDQi_CONS_INDX_BASE_DRAM_L(i)                             \
>>> +    REG32(VCMDQ##i##_CONS_INDX_BASE_DRAM_L, 0x20000 + 0x8 + i *
>> 0x80) \
>>> +    FIELD(VCMDQ##i##_CONS_INDX_BASE_DRAM_L, ADDR, 0, 32)
>>> +
>>> +A_VCMDQi_CONS_INDX_BASE_DRAM_L(0)
>>> +A_VCMDQi_CONS_INDX_BASE_DRAM_L(1)
>>> +
>>> +#define A_VCMDQi_CONS_INDX_BASE_DRAM_H(i)                             \
>>> +    REG32(VCMDQ##i##_CONS_INDX_BASE_DRAM_H, 0x20000 + 0xc + i *
>> 0x80) \
>>> +    FIELD(VCMDQ##i##_CONS_INDX_BASE_DRAM_H, ADDR, 0, 16)
>>> +
>>> +A_VCMDQi_CONS_INDX_BASE_DRAM_H(0)
>>> +A_VCMDQi_CONS_INDX_BASE_DRAM_H(1)
>> I would group all regs definition par page. The layout would look
>> clearer to me.
>>> +
>>> +/*
>>> + * VI_VCMDQ register windows (VCMDQs mapped via VINTF).
>>> + *
>>> + * Page 0 @ 0x30000: VI_VCMDQ control and status registers
>>> + * Page 1 @ 0x40000: VI_VCMDQ base and DRAM address registers
>> same here, pls separate page 0 and page 1 definitions
>>> + */
>>> +#define A_VI_VCMDQi_CONS_INDX(i)
>> A_
>>>                  \
>>> +    REG32(VI_VCMDQ##i##_CONS_INDX, 0x30000 + i * 0x80) \
>>> +    FIELD(VI_VCMDQ##i##_CONS_INDX, RD, 0, 20)          \
>>> +    FIELD(VI_VCMDQ##i##_CONS_INDX, ERR, 24, 7)
>>> +
>>> +A_VI_VCMDQi_CONS_INDX(0)
>>> +A_VI_VCMDQi_CONS_INDX(1)
>>> +
>>> +#define A_VI_VCMDQi_PROD_INDX(i)                             \
>>> +    REG32(VI_VCMDQ##i##_PROD_INDX, 0x30000 + 0x4 + i * 0x80) \
>>> +    FIELD(VI_VCMDQ##i##_PROD_INDX, WR, 0, 20)
>>> +
>>> +A_VI_VCMDQi_PROD_INDX(0)
>>> +A_VI_VCMDQi_PROD_INDX(1)
>>> +
>>> +#define A_VI_VCMDQi_CONFIG(i)                             \
>>> +    REG32(VI_VCMDQ##i##_CONFIG, 0x30000 + 0x8 + i * 0x80) \
>>> +    FIELD(VI_VCMDQ##i##_CONFIG, CMDQ_EN, 0, 1)
>>> +
>>> +A_VI_VCMDQi_CONFIG(0)
>>> +A_VI_VCMDQi_CONFIG(1)
>>> +
>>> +#define A_VI_VCMDQi_STATUS(i)                             \
>>> +    REG32(VI_VCMDQ##i##_STATUS, 0x30000 + 0xc + i * 0x80) \
>>> +    FIELD(VI_VCMDQ##i##_STATUS, CMDQ_EN_OK, 0, 1)
>>> +
>>> +A_VI_VCMDQi_STATUS(0)
>>> +A_VI_VCMDQi_STATUS(1)
>>> +
>>> +#define A_VI_VCMDQi_GERROR(i)                               \
>>> +    REG32(VI_VCMDQ##i##_GERROR, 0x30000 + 0x10 + i * 0x80)  \
>>> +    FIELD(VI_VCMDQ##i##_GERROR, CMDQ_ERR, 0, 1)             \
>>> +    FIELD(VI_VCMDQ##i##_GERROR, CONS_DRAM_WR_ABT_ERR, 1, 1) \
>>> +    FIELD(VI_VCMDQ##i##_GERROR, CMDQ_INIT_ERR, 2, 1)
>>> +
>>> +A_VI_VCMDQi_GERROR(0)
>>> +A_VI_VCMDQi_GERROR(1)
>>> +
>>> +#define A_VI_VCMDQi_GERRORN(i)                               \
>>> +    REG32(VI_VCMDQ##i##_GERRORN, 0x30000 + 0x14 + i * 0x80)  \
>>> +    FIELD(VI_VCMDQ##i##_GERRORN, CMDQ_ERR, 0, 1)             \
>>> +    FIELD(VI_VCMDQ##i##_GERRORN, CONS_DRAM_WR_ABT_ERR, 1, 1) \
>>> +    FIELD(VI_VCMDQ##i##_GERRORN, CMDQ_INIT_ERR, 2, 1)
>>> +
>>> +A_VI_VCMDQi_GERRORN(0)
>>> +A_VI_VCMDQi_GERRORN(1)
>>> +
>>> +#define A_VI_VCMDQi_BASE_L(i)                       \
>>> +    REG32(VI_VCMDQ##i##_BASE_L, 0x40000 + i * 0x80) \
>>> +    FIELD(VI_VCMDQ##i##_BASE_L, LOG2SIZE, 0, 5)     \
>>> +    FIELD(VI_VCMDQ##i##_BASE_L, ADDR, 5, 27)
>>> +
>>> +A_VI_VCMDQi_BASE_L(0)
>>> +A_VI_VCMDQi_BASE_L(1)
>>> +
>>> +#define A_VI_VCMDQi_BASE_H(i)                             \
>>> +    REG32(VI_VCMDQ##i##_BASE_H, 0x40000 + 0x4 + i * 0x80) \
>>> +    FIELD(VI_VCMDQ##i##_BASE_H, ADDR, 0, 16)
>>> +
>>> +A_VI_VCMDQi_BASE_H(0)
>>> +A_VI_VCMDQi_BASE_H(1)
>>> +
>>> +#define A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(i)                             \
>>> +    REG32(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_L, 0x40000 + 0x8 + i
>> * 0x80) \
>>> +    FIELD(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_L, ADDR, 0, 32)
>>> +
>>> +A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(0)
>>> +A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(1)
>>> +
>>> +#define A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(i)                             \
>>> +    REG32(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_H, 0x40000 + 0xc + i
>> * 0x80) \
>>> +    FIELD(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_H, ADDR, 0, 16)
>>> +
>>> +A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(0)
>>> +A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(1)
>>> +
>>>  const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
>>>
>>>  #endif /* HW_ARM_TEGRA241_CMDQV_H */
>>> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
>>> index 3b08ed0ff3..35e6f0bbd6 100644
>>> --- a/hw/arm/tegra241-cmdqv.c
>>> +++ b/hw/arm/tegra241-cmdqv.c
>>> @@ -15,6 +15,46 @@
>>>  #include "tegra241-cmdqv.h"
>>>  #include "trace.h"
>>>
>>> +/*
>>> + * Read a VCMDQ register using VCMDQ0_* offsets.
>>> + *
>>> + * The caller normalizes the MMIO offset such that @offset0 always refers
>>> + * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
>>> + *
>>> + * All VCMDQ accesses return cached registers.
>>> + */
>>> +static uint64_t tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv,
>> hwaddr offset0,
>>> +                                          int index)
>>> +{
>>> +    switch (offset0) {
>>> +    case A_VCMDQ0_CONS_INDX:
>>> +        return cmdqv->vcmdq_cons_indx[index];
>>> +    case A_VCMDQ0_PROD_INDX:
>>> +        return cmdqv->vcmdq_prod_indx[index];
>>> +    case A_VCMDQ0_CONFIG:
>>> +        return cmdqv->vcmdq_config[index];
>>> +    case A_VCMDQ0_STATUS:
>>> +        return cmdqv->vcmdq_status[index];
>>> +    case A_VCMDQ0_GERROR:
>>> +        return cmdqv->vcmdq_gerror[index];
>>> +    case A_VCMDQ0_GERRORN:
>>> +        return cmdqv->vcmdq_gerrorn[index];
>>> +    case A_VCMDQ0_BASE_L:
>>> +        return cmdqv->vcmdq_base[index];
>>> +    case A_VCMDQ0_BASE_H:
>>> +        return cmdqv->vcmdq_base[index] >> 32;
>>> +    case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
>>> +        return cmdqv->vcmdq_cons_indx_base[index];
>>> +    case A_VCMDQ0_CONS_INDX_BASE_DRAM_H:
>>> +        return cmdqv->vcmdq_cons_indx_base[index] >> 32;
>>> +    default:
>>> +        qemu_log_mask(LOG_UNIMP,
>>> +                      "%s unhandled read access at 0x%" PRIx64 "\n",
>>> +                      __func__, offset0);
>>> +        return 0;
>>> +    }
>>> +}
>>> +
>>>  static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV
>> *cmdqv,
>>>                                                   hwaddr offset)
>>>  {
>>> @@ -92,6 +132,7 @@ static uint64_t tegra241_cmdqv_read_mmio(void
>> *opaque, hwaddr offset,
>>>  {
>>>      Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
>>>      uint64_t val = 0;
>>> +    int index;
>>>
>>>      if (offset >= TEGRA241_CMDQV_IO_LEN) {
>>>          qemu_log_mask(LOG_UNIMP,
>>> @@ -125,6 +166,38 @@ static uint64_t tegra241_cmdqv_read_mmio(void
>> *opaque, hwaddr offset,
>>>      case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
>>>          val = tegra241_cmdqv_config_vintf_read(cmdqv, offset);
>>>          break;
>>> +    case A_VI_VCMDQ0_CONS_INDX ... A_VI_VCMDQ1_GERRORN:
>>> +        /*
>>> +         * VINTF Page0 registers have the same per-VCMDQ layout as the
>>> +         * VCMDQ Page0 registers. Translate the VINTF aperture offset to the
>>> +         * equivalent VCMDQ aperture offset, then fall through to reuse the
>>> +         * common VCMDQ decoding logic below.
>>> +         */
>>> +        offset -= CMDQV_VINTF_PAGE0_BASE -
>> CMDQV_VCMDQ_PAGE0_BASE;
>>> +        QEMU_FALLTHROUGH;
>>> +    case A_VCMDQ0_CONS_INDX ... A_VCMDQ1_GERRORN:
>>> +        /*
>>> +         * Decode a per-VCMDQ register access.
>>> +         *
>>> +         * The hardware supports up to 128 identical VCMDQ instances; we
>>> +         * currently expose TEGRA241_CMDQV_MAX_CMDQ (= 2). Each
>> VCMDQ
>>> +         * occupies a CMDQV_VCMDQ_STRIDE-byte window within the page.
>>> +         *
>>> +         * Extract the VCMDQ index and normalize to the VCMDQ0_* register
>>> +         * offset. A single helper services all instances via @index.
>>> +         */
>>> +        index = (offset - CMDQV_VCMDQ_PAGE0_BASE) /
>> CMDQV_VCMDQ_STRIDE;
>>> +        return tegra241_cmdqv_read_vcmdq(cmdqv,
>>> +                offset - index * CMDQV_VCMDQ_STRIDE, index);
>>> +    case A_VI_VCMDQ0_BASE_L ...
>> A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
>>> +        /* Same VINTF-to-VCMDQ translation as VINTF Page0 case above */
>>> +        offset -= CMDQV_VINTF_PAGE1_BASE -
>> CMDQV_VCMDQ_PAGE1_BASE;
>>> +        QEMU_FALLTHROUGH;
>>> +    case A_VCMDQ0_BASE_L ... A_VCMDQ1_CONS_INDX_BASE_DRAM_H:
>>> +        /* Same decode logic as VCMDQ Page0 case above */
>>> +        index = (offset - CMDQV_VCMDQ_PAGE1_BASE) /
>> CMDQV_VCMDQ_STRIDE;
>>> +        return tegra241_cmdqv_read_vcmdq(cmdqv,
>>> +                offset - index * CMDQV_VCMDQ_STRIDE, index);
>> Please add trace points for read & write accesses
> Just to clarify, we already have trace_tegra241_cmdqv_read_mmio and
> trace_tegra241_cmdqv_write_mmio, so this suggestion is for to have 
> another pair for vcmdq read/write, right?
nope I did not see that

trace_tegra241_cmdqv_write_mmio also applied to those accesses.
To be honest I would have prefered to have different traces points for different types of MMIO (CMDQV Config, VCMDQ P0 (when not mmaped), VCMDQ P1) to easily differentiate them but if you fell that's too heavy that's enough.

Eric

>
> (Agree with all other suggestions above)
>
> Thanks,
> Shameer



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing
  2026-04-15 10:55 ` [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing Shameer Kolothum
  2026-05-05  0:50   ` Nicolin Chen
@ 2026-05-06 12:27   ` Eric Auger
  1 sibling, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-06 12:27 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina



On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> Introduce tegra241_cmdqv_vintf_ptr() to route VCMDQ register accesses
> through the mmap'd VINTF page0 backing once a hardware queue has been
> allocated.
>
> There are two QEMU trapped MMIO apertures for VCMDQ registers:
s/VCMDQ registers/ VCMDQ Page0 registers
>
>   - Direct VCMDQ aperture (offset 0x10000)
emphasize this is page 0
>   - VINTF Page0 (offset 0x30000)
>
> These are hardware aliases: they address the same underlying registers.
> A subsequent patch maps the VINTF aperture as a guest-direct RAM region;
> in this patch both remain QEMU-trapped.
>
> VCMDQ register accesses operate in one of two mutually exclusive modes,
> depending on whether a hardware queue (IOMMU_HW_QUEUE_ALLOC) has been
> allocated for the VCMDQ:
>
> Pre-alloc: vintf_ptr is NULL. Both apertures use QEMU's register
> cache. Hardware is not yet engaged;
>
> Post-alloc: vintf_ptr is valid. Both QEMU trapped apertures access
> registers directly via the mmap'd vintf_page0 pointer, bypassing
> the cache. Hardware is the single source of truth.
>
> The pre-to-post-alloc transition is triggered by the BASE register write
> that initiates IOMMU_HW_QUEUE_ALLOC. No cache-to-hardware synchronisation
> is needed at transition time. The hardware mandated init sequence requires
> BASE to be written first; PROD_INDX, CONS_INDX and CONFIG.CMDQ_EN are
> programmed only after BASE and are therefore always post-alloc.
>
> Any pre-alloc writes to those registers update only the register cache,
> which is discarded at the transition.
>
> CMDQV acceleration only becomes active once the guest enables VINTF and
> programs the VCMDQ BASE register. Until then, all VCMDQ accesses are
> served from the emulated register cache with no real hardware command
> processing. This matches the CMDQV hardware specification: if the logical
> CMDQ index does not map to any allocated Virtual CMDQ, "the access is
> dropped with no Fault/Interrupt".
>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/tegra241-cmdqv.c | 48 ++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 47 insertions(+), 1 deletion(-)
>
> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> index b5f2f74cf2..eb619e1134 100644
> --- a/hw/arm/tegra241-cmdqv.c
> +++ b/hw/arm/tegra241-cmdqv.c
> @@ -75,17 +75,45 @@ static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int index,
>      return true;
>  }
>  
> +static inline uint32_t *tegra241_cmdqv_vintf_ptr(Tegra241CMDQV *cmdqv,
> +                                                 int index, hwaddr offset0)
> +{
> +    if (!cmdqv->vcmdq[index] || !cmdqv->vintf_page0) {
> +        return NULL;
> +    }
> +    return (uint32_t *)(cmdqv->vintf_page0 +
> +                        (index * CMDQV_VCMDQ_STRIDE) +
> +                        (offset0 - CMDQV_VCMDQ_PAGE0_BASE));
> +}
> +
>  /*
>   * Read a VCMDQ register using VCMDQ0_* offsets.
>   *
>   * The caller normalizes the MMIO offset such that @offset0 always refers
>   * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
>   *
> - * All VCMDQ accesses return cached registers.
> + * If the VCMDQ is allocated and VINTF page0 is mmap'd, read directly
> + * from the VINTF page0 backing. Otherwise, fall back to cached state.
>   */
>  static uint64_t tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
>                                            int index)
>  {
> +    uint32_t *ptr = tegra241_cmdqv_vintf_ptr(cmdqv, index, offset0);
> +
> +    if (ptr) {
> +        switch (offset0) {
> +        case A_VCMDQ0_CONS_INDX:
> +        case A_VCMDQ0_PROD_INDX:
> +        case A_VCMDQ0_CONFIG:
> +        case A_VCMDQ0_STATUS:
> +        case A_VCMDQ0_GERROR:
> +        case A_VCMDQ0_GERRORN:
> +            return *ptr;
> +        default:
are there other page0 regs which are not modeled for which this
implement would return 0?
> +            break;
> +        }
> +    }
I would prefer we have a clear separation between page0 accesses and
page1 accesses (meaning 2 different helpers).

This would help understand that page1 accesses are always trapped
whereas page0 access can be either cached or mmapped.
> +
>      switch (offset0) {
>      case A_VCMDQ0_CONS_INDX:
>          return cmdqv->vcmdq_cons_indx[index];
> @@ -120,11 +148,29 @@ static uint64_t tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
>   *
>   * The caller normalizes the MMIO offset such that @offset0 always refers
>   * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
> + *
> + * If the VCMDQ is allocated and VINTF page0 is mmap'd, write directly
> + * to the VINTF page0 backing. Otherwise, update cached state.
>   */
>  static void tegra241_cmdqv_write_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
>                                         int index, uint64_t value,
>                                         unsigned size, Error **errp)
>  {
> +    uint32_t *ptr = tegra241_cmdqv_vintf_ptr(cmdqv, index, offset0);
> +
> +    if (ptr) {
> +        switch (offset0) {
> +        case A_VCMDQ0_CONS_INDX:
> +        case A_VCMDQ0_PROD_INDX:
> +        case A_VCMDQ0_CONFIG:
> +        case A_VCMDQ0_GERRORN:
> +            *ptr = (uint32_t)value;
> +            return;
> +        default:
> +            break;
> +        }
> +    }
> +
>      switch (offset0) {
>      case A_VCMDQ0_CONS_INDX:
>          cmdqv->vcmdq_cons_indx[index] = (uint32_t)value;
Thanks

Eric



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 21/31] memory: Allow RAM device regions to skip IOMMU mapping
  2026-04-15 10:55 ` [PATCH v4 21/31] memory: Allow RAM device regions to skip IOMMU mapping Shameer Kolothum
@ 2026-05-06 12:39   ` Eric Auger
  0 siblings, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-06 12:39 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina



On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> Some RAM device regions created with memory_region_init_ram_device_ptr()
> are not intended to be P2P DMA targets.
>
> The VFIO listener currently treats all RAM device regions as DMA
> capable and attempts to map them into the IOMMU. For regions without
> dma-buf backing this fails and prints warnings such as:
>
>   IOMMU_IOAS_MAP failed: Bad address, PCI BAR?
>
> Introduce a MemoryRegion flag (ram_device_skip_iommu_map) to mark RAM
> device regions that should not be IOMMU mapped. When set, the VFIO
> listener skips DMA mapping for that region.
>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  include/system/memory.h | 2 ++
>  hw/vfio/listener.c      | 5 +++++
>  2 files changed, 7 insertions(+)
>
> diff --git a/include/system/memory.h b/include/system/memory.h
> index 7aed255e81..9df15e833a 100644
> --- a/include/system/memory.h
> +++ b/include/system/memory.h
> @@ -864,6 +864,8 @@ struct MemoryRegion {
>  
>      /* For devices designed to perform re-entrant IO into their own IO MRs */
>      bool disable_reentrancy_guard;
> +    /* RAM device region that does not require IOMMU mapping for P2P */
> +    bool ram_device_skip_iommu_map;
>  };
>  
>  struct IOMMUMemoryRegion {
> diff --git a/hw/vfio/listener.c b/hw/vfio/listener.c
> index 960da9e0a9..32d33a740a 100644
> --- a/hw/vfio/listener.c
> +++ b/hw/vfio/listener.c
> @@ -614,6 +614,11 @@ void vfio_container_region_add(VFIOContainer *bcontainer,
>          }
>      }
>  
> +    if (memory_region_is_ram_device(section->mr) &&
> +        section->mr->ram_device_skip_iommu_map) {
nit might be better the create an accessor including both checks
like memory_region_is_protected

Thanks

Eric
> +        return;
> +    }
> +
>      ret = vfio_container_dma_map(bcontainer, iova, int128_get64(llsize),
>                                   vaddr, section->readonly, section->mr);
>      if (ret) {



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0 into guest MMIO space
  2026-04-15 10:55 ` [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0 into guest MMIO space Shameer Kolothum
@ 2026-05-06 12:44   ` Eric Auger
  2026-05-06 14:24     ` Shameer Kolothum Thodi
  0 siblings, 1 reply; 102+ messages in thread
From: Eric Auger @ 2026-05-06 12:44 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina

Hi Shameer,

On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> Once a VCMDQ is allocated, map the mmap'd vintf_page0 region directly
> into the guest-visible MMIO space at offset 0x30000 as a RAM-backed
> MemoryRegion. This eliminates QEMU trapping for hot-path CONS/PROD
> index updates.
>
> After this patch, the two VCMDQ apertures use different access paths:
> the direct aperture (0x10000) remains QEMU-trapped and writes via
> vintf_ptr, while the VI aperture (0x30000) is a direct guest RAM
> mapping. Both paths write to the same underlying vintf_page0 memory,
> so no synchronisation between the apertures is needed.

I fail to understand when the previous trapped path using ptr in
tegra241_cmdqv_read/write_vcmdq gets used versus that path. Is it
eventually used?
>
> The mapping is installed lazily on first successful VCMDQ hardware
> queue allocation and removed when CMDQV or VINTF is disabled.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/tegra241-cmdqv.h |  1 +
>  hw/arm/tegra241-cmdqv.c | 37 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 38 insertions(+)
>
> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> index 039d86374f..2befa6205e 100644
> --- a/hw/arm/tegra241-cmdqv.h
> +++ b/hw/arm/tegra241-cmdqv.h
> @@ -46,6 +46,7 @@ typedef struct Tegra241CMDQV {
>      IOMMUFDVeventq *veventq;
>      IOMMUFDHWqueue *vcmdq[TEGRA241_CMDQV_MAX_CMDQ];
>      void *vintf_page0;
> +    MemoryRegion *mr_vintf_page0;
>  
>      /* Register Cache */
>      uint32_t config;
> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> index eb619e1134..bf989dd51f 100644
> --- a/hw/arm/tegra241-cmdqv.c
> +++ b/hw/arm/tegra241-cmdqv.c
> @@ -15,6 +15,40 @@
>  #include "tegra241-cmdqv.h"
>  #include "trace.h"
>  
> +static void tegra241_cmdqv_guest_unmap_vintf_page0(Tegra241CMDQV *cmdqv)
> +{
> +    if (!cmdqv->mr_vintf_page0) {
> +        return;
> +    }
> +
> +    memory_region_del_subregion(&cmdqv->mmio_cmdqv, cmdqv->mr_vintf_page0);
> +    object_unparent(OBJECT(cmdqv->mr_vintf_page0));
> +    g_free(cmdqv->mr_vintf_page0);
> +    cmdqv->mr_vintf_page0 = NULL;
> +}
> +
> +static void tegra241_cmdqv_guest_map_vintf_page0(Tegra241CMDQV *cmdqv)
> +{
> +    char *name;
> +
> +    if (cmdqv->mr_vintf_page0) {
> +        return;
> +    }
> +
> +    name = g_strdup_printf("%s vintf-page0",
> +                           memory_region_name(&cmdqv->mmio_cmdqv));
> +    cmdqv->mr_vintf_page0 = g_malloc0(sizeof(*cmdqv->mr_vintf_page0));
> +    memory_region_init_ram_device_ptr(cmdqv->mr_vintf_page0,
> +                                      memory_region_owner(&cmdqv->mmio_cmdqv),
> +                                      name, VINTF_PAGE_SIZE,
> +                                      cmdqv->vintf_page0);
> +    cmdqv->mr_vintf_page0->ram_device_skip_iommu_map = true;
I guess you need a setter, here to.
> +    memory_region_add_subregion_overlap(&cmdqv->mmio_cmdqv,
> +                                        CMDQV_VINTF_PAGE0_BASE,
> +                                        cmdqv->mr_vintf_page0, 1);
> +    g_free(name);
> +}
> +
>  static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int index)
>  {
>      IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
> @@ -72,6 +106,7 @@ static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int index,
>      hw_queue->viommu = viommu;
>      cmdqv->vcmdq[index] = hw_queue;
>  
> +    tegra241_cmdqv_guest_map_vintf_page0(cmdqv);
>      return true;
>  }
>  
> @@ -312,6 +347,7 @@ static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
>                  cmdqv->vintf_status |= R_VINTF0_STATUS_ENABLE_OK_MASK;
>              }
>          } else {
> +            tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
>              tegra241_cmdqv_free_all_vcmdq(cmdqv);
>              tegra241_cmdqv_munmap_vintf_page0(cmdqv, errp);
>              cmdqv->vintf_status &= ~R_VINTF0_STATUS_ENABLE_OK_MASK;
> @@ -438,6 +474,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
>          if (value & R_CONFIG_CMDQV_EN_MASK) {
>              cmdqv->status |= R_STATUS_CMDQV_ENABLED_MASK;
>          } else {
> +            tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
>              tegra241_cmdqv_free_all_vcmdq(cmdqv);
>              cmdqv->status &= ~R_STATUS_CMDQV_ENABLED_MASK;
>          }
Thanks

Eric



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 23/31] hw/arm/smmuv3-accel: Introduce common helper for veventq read
  2026-04-15 10:55 ` [PATCH v4 23/31] hw/arm/smmuv3-accel: Introduce common helper for veventq read Shameer Kolothum
  2026-05-05  1:07   ` Nicolin Chen
@ 2026-05-06 12:49   ` Eric Auger
  1 sibling, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-06 12:49 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina



On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> Move the vEVENTQ read and validation logic into a common helper
> smmuv3_accel_event_read_validate(). The helper performs the read(),
> checks for overflow and short reads, validates the sequence number,
> and updates the sequence state.
>
> This helper can be reused for Tegra241 CMDQV vEVENTQ support in a
> subsequent patch.
>
> Error handling is slightly adjusted: instead of reporting errors
> directly in the read handler, the helper now returns errors via
> Error **. Sequence gaps are reported as warnings.
>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  hw/arm/smmuv3-accel.h       |  2 ++
>  hw/arm/smmuv3-accel-stubs.c | 11 ++++++
>  hw/arm/smmuv3-accel.c       | 67 ++++++++++++++++++++++---------------
>  3 files changed, 53 insertions(+), 27 deletions(-)
>
> diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
> index 28bceca061..448f47c0ca 100644
> --- a/hw/arm/smmuv3-accel.h
> +++ b/hw/arm/smmuv3-accel.h
> @@ -71,6 +71,8 @@ bool smmuv3_accel_issue_inv_cmd(SMMUv3State *s, void *cmd, SMMUDevice *sdev,
>                                  Error **errp);
>  void smmuv3_accel_idr_override(SMMUv3State *s);
>  bool smmuv3_accel_alloc_veventq(SMMUv3State *s, Error **errp);
> +bool smmuv3_accel_event_read_validate(IOMMUFDVeventq *veventq, uint32_t type,
> +                                      void *buf, size_t size, Error **errp);
>  void smmuv3_accel_reset(SMMUv3State *s);
>  
>  #endif /* HW_ARM_SMMUV3_ACCEL_H */
> diff --git a/hw/arm/smmuv3-accel-stubs.c b/hw/arm/smmuv3-accel-stubs.c
> index c08caa6fa4..e8f08dc833 100644
> --- a/hw/arm/smmuv3-accel-stubs.c
> +++ b/hw/arm/smmuv3-accel-stubs.c
> @@ -41,6 +41,17 @@ void smmuv3_accel_idr_override(SMMUv3State *s)
>  {
>  }
>  
> +bool smmuv3_accel_alloc_veventq(SMMUv3State *s, Error **errp)
> +{
> +    return true;
> +}
> +
> +bool smmuv3_accel_event_read_validate(IOMMUFDVeventq *veventq, uint32_t type,
> +                                      void *buf, size_t size, Error **errp)
> +{
> +    return true;
> +}
> +
>  void smmuv3_accel_reset(SMMUv3State *s)
>  {
>  }
> diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> index 9068e65e2b..230f608f03 100644
> --- a/hw/arm/smmuv3-accel.c
> +++ b/hw/arm/smmuv3-accel.c
> @@ -436,47 +436,60 @@ bool smmuv3_accel_issue_inv_cmd(SMMUv3State *bs, void *cmd, SMMUDevice *sdev,
>                     sizeof(Cmd), &entry_num, cmd, errp);
>  }
>  
> -static void smmuv3_accel_event_read(void *opaque)
> +bool smmuv3_accel_event_read_validate(IOMMUFDVeventq *veventq, uint32_t type,
> +                                      void *buf, size_t size, Error **errp)
>  {
> -    SMMUv3State *s = opaque;
> -    IOMMUFDVeventq *veventq = s->s_accel->veventq;
> -    struct {
> -        struct iommufd_vevent_header hdr;
> -        struct iommu_vevent_arm_smmuv3 vevent;
> -    } buf;
> -    enum iommu_veventq_type type = IOMMU_VEVENTQ_TYPE_ARM_SMMUV3;
> -    uint32_t id = veventq->veventq_id;
>      uint32_t last_seq = veventq->last_event_seq;
> +    uint32_t id = veventq->veventq_id;
> +    struct iommufd_vevent_header *hdr;
>      ssize_t bytes;
>  
> -    bytes = read(veventq->veventq_fd, &buf, sizeof(buf));
> +    bytes = read(veventq->veventq_fd, buf, size);
>      if (bytes <= 0) {
>          if (errno == EAGAIN || errno == EINTR) {
> -            return;
> +            return true;
>          }
> -        error_report_once("vEVENTQ(type %u id %u): read failed (%m)", type, id);
> -        return;
> +        error_setg(errp, "vEVENTQ(type %u id %u): read failed (%m)", type, id);
> +        return false;
>      }
> -
> -    if (bytes == sizeof(buf.hdr) &&
> -        (buf.hdr.flags & IOMMU_VEVENTQ_FLAG_LOST_EVENTS)) {
> -        error_report_once("vEVENTQ(type %u id %u): overflowed", type, id);
> +    hdr = (struct iommufd_vevent_header *)buf;
> +    if (bytes == sizeof(*hdr) &&
> +        (hdr->flags & IOMMU_VEVENTQ_FLAG_LOST_EVENTS)) {
> +        error_setg(errp, "vEVENTQ(type %u id %u): overflowed", type, id);
>          veventq->event_start = false;
> -        return;
> +        return false;
>      }
> -    if (bytes < sizeof(buf)) {
> -        error_report_once("vEVENTQ(type %u id %u): short read(%zd/%zd bytes)",
> -                          type, id, bytes, sizeof(buf));
> -        return;
> +    if (bytes < size) {
> +        error_setg(errp, "vEVENTQ(type %u id %u): short read(%zd/%zd bytes)",
> +                          type, id, bytes, size);
> +        return false;
>      }
> -
>      /* Check sequence in hdr for lost events if any */
> -    if (veventq->event_start && (buf.hdr.sequence - last_seq != 1)) {
> -        error_report_once("vEVENTQ(type %u id %u): lost %u event(s)",
> -                          type, id, buf.hdr.sequence - last_seq - 1);
> +    if (veventq->event_start && (hdr->sequence - last_seq != 1)) {
> +        warn_report("vEVENTQ(type %u id %u): lost %u event(s)",
> +                    type, id, hdr->sequence - last_seq - 1);
>      }
> -    veventq->last_event_seq = buf.hdr.sequence;
> +    veventq->last_event_seq = hdr->sequence;
>      veventq->event_start = true;
> +    return true;
> +}
> +
> +static void smmuv3_accel_event_read(void *opaque)
> +{
> +    SMMUv3State *s = opaque;
> +    IOMMUFDVeventq *veventq = s->s_accel->veventq;
> +    struct {
> +        struct iommufd_vevent_header hdr;
> +        struct iommu_vevent_arm_smmuv3 vevent;
> +    } buf;
> +    Error *local_err = NULL;
> +
> +    if (!smmuv3_accel_event_read_validate(veventq,
> +                                          IOMMU_VEVENTQ_TYPE_ARM_SMMUV3, &buf,
> +                                          sizeof(buf), &local_err)) {
> +        warn_report_err_once(local_err);
> +        return;
> +    }
>      smmuv3_propagate_event(s, (Evt *)&buf.vevent);
>  }
>  



^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing
  2026-05-05 19:52       ` Nicolin Chen
@ 2026-05-06 13:16         ` Shameer Kolothum Thodi
  2026-05-06 18:34           ` Nicolin Chen
  0 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-06 13:16 UTC (permalink / raw)
  To: Nicolin Chen, eric.auger@redhat.com
  Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org,
	peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	zhenzhong.duan@intel.com, Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: 05 May 2026 20:52
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>
> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org;
> eric.auger@redhat.com; peter.maydell@linaro.org; clg@redhat.com;
> alex@shazbot.org; Nathan Chen <nathanc@nvidia.com>; Matt Ochs
> <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF
> page0 as VCMDQ backing
> 
> On Tue, May 05, 2026 at 08:13:40AM -0700, Shameer Kolothum Thodi
> wrote:
> > > From: Nicolin Chen <nicolinc@nvidia.com>
> > > On Wed, Apr 15, 2026 at 11:55:41AM +0100, Shameer Kolothum wrote:
> > > > The pre-to-post-alloc transition is triggered by the BASE register
> > > > write that initiates IOMMU_HW_QUEUE_ALLOC. No cache-to-hardware
> > > > synchronisation is needed at transition time. The hardware mandated
> > > > init sequence requires BASE to be written first; PROD_INDX, CONS_INDX
> > > > and CONFIG.CMDQ_EN are programmed only after BASE and are
> therefore
> > > always post-alloc.
> > > >
> > > > Any pre-alloc writes to those registers update only the register
> > > > cache, which is discarded at the transition.
> > >
> > > Is "discard" the correct action?
> > >
> > > Guest OS might expect HW (VM) to retain what it writes to those
> > > page0 registers?
> >
> > As explained above, I reached that conclusion based on spec p.174/175:
> >
> > Under "Enabling the Virtual CMDQ" it specifies the init order as below:
> >
> > - Program the VIRT_CMDQ_BASE
> > - Init the PROD_INDX/ CONS_INDX to 0 or a consistent value
> > - Program the VIRT_CMDQ_CONFIG to enable the CMDQ
> >
> > Since we set up the HW QUEUE on BASE write, a spec compliant guest is
> > expected to program the Page0 registers as above after that, and we pass
> > those writes directly to the VINTF Page0. There are no GERRORN writes
> > before HW queue setup either.
> 
> I don't think we can make an assumption that guest would follow
> what spec suggests.
> 
> Could this be a case:
>  - Write the PROD_INDX/ CONS_INDX with a consistent value (0xf)
>  - Program the VIRT_CMDQ_BASE
>  - Read the PROD_INDX/CONS_INDX, expecting 0xf
>  - Write the PROD_INDX/ CONS_INDX with 0x0
>  - Program the VIRT_CMDQ_CONFIG to enable the CMDQ

Yes, if Guest is not spec compliant then this can be any order.

However,

>  - Program the VIRT_CMDQ_BASE -- suppose we call setup_vcmdq()
then kernel will reset PROD/CONS, right?
https://lore.kernel.org/all/20260129224341.1594785-1-nicolinc@nvidia.com/

Again, the question is whether we should handle all these non-compliant
cases to prevent the guest from shooting itself in the foot, or simply
propagate any hardware error events due to the non-compliant
behaviour.

On the other side, if we discard the register values, the read-what-you-write
contract is broken.

I am not sure what is QEMU guideline w.r.t a non-spec compliant Guest
behaviour.

@Eric,  any thoughts?

We already have a spec-compliant behaviour assumption for accel
vEVENTQ support, as discussed here:

https://lore.kernel.org/qemu-devel/CH3PR12MB75485AFF456B4133B9FB3770ABA0A@CH3PR12MB7548.namprd12.prod.outlook.com/

Thanks,
Shameer


 


^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0 into guest MMIO space
  2026-05-06 12:44   ` Eric Auger
@ 2026-05-06 14:24     ` Shameer Kolothum Thodi
  2026-05-07 16:24       ` Eric Auger
  0 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-06 14:24 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com

Hi Eric,

> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 06 May 2026 13:45
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org
> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0
> into guest MMIO space
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi Shameer,
> 
> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> >
> > Once a VCMDQ is allocated, map the mmap'd vintf_page0 region directly
> > into the guest-visible MMIO space at offset 0x30000 as a RAM-backed
> > MemoryRegion. This eliminates QEMU trapping for hot-path CONS/PROD
> > index updates.
> >
> > After this patch, the two VCMDQ apertures use different access paths:
> > the direct aperture (0x10000) remains QEMU-trapped and writes via
> > vintf_ptr, while the VI aperture (0x30000) is a direct guest RAM
> > mapping. Both paths write to the same underlying vintf_page0 memory,
> > so no synchronisation between the apertures is needed.
> 
> I fail to understand when the previous trapped path using ptr in
> tegra241_cmdqv_read/write_vcmdq gets used versus that path. Is it
> eventually used?

The spec does not prevent a guest from using the 0x10000 path for allocated
VCMDQs, so the trap path remains valid and QEMU forwards those accesses to
the mmap'd vintf_page0 via vintf_ptr.

We cannot map 0x10000 directly to the guest as RAM because the kernel
mmap only backs VCMDQs actually allocated via HW_QUEUE ioctl. If the
guest allocates only 1 of 2 VCMDQs, exposing the full direct aperture page0
as RAM would give the guest access to unallocated VCMDQ slots. Hence, it
remains trapped and QEMU only fwds to vintf page0 for allocated queues.

Thanks,
Shameer

> >
> > The mapping is installed lazily on first successful VCMDQ hardware
> > queue allocation and removed when CMDQV or VINTF is disabled.
> >
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
> > Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> > ---
> >  hw/arm/tegra241-cmdqv.h |  1 +
> >  hw/arm/tegra241-cmdqv.c | 37
> +++++++++++++++++++++++++++++++++++++
> >  2 files changed, 38 insertions(+)
> >
> > diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> > index 039d86374f..2befa6205e 100644
> > --- a/hw/arm/tegra241-cmdqv.h
> > +++ b/hw/arm/tegra241-cmdqv.h
> > @@ -46,6 +46,7 @@ typedef struct Tegra241CMDQV {
> >      IOMMUFDVeventq *veventq;
> >      IOMMUFDHWqueue *vcmdq[TEGRA241_CMDQV_MAX_CMDQ];
> >      void *vintf_page0;
> > +    MemoryRegion *mr_vintf_page0;
> >
> >      /* Register Cache */
> >      uint32_t config;
> > diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> > index eb619e1134..bf989dd51f 100644
> > --- a/hw/arm/tegra241-cmdqv.c
> > +++ b/hw/arm/tegra241-cmdqv.c
> > @@ -15,6 +15,40 @@
> >  #include "tegra241-cmdqv.h"
> >  #include "trace.h"
> >
> > +static void tegra241_cmdqv_guest_unmap_vintf_page0(Tegra241CMDQV
> *cmdqv)
> > +{
> > +    if (!cmdqv->mr_vintf_page0) {
> > +        return;
> > +    }
> > +
> > +    memory_region_del_subregion(&cmdqv->mmio_cmdqv, cmdqv-
> >mr_vintf_page0);
> > +    object_unparent(OBJECT(cmdqv->mr_vintf_page0));
> > +    g_free(cmdqv->mr_vintf_page0);
> > +    cmdqv->mr_vintf_page0 = NULL;
> > +}
> > +
> > +static void tegra241_cmdqv_guest_map_vintf_page0(Tegra241CMDQV
> *cmdqv)
> > +{
> > +    char *name;
> > +
> > +    if (cmdqv->mr_vintf_page0) {
> > +        return;
> > +    }
> > +
> > +    name = g_strdup_printf("%s vintf-page0",
> > +                           memory_region_name(&cmdqv->mmio_cmdqv));
> > +    cmdqv->mr_vintf_page0 = g_malloc0(sizeof(*cmdqv->mr_vintf_page0));
> > +    memory_region_init_ram_device_ptr(cmdqv->mr_vintf_page0,
> > +                                      memory_region_owner(&cmdqv->mmio_cmdqv),
> > +                                      name, VINTF_PAGE_SIZE,
> > +                                      cmdqv->vintf_page0);
> > +    cmdqv->mr_vintf_page0->ram_device_skip_iommu_map = true;
> I guess you need a setter, here to.
> > +    memory_region_add_subregion_overlap(&cmdqv->mmio_cmdqv,
> > +                                        CMDQV_VINTF_PAGE0_BASE,
> > +                                        cmdqv->mr_vintf_page0, 1);
> > +    g_free(name);
> > +}
> > +
> >  static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int
> index)
> >  {
> >      IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
> > @@ -72,6 +106,7 @@ static bool
> tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int index,
> >      hw_queue->viommu = viommu;
> >      cmdqv->vcmdq[index] = hw_queue;
> >
> > +    tegra241_cmdqv_guest_map_vintf_page0(cmdqv);
> >      return true;
> >  }
> >
> > @@ -312,6 +347,7 @@ static void
> tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
> >                  cmdqv->vintf_status |= R_VINTF0_STATUS_ENABLE_OK_MASK;
> >              }
> >          } else {
> > +            tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
> >              tegra241_cmdqv_free_all_vcmdq(cmdqv);
> >              tegra241_cmdqv_munmap_vintf_page0(cmdqv, errp);
> >              cmdqv->vintf_status &= ~R_VINTF0_STATUS_ENABLE_OK_MASK;
> > @@ -438,6 +474,7 @@ static void tegra241_cmdqv_write_mmio(void
> *opaque, hwaddr offset,
> >          if (value & R_CONFIG_CMDQV_EN_MASK) {
> >              cmdqv->status |= R_STATUS_CMDQV_ENABLED_MASK;
> >          } else {
> > +            tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
> >              tegra241_cmdqv_free_all_vcmdq(cmdqv);
> >              cmdqv->status &= ~R_STATUS_CMDQV_ENABLED_MASK;
> >          }
> Thanks
> 
> Eric


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
  2026-04-15 10:55 ` [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming Shameer Kolothum
  2026-05-05  0:40   ` Nicolin Chen
  2026-05-05 13:25   ` Eric Auger
@ 2026-05-06 16:51   ` Eric Auger
  2026-05-06 18:21       ` Nicolin Chen via qemu development
  2 siblings, 1 reply; 102+ messages in thread
From: Eric Auger @ 2026-05-06 16:51 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina



On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> Add support for allocating IOMMUFD hardware queues when the guest
> programs the VCMDQ BASE registers.
>
> VCMDQ_EN is part of the VCMDQ_CONFIG register, which is accessed
> through the VINTF Page0 region. A subsequent patch maps this region
> directly into the guest address space, so QEMU does not trap writes
> to VCMDQ_CONFIG.
>
> Since VCMDQ_EN writes are not trapped, QEMU cannot allocate the
> hardware queue based on that bit. Instead, allocate the IOMMUFD
> hardware queue when the guest writes a VCMDQ BASE register with a
> valid RAM-backed address and when CMDQV and VINTF are enabled.
>
> If a hardware queue was previously allocated for the same VCMDQ,
> free it before reallocation.
>
> Writes with invalid addresses are ignored.
>
> All allocated VCMDQs are freed when CMDQV or VINTF is disabled.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/tegra241-cmdqv.h | 11 +++++++
>  hw/arm/tegra241-cmdqv.c | 70 +++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 78 insertions(+), 3 deletions(-)
>
> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> index 88572ad939..039d86374f 100644
> --- a/hw/arm/tegra241-cmdqv.h
> +++ b/hw/arm/tegra241-cmdqv.h
> @@ -44,6 +44,7 @@ typedef struct Tegra241CMDQV {
>      MemoryRegion mmio_cmdqv;
>      qemu_irq irq;
>      IOMMUFDVeventq *veventq;
> +    IOMMUFDHWqueue *vcmdq[TEGRA241_CMDQV_MAX_CMDQ];
>      void *vintf_page0;
>  
>      /* Register Cache */
> @@ -348,6 +349,16 @@ A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(1)
>  A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(0)
>  A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(1)
>  
> +static inline bool tegra241_cmdq_enabled(Tegra241CMDQV *cmdqv)
> +{
> +    return cmdqv->status & R_STATUS_CMDQV_ENABLED_MASK;
> +}
> +
> +static inline bool tegra241_vintf_enabled(Tegra241CMDQV *cmdqv)
> +{
> +    return cmdqv->vintf_status & R_VINTF0_STATUS_ENABLE_OK_MASK;
> +}
> +
>  const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
>  
>  #endif /* HW_ARM_TEGRA241_CMDQV_H */
> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> index cdd941cec9..b5f2f74cf2 100644
> --- a/hw/arm/tegra241-cmdqv.c
> +++ b/hw/arm/tegra241-cmdqv.c
> @@ -15,6 +15,66 @@
>  #include "tegra241-cmdqv.h"
>  #include "trace.h"
>  
> +static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int index)
> +{
> +    IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
> +    IOMMUFDHWqueue *vcmdq = cmdqv->vcmdq[index];
> +
> +    if (!vcmdq) {
> +        return;
> +    }
> +    iommufd_backend_free_id(viommu->iommufd, vcmdq->hw_queue_id);
> +    g_free(vcmdq);
> +    cmdqv->vcmdq[index] = NULL;
> +}
> +
> +static void tegra241_cmdqv_free_all_vcmdq(Tegra241CMDQV *cmdqv)
> +{
> +    /* Free in reverse order to avoid "resource busy" error */
> +    for (int i = (TEGRA241_CMDQV_MAX_CMDQ - 1); i >= 0; i--) {
> +        tegra241_cmdqv_free_vcmdq(cmdqv, i);
uapi/linux/iommufd.h says:        
* - alloc starts from the lowest @index=0 in ascending order
* - destroy starts from the last allocated @index in descending order

so this seems a requirement of the uapi if I understand correctly

Eric

> +    }
> +}
> +
> +static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int index,
> +                                       Error **errp)
> +{
> +    SMMUv3AccelState *accel = cmdqv->s_accel;
> +    uint64_t base_mask = (uint64_t)R_VCMDQ0_BASE_L_ADDR_MASK |
> +                         (uint64_t)R_VCMDQ0_BASE_H_ADDR_MASK << 32;
> +    uint64_t addr = cmdqv->vcmdq_base[index] & base_mask;
> +    uint64_t log2 = cmdqv->vcmdq_base[index] & R_VCMDQ0_BASE_L_LOG2SIZE_MASK;
> +    uint64_t size = 1ULL << (log2 + 4);
> +    IOMMUFDViommu *viommu = accel->viommu;
> +    IOMMUFDHWqueue *hw_queue;
> +    uint32_t hw_queue_id;
> +
> +    /* Ignore any invalid address. This may come as part of reset etc. */
> +    if (!address_space_is_ram(&address_space_memory, addr) ||
> +        !address_space_is_ram(&address_space_memory, addr + size - 1)) {
> +        return true;
> +    }
> +
> +    if (!tegra241_cmdq_enabled(cmdqv) || !tegra241_vintf_enabled(cmdqv)) {
> +        return true;
> +    }
> +
> +    tegra241_cmdqv_free_vcmdq(cmdqv, index);
> +
> +    if (!iommufd_backend_alloc_hw_queue(viommu->iommufd, viommu->viommu_id,
> +                                        IOMMU_HW_QUEUE_TYPE_TEGRA241_CMDQV,
> +                                        index, addr, size, &hw_queue_id,
> +                                        errp)) {
> +        return false;
> +    }
> +    hw_queue = g_new(IOMMUFDHWqueue, 1);
> +    hw_queue->hw_queue_id = hw_queue_id;
> +    hw_queue->viommu = viommu;
> +    cmdqv->vcmdq[index] = hw_queue;
> +
> +    return true;
> +}
> +
>  /*
>   * Read a VCMDQ register using VCMDQ0_* offsets.
>   *
> @@ -63,7 +123,7 @@ static uint64_t tegra241_cmdqv_read_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
>   */
>  static void tegra241_cmdqv_write_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
>                                         int index, uint64_t value,
> -                                       unsigned size)
> +                                       unsigned size, Error **errp)
>  {
>      switch (offset0) {
>      case A_VCMDQ0_CONS_INDX:
> @@ -91,11 +151,13 @@ static void tegra241_cmdqv_write_vcmdq(Tegra241CMDQV *cmdqv, hwaddr offset0,
>                  (cmdqv->vcmdq_base[index] & 0xffffffff00000000ULL) |
>                  (value & 0xffffffffULL);
>          }
> +        tegra241_cmdqv_setup_vcmdq(cmdqv, index, errp);
>          return;
>      case A_VCMDQ0_BASE_H:
>          cmdqv->vcmdq_base[index] =
>              (cmdqv->vcmdq_base[index] & 0xffffffffULL) |
>              ((uint64_t)value << 32);
> +        tegra241_cmdqv_setup_vcmdq(cmdqv, index, errp);
>          return;
>      case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
>          if (size == 8) {
> @@ -204,6 +266,7 @@ static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
>                  cmdqv->vintf_status |= R_VINTF0_STATUS_ENABLE_OK_MASK;
>              }
>          } else {
> +            tegra241_cmdqv_free_all_vcmdq(cmdqv);
>              tegra241_cmdqv_munmap_vintf_page0(cmdqv, errp);
>              cmdqv->vintf_status &= ~R_VINTF0_STATUS_ENABLE_OK_MASK;
>          }
> @@ -329,6 +392,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
>          if (value & R_CONFIG_CMDQV_EN_MASK) {
>              cmdqv->status |= R_STATUS_CMDQV_ENABLED_MASK;
>          } else {
> +            tegra241_cmdqv_free_all_vcmdq(cmdqv);
>              cmdqv->status &= ~R_STATUS_CMDQV_ENABLED_MASK;
>          }
>          break;
> @@ -363,7 +427,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
>           */
>          index = (offset - CMDQV_VCMDQ_PAGE0_BASE) / CMDQV_VCMDQ_STRIDE;
>          tegra241_cmdqv_write_vcmdq(cmdqv, offset - index * CMDQV_VCMDQ_STRIDE,
> -                                   index, value, size);
> +                                   index, value, size, &local_err);
>          break;
>      case A_VI_VCMDQ0_BASE_L ... A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
>          /* Same VINTF-to-VCMDQ translation as VINTF Page0 case above */
> @@ -373,7 +437,7 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
>          /* Same decode logic as VCMDQ Page0 case above */
>          index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
>          tegra241_cmdqv_write_vcmdq(cmdqv, offset - index * CMDQV_VCMDQ_STRIDE,
> -                                   index, value, size);
> +                                   index, value, size, &local_err);
>          break;
>      default:
>          qemu_log_mask(LOG_UNIMP, "%s unhandled write access at 0x%" PRIx64 "\n",



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
  2026-05-05 14:26     ` Shameer Kolothum Thodi
@ 2026-05-06 17:49       ` Nicolin Chen
  2026-05-08 14:50       ` Eric Auger
  1 sibling, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-06 17:49 UTC (permalink / raw)
  To: Shameer Kolothum Thodi
  Cc: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org,
	peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com

On Tue, May 05, 2026 at 07:26:29AM -0700, Shameer Kolothum Thodi wrote:
> > > +static void tegra241_cmdqv_free_all_vcmdq(Tegra241CMDQV *cmdqv)
> > > +{
> > > +    /* Free in reverse order to avoid "resource busy" error */
> > can you provide additional details about the above problematic. Is it
> > documented in the spec?
> 
> See p.176:
> 
> Deallocate a VCMDQ from a Virtual Interface 
>     Logical CMDQ being deallocated for a Guest must be in decreasing order
>     starting from the highest numbered LVCMDQ.

Maybe add "Spec demands" to things like this.

> > > +    for (int i = (TEGRA241_CMDQV_MAX_CMDQ - 1); i >= 0; i--) {
> > > +        tegra241_cmdqv_free_vcmdq(cmdqv, i);
> > > +    }
> > > +}
> > > +
> > > +static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int
> > index,
> > > +                                       Error **errp)
> > > +{
> > > +    SMMUv3AccelState *accel = cmdqv->s_accel;
> > > +    uint64_t base_mask = (uint64_t)R_VCMDQ0_BASE_L_ADDR_MASK |
> > > +                         (uint64_t)R_VCMDQ0_BASE_H_ADDR_MASK << 32;
> > > +    uint64_t addr = cmdqv->vcmdq_base[index] & base_mask;
> > > +    uint64_t log2 = cmdqv->vcmdq_base[index] &
> > R_VCMDQ0_BASE_L_LOG2SIZE_MASK;
> > > +    uint64_t size = 1ULL << (log2 + 4);
> > > +    IOMMUFDViommu *viommu = accel->viommu;
> > > +    IOMMUFDHWqueue *hw_queue;
> > > +    uint32_t hw_queue_id;
> > > +
> > > +    /* Ignore any invalid address. This may come as part of reset etc. */
> > > +    if (!address_space_is_ram(&address_space_memory, addr) ||
> > > +        !address_space_is_ram(&address_space_memory, addr + size - 1)) {
> > this check looks a little bit risky, no? Don't we have a better way to
> > test the address has been properly set?
> 
> I think eventually kernel will handle any attempt to use an invalid address
> through IOMMU_HW_QUEUE_ALLOC IOCTL:
>      iommufd_hw_queue_alloc_phys()/iommufd_access_pin_pages() etc.
> 
> Any attempt to pass an invalid address will return error, I think.
> 
> @Nicolin, is that a safe assumption to make?

Kernel validates if the gPA range is in the stage-2 page table and
physically contiguous. It might be nicer if QEMU has a way to make
sure that is the range is in RAM.

Nicolin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
  2026-05-06  8:18         ` Shameer Kolothum Thodi
@ 2026-05-06 18:18           ` Nicolin Chen
  0 siblings, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-06 18:18 UTC (permalink / raw)
  To: Shameer Kolothum Thodi
  Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org, eric.auger@redhat.com,
	peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com

On Wed, May 06, 2026 at 01:18:58AM -0700, Shameer Kolothum Thodi wrote:
> > > Not sure I get that completely.
> > >
> > > Spec(p. 176) has:
> > >
> > > "While the software can program the Virtual CMDQ(s) directly using the
> > > direct VCMDQ aperture (and not through the Virtual Interface), it is
> > > required that the VCMDQ be allocated to a Virtual Interface before it
> > > is used to send commands to the SMMU."
> > >
> > > The spec only restricts sending commands before allocation, not
> > > programming BASE. In our model, the BASE write itself triggers
> > > alloc_hw_queue so there's nothing to fence there.
> > 
> > Our model has an assumption that guest would map a VCMDQ to the
> > VINTF0 first before doing any meaningful programming.
> 
> Yes. And I think by "map" here, you mean the writing to the _ALLOC_MAP
> register. If so, yes, we don't check that.

Yea, map == ALLOC_MAP.

> > My concern here is about the access to the direct/global VCMDQ MMIO
> > space (0x10000 and 0x20000).
> > 
> > So, if this concern is a real case, should we:
> >  1. If the guest maps the VCMDQ to VINTF0 before it writes BASE,
> >     call setup() to allocate hw_queue as our model expects.
> >  2. If the guest writes BASE before it maps VCMDQ to VINTF0, we
> >     need to fence against it.
> > ?
> > 
> > One more thing to check for case (2): after the guest programmed
> > BASE and then maps it to VINTF0, and it never writes BASE again,
> > should we call setup() at map?
> 
> That is a non-spec compliant behaviour I guess:.
> 
> In p.176: "Allocate a VCMDQ to the Virtual Interface", this is
> the order:
> 
> -Program the CMDQV_CMDQ_ALLOC_MAP_X to map CMDQ X to the
>  Logical CMDQ (L) on Virtual Interface (V). Logical CMDQ allocated in the
>  Guest must be in order starting from 0.
> -CMDQ is allocated to be used by the software. Guest can now program
>   the VCMDQ to use it   as described in the "Enabling the Virtual CMDQ"
>   section.
> 
> "Enabling the Virtual CMDQ" (p.175) is where BASE is programmed. So the
> spec explicitly places ALLOC_MAP before BASE.
> 
> Should we support a non-spec compliant Guest is another question.
> IMHO, supporting non-compliant behaviour adds complexity with
> no real benefit.

I read that as a spec recommendation (IOW, a SW guidance) not an
ordering enforcement (IOW, not a HW contract).

The real restriction for BASE registers is, per spec:
    Should be programmed before CMDQEN == 1
and HW behaviour for breaking this restriction is:
    Ignore Writes to the register when the CMDQEN == 1

My reading is that, so long as SW follows this minimal rule, it
can do whatever it wants.

(Btw, seems we haven't applies this restriction into our code?)

The point is about how we are going to simulate the HW. If real
HW allows this (maybe worth hacking the driver to confirm this),
there seems to be no reason for us not to support that?

Nicolin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
  2026-05-06 16:51   ` Eric Auger
@ 2026-05-06 18:21       ` Nicolin Chen via qemu development
  0 siblings, 0 replies; 102+ messages in thread
From: Nicolin Chen via @ 2026-05-06 18:21 UTC (permalink / raw)
  To: Eric Auger
  Cc: Shameer Kolothum, qemu-arm, qemu-devel, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, May 06, 2026 at 06:51:24PM +0200, Eric Auger wrote:
> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> > +static void tegra241_cmdqv_free_all_vcmdq(Tegra241CMDQV *cmdqv)
> > +{
> > +    /* Free in reverse order to avoid "resource busy" error */
> > +    for (int i = (TEGRA241_CMDQV_MAX_CMDQ - 1); i >= 0; i--) {
> > +        tegra241_cmdqv_free_vcmdq(cmdqv, i);
> uapi/linux/iommufd.h says:        
> * - alloc starts from the lowest @index=0 in ascending order
> * - destroy starts from the last allocated @index in descending order
> 
> so this seems a requirement of the uapi if I understand correctly

Oh yea. I forgot I put there too. Fundamentally, it's demanded by
the HW contract. But I think you are right, for QEMU, saying that
it follows the uAPI fits better.

Nicolin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
@ 2026-05-06 18:21       ` Nicolin Chen via qemu development
  0 siblings, 0 replies; 102+ messages in thread
From: Nicolin Chen via qemu development @ 2026-05-06 18:21 UTC (permalink / raw)
  To: Eric Auger
  Cc: Shameer Kolothum, qemu-arm, qemu-devel, peter.maydell, clg, alex,
	nathanc, mochs, jan, jgg, jonathan.cameron, zhenzhong.duan, kjaju,
	phrdina

On Wed, May 06, 2026 at 06:51:24PM +0200, Eric Auger wrote:
> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> > +static void tegra241_cmdqv_free_all_vcmdq(Tegra241CMDQV *cmdqv)
> > +{
> > +    /* Free in reverse order to avoid "resource busy" error */
> > +    for (int i = (TEGRA241_CMDQV_MAX_CMDQ - 1); i >= 0; i--) {
> > +        tegra241_cmdqv_free_vcmdq(cmdqv, i);
> uapi/linux/iommufd.h says:        
> * - alloc starts from the lowest @index=0 in ascending order
> * - destroy starts from the last allocated @index in descending order
> 
> so this seems a requirement of the uapi if I understand correctly

Oh yea. I forgot I put there too. Fundamentally, it's demanded by
the HW contract. But I think you are right, for QEMU, saying that
it follows the uAPI fits better.

Nicolin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing
  2026-05-06 13:16         ` Shameer Kolothum Thodi
@ 2026-05-06 18:34           ` Nicolin Chen
  2026-05-06 20:13             ` Shameer Kolothum Thodi
  0 siblings, 1 reply; 102+ messages in thread
From: Nicolin Chen @ 2026-05-06 18:34 UTC (permalink / raw)
  To: Shameer Kolothum Thodi
  Cc: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org,
	peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	zhenzhong.duan@intel.com, Krishnakant Jaju, phrdina@redhat.com

On Wed, May 06, 2026 at 06:16:00AM -0700, Shameer Kolothum Thodi wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Could this be a case:
> >  - Write the PROD_INDX/ CONS_INDX with a consistent value (0xf)
> >  - Program the VIRT_CMDQ_BASE
> >  - Read the PROD_INDX/CONS_INDX, expecting 0xf
> >  - Write the PROD_INDX/ CONS_INDX with 0x0
> >  - Program the VIRT_CMDQ_CONFIG to enable the CMDQ
> 
> Yes, if Guest is not spec compliant then this can be any order.
> 
> However,
> 
> >  - Program the VIRT_CMDQ_BASE -- suppose we call setup_vcmdq()

> then kernel will reset PROD/CONS, right?

But I don't have an impression that QEMU only supports Guest to run
Linux...

> Again, the question is whether we should handle all these non-compliant
> cases to prevent the guest from shooting itself in the foot, or simply
> propagate any hardware error events due to the non-compliant
> behaviour.

I don't really feel it convincing to call it non-compliant. HW does
not disallow that behaviour. Also, the flow in my example does not
shoot itself in the foot: it basically follows the spec recommended
flow by adding a pre-write before programming BASE. The remaining
steps are still exactly follow the spec recommendation. Yes, it is
odd to do so, but it is not insane yet. There might be some similar
corner case to exist for certain reasons that we can't imagine yet.

And adding these to making the QEMU model closer to a real HW isn't
seemingly very convoluted? I'd feed the spec to an AI and run some
analysis to see what else we need to cover here.

Nicolin

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing
  2026-05-06 18:34           ` Nicolin Chen
@ 2026-05-06 20:13             ` Shameer Kolothum Thodi
  2026-05-06 20:55               ` Nicolin Chen
  0 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-06 20:13 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org,
	peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	zhenzhong.duan@intel.com, Krishnakant Jaju, phrdina@redhat.com

> -----Original Message-----
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: 06 May 2026 19:35
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>
> Cc: eric.auger@redhat.com; qemu-arm@nongnu.org; qemu-
> devel@nongnu.org; peter.maydell@linaro.org; clg@redhat.com;
> alex@shazbot.org; Nathan Chen <nathanc@nvidia.com>; Matt Ochs
> <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; zhenzhong.duan@intel.com; Krishnakant Jaju
> <kjaju@nvidia.com>; phrdina@redhat.com
> Subject: Re: [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF
> page0 as VCMDQ backing
> 
> On Wed, May 06, 2026 at 06:16:00AM -0700, Shameer Kolothum Thodi wrote:
> > > From: Nicolin Chen <nicolinc@nvidia.com>
> > > Could this be a case:
> > >  - Write the PROD_INDX/ CONS_INDX with a consistent value (0xf)
> > >  - Program the VIRT_CMDQ_BASE
> > >  - Read the PROD_INDX/CONS_INDX, expecting 0xf
> > >  - Write the PROD_INDX/ CONS_INDX with 0x0
> > >  - Program the VIRT_CMDQ_CONFIG to enable the CMDQ
> >
> > Yes, if Guest is not spec compliant then this can be any order.
> >
> > However,
> >
> > >  - Program the VIRT_CMDQ_BASE -- suppose we call setup_vcmdq()
> 
> > then kernel will reset PROD/CONS, right?
> 
> But I don't have an impression that QEMU only supports Guest to run
> Linux...

Oh no, I was referring to host kernel behaviour on IOMMU_HW_QUEUE_ALLOC
IOCTL. It seems to reset PROD/CONS to zero in that path and I thought
it was following spec in that case.

> 
> > Again, the question is whether we should handle all these non-compliant
> > cases to prevent the guest from shooting itself in the foot, or simply
> > propagate any hardware error events due to the non-compliant
> > behaviour.
> 
> I don't really feel it convincing to call it non-compliant. HW does
> not disallow that behaviour. Also, the flow in my example does not
> shoot itself in the foot: it basically follows the spec recommended
> flow by adding a pre-write before programming BASE. The remaining
> steps are still exactly follow the spec recommendation. Yes, it is
> odd to do so, but it is not insane yet. There might be some similar
> corner case to exist for certain reasons that we can't imagine yet.
> 
> And adding these to making the QEMU model closer to a real HW isn't
> seemingly very convoluted? I'd feed the spec to an AI and run some
> analysis to see what else we need to cover here.

I did ask Claude on the need to sync vs discard(feeding spec and all
this discussion) and had some fun😊

1)First, it said:

This scenario writes PROD_INDX before BASE, which is explicitly
non-spec-compliant — TRM p.175 "Enabling the Virtual CMDQ"
mandates BASE as the first step, with PROD_INDX/CONS_INDX
initialization following. For a spec-compliant guest, PROD_INDX
is only written after BASE, at which point vintf_ptr() is
valid and writes go directly to hardware with no discard
involved.

Supporting this non-compliant ordering would require
replaying cached values after alloc, adding complexity for
behaviour the spec does not permit.

2) But then after some prompt it says:

Looking at SMMUv3's register write handlers, QEMU doesn't enforce
any programming order there either —   registers are written to cache
unconditionally. CMDQV should follow the same convention. Will fix by
replaying cached PROD_INDX/CONS_INDX values to hardware after
alloc_hw_queue succeeds, making the pre-to-post-alloc transition
transparent to the guest.

3) But it changes position again if you ask in a different way:

SMMUv3 has no equivalent transition point where hardware takes over.
In CMDQV, the BASE write is a fundamental transition — before it,
there is no hardware queue; after it, hardware owns the registers.
The BASE write is a fundamental transition point — before it, no
hardware queue exists; after it, hardware owns the Page0 registers.
Pre-alloc writes to PROD_INDX/CONS_INDX are undefined behavior by
spec, and "discard" at transition is the   correct consequence.
There's nothing to "preserve across" in the SMMUv3 case.

So it goes on...

I think the spec is vague and doesn’t demand a strict order.
So may be better to sync the registers after the alloc.

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing
  2026-05-06 20:13             ` Shameer Kolothum Thodi
@ 2026-05-06 20:55               ` Nicolin Chen
  0 siblings, 0 replies; 102+ messages in thread
From: Nicolin Chen @ 2026-05-06 20:55 UTC (permalink / raw)
  To: Shameer Kolothum Thodi
  Cc: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org,
	peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	zhenzhong.duan@intel.com, Krishnakant Jaju, phrdina@redhat.com

On Wed, May 06, 2026 at 01:13:53PM -0700, Shameer Kolothum Thodi wrote:
> > -----Original Message-----
> > On Wed, May 06, 2026 at 06:16:00AM -0700, Shameer Kolothum Thodi wrote:
> > > > From: Nicolin Chen <nicolinc@nvidia.com>
> > > > Could this be a case:
> > > >  - Write the PROD_INDX/ CONS_INDX with a consistent value (0xf)
> > > >  - Program the VIRT_CMDQ_BASE
> > > >  - Read the PROD_INDX/CONS_INDX, expecting 0xf
> > > >  - Write the PROD_INDX/ CONS_INDX with 0x0
> > > >  - Program the VIRT_CMDQ_CONFIG to enable the CMDQ
> > >
> > > Yes, if Guest is not spec compliant then this can be any order.
> > >
> > > However,
> > >
> > > >  - Program the VIRT_CMDQ_BASE -- suppose we call setup_vcmdq()
> > 
> > > then kernel will reset PROD/CONS, right?
> > 
> > But I don't have an impression that QEMU only supports Guest to run
> > Linux...
> 
> Oh no, I was referring to host kernel behaviour on IOMMU_HW_QUEUE_ALLOC
> IOCTL. It seems to reset PROD/CONS to zero in that path and I thought
> it was following spec in that case.

Ah, it was fixing a kernel bug, where the VCMDQ registers never got
cleared even after a host reset (IIRC). And the solution is to clear
them before handing over to the guest.

Once QEMU allocates the hw_queue that ALLOC_MAPs it to the VINTF, it
should probably sync what the register caches were programmed by the
guest.

> This scenario writes PROD_INDX before BASE, which is explicitly
> non-spec-compliant — TRM p.175 "Enabling the Virtual CMDQ"
> mandates BASE as the first step, with PROD_INDX/CONS_INDX
> initialization following. For a spec-compliant guest, PROD_INDX
> is only written after BASE, at which point vintf_ptr() is
> valid and writes go directly to hardware with no discard
> involved.

I think the goal of VMM is that a guest VM cannot distinguish
between real HW and emulated HW. If we just follow the minimal
"spec compliant" flow, it fails to do so. So, "spec compliant"
itself doesn't make a lot of sense to me.

With that being said, if QEMU is okay with the minimal support
for "spec compliant" flow only, I certainly won't be against
submitting it (for the initial version maybe?).

> 2) But then after some prompt it says:
> 
> Looking at SMMUv3's register write handlers, QEMU doesn't enforce
> any programming order there either —   registers are written to cache
> unconditionally. CMDQV should follow the same convention. Will fix by
> replaying cached PROD_INDX/CONS_INDX values to hardware after
> alloc_hw_queue succeeds, making the pre-to-post-alloc transition
> transparent to the guest.

Yes. We are indeed missing a few details in the implementation.

> 3) But it changes position again if you ask in a different way:
> 
> SMMUv3 has no equivalent transition point where hardware takes over.
> In CMDQV, the BASE write is a fundamental transition — before it,
> there is no hardware queue; after it, hardware owns the registers.
> The BASE write is a fundamental transition point — before it, no
> hardware queue exists; after it, hardware owns the Page0 registers.
> Pre-alloc writes to PROD_INDX/CONS_INDX are undefined behavior by
> spec, and "discard" at transition is the   correct consequence.
> There's nothing to "preserve across" in the SMMUv3 case.

Ask it to do a deeper research before answer any question, or
to pinpoint where spec declares that "undefined behaviour" for
example.

> I think the spec is vague and doesn’t demand a strict order.
> So may be better to sync the registers after the alloc.

Agreed.

Nicolin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0 into guest MMIO space
  2026-05-06 14:24     ` Shameer Kolothum Thodi
@ 2026-05-07 16:24       ` Eric Auger
  2026-05-08  9:03         ` Shameer Kolothum Thodi
  0 siblings, 1 reply; 102+ messages in thread
From: Eric Auger @ 2026-05-07 16:24 UTC (permalink / raw)
  To: Shameer Kolothum Thodi, qemu-arm@nongnu.org,
	qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



On 5/6/26 4:24 PM, Shameer Kolothum Thodi wrote:
> Hi Eric,
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Sent: 06 May 2026 13:45
>> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
>> arm@nongnu.org; qemu-devel@nongnu.org
>> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
>> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
>> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
>> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
>> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
>> phrdina@redhat.com
>> Subject: Re: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0
>> into guest MMIO space
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi Shameer,
>>
>> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
>>> From: Nicolin Chen <nicolinc@nvidia.com>
>>>
>>> Once a VCMDQ is allocated, map the mmap'd vintf_page0 region directly
>>> into the guest-visible MMIO space at offset 0x30000 as a RAM-backed
>>> MemoryRegion. This eliminates QEMU trapping for hot-path CONS/PROD
>>> index updates.
>>>
>>> After this patch, the two VCMDQ apertures use different access paths:
>>> the direct aperture (0x10000) remains QEMU-trapped and writes via
>>> vintf_ptr, while the VI aperture (0x30000) is a direct guest RAM
>>> mapping. Both paths write to the same underlying vintf_page0 memory,
>>> so no synchronisation between the apertures is needed.
>> I fail to understand when the previous trapped path using ptr in
>> tegra241_cmdqv_read/write_vcmdq gets used versus that path. Is it
>> eventually used?
> The spec does not prevent a guest from using the 0x10000 path for allocated
> VCMDQs, so the trap path remains valid and QEMU forwards those accesses to
> the mmap'd vintf_page0 via vintf_ptr.
>
> We cannot map 0x10000 directly to the guest as RAM because the kernel
> mmap only backs VCMDQs actually allocated via HW_QUEUE ioctl. If the
> guest allocates only 1 of 2 VCMDQs, exposing the full direct aperture page0
> as RAM would give the guest access to unallocated VCMDQ slots. Hence, it
> remains trapped and QEMU only fwds to vintf page0 for allocated queues.
Actually my question rather was why don't we use a subregion_overlap for
global vcmdq page0, mapping to vintf_page0 once cmdqv->vintf_page0 is
set, just as we do for VINTF0 page0? I understand
tegra241_cmdqv_read/write_vcmdq would return the same content anyway.

Besides, I would recommend that all over the code, page0 and page1 are
clearly differentiated in terms of accessors and comments and we use the
same terminology through the code for global vcmdq and vintf vcmdqs.

Thanks

Eric
>
> Thanks,
> Shameer
>
>>> The mapping is installed lazily on first successful VCMDQ hardware
>>> queue allocation and removed when CMDQV or VINTF is disabled.
>>>
>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>> Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
>>> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
>>> ---
>>>  hw/arm/tegra241-cmdqv.h |  1 +
>>>  hw/arm/tegra241-cmdqv.c | 37
>> +++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 38 insertions(+)
>>>
>>> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
>>> index 039d86374f..2befa6205e 100644
>>> --- a/hw/arm/tegra241-cmdqv.h
>>> +++ b/hw/arm/tegra241-cmdqv.h
>>> @@ -46,6 +46,7 @@ typedef struct Tegra241CMDQV {
>>>      IOMMUFDVeventq *veventq;
>>>      IOMMUFDHWqueue *vcmdq[TEGRA241_CMDQV_MAX_CMDQ];
>>>      void *vintf_page0;
>>> +    MemoryRegion *mr_vintf_page0;
>>>
>>>      /* Register Cache */
>>>      uint32_t config;
>>> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
>>> index eb619e1134..bf989dd51f 100644
>>> --- a/hw/arm/tegra241-cmdqv.c
>>> +++ b/hw/arm/tegra241-cmdqv.c
>>> @@ -15,6 +15,40 @@
>>>  #include "tegra241-cmdqv.h"
>>>  #include "trace.h"
>>>
>>> +static void tegra241_cmdqv_guest_unmap_vintf_page0(Tegra241CMDQV
>> *cmdqv)
>>> +{
>>> +    if (!cmdqv->mr_vintf_page0) {
>>> +        return;
>>> +    }
>>> +
>>> +    memory_region_del_subregion(&cmdqv->mmio_cmdqv, cmdqv-
>>> mr_vintf_page0);
>>> +    object_unparent(OBJECT(cmdqv->mr_vintf_page0));
>>> +    g_free(cmdqv->mr_vintf_page0);
>>> +    cmdqv->mr_vintf_page0 = NULL;
>>> +}
>>> +
>>> +static void tegra241_cmdqv_guest_map_vintf_page0(Tegra241CMDQV
>> *cmdqv)
>>> +{
>>> +    char *name;
>>> +
>>> +    if (cmdqv->mr_vintf_page0) {
>>> +        return;
>>> +    }
>>> +
>>> +    name = g_strdup_printf("%s vintf-page0",
>>> +                           memory_region_name(&cmdqv->mmio_cmdqv));
>>> +    cmdqv->mr_vintf_page0 = g_malloc0(sizeof(*cmdqv->mr_vintf_page0));
>>> +    memory_region_init_ram_device_ptr(cmdqv->mr_vintf_page0,
>>> +                                      memory_region_owner(&cmdqv->mmio_cmdqv),
>>> +                                      name, VINTF_PAGE_SIZE,
>>> +                                      cmdqv->vintf_page0);
>>> +    cmdqv->mr_vintf_page0->ram_device_skip_iommu_map = true;
>> I guess you need a setter, here to.
>>> +    memory_region_add_subregion_overlap(&cmdqv->mmio_cmdqv,
>>> +                                        CMDQV_VINTF_PAGE0_BASE,
>>> +                                        cmdqv->mr_vintf_page0, 1);
>>> +    g_free(name);
>>> +}
>>> +
>>>  static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int
>> index)
>>>  {
>>>      IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
>>> @@ -72,6 +106,7 @@ static bool
>> tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int index,
>>>      hw_queue->viommu = viommu;
>>>      cmdqv->vcmdq[index] = hw_queue;
>>>
>>> +    tegra241_cmdqv_guest_map_vintf_page0(cmdqv);
>>>      return true;
>>>  }
>>>
>>> @@ -312,6 +347,7 @@ static void
>> tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
>>>                  cmdqv->vintf_status |= R_VINTF0_STATUS_ENABLE_OK_MASK;
>>>              }
>>>          } else {
>>> +            tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
>>>              tegra241_cmdqv_free_all_vcmdq(cmdqv);
>>>              tegra241_cmdqv_munmap_vintf_page0(cmdqv, errp);
>>>              cmdqv->vintf_status &= ~R_VINTF0_STATUS_ENABLE_OK_MASK;
>>> @@ -438,6 +474,7 @@ static void tegra241_cmdqv_write_mmio(void
>> *opaque, hwaddr offset,
>>>          if (value & R_CONFIG_CMDQV_EN_MASK) {
>>>              cmdqv->status |= R_STATUS_CMDQV_ENABLED_MASK;
>>>          } else {
>>> +            tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
>>>              tegra241_cmdqv_free_all_vcmdq(cmdqv);
>>>              cmdqv->status &= ~R_STATUS_CMDQV_ENABLED_MASK;
>>>          }
>> Thanks
>>
>> Eric



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 24/31] hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors
  2026-04-15 10:55 ` [PATCH v4 24/31] hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors Shameer Kolothum
  2026-05-05  1:13   ` Nicolin Chen
@ 2026-05-07 16:40   ` Eric Auger
  2026-05-08 10:52     ` Shameer Kolothum Thodi
  1 sibling, 1 reply; 102+ messages in thread
From: Eric Auger @ 2026-05-07 16:40 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina

Hi Shameer,

On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> Install an event handler on the CMDQV vEVENTQ fd to read and propagate
> host received CMDQV errors to the guest.
>
> The handler runs in QEMU’s main loop, using a non-blocking fd registered
> via qemu_set_fd_handler().
>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/tegra241-cmdqv.c | 55 +++++++++++++++++++++++++++++++++++++++++
>  hw/arm/trace-events     |  1 +
>  2 files changed, 56 insertions(+)
>
> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> index bf989dd51f..9c2fc02b92 100644
> --- a/hw/arm/tegra241-cmdqv.c
> +++ b/hw/arm/tegra241-cmdqv.c
> @@ -11,6 +11,7 @@
>  #include "qemu/log.h"
>  
>  #include "hw/arm/smmuv3.h"
> +#include "hw/core/irq.h"
>  #include "smmuv3-accel.h"
>  #include "tegra241-cmdqv.h"
>  #include "trace.h"
> @@ -534,6 +535,43 @@ out:
>      trace_tegra241_cmdqv_write_mmio(offset, value, size);
>  }
>  
> +static void tegra241_cmdqv_event_read(void *opaque)
> +{
> +    Tegra241CMDQV *cmdqv = opaque;
> +    IOMMUFDVeventq *veventq = cmdqv->veventq;
> +    struct {
> +        struct iommufd_vevent_header hdr;
> +        struct iommu_vevent_tegra241_cmdqv vevent;
> +    } buf;
> +    Error *local_err = NULL;
> +
> +    if (!smmuv3_accel_event_read_validate(veventq,
> +                                          IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV,
> +                                          &buf, sizeof(buf), &local_err)) {
> +        warn_report_err_once(local_err);
> +        return;
> +    }
> +
> +    if (buf.vevent.lvcmdq_err_map[0] || buf.vevent.lvcmdq_err_map[1]) {
> +        cmdqv->vintf_cmdq_err_map[0] =
> +            buf.vevent.lvcmdq_err_map[0] & 0xffffffff;
can't you use extract64() here and below?
> +        cmdqv->vintf_cmdq_err_map[1] =
> +            (buf.vevent.lvcmdq_err_map[0] >> 32) & 0xffffffff;
> +        cmdqv->vintf_cmdq_err_map[2] =
> +            buf.vevent.lvcmdq_err_map[1] & 0xffffffff;
> +        cmdqv->vintf_cmdq_err_map[3] =
> +            (buf.vevent.lvcmdq_err_map[1] >> 32) & 0xffffffff;
> +        for (int i = 0; i < 4; i++) {
> +            cmdqv->cmdq_err_map[i] = cmdqv->vintf_cmdq_err_map[i];
if the cached regs are same to we need to have both?
> +        }
> +        cmdqv->vi_err_map[0] |= 0x1;
> +        qemu_irq_pulse(cmdqv->irq);
> +        trace_tegra241_cmdqv_err_map(
> +            cmdqv->vintf_cmdq_err_map[3], cmdqv->vintf_cmdq_err_map[2],
> +            cmdqv->vintf_cmdq_err_map[1], cmdqv->vintf_cmdq_err_map[0]);
> +    }
> +}
> +
>  static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
>  {
>      SMMUv3AccelState *accel = s->s_accel;
> @@ -545,6 +583,7 @@ static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
>          return;
>      }
>      if (veventq) {
> +        qemu_set_fd_handler(veventq->veventq_fd, NULL, NULL, NULL);
>          close(veventq->veventq_fd);
>          iommufd_backend_free_id(viommu->iommufd, veventq->veventq_id);
>          g_free(veventq);
> @@ -560,6 +599,7 @@ tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
>      Tegra241CMDQV *cmdqv = s->s_accel->cmdqv;
>      uint32_t viommu_id, veventq_id, veventq_fd;
>      IOMMUFDVeventq *veventq;
> +    int flags;
>  
>      if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
>                                        IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV,
> @@ -577,14 +617,29 @@ tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
>          goto free_viommu;
>      }
>  
> +    flags = fcntl(veventq_fd, F_GETFL);
> +    if (flags < 0) {
> +        error_setg(errp, "Failed to get flags for vEVENTQ fd");
> +        goto free_veventq;
> +    }
> +    if (fcntl(veventq_fd, F_SETFL, O_NONBLOCK | flags) < 0) {
> +        error_setg(errp, "Failed to set O_NONBLOCK on vEVENTQ fd");
> +        goto free_veventq;
> +    }
> +
>      veventq = g_new(IOMMUFDVeventq, 1);
>      veventq->veventq_id = veventq_id;
>      veventq->veventq_fd = veventq_fd;
>      cmdqv->veventq = veventq;
>  
> +    /* Set up event handler for veventq fd */
> +    qemu_set_fd_handler(veventq_fd, tegra241_cmdqv_event_read, NULL, cmdqv);
>      *out_viommu_id = viommu_id;
>      return true;
>  
> +free_veventq:
> +    close(veventq_fd);
> +    iommufd_backend_free_id(idev->iommufd, veventq_id);
>  free_viommu:
>      iommufd_backend_free_id(idev->iommufd, viommu_id);
>      return false;
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 8c61d66a26..fd6441bfa7 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -75,6 +75,7 @@ smmuv3_accel_install_ste(uint32_t vsid, const char * type, uint32_t hwpt_id) "vS
>  # tegra241-cmdqv
>  tegra241_cmdqv_read_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
>  tegra241_cmdqv_write_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
> +tegra241_cmdqv_err_map(uint32_t map3, uint32_t map2, uint32_t map1, uint32_t map0) "hw irq received. error (hex) maps: %04X:%04X:%04X:%04X"
>  
>  # strongarm.c
>  strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
Eric



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 25/31] hw/arm/tegra241-cmdqv: Add reset handler
  2026-04-15 10:55 ` [PATCH v4 25/31] hw/arm/tegra241-cmdqv: Add reset handler Shameer Kolothum
@ 2026-05-07 16:51   ` Eric Auger
  2026-05-08 11:19     ` Shameer Kolothum Thodi
  2026-05-07 17:03   ` Eric Auger
  1 sibling, 1 reply; 102+ messages in thread
From: Eric Auger @ 2026-05-07 16:51 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina



On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> Introduce a reset handler for the Tegra241 CMDQV and initialize its
> register state.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/tegra241-cmdqv.h |  2 ++
>  hw/arm/tegra241-cmdqv.c | 50 +++++++++++++++++++++++++++++++++++++++++
>  hw/arm/trace-events     |  1 +
>  3 files changed, 53 insertions(+)
>
> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> index 2befa6205e..b2a444daef 100644
> --- a/hw/arm/tegra241-cmdqv.h
> +++ b/hw/arm/tegra241-cmdqv.h
> @@ -79,6 +79,8 @@ FIELD(CONFIG, CMDQ_MAX_CLK_BATCH, 4, 8)
>  FIELD(CONFIG, CMDQ_MAX_CMD_BATCH, 12, 8)
>  FIELD(CONFIG, CONS_DRAM_EN, 20, 1)
>  
> +#define V_CONFIG_RESET 0x00020403
> +
>  REG32(PARAM, 0x4)
>  FIELD(PARAM, CMDQV_VER, 0, 4)
>  FIELD(PARAM, CMDQV_NUM_CMDQ_LOG2, 4, 4)
> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> index 9c2fc02b92..af68add2f0 100644
> --- a/hw/arm/tegra241-cmdqv.c
> +++ b/hw/arm/tegra241-cmdqv.c
> @@ -8,6 +8,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/error-report.h"
>  #include "qemu/log.h"
>  
>  #include "hw/arm/smmuv3.h"
> @@ -645,8 +646,57 @@ free_viommu:
>      return false;
>  }
>  
> +static void tegra241_cmdqv_init_regs(SMMUv3State *s, Tegra241CMDQV *cmdqv)
> +{
> +    int i;
> +
> +    cmdqv->config = V_CONFIG_RESET;
> +    cmdqv->param = FIELD_DP32(0, PARAM, CMDQV_VER, CMDQV_VER);
hw/core/registerfields.h:#define FIELD_DP32(storage, reg, field, val)
what does this 0?
> +    cmdqv->param = FIELD_DP32(cmdqv->param, PARAM, CMDQV_NUM_CMDQ_LOG2,
> +                              CMDQV_NUM_CMDQ_LOG2);
I was a bit puzzled by the FIELD name being the value name but well.
> +    cmdqv->param = FIELD_DP32(cmdqv->param, PARAM, CMDQV_NUM_SID_PER_VI_LOG2,
> +                              CMDQV_NUM_SID_PER_VI_LOG2);
> +    trace_tegra241_cmdqv_init_regs(cmdqv->param);
> +    cmdqv->status = R_STATUS_CMDQV_ENABLED_MASK;
> +    for (i = 0; i < 2; i++) {
> +        cmdqv->vi_err_map[i] = 0;
> +        cmdqv->vi_int_mask[i] = 0;
> +        cmdqv->cmdq_err_map[i] = 0;
> +    }
> +    cmdqv->cmdq_err_map[2] = 0;
> +    cmdqv->cmdq_err_map[3] = 0;
the split looks pretty strange, ie. loop of 2 + continue with individual
setting. why don't you do the init of 

cmdqv->cmdq_err_map in the x4 loop below

> +    cmdqv->vintf_config = 0;
> +    cmdqv->vintf_status = 0;
> +    for (i = 0; i < 4; i++) {
> +        cmdqv->vintf_cmdq_err_map[i] = 0;
> +    }
> +    for (i = 0; i < TEGRA241_CMDQV_MAX_CMDQ; i++) {
> +        cmdqv->cmdq_alloc_map[i] = 0;
> +        cmdqv->vcmdq_cons_indx[i] = 0;
> +        cmdqv->vcmdq_prod_indx[i] = 0;
> +        cmdqv->vcmdq_config[i] = 0;
> +        cmdqv->vcmdq_status[i] = 0;
> +        cmdqv->vcmdq_gerror[i] = 0;
> +        cmdqv->vcmdq_gerrorn[i] = 0;
> +        cmdqv->vcmdq_base[i] = 0;
> +        cmdqv->vcmdq_cons_indx_base[i] = 0;
> +    }
> +}
> +
>  static void tegra241_cmdqv_reset(SMMUv3State *s)
>  {
> +    SMMUv3AccelState *accel = s->s_accel;
> +    Tegra241CMDQV *cmdqv = accel->cmdqv;
> +
> +    if (!cmdqv) {
> +        return;
> +    }
> +
> +    tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
> +    tegra241_cmdqv_munmap_vintf_page0(cmdqv, NULL);
> +    tegra241_cmdqv_free_all_vcmdq(cmdqv);
> +
> +    tegra241_cmdqv_init_regs(s, cmdqv);
>  }
>  
>  static const MemoryRegionOps mmio_cmdqv_ops = {
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index fd6441bfa7..6f602b9eda 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -76,6 +76,7 @@ smmuv3_accel_install_ste(uint32_t vsid, const char * type, uint32_t hwpt_id) "vS
>  tegra241_cmdqv_read_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
>  tegra241_cmdqv_write_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
>  tegra241_cmdqv_err_map(uint32_t map3, uint32_t map2, uint32_t map1, uint32_t map0) "hw irq received. error (hex) maps: %04X:%04X:%04X:%04X"
> +tegra241_cmdqv_init_regs(uint32_t param) "hw info received. param: 0x%04X"
>  
>  # strongarm.c
>  strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
Thanks

Eric



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 25/31] hw/arm/tegra241-cmdqv: Add reset handler
  2026-04-15 10:55 ` [PATCH v4 25/31] hw/arm/tegra241-cmdqv: Add reset handler Shameer Kolothum
  2026-05-07 16:51   ` Eric Auger
@ 2026-05-07 17:03   ` Eric Auger
  1 sibling, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-07 17:03 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina



On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> Introduce a reset handler for the Tegra241 CMDQV and initialize its
> register state.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/tegra241-cmdqv.h |  2 ++
>  hw/arm/tegra241-cmdqv.c | 50 +++++++++++++++++++++++++++++++++++++++++
>  hw/arm/trace-events     |  1 +
>  3 files changed, 53 insertions(+)
>
> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> index 2befa6205e..b2a444daef 100644
> --- a/hw/arm/tegra241-cmdqv.h
> +++ b/hw/arm/tegra241-cmdqv.h
> @@ -79,6 +79,8 @@ FIELD(CONFIG, CMDQ_MAX_CLK_BATCH, 4, 8)
>  FIELD(CONFIG, CMDQ_MAX_CMD_BATCH, 12, 8)
>  FIELD(CONFIG, CONS_DRAM_EN, 20, 1)
>  
> +#define V_CONFIG_RESET 0x00020403
please add a comment to explain what 

0x00020403 value means

Eric

> +
>  REG32(PARAM, 0x4)
>  FIELD(PARAM, CMDQV_VER, 0, 4)
>  FIELD(PARAM, CMDQV_NUM_CMDQ_LOG2, 4, 4)
> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> index 9c2fc02b92..af68add2f0 100644
> --- a/hw/arm/tegra241-cmdqv.c
> +++ b/hw/arm/tegra241-cmdqv.c
> @@ -8,6 +8,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/error-report.h"
>  #include "qemu/log.h"
>  
>  #include "hw/arm/smmuv3.h"
> @@ -645,8 +646,57 @@ free_viommu:
>      return false;
>  }
>  
> +static void tegra241_cmdqv_init_regs(SMMUv3State *s, Tegra241CMDQV *cmdqv)
> +{
> +    int i;
> +
> +    cmdqv->config = V_CONFIG_RESET;
> +    cmdqv->param = FIELD_DP32(0, PARAM, CMDQV_VER, CMDQV_VER);
> +    cmdqv->param = FIELD_DP32(cmdqv->param, PARAM, CMDQV_NUM_CMDQ_LOG2,
> +                              CMDQV_NUM_CMDQ_LOG2);
> +    cmdqv->param = FIELD_DP32(cmdqv->param, PARAM, CMDQV_NUM_SID_PER_VI_LOG2,
> +                              CMDQV_NUM_SID_PER_VI_LOG2);
> +    trace_tegra241_cmdqv_init_regs(cmdqv->param);
> +    cmdqv->status = R_STATUS_CMDQV_ENABLED_MASK;
> +    for (i = 0; i < 2; i++) {
> +        cmdqv->vi_err_map[i] = 0;
> +        cmdqv->vi_int_mask[i] = 0;
> +        cmdqv->cmdq_err_map[i] = 0;
> +    }
> +    cmdqv->cmdq_err_map[2] = 0;
> +    cmdqv->cmdq_err_map[3] = 0;
> +    cmdqv->vintf_config = 0;
> +    cmdqv->vintf_status = 0;
> +    for (i = 0; i < 4; i++) {
> +        cmdqv->vintf_cmdq_err_map[i] = 0;
> +    }
> +    for (i = 0; i < TEGRA241_CMDQV_MAX_CMDQ; i++) {
> +        cmdqv->cmdq_alloc_map[i] = 0;
> +        cmdqv->vcmdq_cons_indx[i] = 0;
> +        cmdqv->vcmdq_prod_indx[i] = 0;
> +        cmdqv->vcmdq_config[i] = 0;
> +        cmdqv->vcmdq_status[i] = 0;
> +        cmdqv->vcmdq_gerror[i] = 0;
> +        cmdqv->vcmdq_gerrorn[i] = 0;
> +        cmdqv->vcmdq_base[i] = 0;
> +        cmdqv->vcmdq_cons_indx_base[i] = 0;
> +    }
> +}
> +
>  static void tegra241_cmdqv_reset(SMMUv3State *s)
>  {
> +    SMMUv3AccelState *accel = s->s_accel;
> +    Tegra241CMDQV *cmdqv = accel->cmdqv;
> +
> +    if (!cmdqv) {
> +        return;
> +    }
> +
> +    tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
> +    tegra241_cmdqv_munmap_vintf_page0(cmdqv, NULL);
> +    tegra241_cmdqv_free_all_vcmdq(cmdqv);
> +
> +    tegra241_cmdqv_init_regs(s, cmdqv);
>  }
>  
>  static const MemoryRegionOps mmio_cmdqv_ops = {
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index fd6441bfa7..6f602b9eda 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -76,6 +76,7 @@ smmuv3_accel_install_ste(uint32_t vsid, const char * type, uint32_t hwpt_id) "vS
>  tegra241_cmdqv_read_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
>  tegra241_cmdqv_write_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
>  tegra241_cmdqv_err_map(uint32_t map3, uint32_t map2, uint32_t map1, uint32_t map0) "hw irq received. error (hex) maps: %04X:%04X:%04X:%04X"
> +tegra241_cmdqv_init_regs(uint32_t param) "hw info received. param: 0x%04X"
>  
>  # strongarm.c
>  strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 26/31] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size
  2026-04-15 10:55 ` [PATCH v4 26/31] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size Shameer Kolothum
  2026-05-05  1:26   ` Nicolin Chen
@ 2026-05-07 17:23   ` Eric Auger
  2026-05-08 13:38     ` Shameer Kolothum Thodi
  1 sibling, 1 reply; 102+ messages in thread
From: Eric Auger @ 2026-05-07 17:23 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina

Hi Shameer,

On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> CMDQV HW reads guest queue memory in its host physical address setup via
> IOMMUFD. This requires the guest queue memory is not only contiguous in
> guest PA space but also in host PA space. With Tegra241 CMDQV enabled, we
> must only advertise a CMDQS that the host can safely back with physically
s/a CMDQS/ a command queue size (CMDQS)
> contiguous memory. Allowing a queue larger than the host page size could
a queue size
> cause the hardware to DMA across page boundaries, leading to faults.
>
> Walk the RAMBlock list to find the smallest memory-backend page size, then
> limit IDR1.CMDQS so the guest cannot configure a command queue that exceeds
what is the minimal CMDQS required? 
> that contiguous backing. Fall back to the real host page size if no
> memory-backend RAM blocks are found.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/tegra241-cmdqv.c | 41 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
>
> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> index af68add2f0..2870886783 100644
> --- a/hw/arm/tegra241-cmdqv.c
> +++ b/hw/arm/tegra241-cmdqv.c
> @@ -14,6 +14,9 @@
>  #include "hw/arm/smmuv3.h"
>  #include "hw/core/irq.h"
>  #include "smmuv3-accel.h"
> +#include "smmuv3-internal.h"
> +#include "system/ramblock.h"
> +#include "system/ramlist.h"
>  #include "tegra241-cmdqv.h"
>  #include "trace.h"
>  
> @@ -646,9 +649,38 @@ free_viommu:
>      return false;
>  }
>  
> +static size_t tegra241_cmdqv_min_ram_pagesize(void)
shouldn't we put that rather in system/physmem.c
there we also have qemu_ram_pagesize_largest() for instance

> +{
> +    RAMBlock *rb;
> +    size_t pg, min_pg = SIZE_MAX;
> +
> +    RAMBLOCK_FOREACH(rb) {
> +        MemoryRegion *mr = rb->mr;
> +
> +        /* Only consider real RAM regions */
> +        if (!mr || !memory_region_is_ram(mr)) {
> +            continue;
> +        }
> +
> +        /* Skip RAM regions that are not backed by a memory-backend */
> +        if (!object_dynamic_cast(mr->owner, TYPE_MEMORY_BACKEND)) {
> +            continue;
> +        }
> +
> +        pg = qemu_ram_pagesize(rb);
> +        if (pg && pg < min_pg) {
> +            min_pg = pg;
> +        }
> +    }
> +
> +    return (min_pg == SIZE_MAX) ? qemu_real_host_page_size() : min_pg;
> +}
> +
>  static void tegra241_cmdqv_init_regs(SMMUv3State *s, Tegra241CMDQV *cmdqv)
>  {
>      int i;
> +    size_t pgsize;
> +    uint32_t val;
>  
>      cmdqv->config = V_CONFIG_RESET;
>      cmdqv->param = FIELD_DP32(0, PARAM, CMDQV_VER, CMDQV_VER);
> @@ -681,6 +713,15 @@ static void tegra241_cmdqv_init_regs(SMMUv3State *s, Tegra241CMDQV *cmdqv)
>          cmdqv->vcmdq_base[i] = 0;
>          cmdqv->vcmdq_cons_indx_base[i] = 0;
>      }
> +
> +    /*
> +     * CMDQ must not cross a physical RAM backend page. Adjust CMDQS so the
> +     * queue fits entirely within the smallest backend page size, ensuring
> +     * the command queue is physically contiguous in host memory.
> +     */
> +    pgsize = tegra241_cmdqv_min_ram_pagesize();
> +    val = FIELD_EX32(s->idr[1], IDR1, CMDQS);
> +    s->idr[1] = FIELD_DP32(s->idr[1], IDR1, CMDQS, MIN(ctz64(pgsize) - 4, val));
>  }
>  
>  static void tegra241_cmdqv_reset(SMMUv3State *s)
Eric



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 31/31] hw/arm/smmuv3: Add cmdqv property for SMMUv3 device
  2026-04-15 10:55 ` [PATCH v4 31/31] hw/arm/smmuv3: Add cmdqv property for SMMUv3 device Shameer Kolothum
@ 2026-05-07 17:28   ` Eric Auger
  0 siblings, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-07 17:28 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina



On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> Introduce a "cmdqv" property to enable Tegra241 CMDQV support.
> This is only enabled for accelerated SMMUv3 devices.
>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/smmuv3.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index c9ff6298f5..51b7d01da5 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -1993,6 +1993,10 @@ static bool smmu_validate_property(SMMUv3State *s, Error **errp)
>              error_setg(errp, "ssidsize can only be set if accel=on");
>              return false;
>          }
> +        if (s->cmdqv == ON_OFF_AUTO_ON) {
> +            error_setg(errp, "cmdqv can only be enabled if accel=on");
> +            return false;
> +        }
>          return true;
>      }
>  
> @@ -2161,6 +2165,7 @@ static const Property smmuv3_properties[] = {
>      DEFINE_PROP_OAS_MODE("oas", SMMUv3State, oas, OAS_MODE_AUTO),
>      DEFINE_PROP_SSIDSIZE_MODE("ssidsize", SMMUv3State, ssidsize,
>                                SSID_SIZE_MODE_AUTO),
> +    DEFINE_PROP_ON_OFF_AUTO("cmdqv", SMMUv3State, cmdqv, ON_OFF_AUTO_AUTO),
>  };
>  
>  static void smmuv3_instance_init(Object *obj)
> @@ -2200,6 +2205,8 @@ static void smmuv3_class_init(ObjectClass *klass, const void *data)
>          "Valid range is 0-20, where 0 disables SubstreamID support. "
>          "Defaults to auto. A value greater than 0 is required to enable "
>          "PASID support.");
> +    object_class_property_set_description(klass, "cmdqv",
> +        "Enable/disable CMDQ-Virtualisation support (for accel=on)");
s/CMDQ-Virtualisation/CMDQ-V
>  }
>  
>  static int smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
qemu-options.hx also needs to be updated

Eric



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 29/31] hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT
  2026-04-15 10:55 ` [PATCH v4 29/31] hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT Shameer Kolothum
@ 2026-05-07 17:32   ` Eric Auger
  0 siblings, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-07 17:32 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina



On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> Add ACPI DSDT support for Tegra241 CMDQV when the SMMUv3 instance is
> created with tegra241-cmdqv.
>
> The SMMUv3 device identifier is used as the ACPI _UID. This matches
> the Identifier field of the corresponding SMMUv3 IORT node, allowing
> the CMDQV DSDT device to be correctly associated with its SMMU.

Could you add a link to the ACPI spec to check the content of what is
generated.

maybe you can also describe the aml in the coommit msg

Thanks

Eric
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/virt-acpi-build.c | 52 ++++++++++++++++++++++++++++++++++++++++
>  hw/arm/trace-events      |  1 +
>  2 files changed, 53 insertions(+)
>
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 65ccc96349..fbc793d06e 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -65,6 +65,9 @@
>  #include "target/arm/cpu.h"
>  #include "target/arm/multiprocessing.h"
>  
> +#include "smmuv3-accel.h"
> +#include "tegra241-cmdqv.h"
> +
>  #define ARM_SPI_BASE 32
>  
>  #define ACPI_BUILD_TABLE_SIZE             0x20000
> @@ -1114,6 +1117,51 @@ static void build_fadt_rev6(GArray *table_data, BIOSLinker *linker,
>      build_fadt(table_data, linker, &fadt, vms->oem_id, vms->oem_table_id);
>  }
>  
> +static void acpi_dsdt_add_tegra241_cmdqv(Aml *scope, VirtMachineState *vms)
> +{
> +    for (int i = 0; i < vms->smmuv3_devices->len; i++) {
> +        Object *obj = OBJECT(g_ptr_array_index(vms->smmuv3_devices, i));
> +        PlatformBusDevice *pbus;
> +        Aml *dev, *crs, *addr;
> +        SysBusDevice *sbdev;
> +        hwaddr base;
> +        uint32_t id;
> +        int irq;
> +
> +        if (smmuv3_accel_cmdqv_type(obj) != SMMUV3_CMDQV_TEGRA241) {
> +            continue;
> +        }
> +        id = object_property_get_uint(obj, "identifier", &error_abort);
> +        pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
> +        sbdev = SYS_BUS_DEVICE(obj);
> +        base = platform_bus_get_mmio_addr(pbus, sbdev, 1);
> +        base += vms->memmap[VIRT_PLATFORM_BUS].base;
> +        irq = platform_bus_get_irqn(pbus, sbdev, NUM_SMMU_IRQS);
> +        irq += vms->irqmap[VIRT_PLATFORM_BUS];
> +        irq += ARM_SPI_BASE;
> +
> +        dev = aml_device("CV%.02u", id);
> +        aml_append(dev, aml_name_decl("_HID", aml_string("NVDA200C")));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(id)));
> +        aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
> +
> +        crs = aml_resource_template();
> +        addr = aml_qword_memory(AML_POS_DECODE, AML_MIN_FIXED, AML_MAX_FIXED,
> +                                AML_CACHEABLE, AML_READ_WRITE, 0x0, base,
> +                                base + TEGRA241_CMDQV_IO_LEN - 0x1, 0x0,
> +                                TEGRA241_CMDQV_IO_LEN);
> +        aml_append(crs, addr);
> +        aml_append(crs, aml_interrupt(AML_CONSUMER, AML_EDGE,
> +                                      AML_ACTIVE_HIGH, AML_EXCLUSIVE,
> +                                      (uint32_t *)&irq, 1));
> +        aml_append(dev, aml_name_decl("_CRS", crs));
> +
> +        aml_append(scope, dev);
> +
> +        trace_virt_acpi_dsdt_tegra241_cmdqv(id, base, irq);
> +    }
> +}
> +
>  /* DSDT */
>  static void
>  build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> @@ -1178,6 +1226,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>      acpi_dsdt_add_tpm(scope, vms);
>  #endif
>  
> +    if (!vms->legacy_smmuv3_present) {
> +        acpi_dsdt_add_tegra241_cmdqv(scope, vms);
> +    }
> +
>      aml_append(dsdt, scope);
>  
>      pci0_scope = aml_scope("\\_SB.PCI0");
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 6f602b9eda..e5e4e93324 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -9,6 +9,7 @@ omap1_lpg_led(const char *onoff) "omap1 LPG: LED is %s"
>  
>  # virt-acpi-build.c
>  virt_acpi_setup(void) "No fw cfg or ACPI disabled. Bailing out."
> +virt_acpi_dsdt_tegra241_cmdqv(int smmu_id, uint64_t base, uint32_t irq) "DSDT: add cmdqv node for (id=%d), base=0x%" PRIx64 ", irq=%d"
>  
>  # smmu-common.c
>  smmu_add_mr(const char *name) "%s"



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 30/31] hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active
  2026-04-15 10:55 ` [PATCH v4 30/31] hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active Shameer Kolothum
  2026-05-05  1:35   ` Nicolin Chen
@ 2026-05-07 17:36   ` Eric Auger
  2026-05-08 14:36     ` Shameer Kolothum Thodi
  1 sibling, 1 reply; 102+ messages in thread
From: Eric Auger @ 2026-05-07 17:36 UTC (permalink / raw)
  To: Shameer Kolothum, qemu-arm, qemu-devel
  Cc: peter.maydell, clg, alex, nicolinc, nathanc, mochs, jan, jgg,
	jonathan.cameron, zhenzhong.duan, kjaju, phrdina



On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> When CMDQV is active, the first cold-plugged VFIO device establishes the
> viommu to host SMMUv3 association. Block its hot-unplug to preserve this
> association and the guest's boot time CMDQV configuration.
>
> Also abort at machine_done if cmdqv=on is requested but no cold-plugged
> VFIO device was present to initialize it.
>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/smmuv3-accel.h |  1 +
>  hw/arm/smmuv3-accel.c | 12 ++++++++++++
>  hw/arm/smmuv3.c       |  6 ++++++
>  3 files changed, 19 insertions(+)
>
> diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
> index 3ed94ed05c..c4441d5b3f 100644
> --- a/hw/arm/smmuv3-accel.h
> +++ b/hw/arm/smmuv3-accel.h
> @@ -65,6 +65,7 @@ typedef struct SMMUv3AccelDevice {
>      IOMMUFDVdev *vdev;
>      QLIST_ENTRY(SMMUv3AccelDevice) next;
>      SMMUv3AccelState *s_accel;
> +    Error *unplug_blocker; /* set when CMDQV is active to block hot-unplug */
>  } SMMUv3AccelDevice;
>  
>  bool smmuv3_accel_init(SMMUv3State *s, Error **errp);
> diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> index a58815ded2..f381702a08 100644
> --- a/hw/arm/smmuv3-accel.c
> +++ b/hw/arm/smmuv3-accel.c
> @@ -754,6 +754,18 @@ static bool smmuv3_accel_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
>          return false;
>      }
>  
> +    /*
> +     * CMDQV is active: block hot-unplug of the device that established the
> +     * viommu association. Removing it would cause the vIOMMU to host SMMUv3
> +     * association be changed via device hot-plug.
> +     */
> +    if (s->s_accel->cmdqv_ops) {
> +        PCIDevice *pdev = pci_find_device(bus, pci_bus_num(bus), devfn);
> +        error_setg(&accel_dev->unplug_blocker,
> +                   "CMDQV is active: removing the device that established the "
> +                   "viommu association would break the guest CMDQV");
It is something that is planned to be fixed later on?

can you provide more details about the nature of the breakage

Thanks

Eric
> +        qdev_add_unplug_blocker(DEVICE(pdev), accel_dev->unplug_blocker);
> +    }
>  done:
>      accel_dev->idev = idev;
>      accel_dev->s_accel = s->s_accel;
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 1d6fdd776c..c9ff6298f5 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -2020,6 +2020,12 @@ static void smmuv3_machine_done(Notifier *notifier, void *data)
>                       "at least one cold-plugged VFIO device");
>          exit(1);
>      }
> +
> +    if (s->cmdqv == ON_OFF_AUTO_ON && !accel->cmdqv) {
> +        error_report("arm-smmuv3 cmdqv=on requires at least one cold-plugged "
> +                     "VFIO device");
> +        exit(1);
> +    }
>  }
>  
>  static void smmu_realize(DeviceState *d, Error **errp)



^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0 into guest MMIO space
  2026-05-07 16:24       ` Eric Auger
@ 2026-05-08  9:03         ` Shameer Kolothum Thodi
  2026-05-08 14:35           ` Eric Auger
  0 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-08  9:03 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com

Hi Eric,

> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 07 May 2026 17:24
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org
> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0
> into guest MMIO space
> 
> External email: Use caution opening links or attachments
> 
> 
> On 5/6/26 4:24 PM, Shameer Kolothum Thodi wrote:
> > Hi Eric,
> >
> >> -----Original Message-----
> >> From: Eric Auger <eric.auger@redhat.com>
> >> Sent: 06 May 2026 13:45
> >> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> >> arm@nongnu.org; qemu-devel@nongnu.org
> >> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> >> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> >> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason
> Gunthorpe
> >> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> >> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> >> phrdina@redhat.com
> >> Subject: Re: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0
> >> into guest MMIO space
> >>
> >> External email: Use caution opening links or attachments
> >>
> >>
> >> Hi Shameer,
> >>
> >> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> >>> From: Nicolin Chen <nicolinc@nvidia.com>
> >>>
> >>> Once a VCMDQ is allocated, map the mmap'd vintf_page0 region directly
> >>> into the guest-visible MMIO space at offset 0x30000 as a RAM-backed
> >>> MemoryRegion. This eliminates QEMU trapping for hot-path CONS/PROD
> >>> index updates.
> >>>
> >>> After this patch, the two VCMDQ apertures use different access paths:
> >>> the direct aperture (0x10000) remains QEMU-trapped and writes via
> >>> vintf_ptr, while the VI aperture (0x30000) is a direct guest RAM
> >>> mapping. Both paths write to the same underlying vintf_page0 memory,
> >>> so no synchronisation between the apertures is needed.
> >> I fail to understand when the previous trapped path using ptr in
> >> tegra241_cmdqv_read/write_vcmdq gets used versus that path. Is it
> >> eventually used?
> > The spec does not prevent a guest from using the 0x10000 path for
> allocated
> > VCMDQs, so the trap path remains valid and QEMU forwards those accesses
> to
> > the mmap'd vintf_page0 via vintf_ptr.
> >
> > We cannot map 0x10000 directly to the guest as RAM because the kernel
> > mmap only backs VCMDQs actually allocated via HW_QUEUE ioctl. If the
> > guest allocates only 1 of 2 VCMDQs, exposing the full direct aperture page0
> > as RAM would give the guest access to unallocated VCMDQ slots. Hence, it
> > remains trapped and QEMU only fwds to vintf page0 for allocated queues.
> Actually my question rather was why don't we use a subregion_overlap for
> global vcmdq page0, mapping to vintf_page0 once cmdqv->vintf_page0 is
> set, just as we do for VINTF0 page0? I understand
> tegra241_cmdqv_read/write_vcmdq would return the same content anyway.

This is my understanding why we cannot map direct VCMDQ aperture(0x10000)
to vintf_page0:

 - Unallocated VCMDQs accessed via vintf_page0 get "access dropped with no
    Fault/Interrupt" behaviour per spec((p.172)

 - Mapping the direct VCMDQ aperture (0x10000) to vintf_page0 would replicate
   this behaviour for unallocated VCMDQ accesses via the direct aperture too.

 - However, per spec p.176, software can program VCMDQs via the direct
   VCMDQ aperture without going through the VINTF. Mapping vintf_page0
   at 0x10000 violates this. 

  - Partial allocation case (VCMDQ0 allocated, VCMDQ1 not): the trap correctly
    serves VCMDQ1 accesses from QEMU cache while forwarding VCMDQ0 accesses
    to vintf_page0. A direct RAM mapping cannot make this distinction.

> 
> Besides, I would recommend that all over the code, page0 and page1 are
> clearly differentiated in terms of accessors and comments and we use the
> same terminology through the code for global vcmdq and vintf vcmdqs.

Sure. I will address this.

Thanks,
Shameer


^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 24/31] hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors
  2026-05-07 16:40   ` Eric Auger
@ 2026-05-08 10:52     ` Shameer Kolothum Thodi
  0 siblings, 0 replies; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-08 10:52 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 07 May 2026 17:41
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org
> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 24/31] hw/arm/tegra241-cmdqv: Read and propagate
> Tegra241 CMDQV errors
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi Shameer,
> 
> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> > Install an event handler on the CMDQV vEVENTQ fd to read and propagate
> > host received CMDQV errors to the guest.
> >
> > The handler runs in QEMU’s main loop, using a non-blocking fd
> > registered via qemu_set_fd_handler().
> >
> > Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> > ---
> >  hw/arm/tegra241-cmdqv.c | 55
> +++++++++++++++++++++++++++++++++++++++++
> >  hw/arm/trace-events     |  1 +
> >  2 files changed, 56 insertions(+)
> >
> > diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c index
> > bf989dd51f..9c2fc02b92 100644
> > --- a/hw/arm/tegra241-cmdqv.c
> > +++ b/hw/arm/tegra241-cmdqv.c
> > @@ -11,6 +11,7 @@
> >  #include "qemu/log.h"
> >
> >  #include "hw/arm/smmuv3.h"
> > +#include "hw/core/irq.h"
> >  #include "smmuv3-accel.h"
> >  #include "tegra241-cmdqv.h"
> >  #include "trace.h"
> > @@ -534,6 +535,43 @@ out:
> >      trace_tegra241_cmdqv_write_mmio(offset, value, size);  }
> >
> > +static void tegra241_cmdqv_event_read(void *opaque) {
> > +    Tegra241CMDQV *cmdqv = opaque;
> > +    IOMMUFDVeventq *veventq = cmdqv->veventq;
> > +    struct {
> > +        struct iommufd_vevent_header hdr;
> > +        struct iommu_vevent_tegra241_cmdqv vevent;
> > +    } buf;
> > +    Error *local_err = NULL;
> > +
> > +    if (!smmuv3_accel_event_read_validate(veventq,
> > +                                          IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV,
> > +                                          &buf, sizeof(buf), &local_err)) {
> > +        warn_report_err_once(local_err);
> > +        return;
> > +    }
> > +
> > +    if (buf.vevent.lvcmdq_err_map[0] || buf.vevent.lvcmdq_err_map[1]) {
> > +        cmdqv->vintf_cmdq_err_map[0] =
> > +            buf.vevent.lvcmdq_err_map[0] & 0xffffffff;
> can't you use extract64() here and below?

Yes. that’s better.

> > +        cmdqv->vintf_cmdq_err_map[1] =
> > +            (buf.vevent.lvcmdq_err_map[0] >> 32) & 0xffffffff;
> > +        cmdqv->vintf_cmdq_err_map[2] =
> > +            buf.vevent.lvcmdq_err_map[1] & 0xffffffff;
> > +        cmdqv->vintf_cmdq_err_map[3] =
> > +            (buf.vevent.lvcmdq_err_map[1] >> 32) & 0xffffffff;
> > +        for (int i = 0; i < 4; i++) {
> > +            cmdqv->cmdq_err_map[i] = cmdqv->vintf_cmdq_err_map[i];
> if the cached regs are same to we need to have both?

They are same for now because we only support one VINTF0.  I think
better to keep them separate as per spec. I will add a comment.

Thanks,
Shameer


^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 25/31] hw/arm/tegra241-cmdqv: Add reset handler
  2026-05-07 16:51   ` Eric Auger
@ 2026-05-08 11:19     ` Shameer Kolothum Thodi
  2026-05-08 14:39       ` Eric Auger
  0 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-08 11:19 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 07 May 2026 17:52
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org
> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 25/31] hw/arm/tegra241-cmdqv: Add reset handler
> 
> External email: Use caution opening links or attachments
> 
> 
> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> >
> > Introduce a reset handler for the Tegra241 CMDQV and initialize its
> > register state.
> >
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> > ---
> >  hw/arm/tegra241-cmdqv.h |  2 ++
> >  hw/arm/tegra241-cmdqv.c | 50
> +++++++++++++++++++++++++++++++++++++++++
> >  hw/arm/trace-events     |  1 +
> >  3 files changed, 53 insertions(+)
> >
> > diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> > index 2befa6205e..b2a444daef 100644
> > --- a/hw/arm/tegra241-cmdqv.h
> > +++ b/hw/arm/tegra241-cmdqv.h
> > @@ -79,6 +79,8 @@ FIELD(CONFIG, CMDQ_MAX_CLK_BATCH, 4, 8)
> >  FIELD(CONFIG, CMDQ_MAX_CMD_BATCH, 12, 8)
> >  FIELD(CONFIG, CONS_DRAM_EN, 20, 1)
> >
> > +#define V_CONFIG_RESET 0x00020403
> > +
> >  REG32(PARAM, 0x4)
> >  FIELD(PARAM, CMDQV_VER, 0, 4)
> >  FIELD(PARAM, CMDQV_NUM_CMDQ_LOG2, 4, 4)
> > diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> > index 9c2fc02b92..af68add2f0 100644
> > --- a/hw/arm/tegra241-cmdqv.c
> > +++ b/hw/arm/tegra241-cmdqv.c
> > @@ -8,6 +8,7 @@
> >   */
> >
> >  #include "qemu/osdep.h"
> > +#include "qemu/error-report.h"
> >  #include "qemu/log.h"
> >
> >  #include "hw/arm/smmuv3.h"
> > @@ -645,8 +646,57 @@ free_viommu:
> >      return false;
> >  }
> >
> > +static void tegra241_cmdqv_init_regs(SMMUv3State *s, Tegra241CMDQV
> *cmdqv)
> > +{
> > +    int i;
> > +
> > +    cmdqv->config = V_CONFIG_RESET;
> > +    cmdqv->param = FIELD_DP32(0, PARAM, CMDQV_VER, CMDQV_VER);
> hw/core/registerfields.h:#define FIELD_DP32(storage, reg, field, val)
> what does this 0?

I think it is to make sure we have no garbage set for non-field bits.
Starting from 0 guarantees non-field bits are clean.

> > +    cmdqv->param = FIELD_DP32(cmdqv->param, PARAM,
> CMDQV_NUM_CMDQ_LOG2,
> > +                              CMDQV_NUM_CMDQ_LOG2);
> I was a bit puzzled by the FIELD name being the value name but well.
> > +    cmdqv->param = FIELD_DP32(cmdqv->param, PARAM,
> CMDQV_NUM_SID_PER_VI_LOG2,
> > +                              CMDQV_NUM_SID_PER_VI_LOG2);
> > +    trace_tegra241_cmdqv_init_regs(cmdqv->param);
> > +    cmdqv->status = R_STATUS_CMDQV_ENABLED_MASK;
> > +    for (i = 0; i < 2; i++) {
> > +        cmdqv->vi_err_map[i] = 0;
> > +        cmdqv->vi_int_mask[i] = 0;
> > +        cmdqv->cmdq_err_map[i] = 0;
> > +    }
> > +    cmdqv->cmdq_err_map[2] = 0;
> > +    cmdqv->cmdq_err_map[3] = 0;
> the split looks pretty strange, ie. loop of 2 + continue with individual
> setting. why don't you do the init of
> 
> cmdqv->cmdq_err_map in the x4 loop below

😊. True, that does look strange. I will change.

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 26/31] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size
  2026-05-07 17:23   ` Eric Auger
@ 2026-05-08 13:38     ` Shameer Kolothum Thodi
  2026-05-08 14:41       ` Eric Auger
  0 siblings, 1 reply; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-08 13:38 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 07 May 2026 18:24
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org
> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 26/31] hw/arm/tegra241-cmdqv: Limit queue size
> based on backend page size
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi Shameer,
> 
> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> >
> > CMDQV HW reads guest queue memory in its host physical address setup via
> > IOMMUFD. This requires the guest queue memory is not only contiguous in
> > guest PA space but also in host PA space. With Tegra241 CMDQV enabled,
> we
> > must only advertise a CMDQS that the host can safely back with physically
> s/a CMDQS/ a command queue size (CMDQS)
> > contiguous memory. Allowing a queue larger than the host page size could
> a queue size
> > cause the hardware to DMA across page boundaries, leading to faults.
> >
> > Walk the RAMBlock list to find the smallest memory-backend page size, then
> > limit IDR1.CMDQS so the guest cannot configure a command queue that
> exceeds
> what is the minimal CMDQS required?

AFAICS, spec doesn't specify a minimum. But in practical terms I think,
it will be 4K page size(CMDQS=8).

> > that contiguous backing. Fall back to the real host page size if no
> > memory-backend RAM blocks are found.
> >
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> > ---
> >  hw/arm/tegra241-cmdqv.c | 41
> +++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 41 insertions(+)
> >
> > diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> > index af68add2f0..2870886783 100644
> > --- a/hw/arm/tegra241-cmdqv.c
> > +++ b/hw/arm/tegra241-cmdqv.c
> > @@ -14,6 +14,9 @@
> >  #include "hw/arm/smmuv3.h"
> >  #include "hw/core/irq.h"
> >  #include "smmuv3-accel.h"
> > +#include "smmuv3-internal.h"
> > +#include "system/ramblock.h"
> > +#include "system/ramlist.h"
> >  #include "tegra241-cmdqv.h"
> >  #include "trace.h"
> >
> > @@ -646,9 +649,38 @@ free_viommu:
> >      return false;
> >  }
> >
> > +static size_t tegra241_cmdqv_min_ram_pagesize(void)
> shouldn't we put that rather in system/physmem.c
> there we also have qemu_ram_pagesize_largest() for instance

Could do. But we may have to rename it differently compared to 
qemu_ram_pagesize_largest() as we are checking the size among
memory-backend RAM only.

May be qemu_ram_backend_pagesize_min() ?

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0 into guest MMIO space
  2026-05-08  9:03         ` Shameer Kolothum Thodi
@ 2026-05-08 14:35           ` Eric Auger
  2026-05-08 14:37             ` Shameer Kolothum Thodi
  0 siblings, 1 reply; 102+ messages in thread
From: Eric Auger @ 2026-05-08 14:35 UTC (permalink / raw)
  To: Shameer Kolothum Thodi, qemu-arm@nongnu.org,
	qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



On 5/8/26 11:03 AM, Shameer Kolothum Thodi wrote:
> Hi Eric,
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Sent: 07 May 2026 17:24
>> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
>> arm@nongnu.org; qemu-devel@nongnu.org
>> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
>> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
>> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
>> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
>> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
>> phrdina@redhat.com
>> Subject: Re: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0
>> into guest MMIO space
>>
>> External email: Use caution opening links or attachments
>>
>>
>> On 5/6/26 4:24 PM, Shameer Kolothum Thodi wrote:
>>> Hi Eric,
>>>
>>>> -----Original Message-----
>>>> From: Eric Auger <eric.auger@redhat.com>
>>>> Sent: 06 May 2026 13:45
>>>> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
>>>> arm@nongnu.org; qemu-devel@nongnu.org
>>>> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
>>>> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
>>>> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason
>> Gunthorpe
>>>> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
>>>> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
>>>> phrdina@redhat.com
>>>> Subject: Re: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0
>>>> into guest MMIO space
>>>>
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> Hi Shameer,
>>>>
>>>> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
>>>>> From: Nicolin Chen <nicolinc@nvidia.com>
>>>>>
>>>>> Once a VCMDQ is allocated, map the mmap'd vintf_page0 region directly
>>>>> into the guest-visible MMIO space at offset 0x30000 as a RAM-backed
>>>>> MemoryRegion. This eliminates QEMU trapping for hot-path CONS/PROD
>>>>> index updates.
>>>>>
>>>>> After this patch, the two VCMDQ apertures use different access paths:
>>>>> the direct aperture (0x10000) remains QEMU-trapped and writes via
>>>>> vintf_ptr, while the VI aperture (0x30000) is a direct guest RAM
>>>>> mapping. Both paths write to the same underlying vintf_page0 memory,
>>>>> so no synchronisation between the apertures is needed.
>>>> I fail to understand when the previous trapped path using ptr in
>>>> tegra241_cmdqv_read/write_vcmdq gets used versus that path. Is it
>>>> eventually used?
>>> The spec does not prevent a guest from using the 0x10000 path for
>> allocated
>>> VCMDQs, so the trap path remains valid and QEMU forwards those accesses
>> to
>>> the mmap'd vintf_page0 via vintf_ptr.
>>>
>>> We cannot map 0x10000 directly to the guest as RAM because the kernel
>>> mmap only backs VCMDQs actually allocated via HW_QUEUE ioctl. If the
>>> guest allocates only 1 of 2 VCMDQs, exposing the full direct aperture page0
>>> as RAM would give the guest access to unallocated VCMDQ slots. Hence, it
>>> remains trapped and QEMU only fwds to vintf page0 for allocated queues.
>> Actually my question rather was why don't we use a subregion_overlap for
>> global vcmdq page0, mapping to vintf_page0 once cmdqv->vintf_page0 is
>> set, just as we do for VINTF0 page0? I understand
>> tegra241_cmdqv_read/write_vcmdq would return the same content anyway.
> This is my understanding why we cannot map direct VCMDQ aperture(0x10000)
> to vintf_page0:
>
>  - Unallocated VCMDQs accessed via vintf_page0 get "access dropped with no
>     Fault/Interrupt" behaviour per spec((p.172)
>
>  - Mapping the direct VCMDQ aperture (0x10000) to vintf_page0 would replicate
>    this behaviour for unallocated VCMDQ accesses via the direct aperture too.
>
>  - However, per spec p.176, software can program VCMDQs via the direct
>    VCMDQ aperture without going through the VINTF. Mapping vintf_page0
>    at 0x10000 violates this. 
>
>   - Partial allocation case (VCMDQ0 allocated, VCMDQ1 not): the trap correctly
>     serves VCMDQ1 accesses from QEMU cache while forwarding VCMDQ0 accesses
>     to vintf_page0. A direct RAM mapping cannot make this distinction.

I agree that normally we shouldn't use VINTF view for the global view. 

When you say "the trap correctly serves VCMDQ1 accesses from QEMU cache
while forwarding VCMDQ0 accesses to vintf_page0", the filtering is done by
tegra241_cmdqv_vintf_ptr. Whatever the queue being enabled,
cmdqv->vintf_page0 is allocated. So only the test on cmdqv->vcmdq[index]
does the filtering in that case, correct?

If so I now share your pov

Eric
>
>> Besides, I would recommend that all over the code, page0 and page1 are
>> clearly differentiated in terms of accessors and comments and we use the
>> same terminology through the code for global vcmdq and vintf vcmdqs.
> Sure. I will address this.
>
> Thanks,
> Shameer
>



^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 30/31] hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active
  2026-05-07 17:36   ` Eric Auger
@ 2026-05-08 14:36     ` Shameer Kolothum Thodi
  0 siblings, 0 replies; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-08 14:36 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 07 May 2026 18:36
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org
> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 30/31] hw/arm/smmuv3-accel: Enforce viommu
> association when CMDQV is active
> 
> External email: Use caution opening links or attachments
> 
> 
> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> > When CMDQV is active, the first cold-plugged VFIO device establishes
> > the viommu to host SMMUv3 association. Block its hot-unplug to
> > preserve this association and the guest's boot time CMDQV configuration.
> >
> > Also abort at machine_done if cmdqv=on is requested but no
> > cold-plugged VFIO device was present to initialize it.
> >
> > Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> > ---
> >  hw/arm/smmuv3-accel.h |  1 +
> >  hw/arm/smmuv3-accel.c | 12 ++++++++++++
> >  hw/arm/smmuv3.c       |  6 ++++++
> >  3 files changed, 19 insertions(+)
> >
> > diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h index
> > 3ed94ed05c..c4441d5b3f 100644
> > --- a/hw/arm/smmuv3-accel.h
> > +++ b/hw/arm/smmuv3-accel.h
> > @@ -65,6 +65,7 @@ typedef struct SMMUv3AccelDevice {
> >      IOMMUFDVdev *vdev;
> >      QLIST_ENTRY(SMMUv3AccelDevice) next;
> >      SMMUv3AccelState *s_accel;
> > +    Error *unplug_blocker; /* set when CMDQV is active to block
> > + hot-unplug */
> >  } SMMUv3AccelDevice;
> >
> >  bool smmuv3_accel_init(SMMUv3State *s, Error **errp); diff --git
> > a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c index
> > a58815ded2..f381702a08 100644
> > --- a/hw/arm/smmuv3-accel.c
> > +++ b/hw/arm/smmuv3-accel.c
> > @@ -754,6 +754,18 @@ static bool
> smmuv3_accel_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
> >          return false;
> >      }
> >
> > +    /*
> > +     * CMDQV is active: block hot-unplug of the device that established the
> > +     * viommu association. Removing it would cause the vIOMMU to host
> SMMUv3
> > +     * association be changed via device hot-plug.
> > +     */
> > +    if (s->s_accel->cmdqv_ops) {
> > +        PCIDevice *pdev = pci_find_device(bus, pci_bus_num(bus), devfn);
> > +        error_setg(&accel_dev->unplug_blocker,
> > +                   "CMDQV is active: removing the device that established the "
> > +                   "viommu association would break the guest CMDQV");
> It is something that is planned to be fixed later on?

Not really. It is not straightforward to support hot unplug of the device that
enabled the VCMDQ on the associated SMMUv3.

> can you provide more details about the nature of the breakage

One scenario could be:
- Consider two SMMUv3+CMDQV instances, each with a passthrough device:
   - dev_A → SMMUv3_A + CMDQV_A
   - dev_B → SMMUv3_B + CMDQV_B
- Guest boots with dev_A cold-plugged; CMDQV_A is probed, viommu allocated
  against SMMUv3_A/CMDQV_A, and during guest boot time CMDQV configuration
  is established
- If dev_A is hot-unplugged, QEMU would have to release the viommu and tears
  down all associated CMDQV state (VINTFs, VCMDQs)
- If dev_B is subsequently hot-plugged, QEMU allocates a new viommu against
  SMMUv3_B/CMDQV_B.
- The guest, however, still holds the boot time SMMUv3_A/CMDQV_A configuration
 (VINTFs, VCMDQs) and will continue issuing commands against it. 

Blocking hot-unplug of the establishing device prevents this viommu/CMDQV
teardown from happening while the guest is actively using it.

Supporting hot-unplug may be possible if we retain the SMMUv3-to-viommu
association even after the device is unplugged. However, that would
require different lifetime management in the SMMUv3 accel and VFIO code
paths, which would be non-trivial.

There is no strong user requirement for supporting hot unplug of the device
like in the above scenario on these platforms either.

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0 into guest MMIO space
  2026-05-08 14:35           ` Eric Auger
@ 2026-05-08 14:37             ` Shameer Kolothum Thodi
  0 siblings, 0 replies; 102+ messages in thread
From: Shameer Kolothum Thodi @ 2026-05-08 14:37 UTC (permalink / raw)
  To: eric.auger@redhat.com, qemu-arm@nongnu.org, qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 08 May 2026 15:35
> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org
> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> phrdina@redhat.com
> Subject: Re: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0
> into guest MMIO space
> 
> External email: Use caution opening links or attachments
> 
> 
> On 5/8/26 11:03 AM, Shameer Kolothum Thodi wrote:
> > Hi Eric,
> >
> >> -----Original Message-----
> >> From: Eric Auger <eric.auger@redhat.com>
> >> Sent: 07 May 2026 17:24
> >> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> >> arm@nongnu.org; qemu-devel@nongnu.org
> >> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
> >> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
> >> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
> >> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> >> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> >> phrdina@redhat.com
> >> Subject: Re: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF
> page0
> >> into guest MMIO space
> >>
> >> External email: Use caution opening links or attachments
> >>
> >>
> >> On 5/6/26 4:24 PM, Shameer Kolothum Thodi wrote:
> >>> Hi Eric,
> >>>
> >>>> -----Original Message-----
> >>>> From: Eric Auger <eric.auger@redhat.com>
> >>>> Sent: 06 May 2026 13:45
> >>>> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
> >>>> arm@nongnu.org; qemu-devel@nongnu.org
> >>>> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org;
> Nicolin
> >>>> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>;
> Matt
> >>>> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason
> >> Gunthorpe
> >>>> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
> >>>> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
> >>>> phrdina@redhat.com
> >>>> Subject: Re: [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF
> page0
> >>>> into guest MMIO space
> >>>>
> >>>> External email: Use caution opening links or attachments
> >>>>
> >>>>
> >>>> Hi Shameer,
> >>>>
> >>>> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
> >>>>> From: Nicolin Chen <nicolinc@nvidia.com>
> >>>>>
> >>>>> Once a VCMDQ is allocated, map the mmap'd vintf_page0 region
> directly
> >>>>> into the guest-visible MMIO space at offset 0x30000 as a RAM-backed
> >>>>> MemoryRegion. This eliminates QEMU trapping for hot-path
> CONS/PROD
> >>>>> index updates.
> >>>>>
> >>>>> After this patch, the two VCMDQ apertures use different access paths:
> >>>>> the direct aperture (0x10000) remains QEMU-trapped and writes via
> >>>>> vintf_ptr, while the VI aperture (0x30000) is a direct guest RAM
> >>>>> mapping. Both paths write to the same underlying vintf_page0 memory,
> >>>>> so no synchronisation between the apertures is needed.
> >>>> I fail to understand when the previous trapped path using ptr in
> >>>> tegra241_cmdqv_read/write_vcmdq gets used versus that path. Is it
> >>>> eventually used?
> >>> The spec does not prevent a guest from using the 0x10000 path for
> >> allocated
> >>> VCMDQs, so the trap path remains valid and QEMU forwards those
> accesses
> >> to
> >>> the mmap'd vintf_page0 via vintf_ptr.
> >>>
> >>> We cannot map 0x10000 directly to the guest as RAM because the kernel
> >>> mmap only backs VCMDQs actually allocated via HW_QUEUE ioctl. If the
> >>> guest allocates only 1 of 2 VCMDQs, exposing the full direct aperture
> page0
> >>> as RAM would give the guest access to unallocated VCMDQ slots. Hence, it
> >>> remains trapped and QEMU only fwds to vintf page0 for allocated queues.
> >> Actually my question rather was why don't we use a subregion_overlap for
> >> global vcmdq page0, mapping to vintf_page0 once cmdqv->vintf_page0 is
> >> set, just as we do for VINTF0 page0? I understand
> >> tegra241_cmdqv_read/write_vcmdq would return the same content
> anyway.
> > This is my understanding why we cannot map direct VCMDQ
> aperture(0x10000)
> > to vintf_page0:
> >
> >  - Unallocated VCMDQs accessed via vintf_page0 get "access dropped with
> no
> >     Fault/Interrupt" behaviour per spec((p.172)
> >
> >  - Mapping the direct VCMDQ aperture (0x10000) to vintf_page0 would
> replicate
> >    this behaviour for unallocated VCMDQ accesses via the direct aperture too.
> >
> >  - However, per spec p.176, software can program VCMDQs via the direct
> >    VCMDQ aperture without going through the VINTF. Mapping vintf_page0
> >    at 0x10000 violates this.
> >
> >   - Partial allocation case (VCMDQ0 allocated, VCMDQ1 not): the trap
> correctly
> >     serves VCMDQ1 accesses from QEMU cache while forwarding VCMDQ0
> accesses
> >     to vintf_page0. A direct RAM mapping cannot make this distinction.
> 
> I agree that normally we shouldn't use VINTF view for the global view.
> 
> When you say "the trap correctly serves VCMDQ1 accesses from QEMU cache
> while forwarding VCMDQ0 accesses to vintf_page0", the filtering is done by
> tegra241_cmdqv_vintf_ptr. Whatever the queue being enabled,
> cmdqv->vintf_page0 is allocated. So only the test on cmdqv->vcmdq[index]
> does the filtering in that case, correct?

Yes. That is correct.

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 25/31] hw/arm/tegra241-cmdqv: Add reset handler
  2026-05-08 11:19     ` Shameer Kolothum Thodi
@ 2026-05-08 14:39       ` Eric Auger
  0 siblings, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-08 14:39 UTC (permalink / raw)
  To: Shameer Kolothum Thodi, qemu-arm@nongnu.org,
	qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



On 5/8/26 1:19 PM, Shameer Kolothum Thodi wrote:
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Sent: 07 May 2026 17:52
>> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
>> arm@nongnu.org; qemu-devel@nongnu.org
>> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
>> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
>> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
>> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
>> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
>> phrdina@redhat.com
>> Subject: Re: [PATCH v4 25/31] hw/arm/tegra241-cmdqv: Add reset handler
>>
>> External email: Use caution opening links or attachments
>>
>>
>> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
>>> From: Nicolin Chen <nicolinc@nvidia.com>
>>>
>>> Introduce a reset handler for the Tegra241 CMDQV and initialize its
>>> register state.
>>>
>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
>>> ---
>>>  hw/arm/tegra241-cmdqv.h |  2 ++
>>>  hw/arm/tegra241-cmdqv.c | 50
>> +++++++++++++++++++++++++++++++++++++++++
>>>  hw/arm/trace-events     |  1 +
>>>  3 files changed, 53 insertions(+)
>>>
>>> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
>>> index 2befa6205e..b2a444daef 100644
>>> --- a/hw/arm/tegra241-cmdqv.h
>>> +++ b/hw/arm/tegra241-cmdqv.h
>>> @@ -79,6 +79,8 @@ FIELD(CONFIG, CMDQ_MAX_CLK_BATCH, 4, 8)
>>>  FIELD(CONFIG, CMDQ_MAX_CMD_BATCH, 12, 8)
>>>  FIELD(CONFIG, CONS_DRAM_EN, 20, 1)
>>>
>>> +#define V_CONFIG_RESET 0x00020403
>>> +
>>>  REG32(PARAM, 0x4)
>>>  FIELD(PARAM, CMDQV_VER, 0, 4)
>>>  FIELD(PARAM, CMDQV_NUM_CMDQ_LOG2, 4, 4)
>>> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
>>> index 9c2fc02b92..af68add2f0 100644
>>> --- a/hw/arm/tegra241-cmdqv.c
>>> +++ b/hw/arm/tegra241-cmdqv.c
>>> @@ -8,6 +8,7 @@
>>>   */
>>>
>>>  #include "qemu/osdep.h"
>>> +#include "qemu/error-report.h"
>>>  #include "qemu/log.h"
>>>
>>>  #include "hw/arm/smmuv3.h"
>>> @@ -645,8 +646,57 @@ free_viommu:
>>>      return false;
>>>  }
>>>
>>> +static void tegra241_cmdqv_init_regs(SMMUv3State *s, Tegra241CMDQV
>> *cmdqv)
>>> +{
>>> +    int i;
>>> +
>>> +    cmdqv->config = V_CONFIG_RESET;
>>> +    cmdqv->param = FIELD_DP32(0, PARAM, CMDQV_VER, CMDQV_VER);
>> hw/core/registerfields.h:#define FIELD_DP32(storage, reg, field, val)
>> what does this 0?
> I think it is to make sure we have no garbage set for non-field bits.
> Starting from 0 guarantees non-field bits are clean.

OK thanks ;-)

Eric
>
>>> +    cmdqv->param = FIELD_DP32(cmdqv->param, PARAM,
>> CMDQV_NUM_CMDQ_LOG2,
>>> +                              CMDQV_NUM_CMDQ_LOG2);
>> I was a bit puzzled by the FIELD name being the value name but well.
>>> +    cmdqv->param = FIELD_DP32(cmdqv->param, PARAM,
>> CMDQV_NUM_SID_PER_VI_LOG2,
>>> +                              CMDQV_NUM_SID_PER_VI_LOG2);
>>> +    trace_tegra241_cmdqv_init_regs(cmdqv->param);
>>> +    cmdqv->status = R_STATUS_CMDQV_ENABLED_MASK;
>>> +    for (i = 0; i < 2; i++) {
>>> +        cmdqv->vi_err_map[i] = 0;
>>> +        cmdqv->vi_int_mask[i] = 0;
>>> +        cmdqv->cmdq_err_map[i] = 0;
>>> +    }
>>> +    cmdqv->cmdq_err_map[2] = 0;
>>> +    cmdqv->cmdq_err_map[3] = 0;
>> the split looks pretty strange, ie. loop of 2 + continue with individual
>> setting. why don't you do the init of
>>
>> cmdqv->cmdq_err_map in the x4 loop below
> 😊. True, that does look strange. I will change.
>
> Thanks,
> Shameer



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 26/31] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size
  2026-05-08 13:38     ` Shameer Kolothum Thodi
@ 2026-05-08 14:41       ` Eric Auger
  0 siblings, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-08 14:41 UTC (permalink / raw)
  To: Shameer Kolothum Thodi, qemu-arm@nongnu.org,
	qemu-devel@nongnu.org
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nicolin Chen, Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



On 5/8/26 3:38 PM, Shameer Kolothum Thodi wrote:
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Sent: 07 May 2026 18:24
>> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
>> arm@nongnu.org; qemu-devel@nongnu.org
>> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
>> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
>> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
>> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
>> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
>> phrdina@redhat.com
>> Subject: Re: [PATCH v4 26/31] hw/arm/tegra241-cmdqv: Limit queue size
>> based on backend page size
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi Shameer,
>>
>> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
>>> From: Nicolin Chen <nicolinc@nvidia.com>
>>>
>>> CMDQV HW reads guest queue memory in its host physical address setup via
>>> IOMMUFD. This requires the guest queue memory is not only contiguous in
>>> guest PA space but also in host PA space. With Tegra241 CMDQV enabled,
>> we
>>> must only advertise a CMDQS that the host can safely back with physically
>> s/a CMDQS/ a command queue size (CMDQS)
>>> contiguous memory. Allowing a queue larger than the host page size could
>> a queue size
>>> cause the hardware to DMA across page boundaries, leading to faults.
>>>
>>> Walk the RAMBlock list to find the smallest memory-backend page size, then
>>> limit IDR1.CMDQS so the guest cannot configure a command queue that
>> exceeds
>> what is the minimal CMDQS required?
> AFAICS, spec doesn't specify a minimum. But in practical terms I think,
> it will be 4K page size(CMDQS=8).

OK
>
>>> that contiguous backing. Fall back to the real host page size if no
>>> memory-backend RAM blocks are found.
>>>
>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
>>> ---
>>>  hw/arm/tegra241-cmdqv.c | 41
>> +++++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 41 insertions(+)
>>>
>>> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
>>> index af68add2f0..2870886783 100644
>>> --- a/hw/arm/tegra241-cmdqv.c
>>> +++ b/hw/arm/tegra241-cmdqv.c
>>> @@ -14,6 +14,9 @@
>>>  #include "hw/arm/smmuv3.h"
>>>  #include "hw/core/irq.h"
>>>  #include "smmuv3-accel.h"
>>> +#include "smmuv3-internal.h"
>>> +#include "system/ramblock.h"
>>> +#include "system/ramlist.h"
>>>  #include "tegra241-cmdqv.h"
>>>  #include "trace.h"
>>>
>>> @@ -646,9 +649,38 @@ free_viommu:
>>>      return false;
>>>  }
>>>
>>> +static size_t tegra241_cmdqv_min_ram_pagesize(void)
>> shouldn't we put that rather in system/physmem.c
>> there we also have qemu_ram_pagesize_largest() for instance
> Could do. But we may have to rename it differently compared to 
> qemu_ram_pagesize_largest() as we are checking the size among
> memory-backend RAM only.
>
> May be qemu_ram_backend_pagesize_min() ?

Yes sounds reasonable to me. Let's add the physmem.c maintainers in cc.

Eric
>
> Thanks,
> Shameer



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming
  2026-05-05 14:26     ` Shameer Kolothum Thodi
  2026-05-06 17:49       ` Nicolin Chen
@ 2026-05-08 14:50       ` Eric Auger
  1 sibling, 0 replies; 102+ messages in thread
From: Eric Auger @ 2026-05-08 14:50 UTC (permalink / raw)
  To: Shameer Kolothum Thodi, qemu-arm@nongnu.org,
	qemu-devel@nongnu.org, Nicolin Chen
  Cc: peter.maydell@linaro.org, clg@redhat.com, alex@shazbot.org,
	Nathan Chen, Matt Ochs, Jiandi An, Jason Gunthorpe,
	jonathan.cameron@huawei.com, zhenzhong.duan@intel.com,
	Krishnakant Jaju, phrdina@redhat.com



On 5/5/26 4:26 PM, Shameer Kolothum Thodi wrote:
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Sent: 05 May 2026 14:26
>> To: Shameer Kolothum Thodi <skolothumtho@nvidia.com>; qemu-
>> arm@nongnu.org; qemu-devel@nongnu.org
>> Cc: peter.maydell@linaro.org; clg@redhat.com; alex@shazbot.org; Nicolin
>> Chen <nicolinc@nvidia.com>; Nathan Chen <nathanc@nvidia.com>; Matt
>> Ochs <mochs@nvidia.com>; Jiandi An <jan@nvidia.com>; Jason Gunthorpe
>> <jgg@nvidia.com>; jonathan.cameron@huawei.com;
>> zhenzhong.duan@intel.com; Krishnakant Jaju <kjaju@nvidia.com>;
>> phrdina@redhat.com
>> Subject: Re: [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW
>> VCMDQs on base register programming
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi,
>>
>> On 4/15/26 12:55 PM, Shameer Kolothum wrote:
>>> From: Nicolin Chen <nicolinc@nvidia.com>
>>>
>>> Add support for allocating IOMMUFD hardware queues when the guest
>>> programs the VCMDQ BASE registers.
>>>
>>> VCMDQ_EN is part of the VCMDQ_CONFIG register, which is accessed
>>> through the VINTF Page0 region. A subsequent patch maps this region
>>> directly into the guest address space, so QEMU does not trap writes
>>> to VCMDQ_CONFIG.
>>>
>>> Since VCMDQ_EN writes are not trapped, QEMU cannot allocate the
>>> hardware queue based on that bit. Instead, allocate the IOMMUFD
>>> hardware queue when the guest writes a VCMDQ BASE register with a
>>> valid RAM-backed address and when CMDQV and VINTF are enabled.
>>> If a hardware queue was previously allocated for the same VCMDQ,
>>> free it before reallocation.
>> the asymetric alloc/free sounds unusual. Are there other alternatives?
> Nothing that comes to my mind now. There is no "update hw_queue address"
> hence we have to free and allocate. May be we can add a small optimisation
> and skip free + alloc if the vcmdq_base[index] hasn't changed from
> previous alloc.

OK

Thanks

Eric
>
>>> Writes with invalid addresses are ignored.
>>>
>>> All allocated VCMDQs are freed when CMDQV or VINTF is disabled.
>>>
>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
>>> ---
>>>  hw/arm/tegra241-cmdqv.h | 11 +++++++
>>>  hw/arm/tegra241-cmdqv.c | 70
>> +++++++++++++++++++++++++++++++++++++++--
>>>  2 files changed, 78 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
>>> index 88572ad939..039d86374f 100644
>>> --- a/hw/arm/tegra241-cmdqv.h
>>> +++ b/hw/arm/tegra241-cmdqv.h
>>> @@ -44,6 +44,7 @@ typedef struct Tegra241CMDQV {
>>>      MemoryRegion mmio_cmdqv;
>>>      qemu_irq irq;
>>>      IOMMUFDVeventq *veventq;
>>> +    IOMMUFDHWqueue *vcmdq[TEGRA241_CMDQV_MAX_CMDQ];
>>>      void *vintf_page0;
>>>
>>>      /* Register Cache */
>>> @@ -348,6 +349,16 @@ A_VI_VCMDQi_CONS_INDX_BASE_DRAM_L(1)
>>>  A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(0)
>>>  A_VI_VCMDQi_CONS_INDX_BASE_DRAM_H(1)
>>>
>>> +static inline bool tegra241_cmdq_enabled(Tegra241CMDQV *cmdqv)
>>> +{
>>> +    return cmdqv->status & R_STATUS_CMDQV_ENABLED_MASK;
>>> +}
>>> +
>>> +static inline bool tegra241_vintf_enabled(Tegra241CMDQV *cmdqv)
>>> +{
>>> +    return cmdqv->vintf_status & R_VINTF0_STATUS_ENABLE_OK_MASK;
>>> +}
>>> +
>>>  const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
>>>
>>>  #endif /* HW_ARM_TEGRA241_CMDQV_H */
>>> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
>>> index cdd941cec9..b5f2f74cf2 100644
>>> --- a/hw/arm/tegra241-cmdqv.c
>>> +++ b/hw/arm/tegra241-cmdqv.c
>>> @@ -15,6 +15,66 @@
>>>  #include "tegra241-cmdqv.h"
>>>  #include "trace.h"
>>>
>>> +static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int
>> index)
>>> +{
>>> +    IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
>>> +    IOMMUFDHWqueue *vcmdq = cmdqv->vcmdq[index];
>>> +
>>> +    if (!vcmdq) {
>>> +        return;
>>> +    }
>>> +    iommufd_backend_free_id(viommu->iommufd, vcmdq->hw_queue_id);
>>> +    g_free(vcmdq);
>>> +    cmdqv->vcmdq[index] = NULL;
>>> +}
>>> +
>>> +static void tegra241_cmdqv_free_all_vcmdq(Tegra241CMDQV *cmdqv)
>>> +{
>>> +    /* Free in reverse order to avoid "resource busy" error */
>> can you provide additional details about the above problematic. Is it
>> documented in the spec?
> See p.176:
>
> Deallocate a VCMDQ from a Virtual Interface 
>     Logical CMDQ being deallocated for a Guest must be in decreasing order
>     starting from the highest numbered LVCMDQ.
>
>>> +    for (int i = (TEGRA241_CMDQV_MAX_CMDQ - 1); i >= 0; i--) {
>>> +        tegra241_cmdqv_free_vcmdq(cmdqv, i);
>>> +    }
>>> +}
>>> +
>>> +static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int
>> index,
>>> +                                       Error **errp)
>>> +{
>>> +    SMMUv3AccelState *accel = cmdqv->s_accel;
>>> +    uint64_t base_mask = (uint64_t)R_VCMDQ0_BASE_L_ADDR_MASK |
>>> +                         (uint64_t)R_VCMDQ0_BASE_H_ADDR_MASK << 32;
>>> +    uint64_t addr = cmdqv->vcmdq_base[index] & base_mask;
>>> +    uint64_t log2 = cmdqv->vcmdq_base[index] &
>> R_VCMDQ0_BASE_L_LOG2SIZE_MASK;
>>> +    uint64_t size = 1ULL << (log2 + 4);
>>> +    IOMMUFDViommu *viommu = accel->viommu;
>>> +    IOMMUFDHWqueue *hw_queue;
>>> +    uint32_t hw_queue_id;
>>> +
>>> +    /* Ignore any invalid address. This may come as part of reset etc. */
>>> +    if (!address_space_is_ram(&address_space_memory, addr) ||
>>> +        !address_space_is_ram(&address_space_memory, addr + size - 1)) {
>> this check looks a little bit risky, no? Don't we have a better way to
>> test the address has been properly set?
> I think eventually kernel will handle any attempt to use an invalid address
> through IOMMU_HW_QUEUE_ALLOC IOCTL:
>      iommufd_hw_queue_alloc_phys()/iommufd_access_pin_pages() etc.
>
> Any attempt to pass an invalid address will return error, I think.
>
> @Nicolin, is that a safe assumption to make?
>
>>> +        return true;
>>> +    }
>>> +
>>> +    if (!tegra241_cmdq_enabled(cmdqv) ||
>> !tegra241_vintf_enabled(cmdqv)) {
>>> +        return true;
>>> +    }
>>> +
>>> +    tegra241_cmdqv_free_vcmdq(cmdqv, index);
>> would deserve a comment also here.
> Sure.
>
> Thanks,
> Shameer



^ permalink raw reply	[flat|nested] 102+ messages in thread

end of thread, other threads:[~2026-05-08 14:51 UTC | newest]

Thread overview: 102+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-15 10:55 [PATCH v4 00/31] hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3 Shameer Kolothum
2026-04-15 10:55 ` [PATCH v4 01/31] backends/iommufd: Update iommufd_backend_get_device_info Shameer Kolothum
2026-04-15 10:55 ` [PATCH v4 02/31] backends/iommufd: Update iommufd_backend_alloc_viommu to allow user ptr Shameer Kolothum
2026-04-15 10:55 ` [PATCH v4 03/31] backends/iommufd: Introduce iommufd_backend_alloc_hw_queue Shameer Kolothum
2026-04-15 10:55 ` [PATCH v4 04/31] backends/iommufd: Introduce iommufd_backend_viommu_mmap Shameer Kolothum
2026-04-15 10:55 ` [PATCH v4 05/31] system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq Shameer Kolothum
2026-05-04 15:00   ` Eric Auger
2026-05-04 18:16   ` Nicolin Chen
2026-04-15 10:55 ` [PATCH v4 06/31] hw/arm/smmuv3-accel: Introduce CMDQV ops interface Shameer Kolothum
2026-05-04 15:19   ` Eric Auger
2026-05-04 18:28   ` Nicolin Chen
2026-04-15 10:55 ` [PATCH v4 07/31] hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub Shameer Kolothum
2026-05-04 15:19   ` Eric Auger
2026-05-04 18:23   ` Nicolin Chen
2026-04-15 10:55 ` [PATCH v4 08/31] hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle Shameer Kolothum
2026-05-04 15:33   ` Eric Auger
2026-05-05  7:47     ` Shameer Kolothum Thodi
2026-05-04 18:38   ` Nicolin Chen
2026-04-15 10:55 ` [PATCH v4 09/31] hw/arm/virt: Use stored SMMUv3 device list for IORT build Shameer Kolothum
2026-05-04 18:46   ` Nicolin Chen
2026-04-15 10:55 ` [PATCH v4 10/31] hw/arm/tegra241-cmdqv: Probe host Tegra241 CMDQV support Shameer Kolothum
2026-05-04 18:49   ` Nicolin Chen
2026-04-15 10:55 ` [PATCH v4 11/31] hw/arm/tegra241-cmdqv: Implement CMDQV init Shameer Kolothum
2026-04-15 10:55 ` [PATCH v4 12/31] hw/arm/virt: Link SMMUv3 CMDQV resources to platform bus Shameer Kolothum
2026-05-04 18:57   ` Nicolin Chen
2026-04-15 10:55 ` [PATCH v4 13/31] hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free Shameer Kolothum
2026-05-04 16:01   ` Eric Auger
2026-05-04 19:54   ` Nicolin Chen
2026-04-15 10:55 ` [PATCH v4 14/31] hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region Shameer Kolothum
2026-05-05  0:09   ` Nicolin Chen
2026-05-05  7:26   ` Eric Auger
2026-05-05 10:28     ` Shameer Kolothum Thodi
2026-04-15 10:55 ` [PATCH v4 15/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads Shameer Kolothum
2026-05-05 10:12   ` Eric Auger
2026-05-05 13:27     ` Shameer Kolothum Thodi
2026-05-06 11:14       ` Eric Auger
2026-04-15 10:55 ` [PATCH v4 16/31] hw/arm/tegra241-cmdqv: Emulate VCMDQ register writes Shameer Kolothum
2026-05-05 10:42   ` Eric Auger
2026-05-05 13:49     ` Shameer Kolothum Thodi
2026-04-15 10:55 ` [PATCH v4 17/31] hw/arm/tegra241-cmdqv: mmap VINTF Page0 for CMDQV Shameer Kolothum
2026-04-15 10:55 ` [PATCH v4 18/31] system/physmem: Add address_space_is_ram() helper Shameer Kolothum
2026-05-05  0:24   ` Nicolin Chen
2026-04-15 10:55 ` [PATCH v4 19/31] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs on base register programming Shameer Kolothum
2026-05-05  0:40   ` Nicolin Chen
2026-05-05  9:59     ` Shameer Kolothum Thodi
2026-05-05 19:38       ` Nicolin Chen
2026-05-06  8:18         ` Shameer Kolothum Thodi
2026-05-06 18:18           ` Nicolin Chen
2026-05-05 13:25   ` Eric Auger
2026-05-05 14:26     ` Shameer Kolothum Thodi
2026-05-06 17:49       ` Nicolin Chen
2026-05-08 14:50       ` Eric Auger
2026-05-06 16:51   ` Eric Auger
2026-05-06 18:21     ` Nicolin Chen via
2026-05-06 18:21       ` Nicolin Chen via qemu development
2026-04-15 10:55 ` [PATCH v4 20/31] hw/arm/tegra241-cmdqv: Use mmap'd VINTF page0 as VCMDQ backing Shameer Kolothum
2026-05-05  0:50   ` Nicolin Chen
2026-05-05 15:13     ` Shameer Kolothum Thodi
2026-05-05 19:52       ` Nicolin Chen
2026-05-06 13:16         ` Shameer Kolothum Thodi
2026-05-06 18:34           ` Nicolin Chen
2026-05-06 20:13             ` Shameer Kolothum Thodi
2026-05-06 20:55               ` Nicolin Chen
2026-05-06 12:27   ` Eric Auger
2026-04-15 10:55 ` [PATCH v4 21/31] memory: Allow RAM device regions to skip IOMMU mapping Shameer Kolothum
2026-05-06 12:39   ` Eric Auger
2026-04-15 10:55 ` [PATCH v4 22/31] hw/arm/tegra241-cmdqv: Map VINTF page0 into guest MMIO space Shameer Kolothum
2026-05-06 12:44   ` Eric Auger
2026-05-06 14:24     ` Shameer Kolothum Thodi
2026-05-07 16:24       ` Eric Auger
2026-05-08  9:03         ` Shameer Kolothum Thodi
2026-05-08 14:35           ` Eric Auger
2026-05-08 14:37             ` Shameer Kolothum Thodi
2026-04-15 10:55 ` [PATCH v4 23/31] hw/arm/smmuv3-accel: Introduce common helper for veventq read Shameer Kolothum
2026-05-05  1:07   ` Nicolin Chen
2026-05-06 12:49   ` Eric Auger
2026-04-15 10:55 ` [PATCH v4 24/31] hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors Shameer Kolothum
2026-05-05  1:13   ` Nicolin Chen
2026-05-07 16:40   ` Eric Auger
2026-05-08 10:52     ` Shameer Kolothum Thodi
2026-04-15 10:55 ` [PATCH v4 25/31] hw/arm/tegra241-cmdqv: Add reset handler Shameer Kolothum
2026-05-07 16:51   ` Eric Auger
2026-05-08 11:19     ` Shameer Kolothum Thodi
2026-05-08 14:39       ` Eric Auger
2026-05-07 17:03   ` Eric Auger
2026-04-15 10:55 ` [PATCH v4 26/31] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size Shameer Kolothum
2026-05-05  1:26   ` Nicolin Chen
2026-05-07 17:23   ` Eric Auger
2026-05-08 13:38     ` Shameer Kolothum Thodi
2026-05-08 14:41       ` Eric Auger
2026-04-15 10:55 ` [PATCH v4 27/31] hw/arm/smmuv3: Add per-device identifier property Shameer Kolothum
2026-05-05  1:30   ` Nicolin Chen
2026-04-15 10:55 ` [PATCH v4 28/31] hw/arm/smmuv3-accel: Introduce helper to query CMDQV type Shameer Kolothum
2026-05-05  1:32   ` Nicolin Chen
2026-04-15 10:55 ` [PATCH v4 29/31] hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT Shameer Kolothum
2026-05-07 17:32   ` Eric Auger
2026-04-15 10:55 ` [PATCH v4 30/31] hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active Shameer Kolothum
2026-05-05  1:35   ` Nicolin Chen
2026-05-07 17:36   ` Eric Auger
2026-05-08 14:36     ` Shameer Kolothum Thodi
2026-04-15 10:55 ` [PATCH v4 31/31] hw/arm/smmuv3: Add cmdqv property for SMMUv3 device Shameer Kolothum
2026-05-07 17:28   ` Eric Auger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.