All of lore.kernel.org
 help / color / mirror / Atom feed
* [PULL 00/61] target-arm queue
@ 2022-04-22 10:03 Peter Maydell
  2022-04-22 11:41 ` Richard Henderson
  0 siblings, 1 reply; 66+ messages in thread
From: Peter Maydell @ 2022-04-22 10:03 UTC (permalink / raw)
  To: qemu-devel

This pullreq is (1) my GICv4 patches (2) most of the first third of RTH's
cleanup patchset (3) one patch fixing an smmuv3 bug...

thanks
-- PMM

The following changes since commit a74782936dc6e979ce371dabda4b1c05624ea87f:

  Merge tag 'pull-migration-20220421a' of https://gitlab.com/dagrh/qemu into staging (2022-04-21 18:48:18 -0700)

are available in the Git repository at:

  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20220422

for you to fetch changes up to 9792130613191c1e0c34109918c5e07b9f1429a5:

  hw/arm/smmuv3: Pass the actual perm to returned IOMMUTLBEntry in smmuv3_translate() (2022-04-22 10:19:15 +0100)

----------------------------------------------------------------
target-arm queue:
 * Implement GICv4 emulation
 * Some cleanup patches in target/arm
 * hw/arm/smmuv3: Pass the actual perm to returned IOMMUTLBEntry in smmuv3_translate()

----------------------------------------------------------------
Peter Maydell (41):
      hw/intc/arm_gicv3_its: Add missing blank line
      hw/intc/arm_gicv3: Sanity-check num-cpu property
      hw/intc/arm_gicv3: Insist that redist region capacity matches CPU count
      hw/intc/arm_gicv3: Report correct PIDR0 values for ID registers
      target/arm/cpu.c: ignore VIRQ and VFIQ if no EL2
      hw/intc/arm_gicv3_its: Factor out "is intid a valid LPI ID?"
      hw/intc/arm_gicv3_its: Implement GITS_BASER2 for GICv4
      hw/intc/arm_gicv3_its: Implement VMAPI and VMAPTI
      hw/intc/arm_gicv3_its: Implement VMAPP
      hw/intc/arm_gicv3_its: Distinguish success and error cases of CMD_CONTINUE
      hw/intc/arm_gicv3_its: Factor out "find ITE given devid, eventid"
      hw/intc/arm_gicv3_its: Factor out CTE lookup sequence
      hw/intc/arm_gicv3_its: Split out process_its_cmd() physical interrupt code
      hw/intc/arm_gicv3_its: Handle virtual interrupts in process_its_cmd()
      hw/intc/arm_gicv3: Keep pointers to every connected ITS
      hw/intc/arm_gicv3_its: Implement VMOVP
      hw/intc/arm_gicv3_its: Implement VSYNC
      hw/intc/arm_gicv3_its: Implement INV command properly
      hw/intc/arm_gicv3_its: Implement INV for virtual interrupts
      hw/intc/arm_gicv3_its: Implement VMOVI
      hw/intc/arm_gicv3_its: Implement VINVALL
      hw/intc/arm_gicv3: Implement GICv4's new redistributor frame
      hw/intc/arm_gicv3: Implement new GICv4 redistributor registers
      hw/intc/arm_gicv3_cpuif: Split "update vIRQ/vFIQ" from gicv3_cpuif_virt_update()
      hw/intc/arm_gicv3_cpuif: Support vLPIs
      hw/intc/arm_gicv3_cpuif: Don't recalculate maintenance irq unnecessarily
      hw/intc/arm_gicv3_redist: Factor out "update hpplpi for one LPI" logic
      hw/intc/arm_gicv3_redist: Factor out "update hpplpi for all LPIs" logic
      hw/intc/arm_gicv3_redist: Recalculate hppvlpi on VPENDBASER writes
      hw/intc/arm_gicv3_redist: Factor out "update bit in pending table" code
      hw/intc/arm_gicv3_redist: Implement gicv3_redist_process_vlpi()
      hw/intc/arm_gicv3_redist: Implement gicv3_redist_vlpi_pending()
      hw/intc/arm_gicv3_redist: Use set_pending_table_bit() in mov handling
      hw/intc/arm_gicv3_redist: Implement gicv3_redist_mov_vlpi()
      hw/intc/arm_gicv3_redist: Implement gicv3_redist_vinvall()
      hw/intc/arm_gicv3_redist: Implement gicv3_redist_inv_vlpi()
      hw/intc/arm_gicv3: Update ID and feature registers for GICv4
      hw/intc/arm_gicv3: Allow 'revision' property to be set to 4
      hw/arm/virt: Use VIRT_GIC_VERSION_* enum values in create_gic()
      hw/arm/virt: Abstract out calculation of redistributor region capacity
      hw/arm/virt: Support TCG GICv4

Richard Henderson (19):
      target/arm: Update ISAR fields for ARMv8.8
      target/arm: Update SCR_EL3 bits to ARMv8.8
      target/arm: Update SCTLR bits to ARMv9.2
      target/arm: Change DisasContext.aarch64 to bool
      target/arm: Change CPUArchState.aarch64 to bool
      target/arm: Extend store_cpu_offset to take field size
      target/arm: Change DisasContext.thumb to bool
      target/arm: Change CPUArchState.thumb to bool
      target/arm: Remove fpexc32_access
      target/arm: Split out set_btype_raw
      target/arm: Split out gen_rebuild_hflags
      target/arm: Simplify GEN_SHIFT in translate.c
      target/arm: Simplify gen_sar
      target/arm: Simplify aa32 DISAS_WFI
      target/arm: Use tcg_constant in translate-m-nocp.c
      target/arm: Use tcg_constant in translate-neon.c
      target/arm: Use smin/smax for do_sat_addsub_32
      target/arm: Use tcg_constant in translate-vfp.c
      target/arm: Use tcg_constant_i32 in translate.h

Xiang Chen (1):
      hw/arm/smmuv3: Pass the actual perm to returned IOMMUTLBEntry in smmuv3_translate()

 docs/system/arm/virt.rst               |   5 +-
 hw/intc/gicv3_internal.h               | 231 ++++++++-
 include/hw/arm/virt.h                  |  19 +-
 include/hw/intc/arm_gicv3_common.h     |  13 +
 include/hw/intc/arm_gicv3_its_common.h |   1 +
 target/arm/cpu.h                       |  59 ++-
 target/arm/translate-a32.h             |  13 +-
 target/arm/translate.h                 |  17 +-
 hw/arm/smmuv3.c                        |   2 +-
 hw/arm/virt.c                          | 102 +++-
 hw/intc/arm_gicv3_common.c             |  54 +-
 hw/intc/arm_gicv3_cpuif.c              | 195 ++++++--
 hw/intc/arm_gicv3_dist.c               |   7 +-
 hw/intc/arm_gicv3_its.c                | 876 +++++++++++++++++++++++++++------
 hw/intc/arm_gicv3_its_kvm.c            |   2 +
 hw/intc/arm_gicv3_kvm.c                |   5 +
 hw/intc/arm_gicv3_redist.c             | 480 +++++++++++++++---
 linux-user/arm/cpu_loop.c              |   2 +-
 target/arm/cpu.c                       |  16 +-
 target/arm/helper-a64.c                |   4 +-
 target/arm/helper.c                    |  19 +-
 target/arm/hvf/hvf.c                   |   2 +-
 target/arm/m_helper.c                  |   6 +-
 target/arm/op_helper.c                 |  13 -
 target/arm/translate-a64.c             |  50 +-
 target/arm/translate-m-nocp.c          |  12 +-
 target/arm/translate-neon.c            |  21 +-
 target/arm/translate-sve.c             |   9 +-
 target/arm/translate-vfp.c             |  76 +--
 target/arm/translate.c                 | 101 ++--
 hw/intc/trace-events                   |  18 +-
 31 files changed, 1890 insertions(+), 540 deletions(-)


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PULL 00/61] target-arm queue
  2022-04-22 10:03 Peter Maydell
@ 2022-04-22 11:41 ` Richard Henderson
  2022-04-22 13:48   ` Peter Maydell
  0 siblings, 1 reply; 66+ messages in thread
From: Richard Henderson @ 2022-04-22 11:41 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel

On 4/22/22 03:03, Peter Maydell wrote:
> This pullreq is (1) my GICv4 patches (2) most of the first third of RTH's
> cleanup patchset (3) one patch fixing an smmuv3 bug...
> 
> thanks
> -- PMM
> 
> The following changes since commit a74782936dc6e979ce371dabda4b1c05624ea87f:
> 
>    Merge tag 'pull-migration-20220421a' of https://gitlab.com/dagrh/qemu into staging (2022-04-21 18:48:18 -0700)
> 
> are available in the Git repository at:
> 
>    https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20220422
> 
> for you to fetch changes up to 9792130613191c1e0c34109918c5e07b9f1429a5:
> 
>    hw/arm/smmuv3: Pass the actual perm to returned IOMMUTLBEntry in smmuv3_translate() (2022-04-22 10:19:15 +0100)
> 
> ----------------------------------------------------------------
> target-arm queue:
>   * Implement GICv4 emulation
>   * Some cleanup patches in target/arm
>   * hw/arm/smmuv3: Pass the actual perm to returned IOMMUTLBEntry in smmuv3_translate()
> 
> ----------------------------------------------------------------
> Peter Maydell (41):
>        hw/intc/arm_gicv3_its: Add missing blank line
>        hw/intc/arm_gicv3: Sanity-check num-cpu property
>        hw/intc/arm_gicv3: Insist that redist region capacity matches CPU count
>        hw/intc/arm_gicv3: Report correct PIDR0 values for ID registers
>        target/arm/cpu.c: ignore VIRQ and VFIQ if no EL2
>        hw/intc/arm_gicv3_its: Factor out "is intid a valid LPI ID?"
>        hw/intc/arm_gicv3_its: Implement GITS_BASER2 for GICv4
>        hw/intc/arm_gicv3_its: Implement VMAPI and VMAPTI
>        hw/intc/arm_gicv3_its: Implement VMAPP
>        hw/intc/arm_gicv3_its: Distinguish success and error cases of CMD_CONTINUE
>        hw/intc/arm_gicv3_its: Factor out "find ITE given devid, eventid"
>        hw/intc/arm_gicv3_its: Factor out CTE lookup sequence
>        hw/intc/arm_gicv3_its: Split out process_its_cmd() physical interrupt code
>        hw/intc/arm_gicv3_its: Handle virtual interrupts in process_its_cmd()
>        hw/intc/arm_gicv3: Keep pointers to every connected ITS
>        hw/intc/arm_gicv3_its: Implement VMOVP
>        hw/intc/arm_gicv3_its: Implement VSYNC
>        hw/intc/arm_gicv3_its: Implement INV command properly
>        hw/intc/arm_gicv3_its: Implement INV for virtual interrupts
>        hw/intc/arm_gicv3_its: Implement VMOVI
>        hw/intc/arm_gicv3_its: Implement VINVALL
>        hw/intc/arm_gicv3: Implement GICv4's new redistributor frame
>        hw/intc/arm_gicv3: Implement new GICv4 redistributor registers
>        hw/intc/arm_gicv3_cpuif: Split "update vIRQ/vFIQ" from gicv3_cpuif_virt_update()
>        hw/intc/arm_gicv3_cpuif: Support vLPIs
>        hw/intc/arm_gicv3_cpuif: Don't recalculate maintenance irq unnecessarily
>        hw/intc/arm_gicv3_redist: Factor out "update hpplpi for one LPI" logic
>        hw/intc/arm_gicv3_redist: Factor out "update hpplpi for all LPIs" logic
>        hw/intc/arm_gicv3_redist: Recalculate hppvlpi on VPENDBASER writes
>        hw/intc/arm_gicv3_redist: Factor out "update bit in pending table" code
>        hw/intc/arm_gicv3_redist: Implement gicv3_redist_process_vlpi()
>        hw/intc/arm_gicv3_redist: Implement gicv3_redist_vlpi_pending()
>        hw/intc/arm_gicv3_redist: Use set_pending_table_bit() in mov handling
>        hw/intc/arm_gicv3_redist: Implement gicv3_redist_mov_vlpi()
>        hw/intc/arm_gicv3_redist: Implement gicv3_redist_vinvall()
>        hw/intc/arm_gicv3_redist: Implement gicv3_redist_inv_vlpi()
>        hw/intc/arm_gicv3: Update ID and feature registers for GICv4
>        hw/intc/arm_gicv3: Allow 'revision' property to be set to 4
>        hw/arm/virt: Use VIRT_GIC_VERSION_* enum values in create_gic()
>        hw/arm/virt: Abstract out calculation of redistributor region capacity
>        hw/arm/virt: Support TCG GICv4
> 
> Richard Henderson (19):
>        target/arm: Update ISAR fields for ARMv8.8
>        target/arm: Update SCR_EL3 bits to ARMv8.8
>        target/arm: Update SCTLR bits to ARMv9.2
>        target/arm: Change DisasContext.aarch64 to bool
>        target/arm: Change CPUArchState.aarch64 to bool
>        target/arm: Extend store_cpu_offset to take field size
>        target/arm: Change DisasContext.thumb to bool
>        target/arm: Change CPUArchState.thumb to bool
>        target/arm: Remove fpexc32_access
>        target/arm: Split out set_btype_raw
>        target/arm: Split out gen_rebuild_hflags
>        target/arm: Simplify GEN_SHIFT in translate.c
>        target/arm: Simplify gen_sar
>        target/arm: Simplify aa32 DISAS_WFI
>        target/arm: Use tcg_constant in translate-m-nocp.c
>        target/arm: Use tcg_constant in translate-neon.c
>        target/arm: Use smin/smax for do_sat_addsub_32
>        target/arm: Use tcg_constant in translate-vfp.c
>        target/arm: Use tcg_constant_i32 in translate.h
> 
> Xiang Chen (1):
>        hw/arm/smmuv3: Pass the actual perm to returned IOMMUTLBEntry in smmuv3_translate()
> 
>   docs/system/arm/virt.rst               |   5 +-
>   hw/intc/gicv3_internal.h               | 231 ++++++++-
>   include/hw/arm/virt.h                  |  19 +-
>   include/hw/intc/arm_gicv3_common.h     |  13 +
>   include/hw/intc/arm_gicv3_its_common.h |   1 +
>   target/arm/cpu.h                       |  59 ++-
>   target/arm/translate-a32.h             |  13 +-
>   target/arm/translate.h                 |  17 +-
>   hw/arm/smmuv3.c                        |   2 +-
>   hw/arm/virt.c                          | 102 +++-
>   hw/intc/arm_gicv3_common.c             |  54 +-
>   hw/intc/arm_gicv3_cpuif.c              | 195 ++++++--
>   hw/intc/arm_gicv3_dist.c               |   7 +-
>   hw/intc/arm_gicv3_its.c                | 876 +++++++++++++++++++++++++++------
>   hw/intc/arm_gicv3_its_kvm.c            |   2 +
>   hw/intc/arm_gicv3_kvm.c                |   5 +
>   hw/intc/arm_gicv3_redist.c             | 480 +++++++++++++++---
>   linux-user/arm/cpu_loop.c              |   2 +-
>   target/arm/cpu.c                       |  16 +-
>   target/arm/helper-a64.c                |   4 +-
>   target/arm/helper.c                    |  19 +-
>   target/arm/hvf/hvf.c                   |   2 +-
>   target/arm/m_helper.c                  |   6 +-
>   target/arm/op_helper.c                 |  13 -
>   target/arm/translate-a64.c             |  50 +-
>   target/arm/translate-m-nocp.c          |  12 +-
>   target/arm/translate-neon.c            |  21 +-
>   target/arm/translate-sve.c             |   9 +-
>   target/arm/translate-vfp.c             |  76 +--
>   target/arm/translate.c                 | 101 ++--
>   hw/intc/trace-events                   |  18 +-
>   31 files changed, 1890 insertions(+), 540 deletions(-)
> 

Fails cross-arm64-system:

../hw/intc/arm_gicv3_its_kvm.c: In function ‘kvm_arm_its_realize’:
../hw/intc/arm_gicv3_its_kvm.c:109:5: error: implicit declaration of function 
‘gicv3_add_its’ [-Werror=implicit-function-declaration]
   109 |     gicv3_add_its(s->gicv3, dev);
       |     ^~~~~~~~~~~~~
../hw/intc/arm_gicv3_its_kvm.c:109:5: error: nested extern declaration of ‘gicv3_add_its’ 
[-Werror=nested-externs]
cc1: all warnings being treated as errors

https://gitlab.com/qemu-project/qemu/-/jobs/2365050344

r~


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PULL 00/61] target-arm queue
  2022-04-22 11:41 ` Richard Henderson
@ 2022-04-22 13:48   ` Peter Maydell
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2022-04-22 13:48 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Fri, 22 Apr 2022 at 12:41, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 4/22/22 03:03, Peter Maydell wrote:
> > This pullreq is (1) my GICv4 patches (2) most of the first third of RTH's
> > cleanup patchset (3) one patch fixing an smmuv3 bug...

> Fails cross-arm64-system:
>
> ../hw/intc/arm_gicv3_its_kvm.c: In function ‘kvm_arm_its_realize’:
> ../hw/intc/arm_gicv3_its_kvm.c:109:5: error: implicit declaration of function
> ‘gicv3_add_its’ [-Werror=implicit-function-declaration]
>    109 |     gicv3_add_its(s->gicv3, dev);
>        |     ^~~~~~~~~~~~~
> ../hw/intc/arm_gicv3_its_kvm.c:109:5: error: nested extern declaration of ‘gicv3_add_its’
> [-Werror=nested-externs]
> cc1: all warnings being treated as errors

Oops. Just sent a v2 that fixes that.

-- PMM


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PULL 00/61] target-arm queue
@ 2026-06-16 19:05 Peter Maydell
  2026-06-16 19:05 ` [PULL 01/61] hw/arm/smmuv3: Update ATC invalidation check Peter Maydell
                   ` (61 more replies)
  0 siblings, 62 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

Hi; here's this week's arm pullreq; lots of smmuv3 related work here,
plus some initial work towards emulating FEAT_SVE2p2.

thanks
-- PMM

The following changes since commit 2f28d34ea0aead9830478cd1d3d0dd9d9191d82e:

  Merge tag 'pull-tcg-20260612' of https://gitlab.com/rth7680/qemu into staging (2026-06-13 14:02:34 -0400)

are available in the Git repository at:

  https://gitlab.com/pm215/qemu.git tags/pull-target-arm-20260616

for you to fetch changes up to 8de1ba58af636d1844be43b9ea6e34ced302a057:

  target/arm: Implement floating-point log and convert to integer (zeroing) (2026-06-16 16:54:25 +0100)

----------------------------------------------------------------
target-arm queue:
 * Implementation of various insns preparatory to FEAT_SVE2p2
 * hw/arm/smmuv3: Make smmuv3 ATS, RIL, SSIDSIZE, and OAS 'auto' properties work
 * hw/pci/pci: Enforce pci_setup_iommu_per_bus() is called only once per bus
 * hw/arm/virt: Introduce Tegra241 CMDQV support for accelerated SMMUv3
 * target/arm: honour CCR.BFHFNMIGN for probed data BusFaults
 * hw/arm/bcm2838: Route I2C interrupts to GIC

----------------------------------------------------------------
Eric Auger (1):
      hw/pci/pci: Enforce pci_setup_iommu_per_bus() is called only once per bus

Kyle Fox (1):
      target/arm: honour CCR.BFHFNMIGN for probed data BusFaults

Nathan Chen (9):
      hw/arm/smmuv3: Update ATC invalidation check
      hw/arm/smmuv3: Improve accel SMMUv3 usage documentation
      hw/arm/smmuv3-accel: Add helper for resolving auto parameters
      hw/arm/smmuv3-accel: Implement "auto" value for "ats"
      hw/arm/smmuv3-accel: Implement "auto" value for "ril"
      hw/arm/smmuv3-accel: Implement "auto" value for "ssidsize"
      hw/arm/smmuv3-accel: Implement "auto" value for "oas"
      hw/arm/smmuv3: Set default ats, ril, ssidsize, oas to auto
      qemu-options.hx: Support "auto" for accel SMMUv3 properties

Nicholas Righi (1):
      hw/arm/bcm2838: Route I2C interrupts to GIC

Nicolin Chen (15):
      backends/iommufd: Update iommufd_backend_get_device_info
      backends/iommufd: Update iommufd_backend_alloc_viommu to allow user ptr
      backends/iommufd: Introduce iommufd_backend_alloc_hw_queue
      backends/iommufd: Introduce iommufd_backend_viommu_mmap
      hw/arm/tegra241-cmdqv: Implement CMDQV init
      hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free
      hw/arm/tegra241-cmdqv: mmap host VINTF Page0 for CMDQV
      hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region
      hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads
      hw/arm/tegra241-cmdqv: Emulate VCMDQ register writes
      hw/arm/tegra241-cmdqv: Allocate HW VCMDQs once configured
      hw/arm/tegra241-cmdqv: Use mmap'd host VINTF page0 for virtual VINTF page0
      hw/arm/tegra241-cmdqv: Initialize register state on reset
      hw/arm/tegra241-cmdqv: Limit queue size based on backend page size
      hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT

Richard Henderson (18):
      target/arm: Add feature predicates for SVE2.2 and SME2.2
      target/arm: Rename sve unary predicated patterns
      target/arm: Enable zeroing in DO_ZPZ macros in sve_helper.c
      target/arm: Expand DO_ZPZ in translate-sve.c
      target/arm: Implement SVE integer unary operations (predicated, zeroing)
      target/arm: Implement SVE bitwise unary operations (predicated, zeroing)
      target/arm: Implement SVE reverse within elements (zeroing)
      target/arm: Implement SVE reverse doublewords (zeroing)
      target/arm: Implement SVE2 integer unary operations (predicated, zeroing)
      target/arm: Add data argument to do_frint_mode
      target/arm: Implement Floating-point round to integral value (predicated, zeroing)
      target/arm: Implement Floating-point convert (predicated, zeroing)
      target/arm: Implement Floating-point square root (predicated, zeroing)
      target/arm: Implement SCVTF, UCVTF (predicated, zeroing)
      target/arm: Implement FRINT{32,64}{X,Z}
      target/arm: Enable zeroing in DO_FCVT{N, L}T macros in sve_helper.c
      target/arm: Implement SVE floating-point convert (top, predicated, zeroing)
      target/arm: Implement floating-point log and convert to integer (zeroing)

Shameer Kolothum (16):
      system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq
      hw/arm/smmuv3-accel: Introduce CMDQV ops interface
      hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub
      hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle
      hw/arm/virt: Use stored SMMUv3 device list for IORT build
      hw/arm/tegra241-cmdqv: Probe host Tegra241 CMDQV support
      hw/arm/virt: Link SMMUv3 CMDQV resources to platform bus
      hw/arm/tegra241-cmdqv: Route allocated VCMDQ Page0 accesses to the mmap'd host VINTF page0
      memory: Allow RAM device regions to skip IOMMU mapping
      hw/arm/smmuv3-accel: Introduce common helper for veventq read
      hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors
      hw/arm/smmuv3: Add per-device identifier property
      hw/arm/smmuv3-accel: Introduce helper to query CMDQV type
      hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active
      hw/arm/tegra241-cmdqv: Document the CMDQV design and lifecycle
      hw/arm/smmuv3: Add cmdqv property for SMMUv3 device

 backends/iommufd.c                   |   64 ++
 backends/trace-events                |    4 +-
 hw/arm/Kconfig                       |    5 +
 hw/arm/bcm2835_peripherals.c         |    9 +
 hw/arm/bcm2838.c                     |    4 +
 hw/arm/meson.build                   |    2 +
 hw/arm/smmu-common.c                 |    4 +-
 hw/arm/smmuv3-accel-stubs.c          |   12 +
 hw/arm/smmuv3-accel.c                |  277 +++++++--
 hw/arm/smmuv3-accel.h                |   50 ++
 hw/arm/smmuv3.c                      |   91 +--
 hw/arm/tegra241-cmdqv-stubs.c        |   16 +
 hw/arm/tegra241-cmdqv.c              | 1119 ++++++++++++++++++++++++++++++++++
 hw/arm/tegra241-cmdqv.h              |  384 ++++++++++++
 hw/arm/trace-events                  |   11 +
 hw/arm/virt-acpi-build.c             |  127 ++--
 hw/arm/virt.c                        |   37 ++
 hw/core/machine.c                    |    5 +
 hw/pci/pci.c                         |    9 +-
 hw/vfio/iommufd.c                    |    4 +-
 hw/vfio/listener.c                   |    6 +
 hw/vfio/trace-events                 |    1 +
 include/hw/arm/bcm2835_peripherals.h |    2 +
 include/hw/arm/bcm2838_peripherals.h |    1 +
 include/hw/arm/smmuv3.h              |    6 +
 include/hw/arm/virt.h                |    1 +
 include/hw/pci/pci.h                 |   16 +-
 include/system/iommufd.h             |   17 +-
 include/system/memory.h              |   21 +
 qemu-options.hx                      |   33 +-
 system/memory.c                      |   10 +
 target/arm/cpu-features.h            |   17 +-
 target/arm/tcg/helper-sve-defs.h     |    9 +
 target/arm/tcg/m_helper.c            |   16 +-
 target/arm/tcg/sve.decode            |  261 +++++---
 target/arm/tcg/sve_helper.c          |   30 +-
 target/arm/tcg/tlb_helper.c          |   24 +
 target/arm/tcg/translate-sve.c       |  471 ++++++++++----
 38 files changed, 2861 insertions(+), 315 deletions(-)
 create mode 100644 hw/arm/tegra241-cmdqv-stubs.c
 create mode 100644 hw/arm/tegra241-cmdqv.c
 create mode 100644 hw/arm/tegra241-cmdqv.h


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PULL 01/61] hw/arm/smmuv3: Update ATC invalidation check
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 02/61] hw/arm/smmuv3: Improve accel SMMUv3 usage documentation Peter Maydell
                   ` (60 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nathan Chen <nathanc@nvidia.com>

Use smmuv3_ats_enabled() to determine whether ATS is enabled for the
guest when handling an ATC invalidation command, as setting the ATS
property value to 'auto' will resolve to ATS being detected as
enabled in the ATC invalidation check otherwise.

Fixes: f7f5013a55a3 ("hw/arm/smmuv3-accel: Add support for ATS")
Reported-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260608174900.2227340-2-nathanc@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 5c2855c377..0c65dc8c91 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1528,7 +1528,7 @@ static int smmuv3_cmdq_consume(SMMUv3State *s, Error **errp)
         {
             SMMUDevice *sdev = smmu_find_sdev(bs, CMD_SID(&cmd));
 
-            if (!sdev || !s->ats) {
+            if (!sdev || !smmuv3_ats_enabled(s)) {
                 trace_smmuv3_unhandled_cmd(type);
                 break;
             }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 02/61] hw/arm/smmuv3: Improve accel SMMUv3 usage documentation
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
  2026-06-16 19:05 ` [PULL 01/61] hw/arm/smmuv3: Update ATC invalidation check Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 03/61] hw/arm/smmuv3-accel: Add helper for resolving auto parameters Peter Maydell
                   ` (59 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nathan Chen <nathanc@nvidia.com>

Add a statement to clarify that the host SMMUv3 must support HW-accelerated
vfio-pci device assignment when setting accel=on.

Reported-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Message-id: 20260608174900.2227340-3-nathanc@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 0c65dc8c91..6b5035e11f 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -2165,7 +2165,9 @@ static void smmuv3_class_init(ObjectClass *klass, const void *data)
 
     object_class_property_set_description(klass, "accel",
         "Enable SMMUv3 accelerator support. Allows host SMMUv3 to be "
-        "configured in nested mode for vfio-pci dev assignment");
+        "configured in nested mode for vfio-pci dev assignment. Please "
+        "ensure the host SMMUv3 supports nested translation before "
+        "enabling.");
     object_class_property_set_description(klass, "ril",
         "Disable range invalidation support (for accel=on). ril=auto "
         "is not supported.");
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 03/61] hw/arm/smmuv3-accel: Add helper for resolving auto parameters
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
  2026-06-16 19:05 ` [PULL 01/61] hw/arm/smmuv3: Update ATC invalidation check Peter Maydell
  2026-06-16 19:05 ` [PULL 02/61] hw/arm/smmuv3: Improve accel SMMUv3 usage documentation Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 04/61] hw/arm/smmuv3-accel: Implement "auto" value for "ats" Peter Maydell
                   ` (58 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nathan Chen <nathanc@nvidia.com>

Introduce smmuv3_accel_auto_finalise() to resolve properties that are
set to 'auto' for accelerated SMMUv3. This helper function allows
properties such as ats, ril, ssidsize, and oas support to be resolved
from host IOMMU capabilities via IOMMU_GET_HW_INFO.

The later commits in this series set the auto_mode flag to true when
an accel SMMUv3 property value is explicitly set to 'auto', or if the
property value is not set and defaults to auto mode.

Setting these property values to 'auto' requires at least one
cold-plugged device to retrieve and finalise these properties. If the
auto_mode flag is true, register a machine_init_done notifier to
verify this requirement and fail boot if it is not met.

Hot-plugged devices into an accel SMMUv3-associated bus will re-use
the resolved host values from the initial cold-plug.

Subsequent patches will make use of this helper to resolve 'auto' to
what is reported by host IOMMU capabilities.

Suggested-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260608174900.2227340-4-nathanc@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3-accel.c   | 41 +++++++++++++++++++++++++++++++++++++++++
 hw/arm/smmuv3-accel.h   |  2 ++
 include/hw/arm/smmuv3.h |  3 +++
 3 files changed, 46 insertions(+)

diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 862be814a0..fa0a97cece 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -18,6 +18,7 @@
 
 #include "smmuv3-internal.h"
 #include "smmuv3-accel.h"
+#include "system/system.h"
 
 /*
  * The root region aliases the global system memory, and shared_as_sysmem
@@ -35,11 +36,33 @@ static int smmuv3_oas_bits(uint32_t oas)
     return map[oas];
 }
 
+static void smmuv3_accel_auto_finalise(SMMUv3State *s,
+                                       struct iommu_hw_info_arm_smmuv3 *info)
+{
+    SMMUv3AccelState *accel = s->s_accel;
+
+    /*
+     * Return if 'auto' was not set for any accel SMMUv3 property, or
+     * if property values were already resolved from a previous call
+     * to this function (e.g. if this function was called again after
+     * VM boot during device hot plug). We do not accept new property
+     * values in this case where auto_finalised == true, and we re-use
+     * the values determined from the initial cold plug.
+     */
+    if (!accel->auto_mode || accel->auto_finalised) {
+        return;
+    }
+
+    accel->auto_finalised = true;
+}
+
 static bool
 smmuv3_accel_check_hw_compatible(SMMUv3State *s,
                                  struct iommu_hw_info_arm_smmuv3 *info,
                                  Error **errp)
 {
+    smmuv3_accel_auto_finalise(s, info);
+
     /* QEMU SMMUv3 supports both linear and 2-level stream tables */
     if (FIELD_EX32(info->idr[0], IDR0, STLEVEL) !=
                 FIELD_EX32(s->idr[0], IDR0, STLEVEL)) {
@@ -918,6 +941,18 @@ static void smmuv3_accel_as_init(SMMUv3State *s)
     address_space_init(shared_as_sysmem, &root, "smmuv3-accel-as-sysmem");
 }
 
+static void smmuv3_accel_machine_done(Notifier *notifier, void *data)
+{
+    SMMUv3State *s = container_of(notifier, SMMUv3State, machine_done);
+    SMMUv3AccelState *accel = s->s_accel;
+
+    if (accel->auto_mode && !accel->auto_finalised) {
+        error_report("arm-smmuv3 accel=on with 'auto' properties requires "
+                     "at least one cold-plugged VFIO device");
+        exit(1);
+    }
+}
+
 bool smmuv3_accel_init(SMMUv3State *s, Error **errp)
 {
     SMMUState *bs = ARM_SMMU(s);
@@ -925,5 +960,11 @@ bool smmuv3_accel_init(SMMUv3State *s, Error **errp)
     s->s_accel = g_new0(SMMUv3AccelState, 1);
     bs->iommu_ops = &smmuv3_accel_ops;
     smmuv3_accel_as_init(s);
+
+    if (s->s_accel->auto_mode) {
+        s->machine_done.notify = smmuv3_accel_machine_done;
+        qemu_add_machine_init_done_notifier(&s->machine_done);
+    }
+
     return true;
 }
diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
index 407940616c..87fecb5c68 100644
--- a/hw/arm/smmuv3-accel.h
+++ b/hw/arm/smmuv3-accel.h
@@ -25,6 +25,8 @@ typedef struct SMMUv3AccelState {
     uint32_t bypass_hwpt_id;
     uint32_t abort_hwpt_id;
     QLIST_HEAD(, SMMUv3AccelDevice) device_list;
+    bool auto_mode;
+    bool auto_finalised;
 } SMMUv3AccelState;
 
 typedef struct SMMUS1Hwpt {
diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
index 82f18eb090..85be3d7467 100644
--- a/include/hw/arm/smmuv3.h
+++ b/include/hw/arm/smmuv3.h
@@ -22,6 +22,7 @@
 #include "hw/arm/smmu-common.h"
 #include "qom/object.h"
 #include "qapi/qapi-types-misc-arm.h"
+#include "qemu/notify.h"
 
 #define TYPE_SMMUV3_IOMMU_MEMORY_REGION "smmuv3-iommu-memory-region"
 
@@ -74,6 +75,8 @@ struct SMMUv3State {
     OnOffAuto ats;
     OasMode oas;
     SsidSizeMode ssidsize;
+
+    Notifier machine_done;
 };
 
 typedef enum {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 04/61] hw/arm/smmuv3-accel: Implement "auto" value for "ats"
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (2 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 03/61] hw/arm/smmuv3-accel: Add helper for resolving auto parameters Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 05/61] hw/arm/smmuv3-accel: Implement "auto" value for "ril" Peter Maydell
                   ` (57 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nathan Chen <nathanc@nvidia.com>

Allow accelerated SMMUv3 Address Translation Services support property
to be derived from host IOMMU capabilities. Derive host values using
IOMMU_GET_HW_INFO, retrieving ATS capability from IDR0.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260608174900.2227340-5-nathanc@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3-accel.c |  9 +++++++++
 hw/arm/smmuv3.c       | 11 ++++-------
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index fa0a97cece..4c2acbbedf 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -53,6 +53,11 @@ static void smmuv3_accel_auto_finalise(SMMUv3State *s,
         return;
     }
 
+    if (s->ats == ON_OFF_AUTO_AUTO) {
+        s->idr[0] = FIELD_DP32(s->idr[0], IDR0, ATS,
+                               FIELD_EX32(info->idr[0], IDR0, ATS));
+    }
+
     accel->auto_finalised = true;
 }
 
@@ -961,6 +966,10 @@ bool smmuv3_accel_init(SMMUv3State *s, Error **errp)
     bs->iommu_ops = &smmuv3_accel_ops;
     smmuv3_accel_as_init(s);
 
+    if (s->ats == ON_OFF_AUTO_AUTO) {
+        s->s_accel->auto_mode = true;
+    }
+
     if (s->s_accel->auto_mode) {
         s->machine_done.notify = smmuv3_accel_machine_done;
         qemu_add_machine_init_done_notifier(&s->machine_done);
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 6b5035e11f..436155d883 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1965,10 +1965,6 @@ static void smmu_reset_exit(Object *obj, ResetType type)
 
 static bool smmu_validate_property(SMMUv3State *s, Error **errp)
 {
-    if (s->ats == ON_OFF_AUTO_AUTO) {
-        error_setg(errp, "ats auto mode is not supported");
-        return false;
-    }
     if (s->ril == ON_OFF_AUTO_AUTO) {
         error_setg(errp, "ril auto mode is not supported");
         return false;
@@ -2172,9 +2168,10 @@ static void smmuv3_class_init(ObjectClass *klass, const void *data)
         "Disable range invalidation support (for accel=on). ril=auto "
         "is not supported.");
     object_class_property_set_description(klass, "ats",
-        "Enable/disable ATS support (for accel=on). Please ensure host "
-        "platform has ATS support before enabling this. ats=auto is not "
-        "supported.");
+        "Enable/disable ATS support (for accel=on). "
+        "Valid values are on, off, and auto. Defaults to off. "
+        "Please ensure host platform supports ATS before setting it "
+        "to on.");
     object_class_property_set_description(klass, "oas",
         "Specify Output Address Size (for accel=on). Supported values "
         "are 44 or 48 bits. Defaults to 44 bits. oas=auto is not "
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 05/61] hw/arm/smmuv3-accel: Implement "auto" value for "ril"
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (3 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 04/61] hw/arm/smmuv3-accel: Implement "auto" value for "ats" Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 06/61] hw/arm/smmuv3-accel: Implement "auto" value for "ssidsize" Peter Maydell
                   ` (56 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nathan Chen <nathanc@nvidia.com>

Allow accelerated SMMUv3 Range Invalidation support property to be
derived from host IOMMU capabilities. Derive host values using
IOMMU_GET_HW_INFO, retrieving RIL capability from IDR3.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260608174900.2227340-6-nathanc@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3-accel.c |  8 +++++++-
 hw/arm/smmuv3.c       | 10 ++++------
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 4c2acbbedf..25699a1317 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -58,6 +58,11 @@ static void smmuv3_accel_auto_finalise(SMMUv3State *s,
                                FIELD_EX32(info->idr[0], IDR0, ATS));
     }
 
+    if (s->ril == ON_OFF_AUTO_AUTO) {
+        s->idr[3] = FIELD_DP32(s->idr[3], IDR3, RIL,
+                               FIELD_EX32(info->idr[3], IDR3, RIL));
+    }
+
     accel->auto_finalised = true;
 }
 
@@ -966,7 +971,8 @@ bool smmuv3_accel_init(SMMUv3State *s, Error **errp)
     bs->iommu_ops = &smmuv3_accel_ops;
     smmuv3_accel_as_init(s);
 
-    if (s->ats == ON_OFF_AUTO_AUTO) {
+    if (s->ats == ON_OFF_AUTO_AUTO ||
+        s->ril == ON_OFF_AUTO_AUTO) {
         s->s_accel->auto_mode = true;
     }
 
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 436155d883..c183964b2e 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1965,10 +1965,6 @@ static void smmu_reset_exit(Object *obj, ResetType type)
 
 static bool smmu_validate_property(SMMUv3State *s, Error **errp)
 {
-    if (s->ril == ON_OFF_AUTO_AUTO) {
-        error_setg(errp, "ril auto mode is not supported");
-        return false;
-    }
     if (s->ssidsize == SSID_SIZE_MODE_AUTO) {
         error_setg(errp, "ssidsize auto mode is not supported");
         return false;
@@ -2165,8 +2161,10 @@ static void smmuv3_class_init(ObjectClass *klass, const void *data)
         "ensure the host SMMUv3 supports nested translation before "
         "enabling.");
     object_class_property_set_description(klass, "ril",
-        "Disable range invalidation support (for accel=on). ril=auto "
-        "is not supported.");
+        "Enable/disable range invalidation support (for accel=on). "
+        "Valid values are on, off, and auto. Defaults to on. "
+        "Any attempt to turn it 'on' while the host does not support "
+        "it would fail.");
     object_class_property_set_description(klass, "ats",
         "Enable/disable ATS support (for accel=on). "
         "Valid values are on, off, and auto. Defaults to off. "
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 06/61] hw/arm/smmuv3-accel: Implement "auto" value for "ssidsize"
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (4 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 05/61] hw/arm/smmuv3-accel: Implement "auto" value for "ril" Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 07/61] hw/arm/smmuv3-accel: Implement "auto" value for "oas" Peter Maydell
                   ` (55 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nathan Chen <nathanc@nvidia.com>

Allow accelerated SMMUv3 SSID size property to be derived from host
IOMMU capabilities. Derive host values using IOMMU_GET_HW_INFO,
retrieving SSID size from IDR1. When the auto SSID size is resolved
to a non-zero value, PASID capability is advertised to the vIOMMU
and accelerated use cases such as Shared Virtual Addressing (SVA)
are supported.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260608174900.2227340-7-nathanc@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3-accel.c | 17 +++++++++++++++--
 hw/arm/smmuv3.c       | 22 ++++++++++++----------
 2 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 25699a1317..3f141991eb 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -63,6 +63,11 @@ static void smmuv3_accel_auto_finalise(SMMUv3State *s,
                                FIELD_EX32(info->idr[3], IDR3, RIL));
     }
 
+    if (s->ssidsize == SSID_SIZE_MODE_AUTO) {
+        s->idr[1] = FIELD_DP32(s->idr[1], IDR1, SSIDSIZE,
+                               FIELD_EX32(info->idr[1], IDR1, SSIDSIZE));
+    }
+
     accel->auto_finalised = true;
 }
 
@@ -823,6 +828,13 @@ static AddressSpace *smmuv3_accel_find_add_as(PCIBus *bus, void *opaque,
     }
 }
 
+static inline bool smmuv3_pasid_supported(SMMUv3State *s)
+{
+    return s->ssidsize > SSID_SIZE_MODE_0 ||
+           (s->ssidsize == SSID_SIZE_MODE_AUTO &&
+            FIELD_EX32(s->idr[1], IDR1, SSIDSIZE));
+}
+
 static uint64_t smmuv3_accel_get_viommu_flags(void *opaque)
 {
     /*
@@ -835,7 +847,7 @@ static uint64_t smmuv3_accel_get_viommu_flags(void *opaque)
     SMMUState *bs = opaque;
     SMMUv3State *s = ARM_SMMUV3(bs);
 
-    if (s->ssidsize > SSID_SIZE_MODE_0) {
+    if (smmuv3_pasid_supported(s)) {
         flags |= VIOMMU_FLAG_PASID_SUPPORTED;
     }
     return flags;
@@ -972,7 +984,8 @@ bool smmuv3_accel_init(SMMUv3State *s, Error **errp)
     smmuv3_accel_as_init(s);
 
     if (s->ats == ON_OFF_AUTO_AUTO ||
-        s->ril == ON_OFF_AUTO_AUTO) {
+        s->ril == ON_OFF_AUTO_AUTO ||
+        s->ssidsize == SSID_SIZE_MODE_AUTO) {
         s->s_accel->auto_mode = true;
     }
 
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index c183964b2e..d59931d1a7 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -626,7 +626,10 @@ static int decode_ste(SMMUv3State *s, SMMUTransCfg *cfg,
     }
 
     /* Multiple context descriptors require SubstreamID support */
-    if (s->ssidsize == SSID_SIZE_MODE_0 && STE_S1CDMAX(ste) != 0) {
+    if ((s->ssidsize == SSID_SIZE_MODE_0 ||
+         (s->ssidsize == SSID_SIZE_MODE_AUTO &&
+          !FIELD_EX32(s->idr[1], IDR1, SSIDSIZE))) &&
+        STE_S1CDMAX(ste) != 0) {
         qemu_log_mask(LOG_UNIMP,
                 "SMMUv3: multiple S1 context descriptors require SubstreamID support. "
                 "Configure ssidsize > 0 (requires accel=on)\n");
@@ -1965,10 +1968,6 @@ static void smmu_reset_exit(Object *obj, ResetType type)
 
 static bool smmu_validate_property(SMMUv3State *s, Error **errp)
 {
-    if (s->ssidsize == SSID_SIZE_MODE_AUTO) {
-        error_setg(errp, "ssidsize auto mode is not supported");
-        return false;
-    }
     if (s->oas != OAS_MODE_44 && s->oas != OAS_MODE_48) {
         error_setg(errp, "QEMU SMMUv3 model only implements 44 and 48 bit"
                    "OAS; other OasMode values are not supported");
@@ -1989,7 +1988,8 @@ static bool smmu_validate_property(SMMUv3State *s, Error **errp)
             return false;
         }
         if (s->ssidsize > SSID_SIZE_MODE_0) {
-            error_setg(errp, "ssidsize can only be set if accel=on");
+            error_setg(errp, "ssidsize can only be greater than 0 "
+                       "bits if accel=on");
             return false;
         }
         return true;
@@ -2175,11 +2175,13 @@ static void smmuv3_class_init(ObjectClass *klass, const void *data)
         "are 44 or 48 bits. Defaults to 44 bits. oas=auto is not "
         "supported.");
     object_class_property_set_description(klass, "ssidsize",
-        "Number of bits used to represent SubstreamIDs (SSIDs). "
+        "Set number of bits used to represent SubstreamIDs (SSIDs). "
+        "Valid values are 0-20 and auto. Defaults to 0. "
         "A value of N allows SSIDs in the range [0 .. 2^N - 1]. "
-        "Valid range is 0-20, where 0 disables SubstreamID support. "
-        "Defaults to 0. A value greater than 0 is required to enable "
-        "PASID support. ssidsize=auto is not supported.");
+        "A value of 0 disables SubstreamID support. A value greater "
+        "than 0 is required to enable PASID support."
+        "Please ensure the value does not exceed the maximum "
+        "SubstreamID size supported by the host platform.");
 }
 
 static int smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 07/61] hw/arm/smmuv3-accel: Implement "auto" value for "oas"
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (5 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 06/61] hw/arm/smmuv3-accel: Implement "auto" value for "ssidsize" Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 08/61] hw/arm/smmuv3: Set default ats, ril, ssidsize, oas to auto Peter Maydell
                   ` (54 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nathan Chen <nathanc@nvidia.com>

Allow accelerated SMMUv3 OAS property to be derived from host IOMMU
capabilities. Derive host values using IOMMU_GET_HW_INFO, retrieving
OAS from IDR5.

This keeps the OAS value advertised by the virtual SMMU compatible with
the capabilities of the host SMMUv3, so that the intermediate physical
addresses (IPA) consumed by host SMMU for stage-2 translation do not
exceed the host's max supported IPA size.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260608174900.2227340-8-nathanc@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3-accel.c |  8 +++++++-
 hw/arm/smmuv3.c       | 17 ++++++++++-------
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 3f141991eb..06dadf9cc7 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -68,6 +68,11 @@ static void smmuv3_accel_auto_finalise(SMMUv3State *s,
                                FIELD_EX32(info->idr[1], IDR1, SSIDSIZE));
     }
 
+    if (s->oas == OAS_MODE_AUTO) {
+        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, OAS,
+                               FIELD_EX32(info->idr[5], IDR5, OAS));
+    }
+
     accel->auto_finalised = true;
 }
 
@@ -985,7 +990,8 @@ bool smmuv3_accel_init(SMMUv3State *s, Error **errp)
 
     if (s->ats == ON_OFF_AUTO_AUTO ||
         s->ril == ON_OFF_AUTO_AUTO ||
-        s->ssidsize == SSID_SIZE_MODE_AUTO) {
+        s->ssidsize == SSID_SIZE_MODE_AUTO ||
+        s->oas == OAS_MODE_AUTO) {
         s->s_accel->auto_mode = true;
     }
 
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index d59931d1a7..89289ab6d2 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1968,9 +1968,11 @@ static void smmu_reset_exit(Object *obj, ResetType type)
 
 static bool smmu_validate_property(SMMUv3State *s, Error **errp)
 {
-    if (s->oas != OAS_MODE_44 && s->oas != OAS_MODE_48) {
-        error_setg(errp, "QEMU SMMUv3 model only implements 44 and 48 bit"
-                   "OAS; other OasMode values are not supported");
+    if (s->oas != OAS_MODE_44 && s->oas != OAS_MODE_48 &&
+        s->oas != OAS_MODE_AUTO) {
+        error_setg(errp, "QEMU SMMUv3 model only implements auto, "
+                   "44 bit, or 48 bit OAS. Other OasMode values are "
+                   "not supported.");
         return false;
     }
 
@@ -1984,7 +1986,7 @@ static bool smmu_validate_property(SMMUv3State *s, Error **errp)
             return false;
         }
         if (s->oas > OAS_MODE_44) {
-            error_setg(errp, "OAS must be 44 bits when accel=off");
+            error_setg(errp, "oas must be 44 bits when accel=off");
             return false;
         }
         if (s->ssidsize > SSID_SIZE_MODE_0) {
@@ -2171,9 +2173,10 @@ static void smmuv3_class_init(ObjectClass *klass, const void *data)
         "Please ensure host platform supports ATS before setting it "
         "to on.");
     object_class_property_set_description(klass, "oas",
-        "Specify Output Address Size (for accel=on). Supported values "
-        "are 44 or 48 bits. Defaults to 44 bits. oas=auto is not "
-        "supported.");
+        "Set Output Address Size in bits (for accel=on). "
+        "Valid values are 44, 48, and auto. Defaults to 44 bits."
+        "Please ensure the value does not exceed the maximum "
+        "Output Address Size supported by the host platform.");
     object_class_property_set_description(klass, "ssidsize",
         "Set number of bits used to represent SubstreamIDs (SSIDs). "
         "Valid values are 0-20 and auto. Defaults to 0. "
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 08/61] hw/arm/smmuv3: Set default ats, ril, ssidsize, oas to auto
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (6 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 07/61] hw/arm/smmuv3-accel: Implement "auto" value for "oas" Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 09/61] qemu-options.hx: Support "auto" for accel SMMUv3 properties Peter Maydell
                   ` (53 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nathan Chen <nathanc@nvidia.com>

Set the default value of ATS, RIL, SSIDSIZE, and OAS to auto, in order
to match the host IOMMU properties when accel=on.

If accel=off and these property values are set to auto, the default
property values defined in smmuv3_init_id_regs() for OAS and RIL will
remain unchanged, while SSIDSIZE and ATS values will remain initialized
at 0.

Introduce a new compat for the changed defaults.

Reviewed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260608174900.2227340-9-nathanc@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3.c   | 23 +++++++++++++++--------
 hw/core/machine.c |  5 +++++
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 89289ab6d2..83fa6468fd 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -2129,12 +2129,19 @@ static const Property smmuv3_properties[] = {
     DEFINE_PROP_BOOL("accel", SMMUv3State, accel, false),
     /* GPA of MSI doorbell, for SMMUv3 accel use. */
     DEFINE_PROP_UINT64("msi-gpa", SMMUv3State, msi_gpa, 0),
+    /*
+     * AUTO values for accel=off will resolve to:
+     * ril: on
+     * ats: off
+     * oas: 44
+     * ssidsize: 0
+     */
     /* RIL can be turned off for accel cases */
-    DEFINE_PROP_ON_OFF_AUTO("ril", SMMUv3State, ril, ON_OFF_AUTO_ON),
-    DEFINE_PROP_ON_OFF_AUTO("ats", SMMUv3State, ats, ON_OFF_AUTO_OFF),
-    DEFINE_PROP_OAS_MODE("oas", SMMUv3State, oas, OAS_MODE_44),
+    DEFINE_PROP_ON_OFF_AUTO("ril", SMMUv3State, ril, ON_OFF_AUTO_AUTO),
+    DEFINE_PROP_ON_OFF_AUTO("ats", SMMUv3State, ats, ON_OFF_AUTO_AUTO),
+    DEFINE_PROP_OAS_MODE("oas", SMMUv3State, oas, OAS_MODE_AUTO),
     DEFINE_PROP_SSIDSIZE_MODE("ssidsize", SMMUv3State, ssidsize,
-                              SSID_SIZE_MODE_0),
+                              SSID_SIZE_MODE_AUTO),
 };
 
 static void smmuv3_instance_init(Object *obj)
@@ -2164,22 +2171,22 @@ static void smmuv3_class_init(ObjectClass *klass, const void *data)
         "enabling.");
     object_class_property_set_description(klass, "ril",
         "Enable/disable range invalidation support (for accel=on). "
-        "Valid values are on, off, and auto. Defaults to on. "
+        "Valid values are on, off, and auto. Defaults to auto. "
         "Any attempt to turn it 'on' while the host does not support "
         "it would fail.");
     object_class_property_set_description(klass, "ats",
         "Enable/disable ATS support (for accel=on). "
-        "Valid values are on, off, and auto. Defaults to off. "
+        "Valid values are on, off, and auto. Defaults to auto. "
         "Please ensure host platform supports ATS before setting it "
         "to on.");
     object_class_property_set_description(klass, "oas",
         "Set Output Address Size in bits (for accel=on). "
-        "Valid values are 44, 48, and auto. Defaults to 44 bits."
+        "Valid values are 44, 48, and auto. Defaults to auto."
         "Please ensure the value does not exceed the maximum "
         "Output Address Size supported by the host platform.");
     object_class_property_set_description(klass, "ssidsize",
         "Set number of bits used to represent SubstreamIDs (SSIDs). "
-        "Valid values are 0-20 and auto. Defaults to 0. "
+        "Valid values are 0-20 and auto. Defaults to auto. "
         "A value of N allows SSIDs in the range [0 .. 2^N - 1]. "
         "A value of 0 disables SubstreamID support. A value greater "
         "than 0 is required to enable PASID support."
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 4d8b15d99e..9a10e45aab 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -38,11 +38,16 @@
 #include "hw/virtio/virtio-iommu.h"
 #include "hw/acpi/generic_event_device.h"
 #include "qemu/audio.h"
+#include "hw/arm/smmuv3.h"
 
 GlobalProperty hw_compat_11_0[] = {
     { "chardev-vc", "encoding", "cp437" },
     { "tpm-crb", "cap-chunk", "off" },
     { "tpm-crb", "x-allow-chunk-migration", "off" },
+    { TYPE_ARM_SMMUV3, "ats", "off" },
+    { TYPE_ARM_SMMUV3, "ril", "on" },
+    { TYPE_ARM_SMMUV3, "ssidsize", "0" },
+    { TYPE_ARM_SMMUV3, "oas", "44" },
 };
 const size_t hw_compat_11_0_len = G_N_ELEMENTS(hw_compat_11_0);
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 09/61] qemu-options.hx: Support "auto" for accel SMMUv3 properties
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (7 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 08/61] hw/arm/smmuv3: Set default ats, ril, ssidsize, oas to auto Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 10/61] hw/pci/pci: Enforce pci_setup_iommu_per_bus() is called only once per bus Peter Maydell
                   ` (52 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nathan Chen <nathanc@nvidia.com>

Update documentation now that "auto" is supported for accelerated SMMUv3
properties.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Nathan Chen <nathanc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260608174900.2227340-10-nathanc@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 qemu-options.hx | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 2fd21519b2..a5979d0a5b 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1291,31 +1291,44 @@ SRST
         Enabling accel configures the host SMMUv3 in nested mode to support
         vfio-pci passthrough.
 
-     The following options are available when accel=on.
-     Note: 'auto' mode is not currently supported.
+    The following options will be set to auto by default if not manually
+    set. When accel=on and these properties are set to auto, the value is
+    derived from the host SMMUv3 capabilities via IOMMU_GET_HW_INFO. With
+    accel=on, this requires at least one cold-plugged vfio-pci device; if
+    none is present at machine init, QEMU will abort.
 
-    ``ril=on|off`` (default: on)
+    If accel=off, auto values resolve to the non-accel defaults given below.
+
+    ``ril=on|off|auto`` (default: auto)
         Support for Range Invalidation, which allows the SMMUv3 driver to
         invalidate TLB entries for a range of IOVAs at once instead of issuing
         separate commands to invalidate each page. Must match with host SMMUv3
         Range Invalidation support.
 
-    ``ats=on|off`` (default: off)
+        - With accel=off, auto is resolved to 'on'.
+
+    ``ats=on|off|auto`` (default: auto)
         Support for Address Translation Services, which enables PCIe devices to
         cache address translations in their local TLB and reduce latency. Host
         SMMUv3 must support ATS in order to enable this feature for the vIOMMU.
 
-    ``oas=val`` (supported values are 44 and 48. default: 44)
+        - With accel=off, auto is resolved to 'off'.
+
+    ``oas=val|auto`` (supported values are 44 and 48. default: auto)
         Sets the Output Address Size in bits. The value set here must be less
         than or equal to the host SMMUv3's supported OAS, so that the
         intermediate physical addresses (IPA) consumed by host SMMU for stage-2
         translation do not exceed the host's max supported IPA size.
 
-    ``ssidsize=val`` (val between 0 and 20. default: 0)
+        - With accel=off, auto is resolved to 44.
+
+    ``ssidsize=val|auto`` (val between 0 and 20. default: auto)
         Sets the Substream ID size in bits. When set to a non-zero value,
         PASID capability is advertised to the vIOMMU and accelerated use cases
         such as Shared Virtual Addressing (SVA) are supported.
 
+        - With accel=off, auto is resolved to 0.
+
 ``-device amd-iommu[,option=...]``
     Enables emulation of an AMD-Vi I/O Memory Management Unit (IOMMU).
     Only available with ``-machine q35``, it supports the following options:
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 10/61] hw/pci/pci: Enforce pci_setup_iommu_per_bus() is called only once per bus
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (8 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 09/61] qemu-options.hx: Support "auto" for accel SMMUv3 properties Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 11/61] backends/iommufd: Update iommufd_backend_get_device_info Peter Maydell
                   ` (51 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Eric Auger <eric.auger@redhat.com>

Currently it is possible to attach several arm-smmuv3 devices to the
same bus although it is a wrong setup.

Change the prototype of pci_setup_iommu_per_bus to pass an error
handle. This latter is set when iommu_per_bus is already set and
used by the single caller (smmu_base_realize) to report a useful
error to the end-user.

While at it document pci_setup_iommu_per_bus callback in the header.

Fixes: 66d2f665e163 ("hw/arm/virt: Allow user-creatable SMMUv3 dev instantiation")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nathan Chen <nathanc@nvidia.com>
Reviewed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@mailo.com>
Message-id: 20260603124415.1120808-1-eric.auger@redhat.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmu-common.c |  4 +++-
 hw/pci/pci.c         |  9 +++++++--
 include/hw/pci/pci.h | 16 +++++++++++++++-
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 58c4452b1f..8e40ba603d 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -981,7 +981,9 @@ static void smmu_base_realize(DeviceState *dev, Error **errp)
         }
 
         if (s->smmu_per_bus) {
-            pci_setup_iommu_per_bus(pci_bus, s->iommu_ops, s);
+            if (!pci_setup_iommu_per_bus(pci_bus, s->iommu_ops, s, errp)) {
+                return;
+            }
         } else {
             pci_setup_iommu(pci_bus, s->iommu_ops, s);
         }
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index cec065d108..d3191609e2 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -3314,11 +3314,16 @@ void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque)
  * IOMMU ops are returned, avoiding the use of the parent’s IOMMU when
  * it's not appropriate.
  */
-void pci_setup_iommu_per_bus(PCIBus *bus, const PCIIOMMUOps *ops,
-                             void *opaque)
+bool pci_setup_iommu_per_bus(PCIBus *bus, const PCIIOMMUOps *ops,
+                             void *opaque, Error **errp)
 {
+    if (bus->iommu_per_bus) {
+        error_setg(errp, "An iommu is already attached to this bus");
+        return false;
+    }
     pci_setup_iommu(bus, ops, opaque);
     bus->iommu_per_bus = true;
+    return true;
 }
 
 static void pci_dev_get_w64(PCIBus *b, PCIDevice *dev, void *opaque)
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 5b179091de..f2448e941a 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -863,7 +863,21 @@ int pci_iommu_unregister_iotlb_notifier(PCIDevice *dev, uint32_t pasid,
  */
 void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque);
 
-void pci_setup_iommu_per_bus(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque);
+/**
+ * pci_setup_iommu_per_bus: Initialize specific IOMMU handlers for a PCIBus
+ *
+ * Similar to pci_setup_iommu but enforces that the iommu only protects
+ * @bus downstream end points and no other bus hierarchy
+ *
+ * @bus: the #PCIBus being updated.
+ * @ops: the #PCIIOMMUOps
+ * @opaque: passed to callbacks of the @ops structure.
+ * @errp: error handle
+ *
+ * Returns false on failure with @errp set, true on success
+ */
+bool pci_setup_iommu_per_bus(PCIBus *bus, const PCIIOMMUOps *ops,
+                             void *opaque, Error **errp);
 
 pcibus_t pci_bar_address(PCIDevice *d,
                          int reg, uint8_t type, pcibus_t size);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 11/61] backends/iommufd: Update iommufd_backend_get_device_info
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (9 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 10/61] hw/pci/pci: Enforce pci_setup_iommu_per_bus() is called only once per bus Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 12/61] backends/iommufd: Update iommufd_backend_alloc_viommu to allow user ptr Peter Maydell
                   ` (50 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

The updated IOMMUFD uAPI introduces the ability for userspace to request
a specific hardware info data type via IOMMU_GET_HW_INFO. Update
iommufd_backend_get_device_info() to set IOMMU_HW_INFO_FLAG_INPUT_TYPE
when a non-zero type is supplied, and adjust all callers to pass a type
value explicitly initialised to zero (IOMMU_HW_INFO_TYPE_DEFAULT) when
no specific type is requested.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-2-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 backends/iommufd.c    | 7 +++++++
 hw/arm/smmuv3-accel.c | 2 +-
 hw/vfio/iommufd.c     | 4 ++--
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/backends/iommufd.c b/backends/iommufd.c
index 410b044370..8c3a981392 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -390,16 +390,23 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
     return true;
 }
 
+/*
+ * @type can carry a desired HW info type defined in the uapi headers. If caller
+ * doesn't have one, indicating it wants the default type, then @type should be
+ * zeroed (i.e. IOMMU_HW_INFO_TYPE_DEFAULT).
+ */
 bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
                                      uint32_t *type, void *data, uint32_t len,
                                      uint64_t *caps, uint8_t *max_pasid_log2,
                                      Error **errp)
 {
     struct iommu_hw_info info = {
+        .flags = (*type) ? IOMMU_HW_INFO_FLAG_INPUT_TYPE : 0,
         .size = sizeof(info),
         .dev_id = devid,
         .data_len = len,
         .data_uptr = (uintptr_t)data,
+        .in_data_type = *type,
     };
 
     if (ioctl(be->fd, IOMMU_GET_HW_INFO, &info)) {
diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 06dadf9cc7..163d3e7279 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -176,7 +176,7 @@ smmuv3_accel_hw_compatible(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *hiodi,
                            Error **errp)
 {
     struct iommu_hw_info_arm_smmuv3 info;
-    uint32_t data_type;
+    uint32_t data_type = IOMMU_HW_INFO_TYPE_DEFAULT;
     uint64_t caps;
 
     if (!iommufd_backend_get_device_info(hiodi->iommufd, hiodi->devid,
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index df148a49a7..495a59cc2e 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -353,7 +353,7 @@ static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
     IOMMUFDBackend *iommufd = vbasedev->iommufd;
     VFIOContainer *bcontainer = VFIO_IOMMU(container);
     bool viommu_nesting, viommu_nesting_dirty;
-    uint32_t type, flags = 0;
+    uint32_t type = IOMMU_HW_INFO_TYPE_DEFAULT, flags = 0;
     uint64_t hw_caps;
     VendorCaps caps;
     VFIOIOASHwpt *hwpt;
@@ -948,7 +948,7 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
     HostIOMMUDeviceIOMMUFD *hiodi;
     HostIOMMUDeviceCaps *caps = &hiod->caps;
     VendorCaps *vendor_caps = &caps->vendor_caps;
-    enum iommu_hw_info_type type;
+    uint32_t type = IOMMU_HW_INFO_TYPE_DEFAULT;
     uint8_t max_pasid_log2;
     uint64_t hw_caps;
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 12/61] backends/iommufd: Update iommufd_backend_alloc_viommu to allow user ptr
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (10 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 11/61] backends/iommufd: Update iommufd_backend_get_device_info Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 13/61] backends/iommufd: Introduce iommufd_backend_alloc_hw_queue Peter Maydell
                   ` (49 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

The updated IOMMUFD VIOMMU_ALLOC uAPI allows userspace to provide a data
buffer when creating a vIOMMU (e.g. for Tegra241 CMDQV). Extend
iommufd_backend_alloc_viommu() to pass a user pointer and size to the
kernel.

Update the caller accordingly.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-3-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 backends/iommufd.c       | 4 ++++
 backends/trace-events    | 2 +-
 hw/arm/smmuv3-accel.c    | 4 ++--
 include/system/iommufd.h | 1 +
 4 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/backends/iommufd.c b/backends/iommufd.c
index 8c3a981392..c745dbd2b1 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -463,6 +463,7 @@ bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
 
 bool iommufd_backend_alloc_viommu(IOMMUFDBackend *be, uint32_t dev_id,
                                   uint32_t viommu_type, uint32_t hwpt_id,
+                                  void *data_ptr, uint32_t data_len,
                                   uint32_t *out_viommu_id, Error **errp)
 {
     int ret;
@@ -471,11 +472,14 @@ bool iommufd_backend_alloc_viommu(IOMMUFDBackend *be, uint32_t dev_id,
         .type = viommu_type,
         .dev_id = dev_id,
         .hwpt_id = hwpt_id,
+        .data_len = data_len,
+        .data_uptr = (uintptr_t)data_ptr,
     };
 
     ret = ioctl(be->fd, IOMMU_VIOMMU_ALLOC, &alloc_viommu);
 
     trace_iommufd_backend_alloc_viommu(be->fd, dev_id, viommu_type, hwpt_id,
+                                       (uintptr_t)data_ptr, data_len,
                                        alloc_viommu.out_viommu_id, ret);
     if (ret) {
         error_setg_errno(errp, errno, "IOMMU_VIOMMU_ALLOC failed");
diff --git a/backends/trace-events b/backends/trace-events
index b9365113e7..3ba0c3503c 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -21,7 +21,7 @@ iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%
 iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
 iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t flags, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" flags=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
 iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
-iommufd_backend_alloc_viommu(int iommufd, uint32_t dev_id, uint32_t type, uint32_t hwpt_id, uint32_t viommu_id, int ret) " iommufd=%d type=%u dev_id=%u hwpt_id=%u viommu_id=%u (%d)"
+iommufd_backend_alloc_viommu(int iommufd, uint32_t dev_id, uint32_t type, uint32_t hwpt_id, uint64_t data_ptr, uint32_t data_len, uint32_t viommu_id, int ret) " iommufd=%d type=%u dev_id=%u hwpt_id=%u data_ptr=0x%"PRIx64" data_len=0x%x viommu_id=%u (%d)"
 iommufd_backend_alloc_vdev(int iommufd, uint32_t dev_id, uint32_t viommu_id, uint64_t virt_id, uint32_t vdev_id, int ret) " iommufd=%d dev_id=%u viommu_id=%u virt_id=0x%"PRIx64" vdev_id=%u (%d)"
 iommufd_viommu_alloc_eventq(int iommufd, uint32_t viommu_id, uint32_t type, uint32_t veventq_id, uint32_t veventq_fd, int ret) " iommufd=%d viommu_id=%u type=%u veventq_id=%u veventq_fd=%u (%d)"
 
diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 163d3e7279..4cef487679 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -582,8 +582,8 @@ smmuv3_accel_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *hiodi,
     IOMMUFDViommu *viommu;
 
     if (!iommufd_backend_alloc_viommu(hiodi->iommufd, hiodi->devid,
-                                      IOMMU_VIOMMU_TYPE_ARM_SMMUV3,
-                                      s2_hwpt_id, &viommu_id, errp)) {
+                                      IOMMU_VIOMMU_TYPE_ARM_SMMUV3, s2_hwpt_id,
+                                      NULL, 0, &viommu_id, errp)) {
         return false;
     }
 
diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index 2925d116ac..1b0266f660 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -89,6 +89,7 @@ bool iommufd_backend_alloc_hwpt(IOMMUFDBackend *be, uint32_t dev_id,
                                 Error **errp);
 bool iommufd_backend_alloc_viommu(IOMMUFDBackend *be, uint32_t dev_id,
                                   uint32_t viommu_type, uint32_t hwpt_id,
+                                  void *data_ptr, uint32_t data_len,
                                   uint32_t *out_hwpt, Error **errp);
 
 bool iommufd_backend_alloc_vdev(IOMMUFDBackend *be, uint32_t dev_id,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 13/61] backends/iommufd: Introduce iommufd_backend_alloc_hw_queue
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (11 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 12/61] backends/iommufd: Update iommufd_backend_alloc_viommu to allow user ptr Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 14/61] backends/iommufd: Introduce iommufd_backend_viommu_mmap Peter Maydell
                   ` (48 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

Add a helper to allocate an iommufd backed HW queue for a vIOMMU.

While at it, define a struct IOMMUFDHWqueue for use by vendor
implementations.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-4-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 backends/iommufd.c       | 31 +++++++++++++++++++++++++++++++
 backends/trace-events    |  1 +
 include/system/iommufd.h | 11 +++++++++++
 3 files changed, 43 insertions(+)

diff --git a/backends/iommufd.c b/backends/iommufd.c
index c745dbd2b1..8eaaf456e8 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -549,6 +549,37 @@ bool iommufd_backend_alloc_veventq(IOMMUFDBackend *be, uint32_t viommu_id,
     return true;
 }
 
+bool iommufd_backend_alloc_hw_queue(IOMMUFDBackend *be, uint32_t viommu_id,
+                                    uint32_t queue_type, uint32_t index,
+                                    uint64_t addr, uint64_t length,
+                                    uint32_t *out_hw_queue_id, Error **errp)
+{
+    int ret;
+    struct iommu_hw_queue_alloc alloc_hw_queue = {
+        .size = sizeof(alloc_hw_queue),
+        .flags = 0,
+        .viommu_id = viommu_id,
+        .type = queue_type,
+        .index = index,
+        .nesting_parent_iova = addr,
+        .length = length,
+    };
+
+    ret = ioctl(be->fd, IOMMU_HW_QUEUE_ALLOC, &alloc_hw_queue);
+
+    trace_iommufd_backend_alloc_hw_queue(be->fd, viommu_id, queue_type,
+                                         index, addr, length,
+                                         alloc_hw_queue.out_hw_queue_id, ret);
+    if (ret) {
+        error_setg_errno(errp, errno, "IOMMU_HW_QUEUE_ALLOC failed");
+        return false;
+    }
+
+    g_assert(out_hw_queue_id);
+    *out_hw_queue_id = alloc_hw_queue.out_hw_queue_id;
+    return true;
+}
+
 bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *hiodi,
                                            uint32_t hwpt_id, Error **errp)
 {
diff --git a/backends/trace-events b/backends/trace-events
index 3ba0c3503c..c5c1d95aad 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -24,6 +24,7 @@ iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, u
 iommufd_backend_alloc_viommu(int iommufd, uint32_t dev_id, uint32_t type, uint32_t hwpt_id, uint64_t data_ptr, uint32_t data_len, uint32_t viommu_id, int ret) " iommufd=%d type=%u dev_id=%u hwpt_id=%u data_ptr=0x%"PRIx64" data_len=0x%x viommu_id=%u (%d)"
 iommufd_backend_alloc_vdev(int iommufd, uint32_t dev_id, uint32_t viommu_id, uint64_t virt_id, uint32_t vdev_id, int ret) " iommufd=%d dev_id=%u viommu_id=%u virt_id=0x%"PRIx64" vdev_id=%u (%d)"
 iommufd_viommu_alloc_eventq(int iommufd, uint32_t viommu_id, uint32_t type, uint32_t veventq_id, uint32_t veventq_fd, int ret) " iommufd=%d viommu_id=%u type=%u veventq_id=%u veventq_fd=%u (%d)"
+iommufd_backend_alloc_hw_queue(int iommufd, uint32_t viommu_id, uint32_t queue_type, uint32_t index, uint64_t addr, uint64_t size, uint32_t queue_id, int ret) " iommufd=%d viommu_id=%u queue_type=%u index=%u addr=0x%"PRIx64" size=0x%"PRIx64" queue_id=%u (%d)"
 
 # igvm-cfg.c
 igvm_reset_enter(int type) "type=%u"
diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index 1b0266f660..c6f2e87a7e 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -65,6 +65,12 @@ typedef struct IOMMUFDVeventq {
     bool event_start; /* True after first valid event; cleared on overflow */
 } IOMMUFDVeventq;
 
+/* HW queue object for a vIOMMU-specific HW-accelerated queue */
+typedef struct IOMMUFDHWqueue {
+    IOMMUFDViommu *viommu;
+    uint32_t hw_queue_id;
+} IOMMUFDHWqueue;
+
 bool iommufd_backend_connect(IOMMUFDBackend *be, Error **errp);
 void iommufd_backend_disconnect(IOMMUFDBackend *be);
 
@@ -101,6 +107,11 @@ bool iommufd_backend_alloc_veventq(IOMMUFDBackend *be, uint32_t viommu_id,
                                    uint32_t *out_veventq_id,
                                    uint32_t *out_veventq_fd, Error **errp);
 
+bool iommufd_backend_alloc_hw_queue(IOMMUFDBackend *be, uint32_t viommu_id,
+                                    uint32_t queue_type, uint32_t index,
+                                    uint64_t addr, uint64_t length,
+                                    uint32_t *out_hw_queue_id, Error **errp);
+
 bool iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t hwpt_id,
                                         bool start, Error **errp);
 bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 14/61] backends/iommufd: Introduce iommufd_backend_viommu_mmap
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (12 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 13/61] backends/iommufd: Introduce iommufd_backend_alloc_hw_queue Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 15/61] system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq Peter Maydell
                   ` (47 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

Add a backend helper to mmap hardware MMIO regions exposed via iommufd for
a vIOMMU instance. This allows user space to access HW-accelerated MMIO
pages provided by the vIOMMU.

The caller is responsible for unmapping the returned region.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-5-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 backends/iommufd.c       | 22 ++++++++++++++++++++++
 backends/trace-events    |  1 +
 include/system/iommufd.h |  4 ++++
 3 files changed, 27 insertions(+)

diff --git a/backends/iommufd.c b/backends/iommufd.c
index 8eaaf456e8..440c1b82bc 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -580,6 +580,28 @@ bool iommufd_backend_alloc_hw_queue(IOMMUFDBackend *be, uint32_t viommu_id,
     return true;
 }
 
+/*
+ * Helper to mmap HW MMIO regions exposed via iommufd for a vIOMMU instance.
+ * The caller is responsible for unmapping the mapped region.
+ */
+bool iommufd_backend_viommu_mmap(IOMMUFDBackend *be, uint32_t viommu_id,
+                                 uint64_t size, off_t offset, void **out_ptr,
+                                 Error **errp)
+{
+    g_assert(viommu_id);
+    g_assert(out_ptr);
+
+    *out_ptr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, be->fd,
+                   offset);
+    trace_iommufd_backend_viommu_mmap(be->fd, viommu_id, size, offset);
+    if (*out_ptr == MAP_FAILED) {
+        error_setg_errno(errp, errno, "IOMMUFD vIOMMU mmap failed");
+        return false;
+    }
+
+    return true;
+}
+
 bool host_iommu_device_iommufd_attach_hwpt(HostIOMMUDeviceIOMMUFD *hiodi,
                                            uint32_t hwpt_id, Error **errp)
 {
diff --git a/backends/trace-events b/backends/trace-events
index c5c1d95aad..b63420b73e 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -25,6 +25,7 @@ iommufd_backend_alloc_viommu(int iommufd, uint32_t dev_id, uint32_t type, uint32
 iommufd_backend_alloc_vdev(int iommufd, uint32_t dev_id, uint32_t viommu_id, uint64_t virt_id, uint32_t vdev_id, int ret) " iommufd=%d dev_id=%u viommu_id=%u virt_id=0x%"PRIx64" vdev_id=%u (%d)"
 iommufd_viommu_alloc_eventq(int iommufd, uint32_t viommu_id, uint32_t type, uint32_t veventq_id, uint32_t veventq_fd, int ret) " iommufd=%d viommu_id=%u type=%u veventq_id=%u veventq_fd=%u (%d)"
 iommufd_backend_alloc_hw_queue(int iommufd, uint32_t viommu_id, uint32_t queue_type, uint32_t index, uint64_t addr, uint64_t size, uint32_t queue_id, int ret) " iommufd=%d viommu_id=%u queue_type=%u index=%u addr=0x%"PRIx64" size=0x%"PRIx64" queue_id=%u (%d)"
+iommufd_backend_viommu_mmap(int iommufd, uint32_t viommu_id, uint64_t size, uint64_t offset) " iommufd=%d viommu_id=%u size=0x%"PRIx64" offset=0x%"PRIx64
 
 # igvm-cfg.c
 igvm_reset_enter(int type) "type=%u"
diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index c6f2e87a7e..bb8d5081d7 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -112,6 +112,10 @@ bool iommufd_backend_alloc_hw_queue(IOMMUFDBackend *be, uint32_t viommu_id,
                                     uint64_t addr, uint64_t length,
                                     uint32_t *out_hw_queue_id, Error **errp);
 
+bool iommufd_backend_viommu_mmap(IOMMUFDBackend *be, uint32_t viommu_id,
+                                 uint64_t size, off_t offset, void **out_ptr,
+                                 Error **errp);
+
 bool iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t hwpt_id,
                                         bool start, Error **errp);
 bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 15/61] system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (13 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 14/61] backends/iommufd: Introduce iommufd_backend_viommu_mmap Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 16/61] hw/arm/smmuv3-accel: Introduce CMDQV ops interface Peter Maydell
                   ` (46 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

The viommu field is assigned but never used. Callers freeing the
veventq already have access to the IOMMUFDViommu object through other
references, so this field is redundant.

Removing it also simplifies upcoming changes where veventq is
allocated based on the viommu id before the IOMMUFDViommu object is
created (e.g. vendor CMDQV-based veventq allocation).

No functional change.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-6-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3-accel.c    | 1 -
 include/system/iommufd.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 4cef487679..70b2581c17 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -553,7 +553,6 @@ bool smmuv3_accel_alloc_veventq(SMMUv3State *s, Error **errp)
     veventq = g_new0(IOMMUFDVeventq, 1);
     veventq->veventq_id = veventq_id;
     veventq->veventq_fd = veventq_fd;
-    veventq->viommu = accel->viommu;
     accel->veventq = veventq;
 
     /* Set up event handler for veventq fd */
diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index bb8d5081d7..da68ba0037 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -58,7 +58,6 @@ typedef struct IOMMUFDVdev {
 
 /* Virtual event queue interface for a vIOMMU */
 typedef struct IOMMUFDVeventq {
-    IOMMUFDViommu *viommu;
     uint32_t veventq_id;
     uint32_t veventq_fd;
     uint32_t last_event_seq; /* Sequence number of last processed event */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 16/61] hw/arm/smmuv3-accel: Introduce CMDQV ops interface
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (14 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 15/61] system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 17/61] hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub Peter Maydell
                   ` (45 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

Command Queue Virtualization (CMDQV) is a hardware extension available
on certain platforms that allows the SMMUv3 command queue to be
virtualized and passed through to a VM, improving performance.

For example, NVIDIA Tegra241 implements CMDQV to support virtualization
of multiple command queues (VCMDQs).

The term CMDQV is used here generically to refer to any platform that
provides hardware support to virtualize the SMMUv3 command queue.

CMDQV support is a specialization of the IOMMUFD-backed accelerated
SMMUv3 path. Introduce an ops interface to factor out CMDQV-specific
probe, initialization, and vIOMMU allocation logic from the base
implementation. The ops pointer and associated state are stored in
the accelerated SMMUv3 state.

This provides an extensible design to support future vendor-specific
CMDQV implementations.

No functional change.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-7-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3-accel.h | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
index 87fecb5c68..b45f25ad03 100644
--- a/hw/arm/smmuv3-accel.h
+++ b/hw/arm/smmuv3-accel.h
@@ -10,11 +10,44 @@
 #define HW_ARM_SMMUV3_ACCEL_H
 
 #include "hw/arm/smmu-common.h"
+#include "hw/arm/smmuv3.h"
 #include "system/iommufd.h"
 #ifdef CONFIG_LINUX
 #include <linux/iommufd.h>
 #endif
 
+/*
+ * CMDQ-Virtualization (CMDQV) hardware support, extends the SMMUv3 to
+ * support multiple VCMDQs with virtualization capabilities.
+ * CMDQV specific behavior is factored behind this ops interface.
+ */
+typedef struct SMMUv3AccelCmdqvOps {
+    /**
+     * @probe: Mandatory. Vendor-specific device probing.
+     */
+    bool (*probe)(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev, Error **errp);
+    /**
+     * @init: Optional callback. Initialize CMDQV hardware.
+     */
+    bool (*init)(SMMUv3State *s, Error **errp);
+    /**
+     * @alloc_viommu: Mandatory. Allocate CMDQV viommu resources.
+     */
+    bool (*alloc_viommu)(SMMUv3State *s,
+                         HostIOMMUDeviceIOMMUFD *idev,
+                         uint32_t *out_viommu_id,
+                         Error **errp);
+    /**
+     * @free_viommu: Optional callback. Free CMDQV viommu resources.
+     * If NULL, the viommu_id is freed directly via iommufd_backend_free_id().
+     */
+    void (*free_viommu)(SMMUv3State *s);
+    /**
+     * @reset: Optional callback. Reset CMDQV state.
+     */
+    void (*reset)(SMMUv3State *s);
+} SMMUv3AccelCmdqvOps;
+
 /*
  * Represents an accelerated SMMU instance backed by an iommufd vIOMMU object.
  * Holds bypass and abort proxy HWPT IDs used for device attachment.
@@ -27,6 +60,7 @@ typedef struct SMMUv3AccelState {
     QLIST_HEAD(, SMMUv3AccelDevice) device_list;
     bool auto_mode;
     bool auto_finalised;
+    const SMMUv3AccelCmdqvOps *cmdqv_ops;
 } SMMUv3AccelState;
 
 typedef struct SMMUS1Hwpt {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 17/61] hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (15 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 16/61] hw/arm/smmuv3-accel: Introduce CMDQV ops interface Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 18/61] hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle Peter Maydell
                   ` (44 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

Introduce a Tegra241 CMDQV backend that plugs into the SMMUv3 accelerated
CMDQV ops interface.

This patch wires up the Tegra241 CMDQV backend and provides a stub
implementation for CMDQV probe, initialization, vIOMMU allocation
and reset handling.

Functional CMDQV support is added in follow-up patches.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-8-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/Kconfig                |  5 ++++
 hw/arm/meson.build            |  2 ++
 hw/arm/tegra241-cmdqv-stubs.c | 16 ++++++++++
 hw/arm/tegra241-cmdqv.c       | 56 +++++++++++++++++++++++++++++++++++
 hw/arm/tegra241-cmdqv.h       | 15 ++++++++++
 5 files changed, 94 insertions(+)
 create mode 100644 hw/arm/tegra241-cmdqv-stubs.c
 create mode 100644 hw/arm/tegra241-cmdqv.c
 create mode 100644 hw/arm/tegra241-cmdqv.h

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index fb798ccbee..500bfdfe2a 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -643,6 +643,10 @@ config FSL_IMX8MM_EVK
     depends on TCG
     select FSL_IMX8MM
 
+config TEGRA241_CMDQV
+    bool
+    depends on ARM_SMMUV3_ACCEL
+
 config ARM_SMMUV3_ACCEL
     bool
     depends on ARM_SMMUV3
@@ -650,6 +654,7 @@ config ARM_SMMUV3_ACCEL
 config ARM_SMMUV3
     bool
     select ARM_SMMUV3_ACCEL if IOMMUFD
+    imply TEGRA241_CMDQV
 
 config FSL_IMX6UL
     bool
diff --git a/hw/arm/meson.build b/hw/arm/meson.build
index 8f66a80e10..4233a800be 100644
--- a/hw/arm/meson.build
+++ b/hw/arm/meson.build
@@ -89,6 +89,8 @@ arm_common_ss.add(when: 'CONFIG_FSL_IMX8MM_EVK', if_true: files('imx8mm-evk.c'))
 arm_common_ss.add(when: 'CONFIG_ARM_SMMUV3', if_true: files('smmuv3.c'))
 arm_common_ss.add(when: 'CONFIG_ARM_SMMUV3_ACCEL', if_true: files('smmuv3-accel.c'))
 stub_ss.add(files('smmuv3-accel-stubs.c'))
+arm_common_ss.add(when: 'CONFIG_TEGRA241_CMDQV', if_true: files('tegra241-cmdqv.c'))
+stub_ss.add(files('tegra241-cmdqv-stubs.c'))
 arm_common_ss.add(when: 'CONFIG_FSL_IMX6UL', if_true: files('fsl-imx6ul.c', 'mcimx6ul-evk.c'))
 arm_common_ss.add(when: 'CONFIG_NRF51_SOC', if_true: files('nrf51_soc.c'))
 arm_common_ss.add(when: 'CONFIG_XEN', if_true: files(
diff --git a/hw/arm/tegra241-cmdqv-stubs.c b/hw/arm/tegra241-cmdqv-stubs.c
new file mode 100644
index 0000000000..4669f5c5f5
--- /dev/null
+++ b/hw/arm/tegra241-cmdqv-stubs.c
@@ -0,0 +1,16 @@
+/*
+ * Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved
+ *
+ * Stubs for Tegra241 CMDQ-Virtualization extension for SMMUv3
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "smmuv3-accel.h"
+#include "hw/arm/tegra241-cmdqv.h"
+
+const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void)
+{
+    return NULL;
+}
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
new file mode 100644
index 0000000000..ad5a0d4611
--- /dev/null
+++ b/hw/arm/tegra241-cmdqv.c
@@ -0,0 +1,56 @@
+/*
+ * Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved
+ * NVIDIA Tegra241 CMDQ-Virtualization extension for SMMUv3
+ *
+ * Written by Nicolin Chen, Shameer Kolothum
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+
+#include "hw/arm/smmuv3.h"
+#include "smmuv3-accel.h"
+#include "tegra241-cmdqv.h"
+
+static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
+{
+}
+
+static bool
+tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
+                            uint32_t *out_viommu_id, Error **errp)
+{
+    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
+    return false;
+}
+
+static void tegra241_cmdqv_reset(SMMUv3State *s)
+{
+}
+
+static bool tegra241_cmdqv_init(SMMUv3State *s, Error **errp)
+{
+    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
+    return false;
+}
+
+static bool tegra241_cmdqv_probe(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
+                                 Error **errp)
+{
+    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
+    return false;
+}
+
+static const SMMUv3AccelCmdqvOps tegra241_cmdqv_ops = {
+    .probe = tegra241_cmdqv_probe,
+    .init = tegra241_cmdqv_init,
+    .alloc_viommu = tegra241_cmdqv_alloc_viommu,
+    .free_viommu = tegra241_cmdqv_free_viommu,
+    .reset = tegra241_cmdqv_reset,
+};
+
+const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void)
+{
+    return &tegra241_cmdqv_ops;
+}
diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
new file mode 100644
index 0000000000..74a6954017
--- /dev/null
+++ b/hw/arm/tegra241-cmdqv.h
@@ -0,0 +1,15 @@
+/*
+ * Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved
+ * NVIDIA Tegra241 CMDQ-Virtualization extension for SMMUv3
+ *
+ * Written by Nicolin Chen, Shameer Kolothum
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef HW_ARM_TEGRA241_CMDQV_H
+#define HW_ARM_TEGRA241_CMDQV_H
+
+const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
+
+#endif /* HW_ARM_TEGRA241_CMDQV_H */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 18/61] hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (16 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 17/61] hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 19/61] hw/arm/virt: Use stored SMMUv3 device list for IORT build Peter Maydell
                   ` (43 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

Add support for selecting and initializing a CMDQV backend based on the
cmdqv OnOffAuto property.

If set to OFF, CMDQV is not used and the default IOMMUFD-backed allocation
path is taken.

If set to AUTO, QEMU attempts to probe a CMDQV backend during device setup.
If probing succeeds, the selected ops are stored in the accelerated SMMUv3
state and used. If probing fails, QEMU silently falls back to the default
path.

If set to ON, QEMU requires CMDQV support. Probing is performed during
setup and failure results in an error.

When a CMDQV backend is active, its callbacks are used for vIOMMU
allocation, free, and reset handling. Otherwise, the base implementation
is used.

The current implementation wires up the Tegra241 CMDQV backend through the
generic ops interface. Functional CMDQV behaviour is added in subsequent
patches.

No functional change.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-9-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3-accel.c   | 94 ++++++++++++++++++++++++++++++++++++++---
 include/hw/arm/smmuv3.h |  2 +
 2 files changed, 89 insertions(+), 7 deletions(-)

diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 70b2581c17..3ceca56a67 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -19,6 +19,7 @@
 #include "smmuv3-internal.h"
 #include "smmuv3-accel.h"
 #include "system/system.h"
+#include "tegra241-cmdqv.h"
 
 /*
  * The root region aliases the global system memory, and shared_as_sysmem
@@ -570,6 +571,7 @@ smmuv3_accel_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *hiodi,
                           Error **errp)
 {
     SMMUv3AccelState *accel = s->s_accel;
+    const SMMUv3AccelCmdqvOps *cmdqv_ops = accel->cmdqv_ops;
     struct iommu_hwpt_arm_smmuv3 bypass_data = {
         .ste = { SMMU_STE_CFG_BYPASS | SMMU_STE_VALID, 0x0ULL },
     };
@@ -580,10 +582,17 @@ smmuv3_accel_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *hiodi,
     uint32_t viommu_id, hwpt_id;
     IOMMUFDViommu *viommu;
 
-    if (!iommufd_backend_alloc_viommu(hiodi->iommufd, hiodi->devid,
-                                      IOMMU_VIOMMU_TYPE_ARM_SMMUV3, s2_hwpt_id,
-                                      NULL, 0, &viommu_id, errp)) {
-        return false;
+    if (cmdqv_ops) {
+        if (!cmdqv_ops->alloc_viommu(s, hiodi, &viommu_id, errp)) {
+            return false;
+        }
+    } else {
+        if (!iommufd_backend_alloc_viommu(hiodi->iommufd, hiodi->devid,
+                                          IOMMU_VIOMMU_TYPE_ARM_SMMUV3,
+                                          s2_hwpt_id, NULL, 0, &viommu_id,
+                                          errp)) {
+            return false;
+        }
     }
 
     viommu = g_new0(IOMMUFDViommu, 1);
@@ -629,12 +638,70 @@ free_bypass_hwpt:
 free_abort_hwpt:
     iommufd_backend_free_id(hiodi->iommufd, accel->abort_hwpt_id);
 free_viommu:
-    iommufd_backend_free_id(hiodi->iommufd, viommu->viommu_id);
+    if (cmdqv_ops && cmdqv_ops->free_viommu) {
+        cmdqv_ops->free_viommu(s);
+    } else {
+        iommufd_backend_free_id(hiodi->iommufd, viommu->viommu_id);
+    }
     g_free(viommu);
     accel->viommu = NULL;
     return false;
 }
 
+static const SMMUv3AccelCmdqvOps *
+smmuv3_accel_probe_cmdqv(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
+                         Error **errp)
+{
+    const SMMUv3AccelCmdqvOps *ops = tegra241_cmdqv_get_ops();
+
+    if (!ops) {
+        error_setg(errp, "No CMDQV ops found");
+        return NULL;
+    }
+    g_assert(ops->probe);
+    g_assert(ops->alloc_viommu);
+
+    if (!ops->probe(s, idev, errp)) {
+        return NULL;
+    }
+    return ops;
+}
+
+static bool
+smmuv3_accel_select_cmdqv(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
+                          Error **errp)
+{
+    const SMMUv3AccelCmdqvOps *ops = NULL;
+
+    if (s->s_accel->cmdqv_ops) {
+        return true;
+    }
+
+    switch (s->cmdqv) {
+    case ON_OFF_AUTO_OFF:
+        s->s_accel->cmdqv_ops = NULL;
+        return true;
+    case ON_OFF_AUTO_AUTO:
+        ops = smmuv3_accel_probe_cmdqv(s, idev, NULL);
+        break;
+    case ON_OFF_AUTO_ON:
+        ops = smmuv3_accel_probe_cmdqv(s, idev, errp);
+        if (!ops) {
+            error_append_hint(errp, "CMDQV requested but not supported");
+            return false;
+        }
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    if (ops && ops->init && !ops->init(s, errp)) {
+        return false;
+    }
+    s->s_accel->cmdqv_ops = ops;
+    return true;
+}
+
 static bool smmuv3_accel_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
                                           HostIOMMUDevice *hiod, Error **errp)
 {
@@ -669,6 +736,10 @@ static bool smmuv3_accel_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
         goto done;
     }
 
+    if (!smmuv3_accel_select_cmdqv(s, hiodi, errp)) {
+        return false;
+    }
+
     if (!smmuv3_accel_alloc_viommu(s, hiodi, errp)) {
         error_append_hint(errp, "Unable to alloc vIOMMU: hiodi devid 0x%x: ",
                           hiodi->devid);
@@ -946,8 +1017,17 @@ bool smmuv3_accel_attach_gbpa_hwpt(SMMUv3State *s, Error **errp)
 
 void smmuv3_accel_reset(SMMUv3State *s)
 {
-     /* Attach a HWPT based on GBPA reset value */
-     smmuv3_accel_attach_gbpa_hwpt(s, NULL);
+    SMMUv3AccelState *accel = s->s_accel;
+
+    if (!accel) {
+        return;
+    }
+    /* Attach a HWPT based on GBPA reset value */
+    smmuv3_accel_attach_gbpa_hwpt(s, NULL);
+
+    if (accel->cmdqv_ops && accel->cmdqv_ops->reset) {
+        accel->cmdqv_ops->reset(s);
+    }
 }
 
 static void smmuv3_accel_as_init(SMMUv3State *s)
diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
index 85be3d7467..34d0f65eaa 100644
--- a/include/hw/arm/smmuv3.h
+++ b/include/hw/arm/smmuv3.h
@@ -75,6 +75,8 @@ struct SMMUv3State {
     OnOffAuto ats;
     OasMode oas;
     SsidSizeMode ssidsize;
+    /* SMMU CMDQV extension */
+    OnOffAuto cmdqv;
 
     Notifier machine_done;
 };
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 19/61] hw/arm/virt: Use stored SMMUv3 device list for IORT build
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (17 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 18/61] hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 20/61] hw/arm/tegra241-cmdqv: Probe host Tegra241 CMDQV support Peter Maydell
                   ` (42 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

Introduce a GPtrArray in VirtMachineState to track all SMMUv3 devices
created on the virt machine, and use it when building the IORT table
instead of relying on object_child_foreach_recursive() walks of the
object tree.

This avoids recursive object traversal and provides a foundation for
subsequent patches that need direct access to SMMUv3 instances for
CMDQV-related handling.

No functional change. No bios-tables qtest failures observed.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-10-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/virt-acpi-build.c | 70 ++++++++++++++++++----------------------
 hw/arm/virt.c            |  3 ++
 include/hw/arm/virt.h    |  1 +
 3 files changed, 35 insertions(+), 39 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 3f285ff6c7..b00f3477ca 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -392,49 +392,41 @@ static int smmuv3_dev_idmap_compare(gconstpointer a, gconstpointer b)
     return map_a->input_base - map_b->input_base;
 }
 
-static int iort_smmuv3_devices(Object *obj, void *opaque)
-{
-    VirtMachineState *vms = VIRT_MACHINE(qdev_get_machine());
-    AcpiIortSMMUv3Dev sdev = {0};
-    GArray *sdev_blob = opaque;
-    AcpiIortIdMapping idmap;
-    PlatformBusDevice *pbus;
-    int min_bus, max_bus;
-    SysBusDevice *sbdev;
-    PCIBus *bus;
-
-    if (!object_dynamic_cast(obj, TYPE_ARM_SMMUV3)) {
-        return 0;
-    }
-
-    bus = PCI_BUS(object_property_get_link(obj, "primary-bus", &error_abort));
-    sdev.accel = object_property_get_bool(obj, "accel", &error_abort);
-    sdev.ats = smmuv3_ats_enabled(ARM_SMMUV3(obj));
-    pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
-    sbdev = SYS_BUS_DEVICE(obj);
-    sdev.base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
-    sdev.base += vms->memmap[VIRT_PLATFORM_BUS].base;
-    sdev.irq = platform_bus_get_irqn(pbus, sbdev, 0);
-    sdev.irq += vms->irqmap[VIRT_PLATFORM_BUS];
-    sdev.irq += ARM_SPI_BASE;
-
-    pci_bus_range(bus, &min_bus, &max_bus);
-    sdev.rc_smmu_idmaps = g_array_new(false, true, sizeof(AcpiIortIdMapping));
-    idmap.input_base = min_bus << 8,
-    idmap.id_count = (max_bus - min_bus + 1) << 8,
-    g_array_append_val(sdev.rc_smmu_idmaps, idmap);
-    g_array_append_val(sdev_blob, sdev);
-    return 0;
-}
-
 /*
  * Populate the struct AcpiIortSMMUv3Dev for all SMMUv3 devices and
  * return the total number of idmaps.
  */
-static int populate_smmuv3_dev(GArray *sdev_blob)
+static int populate_smmuv3_dev(VirtMachineState *vms, GArray *sdev_blob)
 {
-    object_child_foreach_recursive(object_get_root(),
-                                   iort_smmuv3_devices, sdev_blob);
+    for (int i = 0; i < vms->smmuv3_devices->len; i++) {
+        Object *obj = OBJECT(g_ptr_array_index(vms->smmuv3_devices, i));
+        AcpiIortSMMUv3Dev sdev = {0};
+        AcpiIortIdMapping idmap;
+        PlatformBusDevice *pbus;
+        int min_bus, max_bus;
+        SysBusDevice *sbdev;
+        PCIBus *bus;
+
+        bus = PCI_BUS(object_property_get_link(obj, "primary-bus",
+                                               &error_abort));
+        sdev.accel = object_property_get_bool(obj, "accel", &error_abort);
+        sdev.ats = smmuv3_ats_enabled(ARM_SMMUV3(obj));
+        pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
+        sbdev = SYS_BUS_DEVICE(obj);
+        sdev.base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
+        sdev.base += vms->memmap[VIRT_PLATFORM_BUS].base;
+        sdev.irq = platform_bus_get_irqn(pbus, sbdev, 0);
+        sdev.irq += vms->irqmap[VIRT_PLATFORM_BUS];
+        sdev.irq += ARM_SPI_BASE;
+
+        pci_bus_range(bus, &min_bus, &max_bus);
+        sdev.rc_smmu_idmaps = g_array_new(false, true,
+                                          sizeof(AcpiIortIdMapping));
+        idmap.input_base = min_bus << 8;
+        idmap.id_count = (max_bus - min_bus + 1) << 8;
+        g_array_append_val(sdev.rc_smmu_idmaps, idmap);
+        g_array_append_val(sdev_blob, sdev);
+    }
     /* Sort the smmuv3 devices(if any) by smmu idmap input_base */
     g_array_sort(sdev_blob, smmuv3_dev_idmap_compare);
     /*
@@ -568,7 +560,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     if (vms->legacy_smmuv3_present) {
         rc_smmu_idmaps_len = populate_smmuv3_legacy_dev(smmuv3_devs);
     } else {
-        rc_smmu_idmaps_len = populate_smmuv3_dev(smmuv3_devs);
+        rc_smmu_idmaps_len = populate_smmuv3_dev(vms, smmuv3_devs);
     }
 
     num_smmus = smmuv3_devs->len;
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b090233893..ac0606fe87 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3865,6 +3865,7 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
             }
 
             create_smmuv3_dev_dtb(vms, dev, bus, errp);
+            g_ptr_array_add(vms->smmuv3_devices, dev);
         }
     }
 
@@ -4319,6 +4320,8 @@ static void virt_instance_init(Object *obj)
     vms->oem_id = g_strndup(ACPI_BUILD_APPNAME6, 6);
     vms->oem_table_id = g_strndup(ACPI_BUILD_APPNAME8, 8);
     cxl_machine_init(obj, &vms->cxl_devices_state);
+
+    vms->smmuv3_devices = g_ptr_array_new_with_free_func(NULL);
 }
 
 static void virt_instance_finalize(Object *obj)
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 3ba33b4bd2..171d44c644 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -207,6 +207,7 @@ struct VirtMachineState {
     MemoryRegion *sysmem;
     MemoryRegion *secure_sysmem;
     bool pci_preserve_config;
+    GPtrArray *smmuv3_devices;
 };
 
 #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 20/61] hw/arm/tegra241-cmdqv: Probe host Tegra241 CMDQV support
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (18 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 19/61] hw/arm/virt: Use stored SMMUv3 device list for IORT build Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 21/61] hw/arm/tegra241-cmdqv: Implement CMDQV init Peter Maydell
                   ` (41 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

Use IOMMU_GET_HW_INFO to query host support for Tegra241 CMDQV.

Validate the returned data type, version, and minimum number of vCMDQs and
SIDs per Tegra241 CMDQ Virtual Interface(VI). Fail the probe if the host
does not meet these requirements.

The QEMU model supports one Virtual Interface(VI) per VM with 2 vCMDQs and
16 SIDs per VI, so the probe ensures the host implementation is compatible
with these limits.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-11-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/tegra241-cmdqv.c | 32 ++++++++++++++++++++++++++++++--
 hw/arm/tegra241-cmdqv.h |  4 ++++
 2 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index ad5a0d4611..3a19a1af56 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -38,8 +38,36 @@ static bool tegra241_cmdqv_init(SMMUv3State *s, Error **errp)
 static bool tegra241_cmdqv_probe(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
                                  Error **errp)
 {
-    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
-    return false;
+    uint32_t data_type = IOMMU_HW_INFO_TYPE_TEGRA241_CMDQV;
+    struct iommu_hw_info_tegra241_cmdqv cmdqv_info;
+    uint64_t caps;
+
+    if (!iommufd_backend_get_device_info(idev->iommufd, idev->devid, &data_type,
+                                         &cmdqv_info, sizeof(cmdqv_info), &caps,
+                                         NULL, errp)) {
+        return false;
+    }
+    if (data_type != IOMMU_HW_INFO_TYPE_TEGRA241_CMDQV) {
+        error_setg(errp, "Host CMDQV: unexpected data type %u (expected %u)",
+                   data_type, IOMMU_HW_INFO_TYPE_TEGRA241_CMDQV);
+        return false;
+    }
+    if (cmdqv_info.version != CMDQV_VER) {
+        error_setg(errp, "Host CMDQV: unsupported version %u (expected %u)",
+                   cmdqv_info.version, CMDQV_VER);
+        return false;
+    }
+    if (cmdqv_info.log2vcmdqs < CMDQV_NUM_CMDQ_LOG2) {
+        error_setg(errp, "Host CMDQV: insufficient vCMDQs log2=%u (need >= %u)",
+                   cmdqv_info.log2vcmdqs, CMDQV_NUM_CMDQ_LOG2);
+        return false;
+    }
+    if (cmdqv_info.log2vsids < CMDQV_NUM_SID_PER_VI_LOG2) {
+        error_setg(errp, "Host CMDQV: insufficient SIDs log2=%u (need >= %u)",
+                   cmdqv_info.log2vsids, CMDQV_NUM_SID_PER_VI_LOG2);
+        return false;
+    }
+    return true;
 }
 
 static const SMMUv3AccelCmdqvOps tegra241_cmdqv_ops = {
diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index 74a6954017..38c8b27b4d 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -10,6 +10,10 @@
 #ifndef HW_ARM_TEGRA241_CMDQV_H
 #define HW_ARM_TEGRA241_CMDQV_H
 
+#define CMDQV_VER                 1
+#define CMDQV_NUM_CMDQ_LOG2       1
+#define CMDQV_NUM_SID_PER_VI_LOG2 4
+
 const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
 
 #endif /* HW_ARM_TEGRA241_CMDQV_H */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 21/61] hw/arm/tegra241-cmdqv: Implement CMDQV init
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (19 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 20/61] hw/arm/tegra241-cmdqv: Probe host Tegra241 CMDQV support Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:05 ` [PULL 22/61] hw/arm/virt: Link SMMUv3 CMDQV resources to platform bus Peter Maydell
                   ` (40 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

Tegra241 CMDQV extends SMMUv3 with support for virtual command queues
(VCMDQs) exposed via a CMDQV MMIO region. The CMDQV MMIO space is split
into 64KB pages:

0x00000  (CMDQ-V Config page)
0x10000  (CMDQ-V CMDQ Page0)
0x20000  (CMDQ-V CMDQ Page1)
0x30000  (Virtual Interface Page0)
0x40000  (Virtual Interface Page1)

This patch wires up the Tegra241 CMDQV init callback and allocates
vendor-specific CMDQV state. The state pointer is stored in
SMMUv3AccelState for use by subsequent CMDQV operations.

The CMDQV MMIO region and a dedicated IRQ line are registered with the
SMMUv3 device. The MMIO read/write handlers are currently stubs and will
be implemented in later patches.

The CMDQV interrupt is edge-triggered and indicates VCMDQ or VINTF
error conditions. This patch only registers the IRQ line. Interrupt
generation and propagation to the guest will be added in a subsequent
patch.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Message-id: 20260609112552.378999-12-skolothumtho@nvidia.com
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3-accel.h   |  1 +
 hw/arm/tegra241-cmdqv.c | 40 ++++++++++++++++++++++++++++++++++++++--
 hw/arm/tegra241-cmdqv.h | 20 ++++++++++++++++++++
 3 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
index b45f25ad03..e0bbec8581 100644
--- a/hw/arm/smmuv3-accel.h
+++ b/hw/arm/smmuv3-accel.h
@@ -61,6 +61,7 @@ typedef struct SMMUv3AccelState {
     bool auto_mode;
     bool auto_finalised;
     const SMMUv3AccelCmdqvOps *cmdqv_ops;
+    void *cmdqv;  /* vendor specific CMDQV state */
 } SMMUv3AccelState;
 
 typedef struct SMMUS1Hwpt {
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 3a19a1af56..2875affc8d 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -13,6 +13,17 @@
 #include "smmuv3-accel.h"
 #include "tegra241-cmdqv.h"
 
+static uint64_t tegra241_cmdqv_read_mmio(void *opaque, hwaddr offset,
+                                         unsigned size)
+{
+    return 0;
+}
+
+static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
+                                      uint64_t value, unsigned size)
+{
+}
+
 static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
 {
 }
@@ -29,10 +40,35 @@ static void tegra241_cmdqv_reset(SMMUv3State *s)
 {
 }
 
+static const MemoryRegionOps mmio_cmdqv_ops = {
+    .read = tegra241_cmdqv_read_mmio,
+    .write = tegra241_cmdqv_write_mmio,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+};
+
 static bool tegra241_cmdqv_init(SMMUv3State *s, Error **errp)
 {
-    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
-    return false;
+    SysBusDevice *sbd = SYS_BUS_DEVICE(OBJECT(s));
+    SMMUv3AccelState *accel = s->s_accel;
+    Tegra241CMDQV *cmdqv;
+
+    cmdqv = g_new0(Tegra241CMDQV, 1);
+    cmdqv->cmdqv_data = g_new0(struct iommu_viommu_tegra241_cmdqv, 1);
+    memory_region_init_io(&cmdqv->mmio_cmdqv, OBJECT(s), &mmio_cmdqv_ops, cmdqv,
+                          "tegra241-cmdqv", TEGRA241_CMDQV_IO_LEN);
+    sysbus_init_mmio(sbd, &cmdqv->mmio_cmdqv);
+    sysbus_init_irq(sbd, &cmdqv->irq);
+    cmdqv->s_accel = accel;
+    accel->cmdqv = cmdqv;
+    return true;
 }
 
 static bool tegra241_cmdqv_probe(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index 38c8b27b4d..a5f65a8991 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -14,6 +14,26 @@
 #define CMDQV_NUM_CMDQ_LOG2       1
 #define CMDQV_NUM_SID_PER_VI_LOG2 4
 
+/*
+ * Tegra241 CMDQV MMIO layout (64KB pages)
+ *
+ * 0x00000  (CMDQ-V Config page)
+ * 0x10000  (CMDQ-V CMDQ Page0)
+ * 0x20000  (CMDQ-V CMDQ Page1)
+ * 0x30000  (Virtual Interface Page0)
+ * 0x40000  (Virtual Interface Page1)
+ */
+#define TEGRA241_CMDQV_IO_LEN 0x50000
+
+struct iommu_viommu_tegra241_cmdqv;
+
+typedef struct Tegra241CMDQV {
+    struct iommu_viommu_tegra241_cmdqv *cmdqv_data;
+    SMMUv3AccelState *s_accel;
+    MemoryRegion mmio_cmdqv;
+    qemu_irq irq;
+} Tegra241CMDQV;
+
 const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
 
 #endif /* HW_ARM_TEGRA241_CMDQV_H */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 22/61] hw/arm/virt: Link SMMUv3 CMDQV resources to platform bus
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (20 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 21/61] hw/arm/tegra241-cmdqv: Implement CMDQV init Peter Maydell
@ 2026-06-16 19:05 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 23/61] hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free Peter Maydell
                   ` (39 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:05 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

SMMUv3 devices with acceleration may enable CMDQV extensions
after device realize. In that case, additional MMIO regions and
IRQ lines may be registered but not yet mapped to the platform bus.

Ensure SMMUv3 device resources are linked to the platform bus
during machine_done().

This is safe to do unconditionally since the platform bus helpers
skip resources that are already mapped.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-13-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/virt.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ac0606fe87..2add7401a1 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2361,6 +2361,25 @@ static void virt_build_smbios(VirtMachineState *vms)
     }
 }
 
+/*
+ * SMMUv3 devices with acceleration may enable CMDQV extensions
+ * after device realize. In that case, additional MMIO regions and
+ * IRQ lines may be registered but not yet mapped to the platform bus.
+ *
+ * Ensure all resources are linked to the platform bus before final
+ * machine setup.
+ */
+
+static void virt_smmuv3_dev_link_cmdqv(VirtMachineState *vms)
+{
+    for (int i = 0; i < vms->smmuv3_devices->len; i++) {
+        DeviceState *dev = g_ptr_array_index(vms->smmuv3_devices, i);
+
+        platform_bus_link_device(PLATFORM_BUS_DEVICE(vms->platform_bus_dev),
+                                 SYS_BUS_DEVICE(dev));
+    }
+}
+
 static
 void virt_machine_done(Notifier *notifier, void *data)
 {
@@ -2377,6 +2396,9 @@ void virt_machine_done(Notifier *notifier, void *data)
     if (vms->cxl_devices_state.is_enabled) {
         cxl_fmws_link_targets(&error_fatal);
     }
+
+    virt_smmuv3_dev_link_cmdqv(vms);
+
     /*
      * If the user provided a dtb, we assume the dynamic sysbus nodes
      * already are integrated there. This corresponds to a use case where
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 23/61] hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (21 preceding siblings ...)
  2026-06-16 19:05 ` [PULL 22/61] hw/arm/virt: Link SMMUv3 CMDQV resources to platform bus Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 24/61] hw/arm/tegra241-cmdqv: mmap host VINTF Page0 for CMDQV Peter Maydell
                   ` (38 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

Replace the stub implementation with real vIOMMU allocation for
Tegra241 CMDQV.

Allocate a matching vEVENTQ together with the vIOMMU, since it is
specific to the Tegra241 CMDQV vIOMMU and used to receive CMDQV
events.

Free both objects on teardown.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Message-id: 20260609112552.378999-14-skolothumtho@nvidia.com
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/tegra241-cmdqv.c | 48 ++++++++++++++++++++++++++++++++++++++++-
 hw/arm/tegra241-cmdqv.h |  1 +
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 2875affc8d..c1351c8519 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -10,6 +10,7 @@
 #include "qemu/osdep.h"
 
 #include "hw/arm/smmuv3.h"
+#include "hw/arm/smmuv3-common.h"
 #include "smmuv3-accel.h"
 #include "tegra241-cmdqv.h"
 
@@ -26,13 +27,58 @@ static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
 
 static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
 {
+    SMMUv3AccelState *accel = s->s_accel;
+    IOMMUFDViommu *viommu = accel->viommu;
+    Tegra241CMDQV *cmdqv = accel->cmdqv;
+    IOMMUFDVeventq *veventq = cmdqv->veventq;
+
+    if (!viommu) {
+        return;
+    }
+    if (veventq) {
+        close(veventq->veventq_fd);
+        iommufd_backend_free_id(viommu->iommufd, veventq->veventq_id);
+        g_free(veventq);
+        cmdqv->veventq = NULL;
+    }
+    iommufd_backend_free_id(viommu->iommufd, viommu->viommu_id);
 }
 
 static bool
 tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
                             uint32_t *out_viommu_id, Error **errp)
 {
-    error_setg(errp, "NVIDIA Tegra241 CMDQV is unsupported");
+    Tegra241CMDQV *cmdqv = s->s_accel->cmdqv;
+    uint32_t viommu_id, veventq_id, veventq_fd;
+    IOMMUFDVeventq *veventq;
+
+    if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
+                                      IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV,
+                                      idev->hwpt_id, cmdqv->cmdqv_data,
+                                      sizeof(*cmdqv->cmdqv_data), &viommu_id,
+                                      errp)) {
+        return false;
+    }
+
+    if (!iommufd_backend_alloc_veventq(idev->iommufd, viommu_id,
+                                       IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV,
+                                       1 << SMMU_EVENTQS, &veventq_id,
+                                       &veventq_fd,
+                                       errp)) {
+        error_append_hint(errp, "Tegra241 CMDQV: failed to alloc veventq");
+        goto free_viommu;
+    }
+
+    veventq = g_new(IOMMUFDVeventq, 1);
+    veventq->veventq_id = veventq_id;
+    veventq->veventq_fd = veventq_fd;
+    cmdqv->veventq = veventq;
+
+    *out_viommu_id = viommu_id;
+    return true;
+
+free_viommu:
+    iommufd_backend_free_id(idev->iommufd, viommu_id);
     return false;
 }
 
diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index a5f65a8991..9fc720e96c 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -32,6 +32,7 @@ typedef struct Tegra241CMDQV {
     SMMUv3AccelState *s_accel;
     MemoryRegion mmio_cmdqv;
     qemu_irq irq;
+    IOMMUFDVeventq *veventq;
 } Tegra241CMDQV;
 
 const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 24/61] hw/arm/tegra241-cmdqv: mmap host VINTF Page0 for CMDQV
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (22 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 23/61] hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 25/61] hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region Peter Maydell
                   ` (37 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

The kernel currently exposes a single VINTF per emulated SMMUv3
instance. IOMMU_VIOMMU_ALLOC returns an mmap offset for the host
VINTF Page0 allocated for this SMMU. However, VCMDQs only become
bound to that VINTF after IOMMU_HW_QUEUE_ALLOC, so until then the
mapped Page0 does not back any real VCMDQ state.

mmap the host VINTF Page0 right after IOMMU_VIOMMU_ALLOC, as the host
VINTF is already enabled at that point, and unmap it when the vIOMMU is
freed. The mapping shares the vIOMMU's lifetime. This prepares the VINTF
mapping in advance of subsequent patches that add VCMDQ allocation.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-15-skolothumtho@nvidia.com
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/tegra241-cmdqv.c | 16 +++++++++++++++-
 hw/arm/tegra241-cmdqv.h |  3 +++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index c1351c8519..3eec6073a4 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -41,6 +41,10 @@ static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
         g_free(veventq);
         cmdqv->veventq = NULL;
     }
+    if (cmdqv->vintf_page0) {
+        munmap(cmdqv->vintf_page0, VINTF_PAGE_SIZE);
+        cmdqv->vintf_page0 = NULL;
+    }
     iommufd_backend_free_id(viommu->iommufd, viommu->viommu_id);
 }
 
@@ -60,13 +64,20 @@ tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
         return false;
     }
 
+    if (!iommufd_backend_viommu_mmap(idev->iommufd, viommu_id, VINTF_PAGE_SIZE,
+                                     cmdqv->cmdqv_data->out_vintf_mmap_offset,
+                                     &cmdqv->vintf_page0, errp)) {
+        error_append_hint(errp, "Tegra241 CMDQV: failed to mmap VINTF page0");
+        goto free_viommu;
+    }
+
     if (!iommufd_backend_alloc_veventq(idev->iommufd, viommu_id,
                                        IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV,
                                        1 << SMMU_EVENTQS, &veventq_id,
                                        &veventq_fd,
                                        errp)) {
         error_append_hint(errp, "Tegra241 CMDQV: failed to alloc veventq");
-        goto free_viommu;
+        goto munmap_page0;
     }
 
     veventq = g_new(IOMMUFDVeventq, 1);
@@ -77,6 +88,9 @@ tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
     *out_viommu_id = viommu_id;
     return true;
 
+munmap_page0:
+    munmap(cmdqv->vintf_page0, VINTF_PAGE_SIZE);
+    cmdqv->vintf_page0 = NULL;
 free_viommu:
     iommufd_backend_free_id(idev->iommufd, viommu_id);
     return false;
diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index 9fc720e96c..00c83b6186 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -25,6 +25,8 @@
  */
 #define TEGRA241_CMDQV_IO_LEN 0x50000
 
+#define VINTF_PAGE_SIZE 0x10000
+
 struct iommu_viommu_tegra241_cmdqv;
 
 typedef struct Tegra241CMDQV {
@@ -33,6 +35,7 @@ typedef struct Tegra241CMDQV {
     MemoryRegion mmio_cmdqv;
     qemu_irq irq;
     IOMMUFDVeventq *veventq;
+    void *vintf_page0;
 } Tegra241CMDQV;
 
 const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 25/61] hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (23 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 24/61] hw/arm/tegra241-cmdqv: mmap host VINTF Page0 for CMDQV Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 26/61] hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads Peter Maydell
                   ` (36 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

Tegra241 CMDQV exposes control and status registers in the CMDQ-V
Config page (offset [0x0, 0x10000)) used to configure virtual command
queue allocation and interrupt behavior.

Add read/write emulation for the CMDQ-V Config region
([CMDQV_BASE, CMDQV_CMDQ_BASE]), backed by a simple register cache.
This includes CONFIG, PARAM, STATUS, VI error and interrupt maps, CMDQ
allocation map and the VINTF0 related registers defined in the CMDQ-V
Config space. Only VINTF0 is supported; VINTF1-63 are not.

Dispatch writes on access size: Introduced writel_mmio for 4-byte and
writell_mmio for 8-byte. Reads need no split as the MMIO framework masks
the returned value to the access size.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Message-id: 20260609112552.378999-16-skolothumtho@nvidia.com
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/tegra241-cmdqv.c | 181 +++++++++++++++++++++++++++++++++++++++-
 hw/arm/tegra241-cmdqv.h | 110 ++++++++++++++++++++++++
 hw/arm/trace-events     |   4 +
 3 files changed, 294 insertions(+), 1 deletion(-)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 3eec6073a4..8950d5153b 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -8,21 +8,200 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/log.h"
 
 #include "hw/arm/smmuv3.h"
 #include "hw/arm/smmuv3-common.h"
 #include "smmuv3-accel.h"
 #include "tegra241-cmdqv.h"
+#include "trace.h"
+
+static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV *cmdqv,
+                                                 hwaddr offset)
+{
+    int i;
+
+    switch (offset) {
+    case A_VINTF0_CONFIG:
+        return cmdqv->vintf_config;
+    case A_VINTF0_STATUS:
+        return cmdqv->vintf_status;
+    case A_VINTF0_SID_MATCH_0 ... A_VINTF0_SID_MATCH_15:
+        i = (offset - A_VINTF0_SID_MATCH_0) / 4;
+        return cmdqv->vintf_sid_match[i];
+    case A_VINTF0_SID_REPLACE_0 ... A_VINTF0_SID_REPLACE_15:
+        i = (offset - A_VINTF0_SID_REPLACE_0) / 4;
+        return cmdqv->vintf_sid_replace[i];
+    case A_VINTF0_LVCMDQ_ERR_MAP_0 ... A_VINTF0_LVCMDQ_ERR_MAP_3:
+        i = (offset - A_VINTF0_LVCMDQ_ERR_MAP_0) / 4;
+        return cmdqv->vintf_cmdq_err_map[i];
+    default:
+        /*
+         * GLB_FILT_CFG_0 (offset 0xC) and GLB_FILT_DATA_0 (offset 0x10) are
+         * filter config and filter data registers. They are not required for
+         * normal VINTF operation and are not emulated.
+         */
+        qemu_log_mask(LOG_UNIMP, "%s unhandled read access at 0x%" PRIx64 "\n",
+                      __func__, offset);
+        return 0;
+    }
+}
+
+static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
+                                              hwaddr offset, uint64_t value)
+{
+    int i;
+
+    switch (offset) {
+    case A_VINTF0_CONFIG:
+        /*
+         * Mask out HYP_OWN on guest writes. This bit selects Hypervisor (1) vs
+         * Guest (0) ownership of the CMDQ. Force it to 0 so the VINTF always
+         * remains guest-owned.
+         */
+        value &= ~R_VINTF0_CONFIG_HYP_OWN_MASK;
+
+        cmdqv->vintf_config = value;
+        if (value & R_VINTF0_CONFIG_ENABLE_MASK) {
+            cmdqv->vintf_status |= R_VINTF0_STATUS_ENABLE_OK_MASK;
+        } else {
+            cmdqv->vintf_status &= ~R_VINTF0_STATUS_ENABLE_OK_MASK;
+        }
+        break;
+    case A_VINTF0_SID_MATCH_0 ... A_VINTF0_SID_MATCH_15:
+        i = (offset - A_VINTF0_SID_MATCH_0) / 4;
+        cmdqv->vintf_sid_match[i] = value;
+        break;
+    case A_VINTF0_SID_REPLACE_0 ... A_VINTF0_SID_REPLACE_15:
+        i = (offset - A_VINTF0_SID_REPLACE_0) / 4;
+        cmdqv->vintf_sid_replace[i] = value;
+        break;
+    default:
+        /*
+         * GLB_FILT_CFG_0 (offset 0xC) and GLB_FILT_DATA_0 (offset 0x10) are
+         * filter config and filter data registers. They are not required for
+         * normal VINTF operation and are not emulated.
+         */
+        qemu_log_mask(LOG_UNIMP, "%s unhandled write access at 0x%" PRIx64 "\n",
+                      __func__, offset);
+        return;
+    }
+}
 
 static uint64_t tegra241_cmdqv_read_mmio(void *opaque, hwaddr offset,
                                          unsigned size)
 {
-    return 0;
+    Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
+    uint64_t val = 0;
+
+    if (offset >= TEGRA241_CMDQV_IO_LEN) {
+        qemu_log_mask(LOG_UNIMP,
+                      "%s offset 0x%" PRIx64 " off limit (0x%x)\n", __func__,
+                      offset, TEGRA241_CMDQV_IO_LEN);
+        goto out;
+    }
+
+    switch (offset) {
+    case A_CONFIG:
+        val = cmdqv->config;
+        break;
+    case A_PARAM:
+        val = cmdqv->param;
+        break;
+    case A_STATUS:
+        val = cmdqv->status;
+        break;
+    case A_VI_ERR_MAP_0 ... A_VI_ERR_MAP_1:
+        val = cmdqv->vi_err_map[(offset - A_VI_ERR_MAP_0) / 4];
+        break;
+    case A_VI_INT_MASK_0 ... A_VI_INT_MASK_1:
+        val = cmdqv->vi_int_mask[(offset - A_VI_INT_MASK_0) / 4];
+        break;
+    case A_CMDQ_ERR_MAP_0 ... A_CMDQ_ERR_MAP_3:
+        val = cmdqv->cmdq_err_map[(offset - A_CMDQ_ERR_MAP_0) / 4];
+        break;
+    case A_CMDQ_ALLOC_MAP_0 ... A_CMDQ_ALLOC_MAP_1:
+        val = cmdqv->cmdq_alloc_map[(offset - A_CMDQ_ALLOC_MAP_0) / 4];
+        break;
+    case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
+        val = tegra241_cmdqv_config_vintf_read(cmdqv, offset);
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP, "%s unhandled read access at 0x%" PRIx64 "\n",
+                      __func__, offset);
+    }
+
+out:
+    trace_tegra241_cmdqv_read_mmio(offset, val, size);
+    return val;
+}
+
+/* 4-byte MMIO write handler. */
+static void tegra241_cmdqv_writel_mmio(Tegra241CMDQV *cmdqv, hwaddr offset,
+                                       uint32_t value)
+{
+    switch (offset) {
+    case A_CONFIG:
+        cmdqv->config = value;
+        if (value & R_CONFIG_CMDQV_EN_MASK) {
+            cmdqv->status |= R_STATUS_CMDQV_ENABLED_MASK;
+        } else {
+            cmdqv->status &= ~R_STATUS_CMDQV_ENABLED_MASK;
+        }
+        break;
+    case A_VI_INT_MASK_0 ... A_VI_INT_MASK_1:
+        cmdqv->vi_int_mask[(offset - A_VI_INT_MASK_0) / 4] = value;
+        break;
+    case A_CMDQ_ALLOC_MAP_0 ... A_CMDQ_ALLOC_MAP_1:
+        cmdqv->cmdq_alloc_map[(offset - A_CMDQ_ALLOC_MAP_0) / 4] = value;
+        break;
+    case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
+        tegra241_cmdqv_config_vintf_write(cmdqv, offset, value);
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP, "%s unhandled write access at 0x%" PRIx64 "\n",
+                      __func__, offset);
+    }
+}
+
+/*
+ * 8-byte MMIO write handler.
+ */
+static void tegra241_cmdqv_writell_mmio(Tegra241CMDQV *cmdqv, hwaddr offset,
+                                        uint64_t value)
+{
+    qemu_log_mask(LOG_UNIMP,
+                      "%s unhandled 64-bit write at 0x%" PRIx64 " (WI)\n",
+                      __func__, offset);
 }
 
 static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
                                       uint64_t value, unsigned size)
 {
+    Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
+
+    if (offset >= TEGRA241_CMDQV_IO_LEN) {
+        qemu_log_mask(LOG_UNIMP,
+                      "%s offset 0x%" PRIx64 " off limit (0x%x)\n", __func__,
+                      offset, TEGRA241_CMDQV_IO_LEN);
+        goto out;
+    }
+
+    switch (size) {
+    case 4:
+        tegra241_cmdqv_writel_mmio(cmdqv, offset, value);
+        break;
+    case 8:
+        tegra241_cmdqv_writell_mmio(cmdqv, offset, value);
+        break;
+    default:
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s bad write size %u at 0x%" PRIx64 "\n",
+                      __func__, size, offset);
+    }
+
+out:
+    trace_tegra241_cmdqv_write_mmio(offset, value, size);
 }
 
 static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index 00c83b6186..d49fa42cd5 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -10,10 +10,15 @@
 #ifndef HW_ARM_TEGRA241_CMDQV_H
 #define HW_ARM_TEGRA241_CMDQV_H
 
+#include "hw/core/registerfields.h"
+
 #define CMDQV_VER                 1
 #define CMDQV_NUM_CMDQ_LOG2       1
 #define CMDQV_NUM_SID_PER_VI_LOG2 4
 
+#define TEGRA241_CMDQV_MAX_CMDQ      (1U << CMDQV_NUM_CMDQ_LOG2)
+#define TEGRA241_CMDQV_MAX_NUM_SID   (1U << CMDQV_NUM_SID_PER_VI_LOG2)
+
 /*
  * Tegra241 CMDQV MMIO layout (64KB pages)
  *
@@ -36,8 +41,113 @@ typedef struct Tegra241CMDQV {
     qemu_irq irq;
     IOMMUFDVeventq *veventq;
     void *vintf_page0;
+
+    /* CMDQ-V Config page register cache */
+    uint32_t config;
+    uint32_t param;
+    uint32_t status;
+    uint32_t vi_err_map[2];
+    uint32_t vi_int_mask[2];
+    uint32_t cmdq_err_map[4];
+    uint32_t cmdq_alloc_map[TEGRA241_CMDQV_MAX_CMDQ];
+
+    /* VINTF0 register cache (within CMDQ-V Config page) */
+    uint32_t vintf_config;
+    uint32_t vintf_status;
+    uint32_t vintf_sid_match[TEGRA241_CMDQV_MAX_NUM_SID];
+    uint32_t vintf_sid_replace[TEGRA241_CMDQV_MAX_NUM_SID];
+    uint32_t vintf_cmdq_err_map[4];
 } Tegra241CMDQV;
 
+/* CMDQ-V Config page registers (offset 0x00000) */
+REG32(CONFIG, 0x0)
+FIELD(CONFIG, CMDQV_EN, 0, 1)
+FIELD(CONFIG, CMDQV_PER_CMD_OFFSET, 1, 3)
+FIELD(CONFIG, CMDQ_MAX_CLK_BATCH, 4, 8)
+FIELD(CONFIG, CMDQ_MAX_CMD_BATCH, 12, 8)
+FIELD(CONFIG, CONS_DRAM_EN, 20, 1)
+
+REG32(PARAM, 0x4)
+FIELD(PARAM, CMDQV_VER, 0, 4)
+FIELD(PARAM, CMDQV_NUM_CMDQ_LOG2, 4, 4)
+FIELD(PARAM, CMDQV_NUM_VI_LOG2, 8, 4)
+FIELD(PARAM, CMDQV_NUM_SID_PER_VI_LOG2, 12, 4)
+
+REG32(STATUS, 0x8)
+FIELD(STATUS, CMDQV_ENABLED, 0, 1)
+
+/* SMMU_CMDQV_VI_ERR_MAP_0/1 definitions */
+#define A_VI_ERR_MAP_0 0x14
+#define A_VI_ERR_MAP_1 0x18
+#define V_VI_ERR_MAP_NO_ERROR (0)
+#define V_VI_ERR_MAP_ERROR (1)
+
+/* SMMU_CMDQV_VI_INT_MASK_0/1 definitions */
+#define A_VI_INT_MASK_0 0x1c
+#define A_VI_INT_MASK_1 0x20
+#define V_VI_INT_MASK_NOT_MASKED (0)
+#define V_VI_INT_MASK_MASKED (1)
+
+/* SMMU_CMDQV_CMDQ_ERR_MAP_0-3 definitions */
+#define A_CMDQ_ERR_MAP_0 0x24
+#define A_CMDQ_ERR_MAP_1 0x28
+#define A_CMDQ_ERR_MAP_2 0x2c
+#define A_CMDQ_ERR_MAP_3 0x30
+
+/*
+ * CMDQ_ALLOC_MAP: one entry per physical VCMDQ. Hardware supports up to 128
+ * entries (CMDQV_NUM_CMDQ_LOG2=7), but QEMU only exposes
+ * TEGRA241_CMDQV_MAX_CMDQ (=2) VCMDQs per VM so only entries 0 and 1 are
+ * defined here.
+ */
+/* 2 identical register entries */
+#define SMMU_CMDQV_CMDQ_ALLOC_MAP_(i)                       \
+    REG32(CMDQ_ALLOC_MAP_##i, 0x200 + i * 4)                \
+    FIELD(CMDQ_ALLOC_MAP_##i, ALLOC, 0, 1)                  \
+    FIELD(CMDQ_ALLOC_MAP_##i, LVCMDQ, 1, 7)                 \
+    FIELD(CMDQ_ALLOC_MAP_##i, VIRT_INTF_INDX, 15, 6)
+
+SMMU_CMDQV_CMDQ_ALLOC_MAP_(0)
+SMMU_CMDQV_CMDQ_ALLOC_MAP_(1)
+
+/* SMMU_CMDQV_VINTF0 registers (only VINTF0 is exposed to the guest) */
+REG32(VINTF0_CONFIG, 0x1000)
+FIELD(VINTF0_CONFIG, ENABLE, 0, 1)
+FIELD(VINTF0_CONFIG, VMID, 1, 16)
+FIELD(VINTF0_CONFIG, HYP_OWN, 17, 1)
+
+REG32(VINTF0_STATUS, 0x1004)
+FIELD(VINTF0_STATUS, ENABLE_OK, 0, 1)
+FIELD(VINTF0_STATUS, STATUS, 1, 3)
+FIELD(VINTF0_STATUS, VI_NUM_LVCMDQ, 16, 8)
+
+#define V_VINTF_STATUS_NO_ERROR    (0 << 1)
+#define V_VINTF_STATUS_VCMDQ_ERROR (1 << 1)
+
+/*
+ * SMMU_CMDQV_VINTF0_SID_MATCH/_REPLACE: 16 entries per VINTF
+ * (CMDQV_NUM_SID_PER_VI_LOG2=4). Only _0 and _15 are defined,
+ * used as switch case range bounds.
+ */
+REG32(VINTF0_SID_MATCH_0, 0x1040)
+FIELD(VINTF0_SID_MATCH_0, ENABLE, 0, 1)
+FIELD(VINTF0_SID_MATCH_0, VIRT_SID, 1, 20)
+#define A_VINTF0_SID_MATCH_15  (A_VINTF0_SID_MATCH_0 + 15 * 4)
+
+REG32(VINTF0_SID_REPLACE_0, 0x1080)
+FIELD(VINTF0_SID_REPLACE_0, PHYS_SID, 0, 20)
+#define A_VINTF0_SID_REPLACE_15 (A_VINTF0_SID_REPLACE_0 + 15 * 4)
+
+/*
+ * SMMU_CMDQV_VINTF0_LVCMDQ_ERR_MAP: 4 registers per VINTF covering 32 logical
+ * VCMDQs each. With TEGRA241_CMDQV_MAX_CMDQ=2, only MAP_0 bits [1:0] carry
+ * error state. MAP_1..MAP_3 always read as 0. Only _0 and _3 are defined,
+ * used as switch case range bounds.
+ */
+REG32(VINTF0_LVCMDQ_ERR_MAP_0, 0x10c0)
+FIELD(VINTF0_LVCMDQ_ERR_MAP_0, LVCMDQ_ERR_MAP, 0, 32)
+#define A_VINTF0_LVCMDQ_ERR_MAP_3 (A_VINTF0_LVCMDQ_ERR_MAP_0 + 3 * 4)
+
 const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
 
 #endif /* HW_ARM_TEGRA241_CMDQV_H */
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 3457536fb0..8c61d66a26 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -72,6 +72,10 @@ smmuv3_accel_unset_iommu_device(int devfn, uint32_t devid) "devfn=0x%x (idev dev
 smmuv3_accel_translate_ste(uint32_t vsid, uint32_t hwpt_id, uint64_t ste_1, uint64_t ste_0) "vSID=0x%x hwpt_id=0x%x ste=%"PRIx64":%"PRIx64
 smmuv3_accel_install_ste(uint32_t vsid, const char * type, uint32_t hwpt_id) "vSID=0x%x ste type=%s hwpt_id=0x%x"
 
+# tegra241-cmdqv
+tegra241_cmdqv_read_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
+tegra241_cmdqv_write_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
+
 # strongarm.c
 strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
 strongarm_ssp_read_underrun(void) "SSP rx underrun"
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 26/61] hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (24 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 25/61] hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 27/61] hw/arm/tegra241-cmdqv: Emulate VCMDQ register writes Peter Maydell
                   ` (35 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

Tegra241 CMDQV exposes per-VCMDQ register windows through two MMIO
apertures:

  Direct VCMDQ aperture (0x10000/0x20000): VCMDQ Page0/Page1
  VINTF logical aperture (0x30000/0x40000): VINTF0 LVCMDQ Page0/Page1

Both apertures are hardware aliases of the same underlying registers:

  Page 0 (control/status): CONS_INDX, PROD_INDX, CONFIG, STATUS,
                           GERROR, GERRORN
  Page 1 (base/DRAM):      BASE_L/H, CONS_INDX_BASE_DRAM_L/H

The direct aperture Page 0 is programmable at any time so long as
CMDQV_EN is enabled. The VINTF (logical) aperture Page 0 is
programmable only once SW has mapped a VCMDQ to a VINTF; the
"logical" view is local to that VINTF.

Add read emulation for both apertures, backed by a single per-VCMDQ
register cache. VINTF aperture reads are translated to their
equivalent direct-aperture offset and served from the same cached
state.

Per the CMDQV architecture, a VCMDQ must be allocated to a Virtual
Interface before it is used to send commands to the SMMU. Until that
allocation happens, reads return cached register state with no HW
interaction. Subsequent patches wire up IOMMU_HW_QUEUE_ALLOC, mmap
the host VINTF Page 0, and install it into guest MMIO; after that,
Page 0 reads from either aperture are served from the hardware-backed
mmap'd page instead of the cache. Page 1 is also a hardware alias,
but the kernel only exposes mmap for Page 0, so Page 1 reads always
trap to QEMU and are served from cache.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Message-id: 20260609112552.378999-17-skolothumtho@nvidia.com
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/tegra241-cmdqv.c | 103 +++++++++++++++++++
 hw/arm/tegra241-cmdqv.h | 216 ++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events     |   2 +
 3 files changed, 321 insertions(+)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 8950d5153b..2dfd377ee9 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -16,6 +16,79 @@
 #include "tegra241-cmdqv.h"
 #include "trace.h"
 
+/*
+ * Read a VCMDQ Page 0 register (control/status) using VCMDQ0_* offsets.
+ *
+ * The caller normalizes the MMIO offset such that @offset0 always refers
+ * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
+ */
+static uint64_t tegra241_cmdqv_read_vcmdq_page0(Tegra241CMDQV *cmdqv,
+                                                hwaddr offset0, int index,
+                                                bool direct)
+{
+    uint64_t val = 0;
+
+    switch (offset0) {
+    case A_VCMDQ0_CONS_INDX:
+        val = cmdqv->vcmdq_cons_indx[index];
+        break;
+    case A_VCMDQ0_PROD_INDX:
+        val = cmdqv->vcmdq_prod_indx[index];
+        break;
+    case A_VCMDQ0_CONFIG:
+        val = cmdqv->vcmdq_config[index];
+        break;
+    case A_VCMDQ0_STATUS:
+        val = cmdqv->vcmdq_status[index];
+        break;
+    case A_VCMDQ0_GERROR:
+        val = cmdqv->vcmdq_gerror[index];
+        break;
+    case A_VCMDQ0_GERRORN:
+        val = cmdqv->vcmdq_gerrorn[index];
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "%s unhandled read access at 0x%" PRIx64 "\n",
+                      __func__, offset0);
+    }
+    trace_tegra241_cmdqv_read_vcmdq_page0(index, direct ? "direct" : "vi",
+                                          offset0, val);
+    return val;
+}
+
+/*
+ * Read a VCMDQ Page 1 register (base / DRAM address) using VCMDQ0_* offsets.
+ */
+static uint64_t tegra241_cmdqv_read_vcmdq_page1(Tegra241CMDQV *cmdqv,
+                                                hwaddr offset0, int index,
+                                                bool direct)
+{
+    uint64_t val = 0;
+
+    switch (offset0) {
+    case A_VCMDQ0_BASE_L:
+        val = cmdqv->vcmdq_base[index];
+        break;
+    case A_VCMDQ0_BASE_H:
+        val = cmdqv->vcmdq_base[index] >> 32;
+        break;
+    case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
+        val = cmdqv->vcmdq_cons_indx_base[index];
+        break;
+    case A_VCMDQ0_CONS_INDX_BASE_DRAM_H:
+        val = cmdqv->vcmdq_cons_indx_base[index] >> 32;
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "%s unhandled read access at 0x%" PRIx64 "\n",
+                      __func__, offset0);
+    }
+    trace_tegra241_cmdqv_read_vcmdq_page1(index, direct ? "direct" : "vi",
+                                          offset0, val);
+    return val;
+}
+
 static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV *cmdqv,
                                                  hwaddr offset)
 {
@@ -93,6 +166,7 @@ static uint64_t tegra241_cmdqv_read_mmio(void *opaque, hwaddr offset,
 {
     Tegra241CMDQV *cmdqv = (Tegra241CMDQV *)opaque;
     uint64_t val = 0;
+    int index;
 
     if (offset >= TEGRA241_CMDQV_IO_LEN) {
         qemu_log_mask(LOG_UNIMP,
@@ -126,6 +200,35 @@ static uint64_t tegra241_cmdqv_read_mmio(void *opaque, hwaddr offset,
     case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
         val = tegra241_cmdqv_config_vintf_read(cmdqv, offset);
         break;
+    case A_VI_VCMDQ0_CONS_INDX ... A_VI_VCMDQ1_GERRORN:
+        /*
+         * VINTF Page0 registers are hardware aliases of VCMDQ Page0 registers.
+         * Translate the VINTF aperture offset to its VCMDQ Page0 equivalent
+         * before dispatching to the Page 0 helper.
+         */
+        offset -= CMDQV_VINTF_PAGE0_BASE - CMDQV_VCMDQ_PAGE0_BASE;
+        index = (offset - CMDQV_VCMDQ_PAGE0_BASE) / CMDQV_VCMDQ_STRIDE;
+        return tegra241_cmdqv_read_vcmdq_page0(cmdqv,
+                offset - index * CMDQV_VCMDQ_STRIDE, index, false);
+    case A_VCMDQ0_CONS_INDX ... A_VCMDQ1_GERRORN:
+        /*
+         * Decode a per-VCMDQ Page 0 access. Each VCMDQ occupies a
+         * CMDQV_VCMDQ_STRIDE-byte window; extract the index and normalize
+         * to the VCMDQ0_* offset before calling the Page 0 helper.
+         */
+        index = (offset - CMDQV_VCMDQ_PAGE0_BASE) / CMDQV_VCMDQ_STRIDE;
+        return tegra241_cmdqv_read_vcmdq_page0(cmdqv,
+                offset - index * CMDQV_VCMDQ_STRIDE, index, true);
+    case A_VI_VCMDQ0_BASE_L ... A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
+        /* Same VINTF-to-VCMDQ translation as VINTF Page0 case above. */
+        offset -= CMDQV_VINTF_PAGE1_BASE - CMDQV_VCMDQ_PAGE1_BASE;
+        index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
+        return tegra241_cmdqv_read_vcmdq_page1(cmdqv,
+                offset - index * CMDQV_VCMDQ_STRIDE, index, false);
+    case A_VCMDQ0_BASE_L ... A_VCMDQ1_CONS_INDX_BASE_DRAM_H:
+        index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
+        return tegra241_cmdqv_read_vcmdq_page1(cmdqv,
+                offset - index * CMDQV_VCMDQ_STRIDE, index, true);
     default:
         qemu_log_mask(LOG_UNIMP, "%s unhandled read access at 0x%" PRIx64 "\n",
                       __func__, offset);
diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index d49fa42cd5..c4d327a9a5 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -30,6 +30,13 @@
  */
 #define TEGRA241_CMDQV_IO_LEN 0x50000
 
+/* CMDQV MMIO aperture bases and VCMDQ stride */
+#define CMDQV_VCMDQ_PAGE0_BASE  0x10000  /* CMDQV_CMDQ_BASE */
+#define CMDQV_VCMDQ_PAGE1_BASE  0x20000
+#define CMDQV_VINTF_PAGE0_BASE  0x30000  /* CMDQV_VI_CMDQ_BASE */
+#define CMDQV_VINTF_PAGE1_BASE  0x40000
+#define CMDQV_VCMDQ_STRIDE      0x80
+
 #define VINTF_PAGE_SIZE 0x10000
 
 struct iommu_viommu_tegra241_cmdqv;
@@ -57,6 +64,19 @@ typedef struct Tegra241CMDQV {
     uint32_t vintf_sid_match[TEGRA241_CMDQV_MAX_NUM_SID];
     uint32_t vintf_sid_replace[TEGRA241_CMDQV_MAX_NUM_SID];
     uint32_t vintf_cmdq_err_map[4];
+    /*
+     * VCMDQ register cache. The direct (VCMDQ aperture) and logical
+     * (VINTF aperture) views are hardware aliases; both are served from
+     * this single cached copy.
+     */
+    uint32_t vcmdq_cons_indx[TEGRA241_CMDQV_MAX_CMDQ];
+    uint32_t vcmdq_prod_indx[TEGRA241_CMDQV_MAX_CMDQ];
+    uint32_t vcmdq_config[TEGRA241_CMDQV_MAX_CMDQ];
+    uint32_t vcmdq_status[TEGRA241_CMDQV_MAX_CMDQ];
+    uint32_t vcmdq_gerror[TEGRA241_CMDQV_MAX_CMDQ];
+    uint32_t vcmdq_gerrorn[TEGRA241_CMDQV_MAX_CMDQ];
+    uint64_t vcmdq_base[TEGRA241_CMDQV_MAX_CMDQ];
+    uint64_t vcmdq_cons_indx_base[TEGRA241_CMDQV_MAX_CMDQ];
 } Tegra241CMDQV;
 
 /* CMDQ-V Config page registers (offset 0x00000) */
@@ -148,6 +168,202 @@ REG32(VINTF0_LVCMDQ_ERR_MAP_0, 0x10c0)
 FIELD(VINTF0_LVCMDQ_ERR_MAP_0, LVCMDQ_ERR_MAP, 0, 32)
 #define A_VINTF0_LVCMDQ_ERR_MAP_3 (A_VINTF0_LVCMDQ_ERR_MAP_0 + 3 * 4)
 
+/*
+ * Direct VCMDQ aperture register windows.
+ *
+ * Page 0 @ CMDQV_VCMDQ_PAGE0_BASE: VCMDQ control and status registers.
+ * Page 1 @ CMDQV_VCMDQ_PAGE1_BASE: VCMDQ base and DRAM address registers.
+ *
+ * Each VCMDQ occupies a CMDQV_VCMDQ_STRIDE-byte slot within its page.
+ */
+
+/* --- Page 0 register macros --- */
+#define SMMU_CMDQV_VCMDQi_CONS_INDX_(i)                     \
+    REG32(VCMDQ##i##_CONS_INDX,                             \
+          CMDQV_VCMDQ_PAGE0_BASE + i * CMDQV_VCMDQ_STRIDE)  \
+    FIELD(VCMDQ##i##_CONS_INDX, RD, 0, 20)                  \
+    FIELD(VCMDQ##i##_CONS_INDX, ERR, 24, 7)
+
+#define V_VCMDQ_CONS_INDX_ERR_CERROR_NONE         0
+#define V_VCMDQ_CONS_INDX_ERR_CERROR_ILL_OPCODE   1
+#define V_VCMDQ_CONS_INDX_ERR_CERROR_ABT          2
+#define V_VCMDQ_CONS_INDX_ERR_CERROR_ATC_INV_SYNC 3
+#define V_VCMDQ_CONS_INDX_ERR_CERROR_ILL_ACCESS   4
+
+#define SMMU_CMDQV_VCMDQi_PROD_INDX_(i)                          \
+    REG32(VCMDQ##i##_PROD_INDX,                                  \
+          CMDQV_VCMDQ_PAGE0_BASE + 0x4 + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VCMDQ##i##_PROD_INDX, WR, 0, 20)
+
+#define SMMU_CMDQV_VCMDQi_CONFIG_(i)                             \
+    REG32(VCMDQ##i##_CONFIG,                                     \
+          CMDQV_VCMDQ_PAGE0_BASE + 0x8 + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VCMDQ##i##_CONFIG, CMDQ_EN, 0, 1)
+
+#define SMMU_CMDQV_VCMDQi_STATUS_(i)                             \
+    REG32(VCMDQ##i##_STATUS,                                     \
+          CMDQV_VCMDQ_PAGE0_BASE + 0xc + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VCMDQ##i##_STATUS, CMDQ_EN_OK, 0, 1)
+
+#define SMMU_CMDQV_VCMDQi_GERROR_(i)                              \
+    REG32(VCMDQ##i##_GERROR,                                      \
+          CMDQV_VCMDQ_PAGE0_BASE + 0x10 + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VCMDQ##i##_GERROR, CMDQ_ERR, 0, 1)                      \
+    FIELD(VCMDQ##i##_GERROR, CONS_DRAM_WR_ABT_ERR, 1, 1)          \
+    FIELD(VCMDQ##i##_GERROR, CMDQ_INIT_ERR, 2, 1)
+
+#define SMMU_CMDQV_VCMDQi_GERRORN_(i)                             \
+    REG32(VCMDQ##i##_GERRORN,                                     \
+          CMDQV_VCMDQ_PAGE0_BASE + 0x14 + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VCMDQ##i##_GERRORN, CMDQ_ERR, 0, 1)                     \
+    FIELD(VCMDQ##i##_GERRORN, CONS_DRAM_WR_ABT_ERR, 1, 1)         \
+    FIELD(VCMDQ##i##_GERRORN, CMDQ_INIT_ERR, 2, 1)
+
+/* Page 0 layout: VCMDQ0 */
+SMMU_CMDQV_VCMDQi_CONS_INDX_(0)
+SMMU_CMDQV_VCMDQi_PROD_INDX_(0)
+SMMU_CMDQV_VCMDQi_CONFIG_(0)
+SMMU_CMDQV_VCMDQi_STATUS_(0)
+SMMU_CMDQV_VCMDQi_GERROR_(0)
+SMMU_CMDQV_VCMDQi_GERRORN_(0)
+
+/* Page 0 layout: VCMDQ1 */
+SMMU_CMDQV_VCMDQi_CONS_INDX_(1)
+SMMU_CMDQV_VCMDQi_PROD_INDX_(1)
+SMMU_CMDQV_VCMDQi_CONFIG_(1)
+SMMU_CMDQV_VCMDQi_STATUS_(1)
+SMMU_CMDQV_VCMDQi_GERROR_(1)
+SMMU_CMDQV_VCMDQi_GERRORN_(1)
+
+/* --- Page 1 register macros --- */
+#define SMMU_CMDQV_VCMDQi_BASE_L_(i)                                          \
+    REG32(VCMDQ##i##_BASE_L, CMDQV_VCMDQ_PAGE1_BASE + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VCMDQ##i##_BASE_L, LOG2SIZE, 0, 5)                                  \
+    FIELD(VCMDQ##i##_BASE_L, ADDR, 5, 27)
+
+#define SMMU_CMDQV_VCMDQi_BASE_H_(i)                             \
+    REG32(VCMDQ##i##_BASE_H,                                     \
+          CMDQV_VCMDQ_PAGE1_BASE + 0x4 + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VCMDQ##i##_BASE_H, ADDR, 0, 16)
+
+#define SMMU_CMDQV_VCMDQi_CONS_INDX_BASE_DRAM_L_(i)              \
+    REG32(VCMDQ##i##_CONS_INDX_BASE_DRAM_L,                      \
+          CMDQV_VCMDQ_PAGE1_BASE + 0x8 + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VCMDQ##i##_CONS_INDX_BASE_DRAM_L, ADDR, 0, 32)
+
+#define SMMU_CMDQV_VCMDQi_CONS_INDX_BASE_DRAM_H_(i)              \
+    REG32(VCMDQ##i##_CONS_INDX_BASE_DRAM_H,                      \
+          CMDQV_VCMDQ_PAGE1_BASE + 0xc + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VCMDQ##i##_CONS_INDX_BASE_DRAM_H, ADDR, 0, 16)
+
+/* Page 1 layout: VCMDQ0 */
+SMMU_CMDQV_VCMDQi_BASE_L_(0)
+SMMU_CMDQV_VCMDQi_BASE_H_(0)
+SMMU_CMDQV_VCMDQi_CONS_INDX_BASE_DRAM_L_(0)
+SMMU_CMDQV_VCMDQi_CONS_INDX_BASE_DRAM_H_(0)
+
+/* Page 1 layout: VCMDQ1 */
+SMMU_CMDQV_VCMDQi_BASE_L_(1)
+SMMU_CMDQV_VCMDQi_BASE_H_(1)
+SMMU_CMDQV_VCMDQi_CONS_INDX_BASE_DRAM_L_(1)
+SMMU_CMDQV_VCMDQi_CONS_INDX_BASE_DRAM_H_(1)
+
+/*
+ * VINTF0 logical VCMDQ aperture register windows.
+ *
+ * Page 0 @ CMDQV_VINTF_PAGE0_BASE: VCMDQ control and status registers.
+ * Page 1 @ CMDQV_VINTF_PAGE1_BASE: VCMDQ base and DRAM address registers.
+ *
+ * VCMDQs mapped via VINTF are accessed through this aperture as
+ * hardware aliases of the direct VCMDQ aperture above.
+ */
+
+/* --- Page 0 register macros --- */
+#define SMMU_CMDQV_VI_VCMDQi_CONS_INDX_(i)                  \
+    REG32(VI_VCMDQ##i##_CONS_INDX,                          \
+          CMDQV_VINTF_PAGE0_BASE + i * CMDQV_VCMDQ_STRIDE)  \
+    FIELD(VI_VCMDQ##i##_CONS_INDX, RD, 0, 20)               \
+    FIELD(VI_VCMDQ##i##_CONS_INDX, ERR, 24, 7)
+
+#define SMMU_CMDQV_VI_VCMDQi_PROD_INDX_(i)                       \
+    REG32(VI_VCMDQ##i##_PROD_INDX,                               \
+          CMDQV_VINTF_PAGE0_BASE + 0x4 + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VI_VCMDQ##i##_PROD_INDX, WR, 0, 20)
+
+#define SMMU_CMDQV_VI_VCMDQi_CONFIG_(i)                          \
+    REG32(VI_VCMDQ##i##_CONFIG,                                  \
+          CMDQV_VINTF_PAGE0_BASE + 0x8 + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VI_VCMDQ##i##_CONFIG, CMDQ_EN, 0, 1)
+
+#define SMMU_CMDQV_VI_VCMDQi_STATUS_(i)                          \
+    REG32(VI_VCMDQ##i##_STATUS,                                  \
+          CMDQV_VINTF_PAGE0_BASE + 0xc + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VI_VCMDQ##i##_STATUS, CMDQ_EN_OK, 0, 1)
+
+#define SMMU_CMDQV_VI_VCMDQi_GERROR_(i)                           \
+    REG32(VI_VCMDQ##i##_GERROR,                                   \
+          CMDQV_VINTF_PAGE0_BASE + 0x10 + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VI_VCMDQ##i##_GERROR, CMDQ_ERR, 0, 1)                   \
+    FIELD(VI_VCMDQ##i##_GERROR, CONS_DRAM_WR_ABT_ERR, 1, 1)       \
+    FIELD(VI_VCMDQ##i##_GERROR, CMDQ_INIT_ERR, 2, 1)
+
+#define SMMU_CMDQV_VI_VCMDQi_GERRORN_(i)                          \
+    REG32(VI_VCMDQ##i##_GERRORN,                                  \
+          CMDQV_VINTF_PAGE0_BASE + 0x14 + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VI_VCMDQ##i##_GERRORN, CMDQ_ERR, 0, 1)                  \
+    FIELD(VI_VCMDQ##i##_GERRORN, CONS_DRAM_WR_ABT_ERR, 1, 1)      \
+    FIELD(VI_VCMDQ##i##_GERRORN, CMDQ_INIT_ERR, 2, 1)
+
+/* Page 0 layout: VCMDQ0 */
+SMMU_CMDQV_VI_VCMDQi_CONS_INDX_(0)
+SMMU_CMDQV_VI_VCMDQi_PROD_INDX_(0)
+SMMU_CMDQV_VI_VCMDQi_CONFIG_(0)
+SMMU_CMDQV_VI_VCMDQi_STATUS_(0)
+SMMU_CMDQV_VI_VCMDQi_GERROR_(0)
+SMMU_CMDQV_VI_VCMDQi_GERRORN_(0)
+
+/* Page 0 layout: VCMDQ1 */
+SMMU_CMDQV_VI_VCMDQi_CONS_INDX_(1)
+SMMU_CMDQV_VI_VCMDQi_PROD_INDX_(1)
+SMMU_CMDQV_VI_VCMDQi_CONFIG_(1)
+SMMU_CMDQV_VI_VCMDQi_STATUS_(1)
+SMMU_CMDQV_VI_VCMDQi_GERROR_(1)
+SMMU_CMDQV_VI_VCMDQi_GERRORN_(1)
+
+/* --- Page 1 register macros --- */
+#define SMMU_CMDQV_VI_VCMDQi_BASE_L_(i)                     \
+    REG32(VI_VCMDQ##i##_BASE_L,                             \
+          CMDQV_VINTF_PAGE1_BASE + i * CMDQV_VCMDQ_STRIDE)  \
+    FIELD(VI_VCMDQ##i##_BASE_L, LOG2SIZE, 0, 5)             \
+    FIELD(VI_VCMDQ##i##_BASE_L, ADDR, 5, 27)
+
+#define SMMU_CMDQV_VI_VCMDQi_BASE_H_(i)                          \
+    REG32(VI_VCMDQ##i##_BASE_H,                                  \
+          CMDQV_VINTF_PAGE1_BASE + 0x4 + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VI_VCMDQ##i##_BASE_H, ADDR, 0, 16)
+
+#define SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_L_(i)           \
+    REG32(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_L,                   \
+          CMDQV_VINTF_PAGE1_BASE + 0x8 + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_L, ADDR, 0, 32)
+
+#define SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_H_(i)           \
+    REG32(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_H,                   \
+          CMDQV_VINTF_PAGE1_BASE + 0xc + i * CMDQV_VCMDQ_STRIDE) \
+    FIELD(VI_VCMDQ##i##_CONS_INDX_BASE_DRAM_H, ADDR, 0, 16)
+
+/* Page 1 layout: VCMDQ0 */
+SMMU_CMDQV_VI_VCMDQi_BASE_L_(0)
+SMMU_CMDQV_VI_VCMDQi_BASE_H_(0)
+SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_L_(0)
+SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_H_(0)
+
+/* Page 1 layout: VCMDQ1 */
+SMMU_CMDQV_VI_VCMDQi_BASE_L_(1)
+SMMU_CMDQV_VI_VCMDQi_BASE_H_(1)
+SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_L_(1)
+SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_H_(1)
+
 const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
 
 #endif /* HW_ARM_TEGRA241_CMDQV_H */
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 8c61d66a26..5156604228 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -75,6 +75,8 @@ smmuv3_accel_install_ste(uint32_t vsid, const char * type, uint32_t hwpt_id) "vS
 # tegra241-cmdqv
 tegra241_cmdqv_read_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
 tegra241_cmdqv_write_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
+tegra241_cmdqv_read_vcmdq_page0(int index, const char *aperture, uint64_t offset0, uint64_t val) "vcmdq[%d] %s offset0: 0x%"PRIx64" val: 0x%"PRIx64
+tegra241_cmdqv_read_vcmdq_page1(int index, const char *aperture, uint64_t offset0, uint64_t val) "vcmdq[%d] %s offset0: 0x%"PRIx64" val: 0x%"PRIx64
 
 # strongarm.c
 strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 27/61] hw/arm/tegra241-cmdqv: Emulate VCMDQ register writes
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (25 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 26/61] hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 28/61] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs once configured Peter Maydell
                   ` (34 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

This is the write side counterpart of the VCMDQ read emulation. Add
write handling for both the direct VCMDQ aperture and the VINTF
logical aperture using the same index decoding and VINTF-to-VCMDQ
translation logic as the read path.

VINTF aperture writes are translated to their direct-aperture
equivalent and update the same cached state. Page 1 registers
(BASE, CONS_INDX_BASE) always update the cache.

Per the CMDQV architecture, a VCMDQ must be allocated to a Virtual
Interface before it is used to send commands to the SMMU. Until
that allocation happens, MMIO writes only update cached register
state - no command consumption, error handling, or interrupt
activity is driven from these writes. Subsequent patches wire up
IOMMU_HW_QUEUE_ALLOC, mmap the host VINTF Page 0, and install it
into guest MMIO; after that, Page 0 writes from either aperture
reach the hardware-backed mmap'd page instead of just the cache.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Message-id: 20260609112552.378999-18-skolothumtho@nvidia.com
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/tegra241-cmdqv.c | 159 +++++++++++++++++++++++++++++++++++++++-
 hw/arm/trace-events     |   2 +
 2 files changed, 159 insertions(+), 2 deletions(-)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 2dfd377ee9..1f3d883cfe 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -120,6 +120,104 @@ static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV *cmdqv,
     }
 }
 
+/*
+ * Write a VCMDQ Page 0 register (control/status) using VCMDQ0_* offsets.
+ *
+ * The caller normalizes the MMIO offset such that @offset0 always refers
+ * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
+ *
+ * Page 0 registers are all 32-bit; this helper is only called for 4-byte
+ * writes.
+ */
+static void tegra241_cmdqv_write_vcmdq_page0(Tegra241CMDQV *cmdqv,
+                                             hwaddr offset0, int index,
+                                             uint32_t value, bool direct)
+{
+    switch (offset0) {
+    case A_VCMDQ0_CONS_INDX:
+        cmdqv->vcmdq_cons_indx[index] = value;
+        break;
+    case A_VCMDQ0_PROD_INDX:
+        /* VCMDQ is functional only once allocated to a VINTF; cache only. */
+        cmdqv->vcmdq_prod_indx[index] = value;
+        break;
+    case A_VCMDQ0_CONFIG:
+        if (value & R_VCMDQ0_CONFIG_CMDQ_EN_MASK) {
+            cmdqv->vcmdq_status[index] |= R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
+        } else {
+            cmdqv->vcmdq_status[index] &= ~R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
+        }
+        cmdqv->vcmdq_config[index] = value;
+        break;
+    case A_VCMDQ0_GERRORN:
+        /* VCMDQ is functional only once allocated to a VINTF; cache only. */
+        cmdqv->vcmdq_gerrorn[index] = value;
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "%s unhandled write access at 0x%" PRIx64 "\n",
+                      __func__, offset0);
+    }
+    trace_tegra241_cmdqv_write_vcmdq_page0(index, direct ? "direct" : "vi",
+                                           offset0, value);
+}
+
+/*
+ * Write a VCMDQ Page 1 register (base / DRAM address) - 4-byte access.
+ */
+static void tegra241_cmdqv_write_vcmdq_page1(Tegra241CMDQV *cmdqv,
+                                             hwaddr offset0, int index,
+                                             uint32_t value, bool direct)
+{
+    switch (offset0) {
+    case A_VCMDQ0_BASE_L:
+        cmdqv->vcmdq_base[index] =
+            deposit64(cmdqv->vcmdq_base[index], 0, 32, value);
+        break;
+    case A_VCMDQ0_BASE_H:
+        cmdqv->vcmdq_base[index] =
+            deposit64(cmdqv->vcmdq_base[index], 32, 32, value);
+        break;
+    case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
+        cmdqv->vcmdq_cons_indx_base[index] =
+            deposit64(cmdqv->vcmdq_cons_indx_base[index], 0, 32, value);
+        break;
+    case A_VCMDQ0_CONS_INDX_BASE_DRAM_H:
+        cmdqv->vcmdq_cons_indx_base[index] =
+            deposit64(cmdqv->vcmdq_cons_indx_base[index], 32, 32, value);
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "%s unhandled write access at 0x%" PRIx64 "\n",
+                      __func__, offset0);
+    }
+    trace_tegra241_cmdqv_write_vcmdq_page1(index, direct ? "direct" : "vi",
+                                           offset0, value);
+}
+
+/*
+ * Write a VCMDQ Page 1 register - 8-byte access at BASE_L or DRAM_L.
+ */
+static void tegra241_cmdqv_write_vcmdq_page1_64(Tegra241CMDQV *cmdqv,
+                                                hwaddr offset0, int index,
+                                                uint64_t value, bool direct)
+{
+    switch (offset0) {
+    case A_VCMDQ0_BASE_L:
+        cmdqv->vcmdq_base[index] = value;
+        break;
+    case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
+        cmdqv->vcmdq_cons_indx_base[index] = value;
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "%s unhandled 64-bit write at 0x%" PRIx64 "\n",
+                      __func__, offset0);
+    }
+    trace_tegra241_cmdqv_write_vcmdq_page1(index, direct ? "direct" : "vi",
+                                           offset0, value);
+}
+
 static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
                                               hwaddr offset, uint64_t value)
 {
@@ -243,6 +341,8 @@ out:
 static void tegra241_cmdqv_writel_mmio(Tegra241CMDQV *cmdqv, hwaddr offset,
                                        uint32_t value)
 {
+    int index;
+
     switch (offset) {
     case A_CONFIG:
         cmdqv->config = value;
@@ -261,6 +361,39 @@ static void tegra241_cmdqv_writel_mmio(Tegra241CMDQV *cmdqv, hwaddr offset,
     case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
         tegra241_cmdqv_config_vintf_write(cmdqv, offset, value);
         break;
+    case A_VI_VCMDQ0_CONS_INDX ... A_VI_VCMDQ1_GERRORN:
+        /*
+         * VINTF Page0 registers are hardware aliases of VCMDQ Page0 registers.
+         * Translate the VINTF aperture offset to its VCMDQ Page0 equivalent
+         * before dispatching to the Page 0 helper.
+         */
+        offset -= CMDQV_VINTF_PAGE0_BASE - CMDQV_VCMDQ_PAGE0_BASE;
+        index = (offset - CMDQV_VCMDQ_PAGE0_BASE) / CMDQV_VCMDQ_STRIDE;
+        tegra241_cmdqv_write_vcmdq_page0(cmdqv,
+                offset - index * CMDQV_VCMDQ_STRIDE, index, value, false);
+        break;
+    case A_VCMDQ0_CONS_INDX ... A_VCMDQ1_GERRORN:
+        /*
+         * Decode a per-VCMDQ Page 0 access. Each VCMDQ occupies a
+         * CMDQV_VCMDQ_STRIDE-byte window; extract the index and normalize
+         * to the VCMDQ0_* offset before calling the Page 0 helper.
+         */
+        index = (offset - CMDQV_VCMDQ_PAGE0_BASE) / CMDQV_VCMDQ_STRIDE;
+        tegra241_cmdqv_write_vcmdq_page0(cmdqv,
+                offset - index * CMDQV_VCMDQ_STRIDE, index, value, true);
+        break;
+    case A_VI_VCMDQ0_BASE_L ... A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
+        /* Same VINTF-to-VCMDQ translation as VINTF Page0 case above. */
+        offset -= CMDQV_VINTF_PAGE1_BASE - CMDQV_VCMDQ_PAGE1_BASE;
+        index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
+        tegra241_cmdqv_write_vcmdq_page1(cmdqv,
+                offset - index * CMDQV_VCMDQ_STRIDE, index, value, false);
+        break;
+    case A_VCMDQ0_BASE_L ... A_VCMDQ1_CONS_INDX_BASE_DRAM_H:
+        index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
+        tegra241_cmdqv_write_vcmdq_page1(cmdqv,
+                offset - index * CMDQV_VCMDQ_STRIDE, index, value, true);
+        break;
     default:
         qemu_log_mask(LOG_UNIMP, "%s unhandled write access at 0x%" PRIx64 "\n",
                       __func__, offset);
@@ -268,14 +401,36 @@ static void tegra241_cmdqv_writel_mmio(Tegra241CMDQV *cmdqv, hwaddr offset,
 }
 
 /*
- * 8-byte MMIO write handler.
+ * 8-byte MMIO write handler. Only Page 1 BASE / CONS_INDX_BASE_DRAM accept
+ * full 64-bit writes; other offsets are write-ignored.
  */
 static void tegra241_cmdqv_writell_mmio(Tegra241CMDQV *cmdqv, hwaddr offset,
                                         uint64_t value)
 {
-    qemu_log_mask(LOG_UNIMP,
+    int index;
+
+    switch (offset) {
+    case A_VI_VCMDQ0_BASE_L ... A_VI_VCMDQ1_CONS_INDX_BASE_DRAM_H:
+        /*
+         * VINTF Page1 registers are hardware aliases of VCMDQ Page1 registers.
+         * Translate the VINTF aperture offset to its VCMDQ Page1 equivalent
+         * before dispatching to the Page 1 helper.
+         */
+        offset -= CMDQV_VINTF_PAGE1_BASE - CMDQV_VCMDQ_PAGE1_BASE;
+        index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
+        tegra241_cmdqv_write_vcmdq_page1_64(cmdqv,
+                offset - index * CMDQV_VCMDQ_STRIDE, index, value, false);
+        break;
+    case A_VCMDQ0_BASE_L ... A_VCMDQ1_CONS_INDX_BASE_DRAM_H:
+        index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
+        tegra241_cmdqv_write_vcmdq_page1_64(cmdqv,
+                offset - index * CMDQV_VCMDQ_STRIDE, index, value, true);
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP,
                       "%s unhandled 64-bit write at 0x%" PRIx64 " (WI)\n",
                       __func__, offset);
+    }
 }
 
 static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 5156604228..666967dc5e 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -77,6 +77,8 @@ tegra241_cmdqv_read_mmio(uint64_t offset, uint64_t val, unsigned size) "offset:
 tegra241_cmdqv_write_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
 tegra241_cmdqv_read_vcmdq_page0(int index, const char *aperture, uint64_t offset0, uint64_t val) "vcmdq[%d] %s offset0: 0x%"PRIx64" val: 0x%"PRIx64
 tegra241_cmdqv_read_vcmdq_page1(int index, const char *aperture, uint64_t offset0, uint64_t val) "vcmdq[%d] %s offset0: 0x%"PRIx64" val: 0x%"PRIx64
+tegra241_cmdqv_write_vcmdq_page0(int index, const char *aperture, uint64_t offset0, uint64_t val) "vcmdq[%d] %s offset0: 0x%"PRIx64" val: 0x%"PRIx64
+tegra241_cmdqv_write_vcmdq_page1(int index, const char *aperture, uint64_t offset0, uint64_t val) "vcmdq[%d] %s offset0: 0x%"PRIx64" val: 0x%"PRIx64
 
 # strongarm.c
 strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 28/61] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs once configured
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (26 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 27/61] hw/arm/tegra241-cmdqv: Emulate VCMDQ register writes Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 29/61] hw/arm/tegra241-cmdqv: Route allocated VCMDQ Page0 accesses to the mmap'd host VINTF page0 Peter Maydell
                   ` (33 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

Add support for allocating IOMMUFD hardware queues when the guest
programs the VCMDQ BASE registers.

VCMDQ_EN lives in VCMDQ_CONFIG, which is on the VINTF Page0 region
that a later patch installs into guest MMIO — so QEMU won't trap its
writes. Allocate the hardware queue instead once all of these are
set: BASE programmed, CMDQ_ALLOC_MAP.ALLOC, and CMDQV / VINTF
enabled. Each precondition write retries the allocation, so the
guest may program them in any order.

iommufd_backend_alloc_hw_queue() needs the guest physical address of
the VCMDQ ring buffer, so allocation is deferred until the guest has
populated BASE.

If a hardware queue was previously allocated for the same VCMDQ,
free it before reallocation. All allocated VCMDQs are freed when
CMDQV or VINTF is disabled, when the ALLOC bit is cleared, or on reset.

On allocation failure, set CMDQ_INIT_ERR and clear CMDQ_EN_OK in the
cache so trapped guest reads see the failure rather than a queue
that looks live. Clear them on a later successful allocation. A guest
CMDQ_EN write then sets CMDQ_EN_OK only if CMDQ_INIT_ERR is clear.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-19-skolothumtho@nvidia.com
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/tegra241-cmdqv.c | 171 +++++++++++++++++++++++++++++++++++++---
 hw/arm/tegra241-cmdqv.h |  11 +++
 2 files changed, 171 insertions(+), 11 deletions(-)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 1f3d883cfe..8cb39e87c4 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -16,6 +16,96 @@
 #include "tegra241-cmdqv.h"
 #include "trace.h"
 
+static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int index)
+{
+    IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
+    IOMMUFDHWqueue *vcmdq = cmdqv->vcmdq[index];
+
+    if (!vcmdq) {
+        return;
+    }
+    iommufd_backend_free_id(viommu->iommufd, vcmdq->hw_queue_id);
+    g_free(vcmdq);
+    cmdqv->vcmdq[index] = NULL;
+}
+
+/*
+ * A VCMDQ's HW queue can be allocated once the guest has programmed:
+ *  - VCMDQ_BASE (ring buffer GPA and size). This only checks that BASE is
+ *    non-zero, not that both the _L and _H halves have been written; a
+ *    half-written BASE may pass here, but the write of the second half
+ *    re-runs setup and reallocates with the complete address.
+ *  - the VINTF mapping (CMDQ_ALLOC_MAP.ALLOC).
+ *  - both the CMDQV global enable and the VINTF enable.
+ */
+static bool tegra241_cmdqv_vcmdq_ready_to_alloc(Tegra241CMDQV *cmdqv, int index)
+{
+    return cmdqv->vcmdq_base[index] &&
+           (cmdqv->cmdq_alloc_map[index] & R_CMDQ_ALLOC_MAP_0_ALLOC_MASK) &&
+           tegra241_cmdqv_enabled(cmdqv) && tegra241_vintf_enabled(cmdqv);
+}
+
+/*
+ * Allocate a host HW VCMDQ from the current cached BASE / size for @index.
+ * No-op (returns true) until the VCMDQ is ready to be allocated.
+ */
+static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int index,
+                                       Error **errp)
+{
+    SMMUv3AccelState *accel = cmdqv->s_accel;
+    uint64_t base_mask = (uint64_t)R_VCMDQ0_BASE_L_ADDR_MASK |
+                         (uint64_t)R_VCMDQ0_BASE_H_ADDR_MASK << 32;
+    uint64_t addr = cmdqv->vcmdq_base[index] & base_mask;
+    uint64_t log2 = cmdqv->vcmdq_base[index] & R_VCMDQ0_BASE_L_LOG2SIZE_MASK;
+    uint64_t size = 1ULL << (log2 + 4);
+    IOMMUFDViommu *viommu = accel->viommu;
+    IOMMUFDHWqueue *hw_queue;
+    uint32_t hw_queue_id;
+
+    if (!tegra241_cmdqv_vcmdq_ready_to_alloc(cmdqv, index)) {
+        return true;
+    }
+
+    tegra241_cmdqv_free_vcmdq(cmdqv, index);
+
+    if (!iommufd_backend_alloc_hw_queue(viommu->iommufd, viommu->viommu_id,
+                                        IOMMU_HW_QUEUE_TYPE_TEGRA241_CMDQV,
+                                        index, addr, size, &hw_queue_id,
+                                        errp)) {
+        /* Record the failure in the cache. */
+        cmdqv->vcmdq_gerror[index] |= R_VCMDQ0_GERROR_CMDQ_INIT_ERR_MASK;
+        cmdqv->vcmdq_status[index] &= ~R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
+        return false;
+    }
+    hw_queue = g_new(IOMMUFDHWqueue, 1);
+    hw_queue->hw_queue_id = hw_queue_id;
+    hw_queue->viommu = viommu;
+    cmdqv->vcmdq[index] = hw_queue;
+
+    cmdqv->vcmdq_gerror[index] &= ~R_VCMDQ0_GERROR_CMDQ_INIT_ERR_MASK;
+    cmdqv->vcmdq_status[index] |= R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
+
+    return true;
+}
+
+static void tegra241_cmdqv_free_all_vcmdq(Tegra241CMDQV *cmdqv)
+{
+    /* uapi/linux/iommufd.h: hw_queue destroy must be in descending @index. */
+    for (int i = (TEGRA241_CMDQV_MAX_CMDQ - 1); i >= 0; i--) {
+        tegra241_cmdqv_free_vcmdq(cmdqv, i);
+    }
+}
+
+static void tegra241_cmdqv_setup_all_vcmdq(Tegra241CMDQV *cmdqv,
+                                           Error **errp)
+{
+    for (int i = 0; i < TEGRA241_CMDQV_MAX_CMDQ; i++) {
+        if (!tegra241_cmdqv_setup_vcmdq(cmdqv, i, errp)) {
+            return;
+        }
+    }
+}
+
 /*
  * Read a VCMDQ Page 0 register (control/status) using VCMDQ0_* offsets.
  *
@@ -143,7 +233,12 @@ static void tegra241_cmdqv_write_vcmdq_page0(Tegra241CMDQV *cmdqv,
         break;
     case A_VCMDQ0_CONFIG:
         if (value & R_VCMDQ0_CONFIG_CMDQ_EN_MASK) {
-            cmdqv->vcmdq_status[index] |= R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
+            /* Report init error if any. */
+            if (!(cmdqv->vcmdq_gerror[index] &
+                  R_VCMDQ0_GERROR_CMDQ_INIT_ERR_MASK)) {
+                cmdqv->vcmdq_status[index] |=
+                    R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
+            }
         } else {
             cmdqv->vcmdq_status[index] &= ~R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
         }
@@ -167,16 +262,19 @@ static void tegra241_cmdqv_write_vcmdq_page0(Tegra241CMDQV *cmdqv,
  */
 static void tegra241_cmdqv_write_vcmdq_page1(Tegra241CMDQV *cmdqv,
                                              hwaddr offset0, int index,
-                                             uint32_t value, bool direct)
+                                             uint32_t value, bool direct,
+                                             Error **errp)
 {
     switch (offset0) {
     case A_VCMDQ0_BASE_L:
         cmdqv->vcmdq_base[index] =
             deposit64(cmdqv->vcmdq_base[index], 0, 32, value);
+        tegra241_cmdqv_setup_vcmdq(cmdqv, index, errp);
         break;
     case A_VCMDQ0_BASE_H:
         cmdqv->vcmdq_base[index] =
             deposit64(cmdqv->vcmdq_base[index], 32, 32, value);
+        tegra241_cmdqv_setup_vcmdq(cmdqv, index, errp);
         break;
     case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
         cmdqv->vcmdq_cons_indx_base[index] =
@@ -200,11 +298,13 @@ static void tegra241_cmdqv_write_vcmdq_page1(Tegra241CMDQV *cmdqv,
  */
 static void tegra241_cmdqv_write_vcmdq_page1_64(Tegra241CMDQV *cmdqv,
                                                 hwaddr offset0, int index,
-                                                uint64_t value, bool direct)
+                                                uint64_t value, bool direct,
+                                                Error **errp)
 {
     switch (offset0) {
     case A_VCMDQ0_BASE_L:
         cmdqv->vcmdq_base[index] = value;
+        tegra241_cmdqv_setup_vcmdq(cmdqv, index, errp);
         break;
     case A_VCMDQ0_CONS_INDX_BASE_DRAM_L:
         cmdqv->vcmdq_cons_indx_base[index] = value;
@@ -219,7 +319,8 @@ static void tegra241_cmdqv_write_vcmdq_page1_64(Tegra241CMDQV *cmdqv,
 }
 
 static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
-                                              hwaddr offset, uint64_t value)
+                                              hwaddr offset, uint64_t value,
+                                              Error **errp)
 {
     int i;
 
@@ -235,7 +336,13 @@ static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
         cmdqv->vintf_config = value;
         if (value & R_VINTF0_CONFIG_ENABLE_MASK) {
             cmdqv->vintf_status |= R_VINTF0_STATUS_ENABLE_OK_MASK;
+            /*
+             * VCMDQs whose BASE was programmed before VINTF was
+             * enabled need their hw_queue allocated now.
+             */
+            tegra241_cmdqv_setup_all_vcmdq(cmdqv, errp);
         } else {
+            tegra241_cmdqv_free_all_vcmdq(cmdqv);
             cmdqv->vintf_status &= ~R_VINTF0_STATUS_ENABLE_OK_MASK;
         }
         break;
@@ -341,6 +448,7 @@ out:
 static void tegra241_cmdqv_writel_mmio(Tegra241CMDQV *cmdqv, hwaddr offset,
                                        uint32_t value)
 {
+    Error *local_err = NULL;
     int index;
 
     switch (offset) {
@@ -348,18 +456,39 @@ static void tegra241_cmdqv_writel_mmio(Tegra241CMDQV *cmdqv, hwaddr offset,
         cmdqv->config = value;
         if (value & R_CONFIG_CMDQV_EN_MASK) {
             cmdqv->status |= R_STATUS_CMDQV_ENABLED_MASK;
+            /*
+             * VCMDQs whose BASE was programmed before CMDQV was enabled
+             * need their hw_queue allocated now.
+             */
+            tegra241_cmdqv_setup_all_vcmdq(cmdqv, &local_err);
         } else {
+            tegra241_cmdqv_free_all_vcmdq(cmdqv);
             cmdqv->status &= ~R_STATUS_CMDQV_ENABLED_MASK;
         }
         break;
     case A_VI_INT_MASK_0 ... A_VI_INT_MASK_1:
         cmdqv->vi_int_mask[(offset - A_VI_INT_MASK_0) / 4] = value;
         break;
-    case A_CMDQ_ALLOC_MAP_0 ... A_CMDQ_ALLOC_MAP_1:
-        cmdqv->cmdq_alloc_map[(offset - A_CMDQ_ALLOC_MAP_0) / 4] = value;
+    case A_CMDQ_ALLOC_MAP_0 ... A_CMDQ_ALLOC_MAP_1: {
+        int idx = (offset - A_CMDQ_ALLOC_MAP_0) / 4;
+        bool was_alloc = cmdqv->cmdq_alloc_map[idx] &
+                         R_CMDQ_ALLOC_MAP_0_ALLOC_MASK;
+        bool now_alloc = value & R_CMDQ_ALLOC_MAP_0_ALLOC_MASK;
+
+        cmdqv->cmdq_alloc_map[idx] = value;
+        /*
+         * If the VCMDQ was already programmed (BASE) before mapping, fire
+         * setup on the ALLOC 0->1 transition; tear down on 1->0.
+         */
+        if (!was_alloc && now_alloc) {
+            tegra241_cmdqv_setup_vcmdq(cmdqv, idx, &local_err);
+        } else if (was_alloc && !now_alloc) {
+            tegra241_cmdqv_free_vcmdq(cmdqv, idx);
+        }
         break;
+    }
     case A_VINTF0_CONFIG ... A_VINTF0_LVCMDQ_ERR_MAP_3:
-        tegra241_cmdqv_config_vintf_write(cmdqv, offset, value);
+        tegra241_cmdqv_config_vintf_write(cmdqv, offset, value, &local_err);
         break;
     case A_VI_VCMDQ0_CONS_INDX ... A_VI_VCMDQ1_GERRORN:
         /*
@@ -387,17 +516,23 @@ static void tegra241_cmdqv_writel_mmio(Tegra241CMDQV *cmdqv, hwaddr offset,
         offset -= CMDQV_VINTF_PAGE1_BASE - CMDQV_VCMDQ_PAGE1_BASE;
         index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
         tegra241_cmdqv_write_vcmdq_page1(cmdqv,
-                offset - index * CMDQV_VCMDQ_STRIDE, index, value, false);
+                offset - index * CMDQV_VCMDQ_STRIDE, index, value, false,
+                &local_err);
         break;
     case A_VCMDQ0_BASE_L ... A_VCMDQ1_CONS_INDX_BASE_DRAM_H:
         index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
         tegra241_cmdqv_write_vcmdq_page1(cmdqv,
-                offset - index * CMDQV_VCMDQ_STRIDE, index, value, true);
+                offset - index * CMDQV_VCMDQ_STRIDE, index, value, true,
+                &local_err);
         break;
     default:
         qemu_log_mask(LOG_UNIMP, "%s unhandled write access at 0x%" PRIx64 "\n",
                       __func__, offset);
     }
+
+    if (local_err) {
+        error_report_err(local_err);
+    }
 }
 
 /*
@@ -407,6 +542,7 @@ static void tegra241_cmdqv_writel_mmio(Tegra241CMDQV *cmdqv, hwaddr offset,
 static void tegra241_cmdqv_writell_mmio(Tegra241CMDQV *cmdqv, hwaddr offset,
                                         uint64_t value)
 {
+    Error *local_err = NULL;
     int index;
 
     switch (offset) {
@@ -419,18 +555,24 @@ static void tegra241_cmdqv_writell_mmio(Tegra241CMDQV *cmdqv, hwaddr offset,
         offset -= CMDQV_VINTF_PAGE1_BASE - CMDQV_VCMDQ_PAGE1_BASE;
         index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
         tegra241_cmdqv_write_vcmdq_page1_64(cmdqv,
-                offset - index * CMDQV_VCMDQ_STRIDE, index, value, false);
+                offset - index * CMDQV_VCMDQ_STRIDE, index, value, false,
+                &local_err);
         break;
     case A_VCMDQ0_BASE_L ... A_VCMDQ1_CONS_INDX_BASE_DRAM_H:
         index = (offset - CMDQV_VCMDQ_PAGE1_BASE) / CMDQV_VCMDQ_STRIDE;
         tegra241_cmdqv_write_vcmdq_page1_64(cmdqv,
-                offset - index * CMDQV_VCMDQ_STRIDE, index, value, true);
+                offset - index * CMDQV_VCMDQ_STRIDE, index, value, true,
+                &local_err);
         break;
     default:
         qemu_log_mask(LOG_UNIMP,
                       "%s unhandled 64-bit write at 0x%" PRIx64 " (WI)\n",
                       __func__, offset);
     }
+
+    if (local_err) {
+        error_report_err(local_err);
+    }
 }
 
 static void tegra241_cmdqv_write_mmio(void *opaque, hwaddr offset,
@@ -535,6 +677,13 @@ free_viommu:
 
 static void tegra241_cmdqv_reset(SMMUv3State *s)
 {
+    Tegra241CMDQV *cmdqv = s->s_accel->cmdqv;
+
+    if (!cmdqv) {
+        return;
+    }
+
+    tegra241_cmdqv_free_all_vcmdq(cmdqv);
 }
 
 static const MemoryRegionOps mmio_cmdqv_ops = {
diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index c4d327a9a5..84499b840d 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -47,6 +47,7 @@ typedef struct Tegra241CMDQV {
     MemoryRegion mmio_cmdqv;
     qemu_irq irq;
     IOMMUFDVeventq *veventq;
+    IOMMUFDHWqueue *vcmdq[TEGRA241_CMDQV_MAX_CMDQ];
     void *vintf_page0;
 
     /* CMDQ-V Config page register cache */
@@ -364,6 +365,16 @@ SMMU_CMDQV_VI_VCMDQi_BASE_H_(1)
 SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_L_(1)
 SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_H_(1)
 
+static inline bool tegra241_cmdqv_enabled(Tegra241CMDQV *cmdqv)
+{
+    return cmdqv->status & R_STATUS_CMDQV_ENABLED_MASK;
+}
+
+static inline bool tegra241_vintf_enabled(Tegra241CMDQV *cmdqv)
+{
+    return cmdqv->vintf_status & R_VINTF0_STATUS_ENABLE_OK_MASK;
+}
+
 const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
 
 #endif /* HW_ARM_TEGRA241_CMDQV_H */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 29/61] hw/arm/tegra241-cmdqv: Route allocated VCMDQ Page0 accesses to the mmap'd host VINTF page0
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (27 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 28/61] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs once configured Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 30/61] memory: Allow RAM device regions to skip IOMMU mapping Peter Maydell
                   ` (32 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

Introduce tegra241_cmdqv_vintf_lvcmdq_ptr() to route VCMDQ Page 0
register accesses through the mmap'd host VINTF Page 0 backing once a
hardware queue has been allocated for the VCMDQ.

The two QEMU-trapped Page 0 apertures (direct at 0x10000, VINTF at
0x30000) are hardware aliases of the same underlying registers. A
subsequent patch installs the VINTF aperture as a RAM-device into
guest MMIO; in this patch both remain QEMU-trapped.

The direct VCMDQ aperture stays QEMU-trapped (rather than aliased
to the VINTF mmap) so that writes to an unallocated VCMDQ remain
well-defined. The CMDQV architecture allows software to program a
VCMDQ through the direct aperture without first allocating it to a
VINTF; aliasing to the VINTF mmap would route those writes into
unallocated logical slots where the hardware silently drops them.

A VCMDQ Page 0 access is served from one of two sources:

  - Cache-backed: no hw_queue is allocated for the VCMDQ
    (HW_QUEUE_ALLOC has not yet succeeded). Both apertures use
    QEMU's register cache.

  - HW-backed: HW_QUEUE_ALLOC has succeeded. Both apertures access
    the registers directly through the mmap'd host VINTF Page 0.

tegra241_cmdqv_sync_vcmdq() copies any cached writes (CONS_INDX,
PROD_INDX, CONFIG, GERRORN) into the mmap'd page on the cache-to-HW
transition so the guest's earlier register state survives. Freeing a
VCMDQ clears the cached Page0 registers.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Message-id: 20260609112552.378999-20-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/tegra241-cmdqv.c | 89 +++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events     |  4 +-
 2 files changed, 91 insertions(+), 2 deletions(-)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 8cb39e87c4..63fe0ac681 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -16,6 +16,16 @@
 #include "tegra241-cmdqv.h"
 #include "trace.h"
 
+static void tegra241_cmdqv_reset_vcmdq_cache(Tegra241CMDQV *cmdqv, int index)
+{
+    cmdqv->vcmdq_cons_indx[index] = 0;
+    cmdqv->vcmdq_prod_indx[index] = 0;
+    cmdqv->vcmdq_config[index] = 0;
+    cmdqv->vcmdq_status[index] = 0;
+    cmdqv->vcmdq_gerror[index] = 0;
+    cmdqv->vcmdq_gerrorn[index] = 0;
+}
+
 static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int index)
 {
     IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
@@ -27,6 +37,7 @@ static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int index)
     iommufd_backend_free_id(viommu->iommufd, vcmdq->hw_queue_id);
     g_free(vcmdq);
     cmdqv->vcmdq[index] = NULL;
+    tegra241_cmdqv_reset_vcmdq_cache(cmdqv, index);
 }
 
 /*
@@ -45,6 +56,47 @@ static bool tegra241_cmdqv_vcmdq_ready_to_alloc(Tegra241CMDQV *cmdqv, int index)
            tegra241_cmdqv_enabled(cmdqv) && tegra241_vintf_enabled(cmdqv);
 }
 
+/*
+ * Return a pointer into the mmap'd VINTF page0 for the VCMDQ Page 0
+ * register at @offset0 in VCMDQ slot @index, or NULL when the VCMDQ
+ * has no hw_queue allocated or the host VINTF page0 is not mmap'd.
+ */
+static inline uint32_t *tegra241_cmdqv_vintf_lvcmdq_ptr(Tegra241CMDQV *cmdqv,
+                                                 int index, hwaddr offset0)
+{
+    if (!cmdqv->vcmdq[index] || !cmdqv->vintf_page0) {
+        return NULL;
+    }
+    return (uint32_t *)(cmdqv->vintf_page0 +
+                        (index * CMDQV_VCMDQ_STRIDE) +
+                        (offset0 - CMDQV_VCMDQ_PAGE0_BASE));
+}
+
+/*
+ * Flush cached register writes into the mmap'd host VINTF page0 after a
+ * successful HW_QUEUE_ALLOC, so the guest's earlier writes survive
+ * the cache-to-hardware transition.
+ */
+static void tegra241_cmdqv_sync_vcmdq(Tegra241CMDQV *cmdqv, int index)
+{
+    uint32_t *ptr;
+
+    ptr = tegra241_cmdqv_vintf_lvcmdq_ptr(cmdqv, index, A_VCMDQ0_CONS_INDX);
+    if (!ptr) {
+        return;
+    }
+    *ptr = cmdqv->vcmdq_cons_indx[index];
+
+    ptr = tegra241_cmdqv_vintf_lvcmdq_ptr(cmdqv, index, A_VCMDQ0_PROD_INDX);
+    *ptr = cmdqv->vcmdq_prod_indx[index];
+
+    ptr = tegra241_cmdqv_vintf_lvcmdq_ptr(cmdqv, index, A_VCMDQ0_CONFIG);
+    *ptr = cmdqv->vcmdq_config[index];
+
+    ptr = tegra241_cmdqv_vintf_lvcmdq_ptr(cmdqv, index, A_VCMDQ0_GERRORN);
+    *ptr = cmdqv->vcmdq_gerrorn[index];
+}
+
 /*
  * Allocate a host HW VCMDQ from the current cached BASE / size for @index.
  * No-op (returns true) until the VCMDQ is ready to be allocated.
@@ -85,6 +137,9 @@ static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int index,
     cmdqv->vcmdq_gerror[index] &= ~R_VCMDQ0_GERROR_CMDQ_INIT_ERR_MASK;
     cmdqv->vcmdq_status[index] |= R_VCMDQ0_STATUS_CMDQ_EN_OK_MASK;
 
+    /* Push cached writes to HW; freeing resets the cache. */
+    tegra241_cmdqv_sync_vcmdq(cmdqv, index);
+
     return true;
 }
 
@@ -111,13 +166,23 @@ static void tegra241_cmdqv_setup_all_vcmdq(Tegra241CMDQV *cmdqv,
  *
  * The caller normalizes the MMIO offset such that @offset0 always refers
  * to a VCMDQ0_* register, while @index selects the VCMDQ instance.
+ *
+ * If the VCMDQ is allocated and the host VINTF page0 is mmap'd, read
+ * directly from the host VINTF page0 backing. Otherwise, fall back to
+ * the cache.
  */
 static uint64_t tegra241_cmdqv_read_vcmdq_page0(Tegra241CMDQV *cmdqv,
                                                 hwaddr offset0, int index,
                                                 bool direct)
 {
+    uint32_t *ptr = tegra241_cmdqv_vintf_lvcmdq_ptr(cmdqv, index, offset0);
     uint64_t val = 0;
 
+    if (ptr) {
+        val = *ptr;
+        goto out;
+    }
+
     switch (offset0) {
     case A_VCMDQ0_CONS_INDX:
         val = cmdqv->vcmdq_cons_indx[index];
@@ -142,7 +207,9 @@ static uint64_t tegra241_cmdqv_read_vcmdq_page0(Tegra241CMDQV *cmdqv,
                       "%s unhandled read access at 0x%" PRIx64 "\n",
                       __func__, offset0);
     }
+out:
     trace_tegra241_cmdqv_read_vcmdq_page0(index, direct ? "direct" : "vi",
+                                          ptr ? "hw" : "cache",
                                           offset0, val);
     return val;
 }
@@ -218,11 +285,31 @@ static uint64_t tegra241_cmdqv_config_vintf_read(Tegra241CMDQV *cmdqv,
  *
  * Page 0 registers are all 32-bit; this helper is only called for 4-byte
  * writes.
+ *
+ * If the VCMDQ is allocated and the host VINTF page0 is mmap'd, write
+ * directly to the VINTF page0 backing. Otherwise, update the cache.
  */
 static void tegra241_cmdqv_write_vcmdq_page0(Tegra241CMDQV *cmdqv,
                                              hwaddr offset0, int index,
                                              uint32_t value, bool direct)
 {
+    uint32_t *ptr = tegra241_cmdqv_vintf_lvcmdq_ptr(cmdqv, index, offset0);
+    bool hw = false;
+
+    if (ptr) {
+        switch (offset0) {
+        case A_VCMDQ0_CONS_INDX:
+        case A_VCMDQ0_PROD_INDX:
+        case A_VCMDQ0_CONFIG:
+        case A_VCMDQ0_GERRORN:
+            *ptr = value;
+            hw = true;
+            goto out;
+        default:
+            break;
+        }
+    }
+
     switch (offset0) {
     case A_VCMDQ0_CONS_INDX:
         cmdqv->vcmdq_cons_indx[index] = value;
@@ -253,7 +340,9 @@ static void tegra241_cmdqv_write_vcmdq_page0(Tegra241CMDQV *cmdqv,
                       "%s unhandled write access at 0x%" PRIx64 "\n",
                       __func__, offset0);
     }
+out:
     trace_tegra241_cmdqv_write_vcmdq_page0(index, direct ? "direct" : "vi",
+                                           hw ? "hw" : "cache",
                                            offset0, value);
 }
 
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 666967dc5e..a8dcbc82db 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -75,9 +75,9 @@ smmuv3_accel_install_ste(uint32_t vsid, const char * type, uint32_t hwpt_id) "vS
 # tegra241-cmdqv
 tegra241_cmdqv_read_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
 tegra241_cmdqv_write_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
-tegra241_cmdqv_read_vcmdq_page0(int index, const char *aperture, uint64_t offset0, uint64_t val) "vcmdq[%d] %s offset0: 0x%"PRIx64" val: 0x%"PRIx64
+tegra241_cmdqv_read_vcmdq_page0(int index, const char *aperture, const char *backing, uint64_t offset0, uint64_t val) "vcmdq[%d] %s (%s) offset0: 0x%"PRIx64" val: 0x%"PRIx64
 tegra241_cmdqv_read_vcmdq_page1(int index, const char *aperture, uint64_t offset0, uint64_t val) "vcmdq[%d] %s offset0: 0x%"PRIx64" val: 0x%"PRIx64
-tegra241_cmdqv_write_vcmdq_page0(int index, const char *aperture, uint64_t offset0, uint64_t val) "vcmdq[%d] %s offset0: 0x%"PRIx64" val: 0x%"PRIx64
+tegra241_cmdqv_write_vcmdq_page0(int index, const char *aperture, const char *backing, uint64_t offset0, uint64_t val) "vcmdq[%d] %s (%s) offset0: 0x%"PRIx64" val: 0x%"PRIx64
 tegra241_cmdqv_write_vcmdq_page1(int index, const char *aperture, uint64_t offset0, uint64_t val) "vcmdq[%d] %s offset0: 0x%"PRIx64" val: 0x%"PRIx64
 
 # strongarm.c
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 30/61] memory: Allow RAM device regions to skip IOMMU mapping
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (28 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 29/61] hw/arm/tegra241-cmdqv: Route allocated VCMDQ Page0 accesses to the mmap'd host VINTF page0 Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 31/61] hw/arm/tegra241-cmdqv: Use mmap'd host VINTF page0 for virtual VINTF page0 Peter Maydell
                   ` (31 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

Some RAM device regions created with memory_region_init_ram_device_ptr()
are not intended to be P2P DMA targets.

The VFIO listener currently treats all RAM device regions as DMA
capable and attempts to map them into the IOMMU. For regions without
dma-buf backing this fails and prints warnings such as:

  IOMMU_IOAS_MAP failed: Bad address, PCI BAR?

Introduce a MemoryRegion flag (ram_device_skip_iommu_map) to mark RAM
device regions that should not be IOMMU mapped, paired with
memory_region_skip_iommu_map() / memory_region_set_skip_iommu_map()
accessors. When the flag is set, the VFIO listener skips DMA mapping
for that region.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Message-id: 20260609112552.378999-21-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/vfio/listener.c      |  6 ++++++
 hw/vfio/trace-events    |  1 +
 include/system/memory.h | 21 +++++++++++++++++++++
 system/memory.c         | 10 ++++++++++
 4 files changed, 38 insertions(+)

diff --git a/hw/vfio/listener.c b/hw/vfio/listener.c
index 0b72a2cf5e..14cca678ae 100644
--- a/hw/vfio/listener.c
+++ b/hw/vfio/listener.c
@@ -610,6 +610,12 @@ void vfio_container_region_add(VFIOContainer *bcontainer,
         }
     }
 
+    if (memory_region_skip_iommu_map(section->mr)) {
+        trace_vfio_listener_region_skip_dma_map(memory_region_name(section->mr),
+                                                iova, int128_get64(llsize));
+        return;
+    }
+
     ret = vfio_container_dma_map(bcontainer, iova, int128_get64(llsize),
                                  vaddr, section->readonly, section->mr);
     if (ret) {
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 2049159015..70c5aa1bcc 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -100,6 +100,7 @@ vfio_listener_region_del_iommu(const char *name) "region_del [iommu] %s"
 vfio_listener_region_add_ram(uint64_t iova_start, uint64_t iova_end, void *vaddr) "region_add [ram] 0x%"PRIx64" - 0x%"PRIx64" [%p]"
 vfio_known_safe_misalignment(const char *name, uint64_t iova, uint64_t offset_within_region, uintptr_t page_size) "Region \"%s\" iova=0x%"PRIx64" offset_within_region=0x%"PRIx64" qemu_real_host_page_size=0x%"PRIxPTR
 vfio_listener_region_add_no_dma_map(const char *name, uint64_t iova, uint64_t size, uint64_t page_size) "Region \"%s\" 0x%"PRIx64" size=0x%"PRIx64" is not aligned to 0x%"PRIx64" and cannot be mapped for DMA"
+vfio_listener_region_skip_dma_map(const char *name, uint64_t iova, uint64_t size) "Region \"%s\" 0x%"PRIx64" size=0x%"PRIx64" marked to skip IOMMU mapping"
 vfio_listener_region_del(uint64_t start, uint64_t end) "region_del 0x%"PRIx64" - 0x%"PRIx64
 vfio_device_dirty_tracking_update(uint64_t start, uint64_t end, uint64_t min, uint64_t max) "section 0x%"PRIx64" - 0x%"PRIx64" -> update [0x%"PRIx64" - 0x%"PRIx64"]"
 vfio_device_dirty_tracking_start(int nr_ranges, uint64_t min32, uint64_t max32, uint64_t min64, uint64_t max64, uint64_t minpci, uint64_t maxpci) "nr_ranges %d 32:[0x%"PRIx64" - 0x%"PRIx64"], 64:[0x%"PRIx64" - 0x%"PRIx64"], pci64:[0x%"PRIx64" - 0x%"PRIx64"]"
diff --git a/include/system/memory.h b/include/system/memory.h
index 1417132f6d..4560809013 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -864,6 +864,8 @@ struct MemoryRegion {
 
     /* For devices designed to perform re-entrant IO into their own IO MRs */
     bool disable_reentrancy_guard;
+    /* RAM device region that does not require IOMMU mapping for P2P */
+    bool ram_device_skip_iommu_map;
 };
 
 struct IOMMUMemoryRegion {
@@ -1743,6 +1745,25 @@ static inline bool memory_region_is_romd(const MemoryRegion *mr)
  */
 bool memory_region_is_protected(const MemoryRegion *mr);
 
+/**
+ * memory_region_skip_iommu_map: check whether a memory region is excluded
+ *                               from IOMMU mapping
+ *
+ * Returns %true if @mr is a RAM device region marked to skip IOMMU mapping.
+ *
+ * @mr: the memory region being queried
+ */
+bool memory_region_skip_iommu_map(const MemoryRegion *mr);
+
+/**
+ * memory_region_set_skip_iommu_map: mark a RAM device region to skip IOMMU
+ *                                   mapping
+ *
+ * @mr: the memory region being modified
+ * @skip: %true to skip IOMMU mapping, %false to allow it
+ */
+void memory_region_set_skip_iommu_map(MemoryRegion *mr, bool skip);
+
 /**
  * memory_region_has_guest_memfd: check whether a memory region has guest_memfd
  *     associated
diff --git a/system/memory.c b/system/memory.c
index 739ba11da6..48245fd01b 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -1814,6 +1814,16 @@ bool memory_region_is_protected(const MemoryRegion *mr)
     return mr->ram && (mr->ram_block->flags & RAM_PROTECTED);
 }
 
+bool memory_region_skip_iommu_map(const MemoryRegion *mr)
+{
+    return memory_region_is_ram_device(mr) && mr->ram_device_skip_iommu_map;
+}
+
+void memory_region_set_skip_iommu_map(MemoryRegion *mr, bool skip)
+{
+    mr->ram_device_skip_iommu_map = skip;
+}
+
 bool memory_region_has_guest_memfd(const MemoryRegion *mr)
 {
     return mr->ram_block && mr->ram_block->guest_memfd >= 0;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 31/61] hw/arm/tegra241-cmdqv: Use mmap'd host VINTF page0 for virtual VINTF page0
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (29 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 30/61] memory: Allow RAM device regions to skip IOMMU mapping Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 32/61] hw/arm/smmuv3-accel: Introduce common helper for veventq read Peter Maydell
                   ` (30 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

Install the mmap'd host VINTF page0 as a RAM-device MemoryRegion
backing the guest's virtual VINTF Page 0 aperture (guest MMIO offset
0x30000) when VINTF is enabled, and remove it on VINTF disable or
reset. This eliminates QEMU trapping for hot-path CONS/PROD index
updates via that aperture.

After this patch, the two VCMDQ Page 0 apertures use different
access paths: the direct aperture (0x10000) remains QEMU-trapped,
while the VINTF aperture (0x30000) is a guest-direct RAM mapping.

The direct aperture is intentionally kept trapped (not aliased to
the host VINTF mmap) so that writes to an unallocated VCMDQ remain
well-defined. The CMDQV architecture allows software to program a
VCMDQ through the direct aperture without first allocating it to a
VINTF; aliasing would route those writes to unallocated logical
slots in the VINTF page, where the hardware silently drops them.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-22-skolothumtho@nvidia.com
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/tegra241-cmdqv.c | 37 +++++++++++++++++++++++++++++++++++++
 hw/arm/tegra241-cmdqv.h |  1 +
 2 files changed, 38 insertions(+)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 63fe0ac681..7264b4bfa9 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -26,6 +26,40 @@ static void tegra241_cmdqv_reset_vcmdq_cache(Tegra241CMDQV *cmdqv, int index)
     cmdqv->vcmdq_gerrorn[index] = 0;
 }
 
+static void tegra241_cmdqv_guest_unmap_vintf_page0(Tegra241CMDQV *cmdqv)
+{
+    if (!cmdqv->mr_vintf_page0) {
+        return;
+    }
+
+    memory_region_del_subregion(&cmdqv->mmio_cmdqv, cmdqv->mr_vintf_page0);
+    object_unparent(OBJECT(cmdqv->mr_vintf_page0));
+    g_free(cmdqv->mr_vintf_page0);
+    cmdqv->mr_vintf_page0 = NULL;
+}
+
+static void tegra241_cmdqv_guest_map_vintf_page0(Tegra241CMDQV *cmdqv)
+{
+    char *name;
+
+    if (cmdqv->mr_vintf_page0) {
+        return;
+    }
+
+    name = g_strdup_printf("%s vintf-page0",
+                           memory_region_name(&cmdqv->mmio_cmdqv));
+    cmdqv->mr_vintf_page0 = g_malloc0(sizeof(*cmdqv->mr_vintf_page0));
+    memory_region_init_ram_device_ptr(cmdqv->mr_vintf_page0,
+                                      memory_region_owner(&cmdqv->mmio_cmdqv),
+                                      name, VINTF_PAGE_SIZE,
+                                      cmdqv->vintf_page0);
+    memory_region_set_skip_iommu_map(cmdqv->mr_vintf_page0, true);
+    memory_region_add_subregion_overlap(&cmdqv->mmio_cmdqv,
+                                        CMDQV_VINTF_PAGE0_BASE,
+                                        cmdqv->mr_vintf_page0, 1);
+    g_free(name);
+}
+
 static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int index)
 {
     IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
@@ -430,7 +464,9 @@ static void tegra241_cmdqv_config_vintf_write(Tegra241CMDQV *cmdqv,
              * enabled need their hw_queue allocated now.
              */
             tegra241_cmdqv_setup_all_vcmdq(cmdqv, errp);
+            tegra241_cmdqv_guest_map_vintf_page0(cmdqv);
         } else {
+            tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
             tegra241_cmdqv_free_all_vcmdq(cmdqv);
             cmdqv->vintf_status &= ~R_VINTF0_STATUS_ENABLE_OK_MASK;
         }
@@ -772,6 +808,7 @@ static void tegra241_cmdqv_reset(SMMUv3State *s)
         return;
     }
 
+    tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
     tegra241_cmdqv_free_all_vcmdq(cmdqv);
 }
 
diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index 84499b840d..01cd6af97d 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -49,6 +49,7 @@ typedef struct Tegra241CMDQV {
     IOMMUFDVeventq *veventq;
     IOMMUFDHWqueue *vcmdq[TEGRA241_CMDQV_MAX_CMDQ];
     void *vintf_page0;
+    MemoryRegion *mr_vintf_page0;
 
     /* CMDQ-V Config page register cache */
     uint32_t config;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 32/61] hw/arm/smmuv3-accel: Introduce common helper for veventq read
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (30 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 31/61] hw/arm/tegra241-cmdqv: Use mmap'd host VINTF page0 for virtual VINTF page0 Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 33/61] hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors Peter Maydell
                   ` (29 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

Move the vEVENTQ read and validation logic into a common helper
smmuv3_accel_event_read_validate(). The helper performs the read(),
checks for overflow and short reads, validates the sequence number,
and updates the sequence state.

This helper can be reused for Tegra241 CMDQV vEVENTQ support in a
subsequent patch.

Error handling is slightly adjusted: instead of reporting errors
directly in the read handler, the helper now returns errors via
Error **. Sequence gaps are reported as warnings.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-23-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3-accel-stubs.c |  7 ++++
 hw/arm/smmuv3-accel.c       | 73 ++++++++++++++++++++++---------------
 hw/arm/smmuv3-accel.h       |  2 +
 3 files changed, 52 insertions(+), 30 deletions(-)

diff --git a/hw/arm/smmuv3-accel-stubs.c b/hw/arm/smmuv3-accel-stubs.c
index 70cef66966..9e6c44a282 100644
--- a/hw/arm/smmuv3-accel-stubs.c
+++ b/hw/arm/smmuv3-accel-stubs.c
@@ -47,6 +47,13 @@ bool smmuv3_accel_alloc_veventq(SMMUv3State *s, Error **errp)
     return true;
 }
 
+bool smmuv3_accel_event_read_validate(IOMMUFDVeventq *veventq, uint32_t type,
+                                      void *buf, size_t size, Error **errp)
+{
+    return true;
+}
+
+
 void smmuv3_accel_reset(SMMUv3State *s)
 {
 }
diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 3ceca56a67..c1c2a67c97 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -440,6 +440,44 @@ bool smmuv3_accel_issue_inv_cmd(SMMUv3State *bs, void *cmd, SMMUDevice *sdev,
                    sizeof(Cmd), &entry_num, cmd, errp);
 }
 
+bool smmuv3_accel_event_read_validate(IOMMUFDVeventq *veventq, uint32_t type,
+                                      void *buf, size_t size, Error **errp)
+{
+    uint32_t last_seq = veventq->last_event_seq;
+    uint32_t id = veventq->veventq_id;
+    struct iommufd_vevent_header *hdr;
+    ssize_t bytes;
+
+    bytes = read(veventq->veventq_fd, buf, size);
+    if (bytes <= 0) {
+        if (errno == EAGAIN || errno == EINTR) {
+            return true;
+        }
+        error_setg(errp, "vEVENTQ(type %u id %u): read failed (%m)", type, id);
+        return false;
+    }
+    hdr = (struct iommufd_vevent_header *)buf;
+    if (bytes == sizeof(*hdr) &&
+        (hdr->flags & IOMMU_VEVENTQ_FLAG_LOST_EVENTS)) {
+        error_setg(errp, "vEVENTQ(type %u id %u): overflowed", type, id);
+        veventq->event_start = false;
+        return false;
+    }
+    if (bytes < size) {
+        error_setg(errp, "vEVENTQ(type %u id %u): short read(%zd/%zd bytes)",
+                          type, id, bytes, size);
+        return false;
+    }
+    /* Check sequence in hdr for lost events if any */
+    if (veventq->event_start && (hdr->sequence - last_seq != 1)) {
+        warn_report("vEVENTQ(type %u id %u): lost %u event(s)",
+                    type, id, hdr->sequence - last_seq - 1);
+    }
+    veventq->last_event_seq = hdr->sequence;
+    veventq->event_start = true;
+    return true;
+}
+
 static void smmuv3_accel_event_read(void *opaque)
 {
     SMMUv3State *s = opaque;
@@ -448,39 +486,14 @@ static void smmuv3_accel_event_read(void *opaque)
         struct iommufd_vevent_header hdr;
         struct iommu_vevent_arm_smmuv3 vevent;
     } buf;
-    enum iommu_veventq_type type = IOMMU_VEVENTQ_TYPE_ARM_SMMUV3;
-    uint32_t id = veventq->veventq_id;
-    uint32_t last_seq = veventq->last_event_seq;
-    ssize_t bytes;
+    Error *local_err = NULL;
 
-    bytes = read(veventq->veventq_fd, &buf, sizeof(buf));
-    if (bytes <= 0) {
-        if (errno == EAGAIN || errno == EINTR) {
-            return;
-        }
-        error_report_once("vEVENTQ(type %u id %u): read failed (%m)", type, id);
+    if (!smmuv3_accel_event_read_validate(veventq,
+                                          IOMMU_VEVENTQ_TYPE_ARM_SMMUV3, &buf,
+                                          sizeof(buf), &local_err)) {
+        warn_report_err_once(local_err);
         return;
     }
-
-    if (bytes == sizeof(buf.hdr) &&
-        (buf.hdr.flags & IOMMU_VEVENTQ_FLAG_LOST_EVENTS)) {
-        error_report_once("vEVENTQ(type %u id %u): overflowed", type, id);
-        veventq->event_start = false;
-        return;
-    }
-    if (bytes < sizeof(buf)) {
-        error_report_once("vEVENTQ(type %u id %u): short read(%zd/%zd bytes)",
-                          type, id, bytes, sizeof(buf));
-        return;
-    }
-
-    /* Check sequence in hdr for lost events if any */
-    if (veventq->event_start && (buf.hdr.sequence - last_seq != 1)) {
-        error_report_once("vEVENTQ(type %u id %u): lost %u event(s)",
-                          type, id, buf.hdr.sequence - last_seq - 1);
-    }
-    veventq->last_event_seq = buf.hdr.sequence;
-    veventq->event_start = true;
     smmuv3_propagate_event(s, (Evt *)&buf.vevent);
 }
 
diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
index e0bbec8581..8d0f636338 100644
--- a/hw/arm/smmuv3-accel.h
+++ b/hw/arm/smmuv3-accel.h
@@ -87,6 +87,8 @@ bool smmuv3_accel_issue_inv_cmd(SMMUv3State *s, void *cmd, SMMUDevice *sdev,
                                 Error **errp);
 void smmuv3_accel_idr_override(SMMUv3State *s);
 bool smmuv3_accel_alloc_veventq(SMMUv3State *s, Error **errp);
+bool smmuv3_accel_event_read_validate(IOMMUFDVeventq *veventq, uint32_t type,
+                                      void *buf, size_t size, Error **errp);
 void smmuv3_accel_reset(SMMUv3State *s);
 
 #endif /* HW_ARM_SMMUV3_ACCEL_H */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 33/61] hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (31 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 32/61] hw/arm/smmuv3-accel: Introduce common helper for veventq read Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 34/61] hw/arm/tegra241-cmdqv: Initialize register state on reset Peter Maydell
                   ` (28 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

Install an event handler on the CMDQV vEVENTQ fd to read and propagate
host received CMDQV errors to the guest.

The handler runs in QEMU's main loop, using a non-blocking fd registered
via qemu_set_fd_handler().

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-24-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/tegra241-cmdqv.c | 63 +++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events     |  1 +
 2 files changed, 64 insertions(+)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 7264b4bfa9..ff0fcd1e66 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -12,6 +12,7 @@
 
 #include "hw/arm/smmuv3.h"
 #include "hw/arm/smmuv3-common.h"
+#include "hw/core/irq.h"
 #include "smmuv3-accel.h"
 #include "tegra241-cmdqv.h"
 #include "trace.h"
@@ -729,6 +730,51 @@ out:
     trace_tegra241_cmdqv_write_mmio(offset, value, size);
 }
 
+static void tegra241_cmdqv_event_read(void *opaque)
+{
+    Tegra241CMDQV *cmdqv = opaque;
+    IOMMUFDVeventq *veventq = cmdqv->veventq;
+    struct {
+        struct iommufd_vevent_header hdr;
+        struct iommu_vevent_tegra241_cmdqv vevent;
+    } buf;
+    Error *local_err = NULL;
+
+    if (!smmuv3_accel_event_read_validate(veventq,
+                                          IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV,
+                                          &buf, sizeof(buf), &local_err)) {
+        warn_report_err_once(local_err);
+        return;
+    }
+
+    if (buf.vevent.lvcmdq_err_map[0] || buf.vevent.lvcmdq_err_map[1]) {
+        cmdqv->vintf_cmdq_err_map[0] =
+            extract64(buf.vevent.lvcmdq_err_map[0], 0, 32);
+        cmdqv->vintf_cmdq_err_map[1] =
+            extract64(buf.vevent.lvcmdq_err_map[0], 32, 32);
+        cmdqv->vintf_cmdq_err_map[2] =
+            extract64(buf.vevent.lvcmdq_err_map[1], 0, 32);
+        cmdqv->vintf_cmdq_err_map[3] =
+            extract64(buf.vevent.lvcmdq_err_map[1], 32, 32);
+        /*
+         * CMDQV_CMDQ_ERR_MAP and VINTF0_LVCMDQ_ERR_MAP are distinct
+         * registers (different MMIO offsets). With only VINTF0 exposed
+         * they carry the same data, so mirror.
+         */
+        for (int i = 0; i < 4; i++) {
+            cmdqv->cmdq_err_map[i] = cmdqv->vintf_cmdq_err_map[i];
+        }
+        /* Set the VINTF0 bit in VI_ERR_MAP_0 (only VINTF0 is exposed). */
+        cmdqv->vi_err_map[0] |= BIT(0);
+        if (!(cmdqv->vi_int_mask[0] & BIT(0))) {
+            qemu_irq_pulse(cmdqv->irq);
+        }
+        trace_tegra241_cmdqv_err_map(
+            cmdqv->vintf_cmdq_err_map[3], cmdqv->vintf_cmdq_err_map[2],
+            cmdqv->vintf_cmdq_err_map[1], cmdqv->vintf_cmdq_err_map[0]);
+    }
+}
+
 static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
 {
     SMMUv3AccelState *accel = s->s_accel;
@@ -740,6 +786,7 @@ static void tegra241_cmdqv_free_viommu(SMMUv3State *s)
         return;
     }
     if (veventq) {
+        qemu_set_fd_handler(veventq->veventq_fd, NULL, NULL, NULL);
         close(veventq->veventq_fd);
         iommufd_backend_free_id(viommu->iommufd, veventq->veventq_id);
         g_free(veventq);
@@ -759,6 +806,7 @@ tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
     Tegra241CMDQV *cmdqv = s->s_accel->cmdqv;
     uint32_t viommu_id, veventq_id, veventq_fd;
     IOMMUFDVeventq *veventq;
+    int flags;
 
     if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
                                       IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV,
@@ -784,14 +832,29 @@ tegra241_cmdqv_alloc_viommu(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
         goto munmap_page0;
     }
 
+    flags = fcntl(veventq_fd, F_GETFL);
+    if (flags < 0) {
+        error_setg(errp, "Failed to get flags for vEVENTQ fd");
+        goto free_veventq;
+    }
+    if (fcntl(veventq_fd, F_SETFL, O_NONBLOCK | flags) < 0) {
+        error_setg(errp, "Failed to set O_NONBLOCK on vEVENTQ fd");
+        goto free_veventq;
+    }
+
     veventq = g_new(IOMMUFDVeventq, 1);
     veventq->veventq_id = veventq_id;
     veventq->veventq_fd = veventq_fd;
     cmdqv->veventq = veventq;
 
+    /* Set up event handler for veventq fd */
+    qemu_set_fd_handler(veventq_fd, tegra241_cmdqv_event_read, NULL, cmdqv);
     *out_viommu_id = viommu_id;
     return true;
 
+free_veventq:
+    close(veventq_fd);
+    iommufd_backend_free_id(idev->iommufd, veventq_id);
 munmap_page0:
     munmap(cmdqv->vintf_page0, VINTF_PAGE_SIZE);
     cmdqv->vintf_page0 = NULL;
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index a8dcbc82db..dbed00afbb 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -75,6 +75,7 @@ smmuv3_accel_install_ste(uint32_t vsid, const char * type, uint32_t hwpt_id) "vS
 # tegra241-cmdqv
 tegra241_cmdqv_read_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
 tegra241_cmdqv_write_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
+tegra241_cmdqv_err_map(uint32_t map3, uint32_t map2, uint32_t map1, uint32_t map0) "hw irq received. error (hex) maps: %04X:%04X:%04X:%04X"
 tegra241_cmdqv_read_vcmdq_page0(int index, const char *aperture, const char *backing, uint64_t offset0, uint64_t val) "vcmdq[%d] %s (%s) offset0: 0x%"PRIx64" val: 0x%"PRIx64
 tegra241_cmdqv_read_vcmdq_page1(int index, const char *aperture, uint64_t offset0, uint64_t val) "vcmdq[%d] %s offset0: 0x%"PRIx64" val: 0x%"PRIx64
 tegra241_cmdqv_write_vcmdq_page0(int index, const char *aperture, const char *backing, uint64_t offset0, uint64_t val) "vcmdq[%d] %s (%s) offset0: 0x%"PRIx64" val: 0x%"PRIx64
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 34/61] hw/arm/tegra241-cmdqv: Initialize register state on reset
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (32 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 33/61] hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 35/61] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size Peter Maydell
                   ` (27 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

Initialize the Tegra241 CMDQV register state in the reset handler.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-25-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/tegra241-cmdqv.c | 38 ++++++++++++++++++++++++++++++++++++++
 hw/arm/tegra241-cmdqv.h |  3 +++
 hw/arm/trace-events     |  1 +
 3 files changed, 42 insertions(+)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index ff0fcd1e66..0ed2a03612 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -863,6 +863,42 @@ free_viommu:
     return false;
 }
 
+static void tegra241_cmdqv_init_regs(SMMUv3State *s, Tegra241CMDQV *cmdqv)
+{
+    int i;
+
+    cmdqv->config = V_CONFIG_RESET;
+    cmdqv->param = FIELD_DP32(0, PARAM, CMDQV_VER, CMDQV_VER);
+    cmdqv->param = FIELD_DP32(cmdqv->param, PARAM, CMDQV_NUM_CMDQ_LOG2,
+                              CMDQV_NUM_CMDQ_LOG2);
+    cmdqv->param = FIELD_DP32(cmdqv->param, PARAM, CMDQV_NUM_SID_PER_VI_LOG2,
+                              CMDQV_NUM_SID_PER_VI_LOG2);
+    trace_tegra241_cmdqv_init_regs(cmdqv->param);
+    cmdqv->status = R_STATUS_CMDQV_ENABLED_MASK;
+
+    for (i = 0; i < 2; i++) {
+        cmdqv->vi_err_map[i] = 0;
+        cmdqv->vi_int_mask[i] = 0;
+    }
+    for (i = 0; i < 4; i++) {
+        cmdqv->cmdq_err_map[i] = 0;
+        cmdqv->vintf_cmdq_err_map[i] = 0;
+    }
+    cmdqv->vintf_config = 0;
+    cmdqv->vintf_status = 0;
+    for (i = 0; i < TEGRA241_CMDQV_MAX_CMDQ; i++) {
+        cmdqv->cmdq_alloc_map[i] = 0;
+        cmdqv->vcmdq_cons_indx[i] = 0;
+        cmdqv->vcmdq_prod_indx[i] = 0;
+        cmdqv->vcmdq_config[i] = 0;
+        cmdqv->vcmdq_status[i] = 0;
+        cmdqv->vcmdq_gerror[i] = 0;
+        cmdqv->vcmdq_gerrorn[i] = 0;
+        cmdqv->vcmdq_base[i] = 0;
+        cmdqv->vcmdq_cons_indx_base[i] = 0;
+    }
+}
+
 static void tegra241_cmdqv_reset(SMMUv3State *s)
 {
     Tegra241CMDQV *cmdqv = s->s_accel->cmdqv;
@@ -873,6 +909,8 @@ static void tegra241_cmdqv_reset(SMMUv3State *s)
 
     tegra241_cmdqv_guest_unmap_vintf_page0(cmdqv);
     tegra241_cmdqv_free_all_vcmdq(cmdqv);
+
+    tegra241_cmdqv_init_regs(s, cmdqv);
 }
 
 static const MemoryRegionOps mmio_cmdqv_ops = {
diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
index 01cd6af97d..de4c1e5335 100644
--- a/hw/arm/tegra241-cmdqv.h
+++ b/hw/arm/tegra241-cmdqv.h
@@ -89,6 +89,9 @@ FIELD(CONFIG, CMDQ_MAX_CLK_BATCH, 4, 8)
 FIELD(CONFIG, CMDQ_MAX_CMD_BATCH, 12, 8)
 FIELD(CONFIG, CONS_DRAM_EN, 20, 1)
 
+/* CMDQV_EN=1, PER_CMD_OFFSET=16B, CLK_BATCH=256, CMD_BATCH=32. */
+#define V_CONFIG_RESET 0x00020083
+
 REG32(PARAM, 0x4)
 FIELD(PARAM, CMDQV_VER, 0, 4)
 FIELD(PARAM, CMDQV_NUM_CMDQ_LOG2, 4, 4)
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index dbed00afbb..0bd718b1ab 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -76,6 +76,7 @@ smmuv3_accel_install_ste(uint32_t vsid, const char * type, uint32_t hwpt_id) "vS
 tegra241_cmdqv_read_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
 tegra241_cmdqv_write_mmio(uint64_t offset, uint64_t val, unsigned size) "offset: 0x%"PRIx64" val: 0x%"PRIx64" size: 0x%x"
 tegra241_cmdqv_err_map(uint32_t map3, uint32_t map2, uint32_t map1, uint32_t map0) "hw irq received. error (hex) maps: %04X:%04X:%04X:%04X"
+tegra241_cmdqv_init_regs(uint32_t param) "register init, param=0x%08X"
 tegra241_cmdqv_read_vcmdq_page0(int index, const char *aperture, const char *backing, uint64_t offset0, uint64_t val) "vcmdq[%d] %s (%s) offset0: 0x%"PRIx64" val: 0x%"PRIx64
 tegra241_cmdqv_read_vcmdq_page1(int index, const char *aperture, uint64_t offset0, uint64_t val) "vcmdq[%d] %s offset0: 0x%"PRIx64" val: 0x%"PRIx64
 tegra241_cmdqv_write_vcmdq_page0(int index, const char *aperture, const char *backing, uint64_t offset0, uint64_t val) "vcmdq[%d] %s (%s) offset0: 0x%"PRIx64" val: 0x%"PRIx64
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 35/61] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (33 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 34/61] hw/arm/tegra241-cmdqv: Initialize register state on reset Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 36/61] hw/arm/smmuv3: Add per-device identifier property Peter Maydell
                   ` (26 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

CMDQV HW performs DMA accesses to guest queue memory by its host
physical address set up via IOMMUFD. This requires the guest queue
to be contiguous in both guest PA and host PA space. With Tegra241
CMDQV enabled, we must only advertise a command queue size (CMDQS)
that the host can safely back with physically contiguous memory.
Allowing a queue size larger than the host page size could cause
the hardware to DMA across page boundaries, leading to faults.

Use qemu_minrampagesize() to find the smallest memory-backend page
size in use, then cap IDR1.CMDQS so the guest cannot configure a
command queue that exceeds that contiguous backing.

Note this is done at SMMUv3 init, before any guest queue GPA is
known, so the cap is conservative. Maximum queue size is 8MiB;
it is recommended to back the VM with hugepage sizes large enough
so CMDQS stays at the HW maximum. Smaller backing pages reduce
CMDQS accordingly.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-26-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/tegra241-cmdqv.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 0ed2a03612..4b8f8b15c7 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -14,6 +14,8 @@
 #include "hw/arm/smmuv3-common.h"
 #include "hw/core/irq.h"
 #include "smmuv3-accel.h"
+#include "smmuv3-internal.h"
+#include "system/hostmem.h"
 #include "tegra241-cmdqv.h"
 #include "trace.h"
 
@@ -866,6 +868,8 @@ free_viommu:
 static void tegra241_cmdqv_init_regs(SMMUv3State *s, Tegra241CMDQV *cmdqv)
 {
     int i;
+    long pgsize;
+    uint32_t val;
 
     cmdqv->config = V_CONFIG_RESET;
     cmdqv->param = FIELD_DP32(0, PARAM, CMDQV_VER, CMDQV_VER);
@@ -897,6 +901,22 @@ static void tegra241_cmdqv_init_regs(SMMUv3State *s, Tegra241CMDQV *cmdqv)
         cmdqv->vcmdq_base[i] = 0;
         cmdqv->vcmdq_cons_indx_base[i] = 0;
     }
+
+    /*
+     * CMDQ must not cross a physical RAM backend page. Adjust CMDQS so the
+     * queue fits entirely within the smallest backend page size, ensuring
+     * the command queue is physically contiguous in host memory.
+     *
+     *   IDR1.CMDQS = log2(max_qsz) - entry_shift
+     *
+     * where entry_shift = 4 (each CMDQ entry is 16 bytes = 2^4).
+     */
+    pgsize = qemu_minrampagesize();
+    if (pgsize == LONG_MAX) {
+        pgsize = qemu_real_host_page_size();
+    }
+    val = FIELD_EX32(s->idr[1], IDR1, CMDQS);
+    s->idr[1] = FIELD_DP32(s->idr[1], IDR1, CMDQS, MIN(ctz64(pgsize) - 4, val));
 }
 
 static void tegra241_cmdqv_reset(SMMUv3State *s)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 36/61] hw/arm/smmuv3: Add per-device identifier property
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (34 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 35/61] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 37/61] hw/arm/smmuv3-accel: Introduce helper to query CMDQV type Peter Maydell
                   ` (25 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

Add an "identifier" property to the SMMUv3 device and use it when
building the ACPI IORT SMMUv3 node Identifier field.

This avoids relying on device enumeration order and provides a stable
per-device identifier. A subsequent patch will use the same identifier
when generating the DSDT description for Tegra241 CMDQV, ensuring that
the IORT and DSDT entries refer to the same SMMUv3 instance.

The identifier is assigned at pre-plug time, accounting for the ITS Group
node that build_iort() places before SMMUv3 nodes in the IORT table, so
that identifiers are globally unique across all IORT nodes.

No functional change: IORT blob content for bios-tables qtest is identical
to before.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-27-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3.c          |  2 ++
 hw/arm/virt-acpi-build.c |  5 ++++-
 hw/arm/virt.c            | 12 ++++++++++++
 include/hw/arm/smmuv3.h  |  1 +
 4 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 83fa6468fd..a8fb10e531 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -2126,6 +2126,8 @@ static const Property smmuv3_properties[] = {
      * Defaults to stage 1
      */
     DEFINE_PROP_STRING("stage", SMMUv3State, stage),
+    /* Identifier used for ACPI IORT SMMUv3 (and DSDT for CMDQV) generation */
+    DEFINE_PROP_UINT8("identifier", SMMUv3State, identifier, 0),
     DEFINE_PROP_BOOL("accel", SMMUv3State, accel, false),
     /* GPA of MSI doorbell, for SMMUv3 accel use. */
     DEFINE_PROP_UINT64("msi-gpa", SMMUv3State, msi_gpa, 0),
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index b00f3477ca..9d05982137 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -349,6 +349,7 @@ static int iort_idmap_compare(gconstpointer a, gconstpointer b)
 typedef struct AcpiIortSMMUv3Dev {
     int irq;
     hwaddr base;
+    uint8_t id;
     GArray *rc_smmu_idmaps;
     /* Offset of the SMMUv3 IORT Node relative to the start of the IORT */
     size_t offset;
@@ -411,6 +412,7 @@ static int populate_smmuv3_dev(VirtMachineState *vms, GArray *sdev_blob)
                                                &error_abort));
         sdev.accel = object_property_get_bool(obj, "accel", &error_abort);
         sdev.ats = smmuv3_ats_enabled(ARM_SMMUV3(obj));
+        sdev.id = object_property_get_uint(obj, "identifier", &error_abort);
         pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
         sbdev = SYS_BUS_DEVICE(obj);
         sdev.base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
@@ -637,7 +639,8 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
                      (ID_MAPPING_ENTRY_SIZE * smmu_mapping_count);
         build_append_int_noprefix(table_data, node_size, 2); /* Length */
         build_append_int_noprefix(table_data, 4, 1); /* Revision */
-        build_append_int_noprefix(table_data, id++, 4); /* Identifier */
+        build_append_int_noprefix(table_data, sdev->id, 4); /* Identifier */
+        id++;  /* advance shared counter for RC/RMR node uniqueness */
         /* Number of ID mappings */
         build_append_int_noprefix(table_data, smmu_mapping_count, 4);
         /* Reference to ID Array */
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 2add7401a1..d8d27f2ef6 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -254,6 +254,9 @@ static MemMapEntry extended_memmap[] = {
     /* Any CXL Fixed memory windows come here */
 };
 
+/* Counts SMMUv3 devices plugged; used to assign stable IORT identifiers */
+static uint8_t smmuv3_dev_id;
+
 static const int a15irqmap[] = {
     [VIRT_UART0] = 1,
     [VIRT_RTC] = 2,
@@ -3830,6 +3833,15 @@ static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
                                      OBJECT(vms->sysmem), NULL);
             object_property_set_link(OBJECT(dev), "secure-memory",
                                      OBJECT(vms->secure_sysmem), NULL);
+            /*
+             * In build_iort(), the ITS node(id=0) precedes SMMUv3 nodes
+             * when present. Account for it so this SMMUv3's identifier
+             * is globally unique across all IORT nodes.
+             */
+            uint8_t its_offset = (vms->msi_controller == VIRT_MSI_CTRL_ITS)
+                                  ? 1 : 0;
+            object_property_set_uint(OBJECT(dev), "identifier",
+                                     its_offset + smmuv3_dev_id++, NULL);
         }
         if (object_property_get_bool(OBJECT(dev), "accel", &error_abort)) {
             hwaddr db_start = 0;
diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
index 34d0f65eaa..d39fe8850b 100644
--- a/include/hw/arm/smmuv3.h
+++ b/include/hw/arm/smmuv3.h
@@ -65,6 +65,7 @@ struct SMMUv3State {
     qemu_irq     irq[4];
     QemuMutex mutex;
     char *stage;
+    uint8_t identifier;
 
     /* SMMU has HW accelerator support for nested S1 + s2 */
     bool accel;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 37/61] hw/arm/smmuv3-accel: Introduce helper to query CMDQV type
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (35 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 36/61] hw/arm/smmuv3: Add per-device identifier property Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 38/61] hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT Peter Maydell
                   ` (24 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

Introduce a SMMUv3AccelCmdqvType enum and a helper to query the
CMDQV implementation type associated with an accelerated SMMUv3
instance.

A subsequent patch will use this helper when generating the
Tegra241 CMDQV DSDT.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-28-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3-accel-stubs.c |  5 +++++
 hw/arm/smmuv3-accel.c       | 12 ++++++++++++
 hw/arm/smmuv3-accel.h       | 10 ++++++++++
 hw/arm/tegra241-cmdqv.c     |  6 ++++++
 4 files changed, 33 insertions(+)

diff --git a/hw/arm/smmuv3-accel-stubs.c b/hw/arm/smmuv3-accel-stubs.c
index 9e6c44a282..147ae06163 100644
--- a/hw/arm/smmuv3-accel-stubs.c
+++ b/hw/arm/smmuv3-accel-stubs.c
@@ -57,3 +57,8 @@ bool smmuv3_accel_event_read_validate(IOMMUFDVeventq *veventq, uint32_t type,
 void smmuv3_accel_reset(SMMUv3State *s)
 {
 }
+
+SMMUv3AccelCmdqvType smmuv3_accel_cmdqv_type(Object *obj)
+{
+    return SMMUV3_CMDQV_NONE;
+}
diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index c1c2a67c97..9c3bd4413d 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -1060,6 +1060,18 @@ static void smmuv3_accel_as_init(SMMUv3State *s)
     address_space_init(shared_as_sysmem, &root, "smmuv3-accel-as-sysmem");
 }
 
+SMMUv3AccelCmdqvType smmuv3_accel_cmdqv_type(Object *obj)
+{
+    SMMUv3State *s = ARM_SMMUV3(obj);
+    SMMUv3AccelState *accel = s->s_accel;
+
+    if (!accel || !accel->cmdqv_ops || !accel->cmdqv_ops->get_type) {
+        return SMMUV3_CMDQV_NONE;
+    }
+
+    return accel->cmdqv_ops->get_type();
+}
+
 static void smmuv3_accel_machine_done(Notifier *notifier, void *data)
 {
     SMMUv3State *s = container_of(notifier, SMMUv3State, machine_done);
diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
index 8d0f636338..5fc85fb89d 100644
--- a/hw/arm/smmuv3-accel.h
+++ b/hw/arm/smmuv3-accel.h
@@ -16,6 +16,11 @@
 #include <linux/iommufd.h>
 #endif
 
+typedef enum SMMUv3AccelCmdqvType {
+    SMMUV3_CMDQV_NONE = 0,
+    SMMUV3_CMDQV_TEGRA241,
+} SMMUv3AccelCmdqvType;
+
 /*
  * CMDQ-Virtualization (CMDQV) hardware support, extends the SMMUv3 to
  * support multiple VCMDQs with virtualization capabilities.
@@ -42,6 +47,10 @@ typedef struct SMMUv3AccelCmdqvOps {
      * If NULL, the viommu_id is freed directly via iommufd_backend_free_id().
      */
     void (*free_viommu)(SMMUv3State *s);
+    /**
+     * @get_type: Optional callback. Return the CMDQV implementation type.
+     */
+    SMMUv3AccelCmdqvType (*get_type)(void);
     /**
      * @reset: Optional callback. Reset CMDQV state.
      */
@@ -90,5 +99,6 @@ bool smmuv3_accel_alloc_veventq(SMMUv3State *s, Error **errp);
 bool smmuv3_accel_event_read_validate(IOMMUFDVeventq *veventq, uint32_t type,
                                       void *buf, size_t size, Error **errp);
 void smmuv3_accel_reset(SMMUv3State *s);
+SMMUv3AccelCmdqvType smmuv3_accel_cmdqv_type(Object *obj);
 
 #endif /* HW_ARM_SMMUV3_ACCEL_H */
diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index 4b8f8b15c7..d821eafca1 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -964,6 +964,11 @@ static bool tegra241_cmdqv_init(SMMUv3State *s, Error **errp)
     return true;
 }
 
+static SMMUv3AccelCmdqvType tegra241_cmdqv_get_type(void)
+{
+    return SMMUV3_CMDQV_TEGRA241;
+}
+
 static bool tegra241_cmdqv_probe(SMMUv3State *s, HostIOMMUDeviceIOMMUFD *idev,
                                  Error **errp)
 {
@@ -1004,6 +1009,7 @@ static const SMMUv3AccelCmdqvOps tegra241_cmdqv_ops = {
     .init = tegra241_cmdqv_init,
     .alloc_viommu = tegra241_cmdqv_alloc_viommu,
     .free_viommu = tegra241_cmdqv_free_viommu,
+    .get_type = tegra241_cmdqv_get_type,
     .reset = tegra241_cmdqv_reset,
 };
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 38/61] hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (36 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 37/61] hw/arm/smmuv3-accel: Introduce helper to query CMDQV type Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 39/61] hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active Peter Maydell
                   ` (23 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Nicolin Chen <nicolinc@nvidia.com>

Add ACPI DSDT support for Tegra241 CMDQV when the SMMUv3 instance is
created with tegra241-cmdqv.

For each accelerated SMMUv3 instance, add a Tegra241 CMDQV device
object under the DSDT \_SB namespace, with HID "NVDA200C" and a UID
that matches the Identifier of the corresponding SMMUv3 IORT node, so
the guest OS can associate the DSDT device with the right SMMU. The
_CRS covers the CMDQV MMIO aperture plus its interrupt, and _CCA
declares I/O cache coherency.

See ACPI Specification 6.5, Section 6 (Device Configuration) for
_HID/_UID/_CCA/_CRS.

Generated DSDT entry for a CMDQV instance paired with SMMUv3 Identifier=1:
  ...
  Device (CV01)
  {
      Name (_HID, "NVDA200C")  // _HID: Hardware ID
      Name (_UID, One)  // _UID: Unique ID
      Name (_CCA, One)  // _CCA: Cache Coherency Attribute
      Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
      {
          QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite,
              0x0000000000000000, // Granularity
              0x000000000C080000, // Range Minimum
              0x000000000C0CFFFF, // Range Maximum
              0x0000000000000000, // Translation Offset
              0x0000000000050000, // Length
              ,, , AddressRangeMemory, TypeStatic)
          Interrupt (ResourceConsumer, Edge, ActiveHigh, Exclusive, ,, )
          {
              0x00000094,
          }
      })
  }
  ...
Generated IORT SMMUv3 node (Identifier = 1):

  ...
  [048h 0072 001h]                        Type : 04
  [049h 0073 002h]                      Length : 0058
  [04Bh 0075 001h]                    Revision : 04
  [04Ch 0076 004h]                  Identifier : 00000001
  [050h 0080 004h]               Mapping Count : 00000001
  [054h 0084 004h]              Mapping Offset : 00000044
  ...

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Message-id: 20260609112552.378999-29-skolothumtho@nvidia.com
Co-developed-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/trace-events      |  1 +
 hw/arm/virt-acpi-build.c | 52 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 53 insertions(+)

diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 0bd718b1ab..1b16f710fe 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -9,6 +9,7 @@ omap1_lpg_led(const char *onoff) "omap1 LPG: LED is %s"
 
 # virt-acpi-build.c
 virt_acpi_setup(void) "No fw cfg or ACPI disabled. Bailing out."
+virt_acpi_dsdt_tegra241_cmdqv(int smmu_id, uint64_t base, uint32_t irq) "DSDT: add cmdqv node for (id=%d), base=0x%" PRIx64 ", irq=%d"
 
 # smmu-common.c
 smmu_add_mr(const char *name) "%s"
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 9d05982137..99490aa7b1 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -65,6 +65,9 @@
 #include "target/arm/cpu.h"
 #include "target/arm/multiprocessing.h"
 
+#include "smmuv3-accel.h"
+#include "tegra241-cmdqv.h"
+
 #define ARM_SPI_BASE 32
 
 #define ACPI_BUILD_TABLE_SIZE             0x20000
@@ -1121,6 +1124,51 @@ static void build_fadt_rev6(GArray *table_data, BIOSLinker *linker,
     build_fadt(table_data, linker, &fadt, vms->oem_id, vms->oem_table_id);
 }
 
+static void acpi_dsdt_add_tegra241_cmdqv(Aml *scope, VirtMachineState *vms)
+{
+    for (int i = 0; i < vms->smmuv3_devices->len; i++) {
+        Object *obj = OBJECT(g_ptr_array_index(vms->smmuv3_devices, i));
+        PlatformBusDevice *pbus;
+        Aml *dev, *crs, *addr;
+        SysBusDevice *sbdev;
+        hwaddr base;
+        uint32_t id;
+        int irq;
+
+        if (smmuv3_accel_cmdqv_type(obj) != SMMUV3_CMDQV_TEGRA241) {
+            continue;
+        }
+        id = object_property_get_uint(obj, "identifier", &error_abort);
+        pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev);
+        sbdev = SYS_BUS_DEVICE(obj);
+        base = platform_bus_get_mmio_addr(pbus, sbdev, 1);
+        base += vms->memmap[VIRT_PLATFORM_BUS].base;
+        irq = platform_bus_get_irqn(pbus, sbdev, NUM_SMMU_IRQS);
+        irq += vms->irqmap[VIRT_PLATFORM_BUS];
+        irq += ARM_SPI_BASE;
+
+        dev = aml_device("CV%.02u", id);
+        aml_append(dev, aml_name_decl("_HID", aml_string("NVDA200C")));
+        aml_append(dev, aml_name_decl("_UID", aml_int(id)));
+        aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
+
+        crs = aml_resource_template();
+        addr = aml_qword_memory(AML_POS_DECODE, AML_MIN_FIXED, AML_MAX_FIXED,
+                                AML_CACHEABLE, AML_READ_WRITE, 0x0, base,
+                                base + TEGRA241_CMDQV_IO_LEN - 0x1, 0x0,
+                                TEGRA241_CMDQV_IO_LEN);
+        aml_append(crs, addr);
+        aml_append(crs, aml_interrupt(AML_CONSUMER, AML_EDGE,
+                                      AML_ACTIVE_HIGH, AML_EXCLUSIVE,
+                                      (uint32_t *)&irq, 1));
+        aml_append(dev, aml_name_decl("_CRS", crs));
+
+        aml_append(scope, dev);
+
+        trace_virt_acpi_dsdt_tegra241_cmdqv(id, base, irq);
+    }
+}
+
 /* DSDT */
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
@@ -1185,6 +1233,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     acpi_dsdt_add_tpm(scope, vms);
 #endif
 
+    if (!vms->legacy_smmuv3_present) {
+        acpi_dsdt_add_tegra241_cmdqv(scope, vms);
+    }
+
     aml_append(dsdt, scope);
 
     pci0_scope = aml_scope("\\_SB.PCI0");
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 39/61] hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (37 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 38/61] hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 40/61] hw/arm/tegra241-cmdqv: Document the CMDQV design and lifecycle Peter Maydell
                   ` (22 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

When CMDQV is active, the first cold-plugged VFIO device establishes
the viommu to host SMMUv3 association, and the guest's boot-time CMDQV
configuration (VINTFs, VCMDQs) is built on top of that association.

Hot-unplugging that device would release the viommu and tear down all
CMDQV state. Hot-plugging another device behind a different host
SMMUv3+CMDQV would then re-bind the same vSMMUv3 to new host hardware,
while the guest keeps using its boot-time configuration and ends up
issuing commands to the wrong host. Block hot-unplug of the
establishing device to avoid this; retaining the binding across unplug
is non-trivial and not required by any current use case.

Also abort at machine_done if cmdqv=on is requested but no cold-plugged
VFIO device was present to initialize it.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-30-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3-accel.c | 18 ++++++++++++++++++
 hw/arm/smmuv3-accel.h |  1 +
 2 files changed, 19 insertions(+)

diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 9c3bd4413d..80900c2521 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -759,6 +759,18 @@ static bool smmuv3_accel_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
         return false;
     }
 
+    /*
+     * CMDQV is active: block hot-unplug of the device that established the
+     * viommu association. Removing it would cause the vIOMMU to host SMMUv3
+     * association be changed via device hot-plug.
+     */
+    if (s->s_accel->cmdqv_ops) {
+        PCIDevice *pdev = pci_find_device(bus, pci_bus_num(bus), devfn);
+        error_setg(&accel_dev->unplug_blocker,
+                   "CMDQV is active: removing the device that established the "
+                   "viommu association would break the guest CMDQV");
+        qdev_add_unplug_blocker(DEVICE(pdev), accel_dev->unplug_blocker);
+    }
 done:
     accel_dev->hiodi = hiodi;
     accel_dev->s_accel = s->s_accel;
@@ -1082,6 +1094,12 @@ static void smmuv3_accel_machine_done(Notifier *notifier, void *data)
                      "at least one cold-plugged VFIO device");
         exit(1);
     }
+
+    if (s->cmdqv == ON_OFF_AUTO_ON && !accel->cmdqv) {
+        error_report("arm-smmuv3 cmdqv=on requires at least one cold-plugged "
+                     "VFIO device");
+        exit(1);
+    }
 }
 
 bool smmuv3_accel_init(SMMUv3State *s, Error **errp)
diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
index 5fc85fb89d..dd755c394d 100644
--- a/hw/arm/smmuv3-accel.h
+++ b/hw/arm/smmuv3-accel.h
@@ -84,6 +84,7 @@ typedef struct SMMUv3AccelDevice {
     IOMMUFDVdev *vdev;
     QLIST_ENTRY(SMMUv3AccelDevice) next;
     SMMUv3AccelState *s_accel;
+    Error *unplug_blocker; /* set when CMDQV is active to block hot-unplug */
 } SMMUv3AccelDevice;
 
 bool smmuv3_accel_init(SMMUv3State *s, Error **errp);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 40/61] hw/arm/tegra241-cmdqv: Document the CMDQV design and lifecycle
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (38 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 39/61] hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 41/61] hw/arm/smmuv3: Add cmdqv property for SMMUv3 device Peter Maydell
                   ` (21 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

Add an overview describing the Tegra241 CMDQV passthrough model, MMIO
layout, guest-driven lifecycle, and per-VM isolation.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Message-id: 20260609112552.378999-31-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/tegra241-cmdqv.c | 100 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 100 insertions(+)

diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
index d821eafca1..29c488e0e4 100644
--- a/hw/arm/tegra241-cmdqv.c
+++ b/hw/arm/tegra241-cmdqv.c
@@ -7,6 +7,106 @@
  * SPDX-License-Identifier: GPL-2.0-or-later
  */
 
+/*
+ * Tegra241 CMDQV - overview
+ * =========================
+ *
+ * NVIDIA Tegra241 extends SMMUv3 with a Command Queue Virtualization (CMDQ-V)
+ * block. It lets a guest issue SMMU invalidation commands directly to
+ * dedicated hardware queues (vCMDQs) without trapping into the hypervisor on
+ * the fast path. vCMDQs are exclusively allocated to Virtual Interfaces
+ * (VINTFs); the host kernel allocates one VINTF per emulated SMMUv3 instance
+ * via iommufd. QEMU emulates the CMDQV MMIO region and drives the host kernel
+ * calls (VIOMMU_ALLOC, HW_QUEUE_ALLOC, mmap); the actual command processing
+ * happens on real hardware.
+ *
+ * A vCMDQ becomes functional only once allocated to the host VINTF; until then
+ * no command processing happens, and trapped register accesses fall back to a
+ * QEMU-side cache. After allocation, the cached register state is migrated to
+ * the hardware and command processing runs on the host; guest accesses to the
+ * live control/status registers then bypass QEMU and reach the host directly.
+ *
+ * MMIO layout (64KB pages, total TEGRA241_CMDQV_IO_LEN)
+ * -----------------------------------------------------
+ *   0x00000  CMDQV Config page: QEMU-trapped.
+ *   0x10000  Direct vCMDQ Page 0 (control/status): QEMU-trapped and routed
+ *            to either the mmap'd host VINTF Page 0 (if the vCMDQ has been
+ *            allocated to a VINTF) or a per-vCMDQ register cache (otherwise).
+ *   0x20000  Direct vCMDQ Page 1 (BASE / DRAM addresses): QEMU-trapped.
+ *   0x30000  VINTF Page 0 (per-VINTF control/status): the guest's virtual
+ *            VINTF Page 0 aperture, backed by the host VINTF Page 0 (mmap'd
+ *            via iommufd) and installed into guest MMIO as a RAM-device
+ *            subregion when VINTF is enabled; subsequent accesses bypass QEMU.
+ *   0x40000  VINTF Page 1 (per-VINTF BASE): QEMU-trapped. Although this is
+ *            a HW alias of the direct Page 1, the kernel only exposes mmap
+ *            for the host VINTF Page 0; the host VINTF Page 1 is not mmap'd
+ *            and stays trapped.
+ *
+ * The direct vCMDQ apertures (0x10000/0x20000) are HW aliases of the VINTF
+ * apertures (0x30000/0x40000); they expose the same per-vCMDQ register slots
+ * under different addressing.
+ *
+ * The direct vCMDQ Page 0 stays trapped rather than aliased to the host VINTF
+ * Page 0 mmap. The CMDQV architecture allows software to program a vCMDQ
+ * through the direct aperture before allocating it to a VINTF; aliasing to
+ * the host VINTF Page 0 mmap would route those accesses into unallocated
+ * logical slots where the hardware silently drops them, so trapping keeps
+ * accesses well-defined for an unallocated vCMDQ.
+ *
+ * Lifecycle (driven by guest events)
+ * ----------------------------------
+ * 1. First vfio-pci device attach (.set_iommu_device) triggers:
+ *    - tegra241_cmdqv_probe(): IOMMU_GET_HW_INFO confirms host CMDQV support.
+ *    - IOMMU_VIOMMU_ALLOC: the kernel allocates and enables a VINTF for this
+ *      VM, configures the VM's VMID (from its stage-2 HWPT) in VINTF_CONFIG,
+ *      forces HYP_OWN=0, and returns the mmap offset/length for the host
+ *      VINTF Page 0, which QEMU then mmap()s.
+ *
+ * 2. Guest writes VINTF_CONFIG.ENABLE = 1:
+ *    QEMU installs the mmap'd host VINTF Page 0 into guest MMIO as the guest's
+ *    virtual VINTF Page 0 aperture (a RAM-device subregion) and reports
+ *    STATUS.ENABLE_OK = 1. The aperture is now a direct window onto the host
+ *    page, so accesses no longer trap into QEMU; a vCMDQ within it operates as
+ *    a real command queue only once it has been allocated (step 3).
+ *
+ * 3. Guest completes vCMDQ setup (BASE, CMDQ_ALLOC_MAP.ALLOC, CMDQV_EN,
+ *    VINTF.ENABLE, in any order; each precondition write retries the HW queue
+ *    allocation):
+ *    IOMMU_HW_QUEUE_ALLOC grants the guest a new host vCMDQ in this VM's
+ *    VINTF, binding the guest BASE GPA (translated through stage-2 and pinned
+ *    by the kernel) to it.
+ *
+ * 4. Guest SMMU driver programs a Stream Table Entry for a passthrough
+ *    device: IOMMU_VDEVICE_ALLOC programs SID_MATCH/SID_REPLACE in this VM's
+ *    VINTF so that the HW translates the device's guest vSID into its host
+ *    pSID. Commands referencing unmapped SIDs are rejected by HW.
+ *
+ *    This reflects the current accel SMMUv3 design, which allocates the
+ *    vDEVICE when the guest programs the STE.
+ *
+ * Per-VM isolation
+ * ----------------
+ * - Each VM has its own iommufd FD; all iommufd objects (VINTF, vdevices,
+ *   hw_queues, mmap regions) belong to that FD. Cross-FD lookups fail, so
+ *   one VM cannot reach another VM's IDs.
+ * - IOMMU_VIOMMU_ALLOC configures the VM's VMID in VINTF_CONFIG; the CMDQV
+ *   hardware substitutes / checks VMID on every command the guest issues.
+ * - The kernel allocates the VINTF with HYP_OWN = 0, which restricts the
+ *   guest to a safe subset of commands.
+ * - IOMMU_VDEVICE_ALLOC populates SID_MATCH/SID_REPLACE so invalidations
+ *   only reach the host StreamIDs assigned to this VM (see step 4).
+ * - IOMMU_HW_QUEUE_ALLOC binds each vCMDQ to a single VINTF, so a guest
+ *   cannot reach a vCMDQ that belongs to another VM.
+ *
+ * Limits exposed to the guest
+ * ---------------------------
+ * One VINTF per emulated SMMUv3 and two vCMDQs per VINTF. The HW maximum
+ * vCMDQ size is 8MiB, but the size QEMU exposes to the guest may be smaller.
+ * The queue must be physically contiguous in host memory, so QEMU caps the
+ * exposed size to the host memory-backend page size. Use hugepage backing to
+ * reach the 8MiB maximum.
+ */
+
 #include "qemu/osdep.h"
 #include "qemu/log.h"
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 41/61] hw/arm/smmuv3: Add cmdqv property for SMMUv3 device
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (39 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 40/61] hw/arm/tegra241-cmdqv: Document the CMDQV design and lifecycle Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 42/61] target/arm: honour CCR.BFHFNMIGN for probed data BusFaults Peter Maydell
                   ` (20 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Shameer Kolothum <skolothumtho@nvidia.com>

Introduce a "cmdqv" property to enable Tegra241 CMDQV support.
This is only enabled for accelerated SMMUv3 devices.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20260609112552.378999-32-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/smmuv3.c | 8 ++++++++
 qemu-options.hx | 8 ++++++++
 2 files changed, 16 insertions(+)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index a8fb10e531..5e5a6a960c 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1994,6 +1994,10 @@ static bool smmu_validate_property(SMMUv3State *s, Error **errp)
                        "bits if accel=on");
             return false;
         }
+        if (s->cmdqv == ON_OFF_AUTO_ON) {
+            error_setg(errp, "cmdqv can only be enabled if accel=on");
+            return false;
+        }
         return true;
     }
 
@@ -2144,6 +2148,7 @@ static const Property smmuv3_properties[] = {
     DEFINE_PROP_OAS_MODE("oas", SMMUv3State, oas, OAS_MODE_AUTO),
     DEFINE_PROP_SSIDSIZE_MODE("ssidsize", SMMUv3State, ssidsize,
                               SSID_SIZE_MODE_AUTO),
+    DEFINE_PROP_ON_OFF_AUTO("cmdqv", SMMUv3State, cmdqv, ON_OFF_AUTO_AUTO),
 };
 
 static void smmuv3_instance_init(Object *obj)
@@ -2194,6 +2199,9 @@ static void smmuv3_class_init(ObjectClass *klass, const void *data)
         "than 0 is required to enable PASID support."
         "Please ensure the value does not exceed the maximum "
         "SubstreamID size supported by the host platform.");
+    object_class_property_set_description(klass, "cmdqv",
+        "Enable/disable CMDQV support (for accel=on). "
+        "Valid values are on, off, and auto. Defaults to auto.");
 }
 
 static int smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
diff --git a/qemu-options.hx b/qemu-options.hx
index a5979d0a5b..c799286153 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1329,6 +1329,14 @@ SRST
 
         - With accel=off, auto is resolved to 0.
 
+    ``cmdqv=on|off|auto`` (default: auto)
+        Enable hardware Command Queue Virtualization (CMDQV) for the
+        SMMUv3 command queue. Currently only the NVIDIA Tegra241 CMDQV
+        implementation is supported.
+
+        - With accel=on, auto means the value is automatically derived from the host SMMU.
+        - With accel=off, auto is resolved to 'off'.
+
 ``-device amd-iommu[,option=...]``
     Enables emulation of an AMD-Vi I/O Memory Management Unit (IOMMU).
     Only available with ``-machine q35``, it supports the following options:
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 42/61] target/arm: honour CCR.BFHFNMIGN for probed data BusFaults
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (40 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 41/61] hw/arm/smmuv3: Add cmdqv property for SMMUv3 device Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 43/61] hw/arm/bcm2838: Route I2C interrupts to GIC Peter Maydell
                   ` (19 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Kyle Fox <kylefoxaustin.github@gmail.com>

M-profile CCR.BFHFNMIGN lets software executing at a negative execution
priority (in HardFault/NMI, or with FAULTMASK set) suppress precise data
BusFaults caused by load/store instructions: the access completes
returning UNKNOWN data, the fault status is recorded in BFSR/BFAR, but no
BusFault exception is taken. Software uses this to probe for the presence
of a device.

QEMU stored CCR.BFHFNMIGN but never consumed it: arm_cpu_do_transaction_
failed() always raised the external abort, which arm_v7m_cpu_do_interrupt()
pended as a BusFault and then escalated to a HardFault it could not take at
priority -1, aborting the VM with "Lockup: can't escalate 3 to HardFault".

Honour the bit in arm_cpu_do_transaction_failed(): when the access is a
data access from M-profile code at negative priority with BFHFNMIGN set,
record PRECISERR/BFARVALID and BFAR and return without raising, so the
faulting instruction completes instead of re-faulting forever. Instruction
fetches are unaffected, since BFHFNMIGN applies only to data accesses.

The SG instruction's stack-word load is also an AccType_NORMAL data access
that must honour BFHFNMIGN, but QEMU performs it manually in
v7m_read_sg_stack_word() (outside the TCG TLB, so it never reaches
arm_cpu_do_transaction_failed()). Apply the same suppression there: on a
BusFault, record the status and, when BFHFNMIGN is set at negative
priority, return the UNKNOWN data instead of pending ARMV7M_EXCP_BUS. The
remaining manual EXCP_BUS sites (vector-table loads, stacking, unstacking)
are AccType_VECTABLE/STACK/UNSTACK and are not required to honour the bit,
so they are left unchanged.

This surfaced running the real NXP i.MX 95 System Manager firmware on the
emulated Cortex-M33: its SystemMemoryProbe() (set BFHFNMIGN + FAULTMASK,
do the access, test CFSR.BFARVALID) locked up the VM. With this change the
SM's debug-monitor memory-probe commands run and recover correctly.

Signed-off-by: Kyle Fox <kylefoxaustin.github@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: minor tweak to v7m_read_sg_stack_word() code]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/m_helper.c   | 16 ++++++++++++++--
 target/arm/tcg/tlb_helper.c | 24 ++++++++++++++++++++++++
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/target/arm/tcg/m_helper.c b/target/arm/tcg/m_helper.c
index c5a553a5d4..f4ba93b291 100644
--- a/target/arm/tcg/m_helper.c
+++ b/target/arm/tcg/m_helper.c
@@ -2086,8 +2086,20 @@ static bool v7m_read_sg_stack_word(ARMCPU *cpu, ARMMMUIdx mmu_idx,
         env->v7m.cfsr[M_REG_NS] |=
             (R_V7M_CFSR_PRECISERR_MASK | R_V7M_CFSR_BFARVALID_MASK);
         env->v7m.bfar = addr;
-        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_BUS, false);
-        return false;
+        /*
+         * The SG instruction's stack-word load is an AccType_NORMAL data
+         * access, so CCR.BFHFNMIGN applies: at negative execution priority
+         * with BFHFNMIGN set, the BusFault is suppressed -- the access
+         * completes returning UNKNOWN data (status recorded above), with no
+         * BusFault exception pended.
+         */
+        if (!((env->v7m.ccr[M_REG_NS] & R_V7M_CCR_BFHFNMIGN_MASK) &&
+            armv7m_nvic_neg_prio_requested(env->nvic, env->v7m.secure))) {
+            armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_BUS, false);
+            return false;
+        }
+        /* BusFault suppressed; data value is UNKNOWN, we choose 0 */
+        value = 0;
     }
 
     *spdata = value;
diff --git a/target/arm/tcg/tlb_helper.c b/target/arm/tcg/tlb_helper.c
index f90765cb59..cbef9cb03e 100644
--- a/target/arm/tcg/tlb_helper.c
+++ b/target/arm/tcg/tlb_helper.c
@@ -10,6 +10,7 @@
 #include "helper.h"
 #include "internals.h"
 #include "cpu-features.h"
+#include "hw/intc/armv7m_nvic.h"
 
 /*
  * Returns true if the stage 1 translation regime is using LPAE format page
@@ -318,8 +319,31 @@ void arm_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr,
                                    MemTxResult response, uintptr_t retaddr)
 {
     ARMCPU *cpu = ARM_CPU(cs);
+    CPUARMState *env = &cpu->env;
     ARMMMUFaultInfo fi = {};
 
+    /*
+     * For M-profile, CCR.BFHFNMIGN lets software executing at a negative
+     * priority (in HardFault/NMI, or with FAULTMASK set) suppress precise
+     * data BusFaults from load/store instructions: the access completes
+     * returning UNKNOWN data (the store is dropped), the fault status is
+     * recorded in BFSR/BFAR, but no BusFault exception is taken. This is
+     * the mechanism software uses to probe for the presence of a device
+     * (e.g. the NXP System Manager's SystemMemoryProbe). Honour it by
+     * recording the status and returning without raising, so the faulting
+     * instruction completes rather than re-faulting forever. BFHFNMIGN
+     * applies only to data accesses, so instruction fetches are unaffected.
+     */
+    if (arm_feature(env, ARM_FEATURE_M) &&
+        access_type != MMU_INST_FETCH &&
+        (env->v7m.ccr[M_REG_NS] & R_V7M_CCR_BFHFNMIGN_MASK) &&
+        armv7m_nvic_neg_prio_requested(env->nvic, env->v7m.secure)) {
+        env->v7m.cfsr[M_REG_NS] |=
+            (R_V7M_CFSR_PRECISERR_MASK | R_V7M_CFSR_BFARVALID_MASK);
+        env->v7m.bfar = addr;
+        return;
+    }
+
     /* now we have a real cpu fault */
     cpu_restore_state(cs, retaddr);
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 43/61] hw/arm/bcm2838: Route I2C interrupts to GIC
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (41 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 42/61] target/arm: honour CCR.BFHFNMIGN for probed data BusFaults Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 44/61] target/arm: Add feature predicates for SVE2.2 and SME2.2 Peter Maydell
                   ` (18 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Nicholas Righi <nicholasrighi@gmail.com>

The I2C interrupts are only routed to the legacy interrupt controller. This means
that for modern device trees that use the GIC, the interrupts don't work. This patch
adds a splitter to route the I2C interrupt to both the legacy interrupt controller and the GIC.

Testing

Add these lines to QEMU invocation

-drive if=none,id=i2c_storage,format=raw,file=eeprom.bin \
-device at24c-eeprom,bus=i2c-bus.1,address=0x50,drive=i2c_storage,rom-size=4096 \

note: eeprom.bin is all zeros

Before this change, running i2c get to read from EEPROM would result in this

i2cget -y 1 0x50
Error: Read failed

After this change, running i2c to read from EEPROM results in this

i2cget -y 1 0x50
0x00

The eeprom can now also be enabled in the device tree. Before the
eeprom driver load would fail due to the read failing

ls -l /sys/bus/i2c/devices/i2c-1/1-0050/ | grep -i eeprom
-rw------- 1 root root 4096 May 17 16:57 eeprom

Signed-off-by: Nicholas Righi <nicholasrighi@gmail.com>
Message-id: 20260609024027.22140-1-nicholasrighi@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@oss.qualcomm.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/bcm2835_peripherals.c         | 9 +++++++++
 hw/arm/bcm2838.c                     | 4 ++++
 include/hw/arm/bcm2835_peripherals.h | 2 ++
 include/hw/arm/bcm2838_peripherals.h | 1 +
 4 files changed, 16 insertions(+)

diff --git a/hw/arm/bcm2835_peripherals.c b/hw/arm/bcm2835_peripherals.c
index 8a1e72dfab..558c180df9 100644
--- a/hw/arm/bcm2835_peripherals.c
+++ b/hw/arm/bcm2835_peripherals.c
@@ -179,6 +179,8 @@ static void raspi_peripherals_base_init(Object *obj)
                             &s->orgated_i2c_irq, TYPE_OR_IRQ);
     object_property_set_int(OBJECT(&s->orgated_i2c_irq), "num-lines",
                             ORGATED_I2C_IRQ_COUNT, &error_abort);
+    object_initialize_child(obj, "orgated-i2c-irq-splitter",
+                            &s->orgated_i2c_irq_splitter, TYPE_SPLIT_IRQ);
 }
 
 static void bcm2835_peripherals_realize(DeviceState *dev, Error **errp)
@@ -504,7 +506,14 @@ void bcm_soc_peripherals_common_realize(DeviceState *dev, Error **errp)
         sysbus_connect_irq(SYS_BUS_DEVICE(&s->i2c[n]), 0,
                            qdev_get_gpio_in(DEVICE(&s->orgated_i2c_irq), n));
     }
+
+    qdev_prop_set_uint32(DEVICE(&s->orgated_i2c_irq_splitter), "num-lines", 2);
+    if (!qdev_realize(DEVICE(&s->orgated_i2c_irq_splitter), NULL, errp)) {
+        return;
+    }
     qdev_connect_gpio_out(DEVICE(&s->orgated_i2c_irq), 0,
+                          qdev_get_gpio_in(DEVICE(&s->orgated_i2c_irq_splitter), 0));
+    qdev_connect_gpio_out(DEVICE(&s->orgated_i2c_irq_splitter), 0,
                           qdev_get_gpio_in_named(DEVICE(&s->ic),
                                                  BCM2835_IC_GPU_IRQ,
                                                  INTERRUPT_I2C));
diff --git a/hw/arm/bcm2838.c b/hw/arm/bcm2838.c
index c14a854046..089af412a3 100644
--- a/hw/arm/bcm2838.c
+++ b/hw/arm/bcm2838.c
@@ -184,6 +184,10 @@ static void bcm2838_realize(DeviceState *dev, Error **errp)
     sysbus_connect_irq(SYS_BUS_DEVICE(&ps_base->aux), 0,
                        qdev_get_gpio_in(gicdev, GIC_SPI_INTERRUPT_AUX_UART1));
 
+    /* Connect the I2C interrupt to the interrupt controller */
+    qdev_connect_gpio_out(DEVICE(&ps_base->orgated_i2c_irq_splitter), 1,
+                          qdev_get_gpio_in(gicdev, GIC_SPI_INTERRUPT_I2C));
+
     /* Connect VC mailbox to the interrupt controller */
     sysbus_connect_irq(SYS_BUS_DEVICE(&ps_base->mboxes), 0,
                        qdev_get_gpio_in(gicdev, GIC_SPI_INTERRUPT_MBOX));
diff --git a/include/hw/arm/bcm2835_peripherals.h b/include/hw/arm/bcm2835_peripherals.h
index bf35bb18e5..4f356f4643 100644
--- a/include/hw/arm/bcm2835_peripherals.h
+++ b/include/hw/arm/bcm2835_peripherals.h
@@ -33,6 +33,7 @@
 #include "hw/usb/hcd-dwc2.h"
 #include "hw/ssi/bcm2835_spi.h"
 #include "hw/i2c/bcm2835_i2c.h"
+#include "hw/core/split-irq.h"
 #include "hw/nvram/bcm2835_otp.h"
 #include "hw/misc/unimp.h"
 #include "qom/object.h"
@@ -72,6 +73,7 @@ struct BCMSocPeripheralBaseState {
     BCM2835SPIState spi[1];
     BCM2835I2CState i2c[3];
     OrIRQState orgated_i2c_irq;
+    SplitIRQ orgated_i2c_irq_splitter;
     BCM2835OTPState otp;
     UnimplementedDeviceState dbus;
     UnimplementedDeviceState ave0;
diff --git a/include/hw/arm/bcm2838_peripherals.h b/include/hw/arm/bcm2838_peripherals.h
index 7ee1bd066f..0be97e67c7 100644
--- a/include/hw/arm/bcm2838_peripherals.h
+++ b/include/hw/arm/bcm2838_peripherals.h
@@ -22,6 +22,7 @@
 #define GIC_SPI_INTERRUPT_DMA_7_8      87
 #define GIC_SPI_INTERRUPT_DMA_9_10     88
 #define GIC_SPI_INTERRUPT_AUX_UART1    93
+#define GIC_SPI_INTERRUPT_I2C          117
 #define GIC_SPI_INTERRUPT_SDHOST       120
 #define GIC_SPI_INTERRUPT_UART0        121
 #define GIC_SPI_INTERRUPT_RNG200       125
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 44/61] target/arm: Add feature predicates for SVE2.2 and SME2.2
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (42 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 43/61] hw/arm/bcm2838: Route I2C interrupts to GIC Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 45/61] target/arm: Rename sve unary predicated patterns Peter Maydell
                   ` (17 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu-features.h | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/target/arm/cpu-features.h b/target/arm/cpu-features.h
index 9e70d30964..a80b251589 100644
--- a/target/arm/cpu-features.h
+++ b/target/arm/cpu-features.h
@@ -1516,7 +1516,12 @@ static inline bool isar_feature_aa64_sve2(const ARMISARegisters *id)
 
 static inline bool isar_feature_aa64_sve2p1(const ARMISARegisters *id)
 {
-    return FIELD_EX64_IDREG(id, ID_AA64ZFR0, SVEVER) >=2;
+    return FIELD_EX64_IDREG(id, ID_AA64ZFR0, SVEVER) >= 2;
+}
+
+static inline bool isar_feature_aa64_sve2p2(const ARMISARegisters *id)
+{
+    return FIELD_EX64_IDREG(id, ID_AA64ZFR0, SVEVER) >= 3;
 }
 
 static inline bool isar_feature_aa64_sve2_aes(const ARMISARegisters *id)
@@ -1625,6 +1630,11 @@ static inline bool isar_feature_aa64_sme2p1(const ARMISARegisters *id)
     return FIELD_EX64_IDREG(id, ID_AA64SMFR0, SMEVER) >= 2;
 }
 
+static inline bool isar_feature_aa64_sme2p2(const ARMISARegisters *id)
+{
+    return FIELD_EX64_IDREG(id, ID_AA64SMFR0, SMEVER) >= 3;
+}
+
 static inline bool isar_feature_aa64_f8cvt(const ARMISARegisters *id)
 {
     return FIELD_EX64_IDREG(id, ID_AA64FPFR0, F8CVT);
@@ -1688,6 +1698,11 @@ static inline bool isar_feature_aa64_sme2p1_or_sve2p1(const ARMISARegisters *id)
     return isar_feature_aa64_sme2p1(id) || isar_feature_aa64_sve2p1(id);
 }
 
+static inline bool isar_feature_aa64_sme2p2_or_sve2p2(const ARMISARegisters *id)
+{
+    return isar_feature_aa64_sme2p2(id) || isar_feature_aa64_sve2p2(id);
+}
+
 static inline bool isar_feature_aa64_sme2_i16i64(const ARMISARegisters *id)
 {
     return isar_feature_aa64_sme2(id) && isar_feature_aa64_sme_i16i64(id);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 45/61] target/arm: Rename sve unary predicated patterns
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (43 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 44/61] target/arm: Add feature predicates for SVE2.2 and SME2.2 Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 46/61] target/arm: Enable zeroing in DO_ZPZ macros in sve_helper.c Peter Maydell
                   ` (16 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

Add an "_m" suffix to indicate merging, in preparation for
adding new predicated zeroing instructions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve.decode      | 152 ++++++++++++++---------------
 target/arm/tcg/translate-sve.c | 169 +++++++++++++++++----------------
 2 files changed, 165 insertions(+), 156 deletions(-)

diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index b53fe6a58f..c7e633ec4f 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -392,24 +392,24 @@ LSL_zpzw        00000100 .. 011 011 100 ... ..... .....         @rdn_pg_rm
 
 # SVE unary bit operations (predicated)
 # Note esz != 0 for FABS and FNEG.
-CLS             00000100 .. 011 000 101 ... ..... .....         @rd_pg_rn
-CLZ             00000100 .. 011 001 101 ... ..... .....         @rd_pg_rn
-CNT_zpz         00000100 .. 011 010 101 ... ..... .....         @rd_pg_rn
-CNOT            00000100 .. 011 011 101 ... ..... .....         @rd_pg_rn
-NOT_zpz         00000100 .. 011 110 101 ... ..... .....         @rd_pg_rn
-FABS            00000100 .. 011 100 101 ... ..... .....         @rd_pg_rn
-FNEG            00000100 .. 011 101 101 ... ..... .....         @rd_pg_rn
+CLS_m           00000100 .. 011 000 101 ... ..... .....         @rd_pg_rn
+CLZ_m           00000100 .. 011 001 101 ... ..... .....         @rd_pg_rn
+CNT_zpz_m       00000100 .. 011 010 101 ... ..... .....         @rd_pg_rn
+CNOT_m          00000100 .. 011 011 101 ... ..... .....         @rd_pg_rn
+NOT_zpz_m       00000100 .. 011 110 101 ... ..... .....         @rd_pg_rn
+FABS_m          00000100 .. 011 100 101 ... ..... .....         @rd_pg_rn
+FNEG_m          00000100 .. 011 101 101 ... ..... .....         @rd_pg_rn
 
 # SVE integer unary operations (predicated)
 # Note esz > original size for extensions.
-ABS             00000100 .. 010 110 101 ... ..... .....         @rd_pg_rn
-NEG             00000100 .. 010 111 101 ... ..... .....         @rd_pg_rn
-SXTB            00000100 .. 010 000 101 ... ..... .....         @rd_pg_rn
-UXTB            00000100 .. 010 001 101 ... ..... .....         @rd_pg_rn
-SXTH            00000100 .. 010 010 101 ... ..... .....         @rd_pg_rn
-UXTH            00000100 .. 010 011 101 ... ..... .....         @rd_pg_rn
-SXTW            00000100 .. 010 100 101 ... ..... .....         @rd_pg_rn
-UXTW            00000100 .. 010 101 101 ... ..... .....         @rd_pg_rn
+ABS_m           00000100 .. 010 110 101 ... ..... .....         @rd_pg_rn
+NEG_m           00000100 .. 010 111 101 ... ..... .....         @rd_pg_rn
+SXTB_m          00000100 .. 010 000 101 ... ..... .....         @rd_pg_rn
+UXTB_m          00000100 .. 010 001 101 ... ..... .....         @rd_pg_rn
+SXTH_m          00000100 .. 010 010 101 ... ..... .....         @rd_pg_rn
+UXTH_m          00000100 .. 010 011 101 ... ..... .....         @rd_pg_rn
+SXTW_m          00000100 .. 010 100 101 ... ..... .....         @rd_pg_rn
+UXTW_m          00000100 .. 010 101 101 ... ..... .....         @rd_pg_rn
 
 ### SVE Floating Point Compare - Vectors Group
 
@@ -707,11 +707,11 @@ CPY_m_r         00000101 .. 101000 101 ... ..... .....          @rd_pg_rn
 
 # SVE reverse within elements
 # Note esz >= operation size
-REVB            00000101 .. 1001 00 100 ... ..... .....         @rd_pg_rn
-REVH            00000101 .. 1001 01 100 ... ..... .....         @rd_pg_rn
-REVW            00000101 .. 1001 10 100 ... ..... .....         @rd_pg_rn
-RBIT            00000101 .. 1001 11 100 ... ..... .....         @rd_pg_rn
-REVD            00000101 00 1011 10 100 ... ..... .....         @rd_pg_rn_e0
+REVB_m          00000101 .. 1001 00 100 ... ..... .....         @rd_pg_rn
+REVH_m          00000101 .. 1001 01 100 ... ..... .....         @rd_pg_rn
+REVW_m          00000101 .. 1001 10 100 ... ..... .....         @rd_pg_rn
+RBIT_m          00000101 .. 1001 11 100 ... ..... .....         @rd_pg_rn
+REVD_m          00000101 00 1011 10 100 ... ..... .....         @rd_pg_rn_e0
 
 # SVE vector splice (predicated, destructive)
 SPLICE          00000101 .. 101 100 100 ... ..... .....         @rdn_pg_rm
@@ -1184,59 +1184,59 @@ FNMLS_zpzzz     01100101 .. 1 ..... 111 ... ..... .....         @rdn_pg_rm_ra
 ### SVE FP Unary Operations Predicated Group
 
 # SVE floating-point convert precision
-FCVT_sh         01100101 10 0010 00 101 ... ..... .....         @rd_pg_rn_e0
-FCVT_hs         01100101 10 0010 01 101 ... ..... .....         @rd_pg_rn_e0
-BFCVT           01100101 10 0010 10 101 ... ..... .....         @rd_pg_rn_e0
-FCVT_dh         01100101 11 0010 00 101 ... ..... .....         @rd_pg_rn_e0
-FCVT_hd         01100101 11 0010 01 101 ... ..... .....         @rd_pg_rn_e0
-FCVT_ds         01100101 11 0010 10 101 ... ..... .....         @rd_pg_rn_e0
-FCVT_sd         01100101 11 0010 11 101 ... ..... .....         @rd_pg_rn_e0
+FCVT_sh_m       01100101 10 0010 00 101 ... ..... .....         @rd_pg_rn_e0
+FCVT_hs_m       01100101 10 0010 01 101 ... ..... .....         @rd_pg_rn_e0
+BFCVT_m         01100101 10 0010 10 101 ... ..... .....         @rd_pg_rn_e0
+FCVT_dh_m       01100101 11 0010 00 101 ... ..... .....         @rd_pg_rn_e0
+FCVT_hd_m       01100101 11 0010 01 101 ... ..... .....         @rd_pg_rn_e0
+FCVT_ds_m       01100101 11 0010 10 101 ... ..... .....         @rd_pg_rn_e0
+FCVT_sd_m       01100101 11 0010 11 101 ... ..... .....         @rd_pg_rn_e0
 
 # SVE floating-point convert to integer
-FCVTZS_hh       01100101 01 011 01 0 101 ... ..... .....        @rd_pg_rn_e0
-FCVTZU_hh       01100101 01 011 01 1 101 ... ..... .....        @rd_pg_rn_e0
-FCVTZS_hs       01100101 01 011 10 0 101 ... ..... .....        @rd_pg_rn_e0
-FCVTZU_hs       01100101 01 011 10 1 101 ... ..... .....        @rd_pg_rn_e0
-FCVTZS_hd       01100101 01 011 11 0 101 ... ..... .....        @rd_pg_rn_e0
-FCVTZU_hd       01100101 01 011 11 1 101 ... ..... .....        @rd_pg_rn_e0
-FCVTZS_ss       01100101 10 011 10 0 101 ... ..... .....        @rd_pg_rn_e0
-FCVTZU_ss       01100101 10 011 10 1 101 ... ..... .....        @rd_pg_rn_e0
-FCVTZS_ds       01100101 11 011 00 0 101 ... ..... .....        @rd_pg_rn_e0
-FCVTZU_ds       01100101 11 011 00 1 101 ... ..... .....        @rd_pg_rn_e0
-FCVTZS_sd       01100101 11 011 10 0 101 ... ..... .....        @rd_pg_rn_e0
-FCVTZU_sd       01100101 11 011 10 1 101 ... ..... .....        @rd_pg_rn_e0
-FCVTZS_dd       01100101 11 011 11 0 101 ... ..... .....        @rd_pg_rn_e0
-FCVTZU_dd       01100101 11 011 11 1 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZS_hh_m     01100101 01 011 01 0 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_hh_m     01100101 01 011 01 1 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZS_hs_m     01100101 01 011 10 0 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_hs_m     01100101 01 011 10 1 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZS_hd_m     01100101 01 011 11 0 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_hd_m     01100101 01 011 11 1 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZS_ss_m     01100101 10 011 10 0 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_ss_m     01100101 10 011 10 1 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZS_ds_m     01100101 11 011 00 0 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_ds_m     01100101 11 011 00 1 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZS_sd_m     01100101 11 011 10 0 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_sd_m     01100101 11 011 10 1 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZS_dd_m     01100101 11 011 11 0 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_dd_m     01100101 11 011 11 1 101 ... ..... .....        @rd_pg_rn_e0
 
 # SVE floating-point round to integral value
-FRINTN          01100101 .. 000 000 101 ... ..... .....         @rd_pg_rn
-FRINTP          01100101 .. 000 001 101 ... ..... .....         @rd_pg_rn
-FRINTM          01100101 .. 000 010 101 ... ..... .....         @rd_pg_rn
-FRINTZ          01100101 .. 000 011 101 ... ..... .....         @rd_pg_rn
-FRINTA          01100101 .. 000 100 101 ... ..... .....         @rd_pg_rn
-FRINTX          01100101 .. 000 110 101 ... ..... .....         @rd_pg_rn
-FRINTI          01100101 .. 000 111 101 ... ..... .....         @rd_pg_rn
+FRINTN_m        01100101 .. 000 000 101 ... ..... .....         @rd_pg_rn
+FRINTP_m        01100101 .. 000 001 101 ... ..... .....         @rd_pg_rn
+FRINTM_m        01100101 .. 000 010 101 ... ..... .....         @rd_pg_rn
+FRINTZ_m        01100101 .. 000 011 101 ... ..... .....         @rd_pg_rn
+FRINTA_m        01100101 .. 000 100 101 ... ..... .....         @rd_pg_rn
+FRINTX_m        01100101 .. 000 110 101 ... ..... .....         @rd_pg_rn
+FRINTI_m        01100101 .. 000 111 101 ... ..... .....         @rd_pg_rn
 
 # SVE floating-point unary operations
-FRECPX          01100101 .. 001 100 101 ... ..... .....         @rd_pg_rn
-FSQRT           01100101 .. 001 101 101 ... ..... .....         @rd_pg_rn
+FRECPX_m        01100101 .. 001 100 101 ... ..... .....         @rd_pg_rn
+FSQRT_m         01100101 .. 001 101 101 ... ..... .....         @rd_pg_rn
 
 # SVE integer convert to floating-point
-SCVTF_hh        01100101 01 010 01 0 101 ... ..... .....        @rd_pg_rn_e0
-SCVTF_sh        01100101 01 010 10 0 101 ... ..... .....        @rd_pg_rn_e0
-SCVTF_dh        01100101 01 010 11 0 101 ... ..... .....        @rd_pg_rn_e0
-SCVTF_ss        01100101 10 010 10 0 101 ... ..... .....        @rd_pg_rn_e0
-SCVTF_sd        01100101 11 010 00 0 101 ... ..... .....        @rd_pg_rn_e0
-SCVTF_ds        01100101 11 010 10 0 101 ... ..... .....        @rd_pg_rn_e0
-SCVTF_dd        01100101 11 010 11 0 101 ... ..... .....        @rd_pg_rn_e0
+SCVTF_hh_m      01100101 01 010 01 0 101 ... ..... .....        @rd_pg_rn_e0
+SCVTF_sh_m      01100101 01 010 10 0 101 ... ..... .....        @rd_pg_rn_e0
+SCVTF_dh_m      01100101 01 010 11 0 101 ... ..... .....        @rd_pg_rn_e0
+SCVTF_ss_m      01100101 10 010 10 0 101 ... ..... .....        @rd_pg_rn_e0
+SCVTF_sd_m      01100101 11 010 00 0 101 ... ..... .....        @rd_pg_rn_e0
+SCVTF_ds_m      01100101 11 010 10 0 101 ... ..... .....        @rd_pg_rn_e0
+SCVTF_dd_m      01100101 11 010 11 0 101 ... ..... .....        @rd_pg_rn_e0
 
-UCVTF_hh        01100101 01 010 01 1 101 ... ..... .....        @rd_pg_rn_e0
-UCVTF_sh        01100101 01 010 10 1 101 ... ..... .....        @rd_pg_rn_e0
-UCVTF_dh        01100101 01 010 11 1 101 ... ..... .....        @rd_pg_rn_e0
-UCVTF_ss        01100101 10 010 10 1 101 ... ..... .....        @rd_pg_rn_e0
-UCVTF_sd        01100101 11 010 00 1 101 ... ..... .....        @rd_pg_rn_e0
-UCVTF_ds        01100101 11 010 10 1 101 ... ..... .....        @rd_pg_rn_e0
-UCVTF_dd        01100101 11 010 11 1 101 ... ..... .....        @rd_pg_rn_e0
+UCVTF_hh_m      01100101 01 010 01 1 101 ... ..... .....        @rd_pg_rn_e0
+UCVTF_sh_m      01100101 01 010 10 1 101 ... ..... .....        @rd_pg_rn_e0
+UCVTF_dh_m      01100101 01 010 11 1 101 ... ..... .....        @rd_pg_rn_e0
+UCVTF_ss_m      01100101 10 010 10 1 101 ... ..... .....        @rd_pg_rn_e0
+UCVTF_sd_m      01100101 11 010 00 1 101 ... ..... .....        @rd_pg_rn_e0
+UCVTF_ds_m      01100101 11 010 10 1 101 ... ..... .....        @rd_pg_rn_e0
+UCVTF_dd_m      01100101 11 010 11 1 101 ... ..... .....        @rd_pg_rn_e0
 
 ### SVE Memory - 32-bit Gather and Unsized Contiguous Group
 
@@ -1517,10 +1517,10 @@ UADALP_zpzz     01000100 .. 000 101 101 ... ..... .....  @rdm_pg_rn
 
 ### SVE2 integer unary operations (predicated)
 
-URECPE          01000100 .. 000 000 101 ... ..... .....  @rd_pg_rn
-URSQRTE         01000100 .. 000 001 101 ... ..... .....  @rd_pg_rn
-SQABS           01000100 .. 001 000 101 ... ..... .....  @rd_pg_rn
-SQNEG           01000100 .. 001 001 101 ... ..... .....  @rd_pg_rn
+URECPE_m        01000100 .. 000 000 101 ... ..... .....  @rd_pg_rn
+URSQRTE_m       01000100 .. 000 001 101 ... ..... .....  @rd_pg_rn
+SQABS_m         01000100 .. 001 000 101 ... ..... .....  @rd_pg_rn
+SQNEG_m         01000100 .. 001 001 101 ... ..... .....  @rd_pg_rn
 
 ### SVE2 saturating/rounding bitwise shift left (predicated)
 
@@ -1847,16 +1847,16 @@ SM4EKEY         01000101 00 1 ..... 11110 0 ..... .....  @rd_rn_rm_e0
 RAX1            01000101 00 1 ..... 11110 1 ..... .....  @rd_rn_rm_e0
 
 ### SVE2 floating-point convert precision odd elements
-FCVTXNT_ds      01100100 00 0010 10 101 ... ..... .....  @rd_pg_rn_e0
-FCVTX_ds        01100101 00 0010 10 101 ... ..... .....  @rd_pg_rn_e0
-FCVTNT_sh       01100100 10 0010 00 101 ... ..... .....  @rd_pg_rn_e0
-BFCVTNT         01100100 10 0010 10 101 ... ..... .....  @rd_pg_rn_e0
-FCVTLT_hs       01100100 10 0010 01 101 ... ..... .....  @rd_pg_rn_e0
-FCVTNT_ds       01100100 11 0010 10 101 ... ..... .....  @rd_pg_rn_e0
-FCVTLT_sd       01100100 11 0010 11 101 ... ..... .....  @rd_pg_rn_e0
+FCVTXNT_ds_m    01100100 00 0010 10 101 ... ..... .....  @rd_pg_rn_e0
+FCVTX_ds_m      01100101 00 0010 10 101 ... ..... .....  @rd_pg_rn_e0
+FCVTNT_sh_m     01100100 10 0010 00 101 ... ..... .....  @rd_pg_rn_e0
+BFCVTNT_m       01100100 10 0010 10 101 ... ..... .....  @rd_pg_rn_e0
+FCVTLT_hs_m     01100100 10 0010 01 101 ... ..... .....  @rd_pg_rn_e0
+FCVTNT_ds_m     01100100 11 0010 10 101 ... ..... .....  @rd_pg_rn_e0
+FCVTLT_sd_m     01100100 11 0010 11 101 ... ..... .....  @rd_pg_rn_e0
 
 ### SVE2 floating-point convert to integer
-FLOGB           01100101 00 011 esz:2 0101 pg:3 rn:5 rd:5  &rpr_esz
+FLOGB_m         01100101 00 011 esz:2 0101 pg:3 rn:5 rd:5  &rpr_esz
 
 ### SVE2 floating-point multiply-add long (vectors)
 FMLALB_zzzw     01100100 10 1 ..... 10 0 00 0 ..... .....  @rda_rn_rm_ex esz=2
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index a85558bdaa..4fd17905d5 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -783,14 +783,14 @@ TRANS_FEAT(SEL_zpzz, aa64_sme_or_sve, do_sel_z, a->rd, a->rn, a->rm, a->pg, a->e
     };                                                              \
     TRANS_FEAT(NAME, FEAT, gen_gvec_ool_arg_zpz, name##_fns[a->esz], a, 0)
 
-DO_ZPZ(CLS, aa64_sme_or_sve, sve_cls)
-DO_ZPZ(CLZ, aa64_sme_or_sve, sve_clz)
-DO_ZPZ(CNT_zpz, aa64_sme_or_sve, sve_cnt_zpz)
-DO_ZPZ(CNOT, aa64_sme_or_sve, sve_cnot)
-DO_ZPZ(NOT_zpz, aa64_sme_or_sve, sve_not_zpz)
-DO_ZPZ(ABS, aa64_sme_or_sve, sve_abs)
-DO_ZPZ(NEG, aa64_sme_or_sve, sve_neg)
-DO_ZPZ(RBIT, aa64_sme_or_sve, sve_rbit)
+DO_ZPZ(CLS_m, aa64_sme_or_sve, sve_cls)
+DO_ZPZ(CLZ_m, aa64_sme_or_sve, sve_clz)
+DO_ZPZ(CNT_zpz_m, aa64_sme_or_sve, sve_cnt_zpz)
+DO_ZPZ(CNOT_m, aa64_sme_or_sve, sve_cnot)
+DO_ZPZ(NOT_zpz_m, aa64_sme_or_sve, sve_not_zpz)
+DO_ZPZ(ABS_m, aa64_sme_or_sve, sve_abs)
+DO_ZPZ(NEG_m, aa64_sme_or_sve, sve_neg)
+DO_ZPZ(RBIT_m, aa64_sme_or_sve, sve_rbit)
 DO_ZPZ(ORQV, aa64_sme2p1_or_sve2p1, sve2p1_orqv)
 DO_ZPZ(EORQV, aa64_sme2p1_or_sve2p1, sve2p1_eorqv)
 DO_ZPZ(ANDQV, aa64_sme2p1_or_sve2p1, sve2p1_andqv)
@@ -803,7 +803,7 @@ static gen_helper_gvec_3 * const fabs_ah_fns[4] = {
     NULL,                  gen_helper_sve_ah_fabs_h,
     gen_helper_sve_ah_fabs_s, gen_helper_sve_ah_fabs_d,
 };
-TRANS_FEAT(FABS, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+TRANS_FEAT(FABS_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            s->fpcr_ah ? fabs_ah_fns[a->esz] : fabs_fns[a->esz], a, 0)
 
 static gen_helper_gvec_3 * const fneg_fns[4] = {
@@ -814,34 +814,38 @@ static gen_helper_gvec_3 * const fneg_ah_fns[4] = {
     NULL,                  gen_helper_sve_ah_fneg_h,
     gen_helper_sve_ah_fneg_s, gen_helper_sve_ah_fneg_d,
 };
-TRANS_FEAT(FNEG, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+TRANS_FEAT(FNEG_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            s->fpcr_ah ? fneg_ah_fns[a->esz] : fneg_fns[a->esz], a, 0)
 
 static gen_helper_gvec_3 * const sxtb_fns[4] = {
     NULL,                  gen_helper_sve_sxtb_h,
     gen_helper_sve_sxtb_s, gen_helper_sve_sxtb_d,
 };
-TRANS_FEAT(SXTB, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sxtb_fns[a->esz], a, 0)
+TRANS_FEAT(SXTB_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+           sxtb_fns[a->esz], a, 0)
 
 static gen_helper_gvec_3 * const uxtb_fns[4] = {
     NULL,                  gen_helper_sve_uxtb_h,
     gen_helper_sve_uxtb_s, gen_helper_sve_uxtb_d,
 };
-TRANS_FEAT(UXTB, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, uxtb_fns[a->esz], a, 0)
+TRANS_FEAT(UXTB_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+           uxtb_fns[a->esz], a, 0)
 
 static gen_helper_gvec_3 * const sxth_fns[4] = {
     NULL, NULL, gen_helper_sve_sxth_s, gen_helper_sve_sxth_d
 };
-TRANS_FEAT(SXTH, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sxth_fns[a->esz], a, 0)
+TRANS_FEAT(SXTH_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+           sxth_fns[a->esz], a, 0)
 
 static gen_helper_gvec_3 * const uxth_fns[4] = {
     NULL, NULL, gen_helper_sve_uxth_s, gen_helper_sve_uxth_d
 };
-TRANS_FEAT(UXTH, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, uxth_fns[a->esz], a, 0)
+TRANS_FEAT(UXTH_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+           uxth_fns[a->esz], a, 0)
 
-TRANS_FEAT(SXTW, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+TRANS_FEAT(SXTW_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            a->esz == 3 ? gen_helper_sve_sxtw_d : NULL, a, 0)
-TRANS_FEAT(UXTW, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+TRANS_FEAT(UXTW_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            a->esz == 3 ? gen_helper_sve_uxtw_d : NULL, a, 0)
 
 static gen_helper_gvec_3 * const addqv_fns[4] = {
@@ -2984,17 +2988,19 @@ static gen_helper_gvec_3 * const revb_fns[4] = {
     NULL,                  gen_helper_sve_revb_h,
     gen_helper_sve_revb_s, gen_helper_sve_revb_d,
 };
-TRANS_FEAT(REVB, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, revb_fns[a->esz], a, 0)
+TRANS_FEAT(REVB_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+           revb_fns[a->esz], a, 0)
 
 static gen_helper_gvec_3 * const revh_fns[4] = {
     NULL, NULL, gen_helper_sve_revh_s, gen_helper_sve_revh_d,
 };
-TRANS_FEAT(REVH, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, revh_fns[a->esz], a, 0)
+TRANS_FEAT(REVH_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+           revh_fns[a->esz], a, 0)
 
-TRANS_FEAT(REVW, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+TRANS_FEAT(REVW_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            a->esz == 3 ? gen_helper_sve_revw_d : NULL, a, 0)
 
-TRANS_FEAT(REVD, aa64_sme_or_sve2p1, gen_gvec_ool_arg_zpz,
+TRANS_FEAT(REVD_m, aa64_sme_or_sve2p1, gen_gvec_ool_arg_zpz,
            gen_helper_sme_revd_q, a, 0)
 
 TRANS_FEAT(SPLICE, aa64_sme_or_sve, gen_gvec_ool_arg_zpzz,
@@ -4491,53 +4497,53 @@ TRANS_FEAT(FCMLA_zzxz, aa64_sme_or_sve, gen_gvec_fpst_zzzz, fcmla_idx_fns[a->esz
  *** SVE Floating Point Unary Operations Predicated Group
  */
 
-TRANS_FEAT(FCVT_sh, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVT_sh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvt_sh, a, 0, FPST_A64)
-TRANS_FEAT(FCVT_hs, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVT_hs_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvt_hs, a, 0, FPST_A64_F16)
 
-TRANS_FEAT(BFCVT, aa64_sme_sve_bf16, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(BFCVT_m, aa64_sme_sve_bf16, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_bfcvt, a, 0,
            s->fpcr_ah ? FPST_AH : FPST_A64)
 
-TRANS_FEAT(FCVT_dh, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVT_dh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvt_dh, a, 0, FPST_A64)
-TRANS_FEAT(FCVT_hd, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVT_hd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvt_hd, a, 0, FPST_A64_F16)
-TRANS_FEAT(FCVT_ds, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVT_ds_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvt_ds, a, 0, FPST_A64)
-TRANS_FEAT(FCVT_sd, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVT_sd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvt_sd, a, 0, FPST_A64)
 
-TRANS_FEAT(FCVTZS_hh, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZS_hh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzs_hh, a, 0, FPST_A64_F16)
-TRANS_FEAT(FCVTZU_hh, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZU_hh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzu_hh, a, 0, FPST_A64_F16)
-TRANS_FEAT(FCVTZS_hs, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZS_hs_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzs_hs, a, 0, FPST_A64_F16)
-TRANS_FEAT(FCVTZU_hs, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZU_hs_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzu_hs, a, 0, FPST_A64_F16)
-TRANS_FEAT(FCVTZS_hd, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZS_hd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzs_hd, a, 0, FPST_A64_F16)
-TRANS_FEAT(FCVTZU_hd, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZU_hd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzu_hd, a, 0, FPST_A64_F16)
 
-TRANS_FEAT(FCVTZS_ss, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZS_ss_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzs_ss, a, 0, FPST_A64)
-TRANS_FEAT(FCVTZU_ss, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZU_ss_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzu_ss, a, 0, FPST_A64)
-TRANS_FEAT(FCVTZS_sd, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZS_sd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzs_sd, a, 0, FPST_A64)
-TRANS_FEAT(FCVTZU_sd, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZU_sd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzu_sd, a, 0, FPST_A64)
-TRANS_FEAT(FCVTZS_ds, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZS_ds_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzs_ds, a, 0, FPST_A64)
-TRANS_FEAT(FCVTZU_ds, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZU_ds_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzu_ds, a, 0, FPST_A64)
 
-TRANS_FEAT(FCVTZS_dd, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZS_dd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzs_dd, a, 0, FPST_A64)
-TRANS_FEAT(FCVTZU_dd, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTZU_dd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzu_dd, a, 0, FPST_A64)
 
 static gen_helper_gvec_3_ptr * const frint_fns[] = {
@@ -4546,8 +4552,9 @@ static gen_helper_gvec_3_ptr * const frint_fns[] = {
     gen_helper_sve_frint_s,
     gen_helper_sve_frint_d
 };
-TRANS_FEAT(FRINTI, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz, frint_fns[a->esz],
-           a, 0, a->esz == MO_16 ? FPST_A64_F16 : FPST_A64)
+TRANS_FEAT(FRINTI_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+           frint_fns[a->esz], a, 0,
+           a->esz == MO_16 ? FPST_A64_F16 : FPST_A64)
 
 static gen_helper_gvec_3_ptr * const frintx_fns[] = {
     NULL,
@@ -4555,8 +4562,9 @@ static gen_helper_gvec_3_ptr * const frintx_fns[] = {
     gen_helper_sve_frintx_s,
     gen_helper_sve_frintx_d
 };
-TRANS_FEAT(FRINTX, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz, frintx_fns[a->esz],
-           a, 0, a->esz == MO_16 ? FPST_A64_F16 : FPST_A64);
+TRANS_FEAT(FRINTX_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+           frintx_fns[a->esz], a, 0,
+           a->esz == MO_16 ? FPST_A64_F16 : FPST_A64);
 
 static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a,
                           ARMFPRounding mode, gen_helper_gvec_3_ptr *fn)
@@ -4585,63 +4593,64 @@ static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a,
     return true;
 }
 
-TRANS_FEAT(FRINTN, aa64_sme_or_sve, do_frint_mode, a,
+TRANS_FEAT(FRINTN_m, aa64_sme_or_sve, do_frint_mode, a,
            FPROUNDING_TIEEVEN, frint_fns[a->esz])
-TRANS_FEAT(FRINTP, aa64_sme_or_sve, do_frint_mode, a,
+TRANS_FEAT(FRINTP_m, aa64_sme_or_sve, do_frint_mode, a,
            FPROUNDING_POSINF, frint_fns[a->esz])
-TRANS_FEAT(FRINTM, aa64_sme_or_sve, do_frint_mode, a,
+TRANS_FEAT(FRINTM_m, aa64_sme_or_sve, do_frint_mode, a,
            FPROUNDING_NEGINF, frint_fns[a->esz])
-TRANS_FEAT(FRINTZ, aa64_sme_or_sve, do_frint_mode, a,
+TRANS_FEAT(FRINTZ_m, aa64_sme_or_sve, do_frint_mode, a,
            FPROUNDING_ZERO, frint_fns[a->esz])
-TRANS_FEAT(FRINTA, aa64_sme_or_sve, do_frint_mode, a,
+TRANS_FEAT(FRINTA_m, aa64_sme_or_sve, do_frint_mode, a,
            FPROUNDING_TIEAWAY, frint_fns[a->esz])
 
 static gen_helper_gvec_3_ptr * const frecpx_fns[] = {
     NULL,                    gen_helper_sve_frecpx_h,
     gen_helper_sve_frecpx_s, gen_helper_sve_frecpx_d,
 };
-TRANS_FEAT(FRECPX, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz, frecpx_fns[a->esz],
-           a, 0, select_ah_fpst(s, a->esz))
+TRANS_FEAT(FRECPX_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+           frecpx_fns[a->esz], a, 0, select_ah_fpst(s, a->esz))
 
 static gen_helper_gvec_3_ptr * const fsqrt_fns[] = {
     NULL,                   gen_helper_sve_fsqrt_h,
     gen_helper_sve_fsqrt_s, gen_helper_sve_fsqrt_d,
 };
-TRANS_FEAT(FSQRT, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz, fsqrt_fns[a->esz],
-           a, 0, a->esz == MO_16 ? FPST_A64_F16 : FPST_A64)
+TRANS_FEAT(FSQRT_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+           fsqrt_fns[a->esz], a, 0,
+           a->esz == MO_16 ? FPST_A64_F16 : FPST_A64)
 
-TRANS_FEAT(SCVTF_hh, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(SCVTF_hh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_scvt_hh, a, 0, FPST_A64_F16)
-TRANS_FEAT(SCVTF_sh, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(SCVTF_sh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_scvt_sh, a, 0, FPST_A64_F16)
-TRANS_FEAT(SCVTF_dh, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(SCVTF_dh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_scvt_dh, a, 0, FPST_A64_F16)
 
-TRANS_FEAT(SCVTF_ss, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(SCVTF_ss_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_scvt_ss, a, 0, FPST_A64)
-TRANS_FEAT(SCVTF_ds, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(SCVTF_ds_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_scvt_ds, a, 0, FPST_A64)
 
-TRANS_FEAT(SCVTF_sd, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(SCVTF_sd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_scvt_sd, a, 0, FPST_A64)
-TRANS_FEAT(SCVTF_dd, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(SCVTF_dd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_scvt_dd, a, 0, FPST_A64)
 
-TRANS_FEAT(UCVTF_hh, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(UCVTF_hh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_ucvt_hh, a, 0, FPST_A64_F16)
-TRANS_FEAT(UCVTF_sh, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(UCVTF_sh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_ucvt_sh, a, 0, FPST_A64_F16)
-TRANS_FEAT(UCVTF_dh, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(UCVTF_dh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_ucvt_dh, a, 0, FPST_A64_F16)
 
-TRANS_FEAT(UCVTF_ss, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(UCVTF_ss_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_ucvt_ss, a, 0, FPST_A64)
-TRANS_FEAT(UCVTF_ds, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(UCVTF_ds_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_ucvt_ds, a, 0, FPST_A64)
-TRANS_FEAT(UCVTF_sd, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(UCVTF_sd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_ucvt_sd, a, 0, FPST_A64)
 
-TRANS_FEAT(UCVTF_dd, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(UCVTF_dd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_ucvt_dd, a, 0, FPST_A64)
 
 /*
@@ -6691,23 +6700,23 @@ TRANS_FEAT(UADALP_zpzz, aa64_sme_or_sve2, gen_gvec_ool_arg_zpzz,
  * SVE2 integer unary operations (predicated)
  */
 
-TRANS_FEAT(URECPE, aa64_sme_or_sve2, gen_gvec_ool_arg_zpz,
+TRANS_FEAT(URECPE_m, aa64_sme_or_sve2, gen_gvec_ool_arg_zpz,
            a->esz == 2 ? gen_helper_sve2_urecpe_s : NULL, a, 0)
 
-TRANS_FEAT(URSQRTE, aa64_sme_or_sve2, gen_gvec_ool_arg_zpz,
+TRANS_FEAT(URSQRTE_m, aa64_sme_or_sve2, gen_gvec_ool_arg_zpz,
            a->esz == 2 ? gen_helper_sve2_ursqrte_s : NULL, a, 0)
 
 static gen_helper_gvec_3 * const sqabs_fns[4] = {
     gen_helper_sve2_sqabs_b, gen_helper_sve2_sqabs_h,
     gen_helper_sve2_sqabs_s, gen_helper_sve2_sqabs_d,
 };
-TRANS_FEAT(SQABS, aa64_sme_or_sve2, gen_gvec_ool_arg_zpz, sqabs_fns[a->esz], a, 0)
+TRANS_FEAT(SQABS_m, aa64_sme_or_sve2, gen_gvec_ool_arg_zpz, sqabs_fns[a->esz], a, 0)
 
 static gen_helper_gvec_3 * const sqneg_fns[4] = {
     gen_helper_sve2_sqneg_b, gen_helper_sve2_sqneg_h,
     gen_helper_sve2_sqneg_s, gen_helper_sve2_sqneg_d,
 };
-TRANS_FEAT(SQNEG, aa64_sme_or_sve2, gen_gvec_ool_arg_zpz, sqneg_fns[a->esz], a, 0)
+TRANS_FEAT(SQNEG_m, aa64_sme_or_sve2, gen_gvec_ool_arg_zpz, sqneg_fns[a->esz], a, 0)
 
 DO_ZPZZ(SQSHL, aa64_sme_or_sve2, sve2_sqshl)
 DO_ZPZZ(SQRSHL, aa64_sme_or_sve2, sve2_sqrshl)
@@ -7879,30 +7888,30 @@ static bool trans_RAX1(DisasContext *s, arg_RAX1 *a)
     return gen_gvec_fn_arg_zzz(s, gen_gvec_rax1, a);
 }
 
-TRANS_FEAT(FCVTNT_sh, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTNT_sh_m, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
            gen_helper_sve2_fcvtnt_sh, a, 0, FPST_A64)
-TRANS_FEAT(FCVTNT_ds, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTNT_ds_m, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
            gen_helper_sve2_fcvtnt_ds, a, 0, FPST_A64)
 
-TRANS_FEAT(BFCVTNT, aa64_sme_sve_bf16, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(BFCVTNT_m, aa64_sme_sve_bf16, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_bfcvtnt, a, 0,
            s->fpcr_ah ? FPST_AH : FPST_A64)
 
-TRANS_FEAT(FCVTLT_hs, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTLT_hs_m, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
            gen_helper_sve2_fcvtlt_hs, a, 0, FPST_A64_F16)
-TRANS_FEAT(FCVTLT_sd, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
+TRANS_FEAT(FCVTLT_sd_m, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
            gen_helper_sve2_fcvtlt_sd, a, 0, FPST_A64)
 
-TRANS_FEAT(FCVTX_ds, aa64_sme_or_sve2, do_frint_mode, a,
+TRANS_FEAT(FCVTX_ds_m, aa64_sme_or_sve2, do_frint_mode, a,
            FPROUNDING_ODD, gen_helper_sve_fcvt_ds)
-TRANS_FEAT(FCVTXNT_ds, aa64_sme_or_sve2, do_frint_mode, a,
+TRANS_FEAT(FCVTXNT_ds_m, aa64_sme_or_sve2, do_frint_mode, a,
            FPROUNDING_ODD, gen_helper_sve2_fcvtnt_ds)
 
 static gen_helper_gvec_3_ptr * const flogb_fns[] = {
     NULL,               gen_helper_flogb_h,
     gen_helper_flogb_s, gen_helper_flogb_d
 };
-TRANS_FEAT(FLOGB, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz, flogb_fns[a->esz],
+TRANS_FEAT(FLOGB_m, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz, flogb_fns[a->esz],
            a, 0, a->esz == MO_16 ? FPST_A64_F16 : FPST_A64)
 
 static bool do_FMLAL_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sub, bool sel)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 46/61] target/arm: Enable zeroing in DO_ZPZ macros in sve_helper.c
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (44 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 45/61] target/arm: Rename sve unary predicated patterns Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 47/61] target/arm: Expand DO_ZPZ in translate-sve.c Peter Maydell
                   ` (15 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

Use the low bit of simd_data to hold a 'zeroing' bit.
The simd_data field is currently unused and always 0.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve_helper.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
index d05d8addfd..9634517925 100644
--- a/target/arm/tcg/sve_helper.c
+++ b/target/arm/tcg/sve_helper.c
@@ -823,18 +823,20 @@ DO_ZPZW(sve_lsl_zpzw_s, uint32_t, uint64_t, H1_4, DO_LSL)
 
 #undef DO_ZPZW
 
-/* Fully general two-operand expander, controlled by a predicate.
- */
+/* Fully general two-operand expander, controlled by a predicate.  */
 #define DO_ZPZ(NAME, TYPE, H, OP)                               \
 void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc)  \
 {                                                               \
     intptr_t i, opr_sz = simd_oprsz(desc);                      \
+    bool zeroing = simd_data(desc) & 1;                         \
     for (i = 0; i < opr_sz; ) {                                 \
         uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));         \
         do {                                                    \
             if (pg & 1) {                                       \
                 TYPE nn = *(TYPE *)(vn + H(i));                 \
                 *(TYPE *)(vd + H(i)) = OP(nn);                  \
+            } else if (zeroing) {                               \
+                *(TYPE *)(vd + H(i)) = 0;                       \
             }                                                   \
             i += sizeof(TYPE), pg >>= sizeof(TYPE);             \
         } while (i & 15);                                       \
@@ -846,12 +848,15 @@ void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc)  \
 void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc)  \
 {                                                               \
     intptr_t i, opr_sz = simd_oprsz(desc) / 8;                  \
+    bool zeroing = simd_data(desc) & 1;                         \
     TYPE *d = vd, *n = vn;                                      \
     uint8_t *pg = vg;                                           \
     for (i = 0; i < opr_sz; i += 1) {                           \
         if (pg[H1(i)] & 1) {                                    \
             TYPE nn = n[i];                                     \
             d[i] = OP(nn);                                      \
+        } else if (zeroing) {                                   \
+            d[i] = 0;                                           \
         }                                                       \
     }                                                           \
 }
@@ -4831,7 +4836,8 @@ DO_ZPZS_FP(sve_ah_fmins_h, float16, H1_2, helper_vfp_ah_minh)
 DO_ZPZS_FP(sve_ah_fmins_s, float32, H1_4, helper_vfp_ah_mins)
 DO_ZPZS_FP(sve_ah_fmins_d, float64, H1_8, helper_vfp_ah_mind)
 
-/* Fully general two-operand expander, controlled by a predicate,
+/*
+ * Fully general two-operand expander, controlled by a predicate,
  * With the extra float_status parameter.
  */
 #define DO_ZPZ_FP(NAME, TYPE, H, OP)                                  \
@@ -4839,6 +4845,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vg,                       \
                   float_status *status, uint32_t desc)                \
 {                                                                     \
     intptr_t i = simd_oprsz(desc);                                    \
+    bool zeroing = simd_data(desc) & 1;                               \
     uint64_t *g = vg;                                                 \
     do {                                                              \
         uint64_t pg = g[(i - 1) >> 6];                                \
@@ -4847,6 +4854,8 @@ void HELPER(NAME)(void *vd, void *vn, void *vg,                       \
             if (likely((pg >> (i & 63)) & 1)) {                       \
                 TYPE nn = *(TYPE *)(vn + H(i));                       \
                 *(TYPE *)(vd + H(i)) = OP(nn, status);                \
+            } else if (zeroing) {                                     \
+                *(TYPE *)(vd + H(i)) = 0;                             \
             }                                                         \
         } while (i & 63);                                             \
     } while (i != 0);                                                 \
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 47/61] target/arm: Expand DO_ZPZ in translate-sve.c
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (45 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 46/61] target/arm: Enable zeroing in DO_ZPZ macros in sve_helper.c Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 48/61] target/arm: Implement SVE integer unary operations (predicated, zeroing) Peter Maydell
                   ` (14 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

Prepare for adding zeroing instructions for some of these.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/translate-sve.c | 84 +++++++++++++++++++++++++++-------
 1 file changed, 67 insertions(+), 17 deletions(-)

diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 4fd17905d5..8ab4526468 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -776,24 +776,74 @@ TRANS_FEAT(SEL_zpzz, aa64_sme_or_sve, do_sel_z, a->rd, a->rn, a->rm, a->pg, a->e
  *** SVE Integer Arithmetic - Unary Predicated Group
  */
 
-#define DO_ZPZ(NAME, FEAT, name) \
-    static gen_helper_gvec_3 * const name##_fns[4] = {              \
-        gen_helper_##name##_b, gen_helper_##name##_h,               \
-        gen_helper_##name##_s, gen_helper_##name##_d,               \
-    };                                                              \
-    TRANS_FEAT(NAME, FEAT, gen_gvec_ool_arg_zpz, name##_fns[a->esz], a, 0)
+static gen_helper_gvec_3 * const sve_cls_fns[4] = {
+    gen_helper_sve_cls_b, gen_helper_sve_cls_h,
+    gen_helper_sve_cls_s, gen_helper_sve_cls_d,
+};
+TRANS_FEAT(CLS_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_cls_fns[a->esz], a, 0)
 
-DO_ZPZ(CLS_m, aa64_sme_or_sve, sve_cls)
-DO_ZPZ(CLZ_m, aa64_sme_or_sve, sve_clz)
-DO_ZPZ(CNT_zpz_m, aa64_sme_or_sve, sve_cnt_zpz)
-DO_ZPZ(CNOT_m, aa64_sme_or_sve, sve_cnot)
-DO_ZPZ(NOT_zpz_m, aa64_sme_or_sve, sve_not_zpz)
-DO_ZPZ(ABS_m, aa64_sme_or_sve, sve_abs)
-DO_ZPZ(NEG_m, aa64_sme_or_sve, sve_neg)
-DO_ZPZ(RBIT_m, aa64_sme_or_sve, sve_rbit)
-DO_ZPZ(ORQV, aa64_sme2p1_or_sve2p1, sve2p1_orqv)
-DO_ZPZ(EORQV, aa64_sme2p1_or_sve2p1, sve2p1_eorqv)
-DO_ZPZ(ANDQV, aa64_sme2p1_or_sve2p1, sve2p1_andqv)
+static gen_helper_gvec_3 * const sve_clz_fns[4] = {
+    gen_helper_sve_clz_b, gen_helper_sve_clz_h,
+    gen_helper_sve_clz_s, gen_helper_sve_clz_d,
+};
+TRANS_FEAT(CLZ_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_clz_fns[a->esz], a, 0)
+
+static gen_helper_gvec_3 * const sve_cnt_zpz_fns[4] = {
+    gen_helper_sve_cnt_zpz_b, gen_helper_sve_cnt_zpz_h,
+    gen_helper_sve_cnt_zpz_s, gen_helper_sve_cnt_zpz_d,
+};
+TRANS_FEAT(CNT_zpz_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_cnt_zpz_fns[a->esz], a, 0)
+
+static gen_helper_gvec_3 * const sve_cnot_fns[4] = {
+    gen_helper_sve_cnot_b, gen_helper_sve_cnot_h,
+    gen_helper_sve_cnot_s, gen_helper_sve_cnot_d,
+};
+TRANS_FEAT(CNOT_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_cnot_fns[a->esz], a, 0)
+
+static gen_helper_gvec_3 * const sve_not_zpz_fns[4] = {
+    gen_helper_sve_not_zpz_b, gen_helper_sve_not_zpz_h,
+    gen_helper_sve_not_zpz_s, gen_helper_sve_not_zpz_d,
+};
+TRANS_FEAT(NOT_zpz_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_not_zpz_fns[a->esz], a, 0)
+
+static gen_helper_gvec_3 * const sve_abs_fns[4] = {
+    gen_helper_sve_abs_b, gen_helper_sve_abs_h,
+    gen_helper_sve_abs_s, gen_helper_sve_abs_d,
+};
+TRANS_FEAT(ABS_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_abs_fns[a->esz], a, 0)
+
+static gen_helper_gvec_3 * const sve_neg_fns[4] = {
+    gen_helper_sve_neg_b, gen_helper_sve_neg_h,
+    gen_helper_sve_neg_s, gen_helper_sve_neg_d,
+};
+TRANS_FEAT(NEG_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_neg_fns[a->esz], a, 0)
+
+static gen_helper_gvec_3 * const sve_rbit_fns[4] = {
+    gen_helper_sve_rbit_b, gen_helper_sve_rbit_h,
+    gen_helper_sve_rbit_s, gen_helper_sve_rbit_d,
+};
+TRANS_FEAT(RBIT_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_rbit_fns[a->esz], a, 0)
+
+static gen_helper_gvec_3 * const sve2p1_orqv_fns[4] = {
+    gen_helper_sve2p1_orqv_b, gen_helper_sve2p1_orqv_h,
+    gen_helper_sve2p1_orqv_s, gen_helper_sve2p1_orqv_d,
+};
+TRANS_FEAT(ORQV, aa64_sme2p1_or_sve2p1, gen_gvec_ool_arg_zpz,
+           sve2p1_orqv_fns[a->esz], a, 0)
+
+static gen_helper_gvec_3 * const sve2p1_eorqv_fns[4] = {
+    gen_helper_sve2p1_eorqv_b, gen_helper_sve2p1_eorqv_h,
+    gen_helper_sve2p1_eorqv_s, gen_helper_sve2p1_eorqv_d,
+};
+TRANS_FEAT(EORQV, aa64_sme2p1_or_sve2p1, gen_gvec_ool_arg_zpz,
+           sve2p1_eorqv_fns[a->esz], a, 0)
+
+static gen_helper_gvec_3 * const sve2p1_andqv_fns[4] = {
+    gen_helper_sve2p1_andqv_b, gen_helper_sve2p1_andqv_h,
+    gen_helper_sve2p1_andqv_s, gen_helper_sve2p1_andqv_d,
+};
+TRANS_FEAT(ANDQV, aa64_sme2p1_or_sve2p1, gen_gvec_ool_arg_zpz,
+           sve2p1_andqv_fns[a->esz], a, 0)
 
 static gen_helper_gvec_3 * const fabs_fns[4] = {
     NULL,                  gen_helper_sve_fabs_h,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 48/61] target/arm: Implement SVE integer unary operations (predicated, zeroing)
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (46 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 47/61] target/arm: Expand DO_ZPZ in translate-sve.c Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 49/61] target/arm: Implement SVE bitwise " Peter Maydell
                   ` (13 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

This includes ABS, NEG, SXT{B,H,W}.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve.decode      |  9 +++++++++
 target/arm/tcg/translate-sve.c | 15 +++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index c7e633ec4f..31b65fab1b 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -411,6 +411,15 @@ UXTH_m          00000100 .. 010 011 101 ... ..... .....         @rd_pg_rn
 SXTW_m          00000100 .. 010 100 101 ... ..... .....         @rd_pg_rn
 UXTW_m          00000100 .. 010 101 101 ... ..... .....         @rd_pg_rn
 
+ABS_z           00000100 .. 000 110 101 ... ..... .....         @rd_pg_rn
+NEG_z           00000100 .. 000 111 101 ... ..... .....         @rd_pg_rn
+SXTB_z          00000100 .. 000 000 101 ... ..... .....         @rd_pg_rn
+UXTB_z          00000100 .. 000 001 101 ... ..... .....         @rd_pg_rn
+SXTH_z          00000100 .. 000 010 101 ... ..... .....         @rd_pg_rn
+UXTH_z          00000100 .. 000 011 101 ... ..... .....         @rd_pg_rn
+SXTW_z          00000100 .. 000 100 101 ... ..... .....         @rd_pg_rn
+UXTW_z          00000100 .. 000 101 101 ... ..... .....         @rd_pg_rn
+
 ### SVE Floating Point Compare - Vectors Group
 
 # SVE floating-point compare vectors
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 8ab4526468..afbed77ec3 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -811,12 +811,14 @@ static gen_helper_gvec_3 * const sve_abs_fns[4] = {
     gen_helper_sve_abs_s, gen_helper_sve_abs_d,
 };
 TRANS_FEAT(ABS_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_abs_fns[a->esz], a, 0)
+TRANS_FEAT(ABS_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz, sve_abs_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const sve_neg_fns[4] = {
     gen_helper_sve_neg_b, gen_helper_sve_neg_h,
     gen_helper_sve_neg_s, gen_helper_sve_neg_d,
 };
 TRANS_FEAT(NEG_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_neg_fns[a->esz], a, 0)
+TRANS_FEAT(NEG_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz, sve_neg_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const sve_rbit_fns[4] = {
     gen_helper_sve_rbit_b, gen_helper_sve_rbit_h,
@@ -873,6 +875,8 @@ static gen_helper_gvec_3 * const sxtb_fns[4] = {
 };
 TRANS_FEAT(SXTB_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            sxtb_fns[a->esz], a, 0)
+TRANS_FEAT(SXTB_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           sxtb_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const uxtb_fns[4] = {
     NULL,                  gen_helper_sve_uxtb_h,
@@ -880,23 +884,34 @@ static gen_helper_gvec_3 * const uxtb_fns[4] = {
 };
 TRANS_FEAT(UXTB_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            uxtb_fns[a->esz], a, 0)
+TRANS_FEAT(UXTB_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           uxtb_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const sxth_fns[4] = {
     NULL, NULL, gen_helper_sve_sxth_s, gen_helper_sve_sxth_d
 };
 TRANS_FEAT(SXTH_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            sxth_fns[a->esz], a, 0)
+TRANS_FEAT(SXTH_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           sxth_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const uxth_fns[4] = {
     NULL, NULL, gen_helper_sve_uxth_s, gen_helper_sve_uxth_d
 };
 TRANS_FEAT(UXTH_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            uxth_fns[a->esz], a, 0)
+TRANS_FEAT(UXTH_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           uxth_fns[a->esz], a, 1)
 
 TRANS_FEAT(SXTW_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            a->esz == 3 ? gen_helper_sve_sxtw_d : NULL, a, 0)
+TRANS_FEAT(SXTW_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           a->esz == 3 ? gen_helper_sve_sxtw_d : NULL, a, 1)
+
 TRANS_FEAT(UXTW_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            a->esz == 3 ? gen_helper_sve_uxtw_d : NULL, a, 0)
+TRANS_FEAT(UXTW_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           a->esz == 3 ? gen_helper_sve_uxtw_d : NULL, a, 1)
 
 static gen_helper_gvec_3 * const addqv_fns[4] = {
     gen_helper_sve2p1_addqv_b, gen_helper_sve2p1_addqv_h,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 49/61] target/arm: Implement SVE bitwise unary operations (predicated, zeroing)
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (47 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 48/61] target/arm: Implement SVE integer unary operations (predicated, zeroing) Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 50/61] target/arm: Implement SVE reverse within elements (zeroing) Peter Maydell
                   ` (12 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

This includes CLS, CLZ, CNT, CNOT, NOT, FABS, FNEG.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve.decode      |  8 ++++++++
 target/arm/tcg/translate-sve.c | 21 ++++++++++++++++++---
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index 31b65fab1b..e79b5e84c1 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -400,6 +400,14 @@ NOT_zpz_m       00000100 .. 011 110 101 ... ..... .....         @rd_pg_rn
 FABS_m          00000100 .. 011 100 101 ... ..... .....         @rd_pg_rn
 FNEG_m          00000100 .. 011 101 101 ... ..... .....         @rd_pg_rn
 
+CLS_z           00000100 .. 001 000 101 ... ..... .....         @rd_pg_rn
+CLZ_z           00000100 .. 001 001 101 ... ..... .....         @rd_pg_rn
+CNT_zpz_z       00000100 .. 001 010 101 ... ..... .....         @rd_pg_rn
+CNOT_z          00000100 .. 001 011 101 ... ..... .....         @rd_pg_rn
+NOT_zpz_z       00000100 .. 001 110 101 ... ..... .....         @rd_pg_rn
+FABS_z          00000100 .. 001 100 101 ... ..... .....         @rd_pg_rn
+FNEG_z          00000100 .. 001 101 101 ... ..... .....         @rd_pg_rn
+
 # SVE integer unary operations (predicated)
 # Note esz > original size for extensions.
 ABS_m           00000100 .. 010 110 101 ... ..... .....         @rd_pg_rn
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index afbed77ec3..d6a1683719 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -781,30 +781,41 @@ static gen_helper_gvec_3 * const sve_cls_fns[4] = {
     gen_helper_sve_cls_s, gen_helper_sve_cls_d,
 };
 TRANS_FEAT(CLS_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_cls_fns[a->esz], a, 0)
+TRANS_FEAT(CLS_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz, sve_cls_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const sve_clz_fns[4] = {
     gen_helper_sve_clz_b, gen_helper_sve_clz_h,
     gen_helper_sve_clz_s, gen_helper_sve_clz_d,
 };
 TRANS_FEAT(CLZ_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_clz_fns[a->esz], a, 0)
+TRANS_FEAT(CLZ_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz, sve_clz_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const sve_cnt_zpz_fns[4] = {
     gen_helper_sve_cnt_zpz_b, gen_helper_sve_cnt_zpz_h,
     gen_helper_sve_cnt_zpz_s, gen_helper_sve_cnt_zpz_d,
 };
-TRANS_FEAT(CNT_zpz_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_cnt_zpz_fns[a->esz], a, 0)
+TRANS_FEAT(CNT_zpz_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+           sve_cnt_zpz_fns[a->esz], a, 0)
+TRANS_FEAT(CNT_zpz_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           sve_cnt_zpz_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const sve_cnot_fns[4] = {
     gen_helper_sve_cnot_b, gen_helper_sve_cnot_h,
     gen_helper_sve_cnot_s, gen_helper_sve_cnot_d,
 };
-TRANS_FEAT(CNOT_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_cnot_fns[a->esz], a, 0)
+TRANS_FEAT(CNOT_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+           sve_cnot_fns[a->esz], a, 0)
+TRANS_FEAT(CNOT_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           sve_cnot_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const sve_not_zpz_fns[4] = {
     gen_helper_sve_not_zpz_b, gen_helper_sve_not_zpz_h,
     gen_helper_sve_not_zpz_s, gen_helper_sve_not_zpz_d,
 };
-TRANS_FEAT(NOT_zpz_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_not_zpz_fns[a->esz], a, 0)
+TRANS_FEAT(NOT_zpz_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
+           sve_not_zpz_fns[a->esz], a, 0)
+TRANS_FEAT(NOT_zpz_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           sve_not_zpz_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const sve_abs_fns[4] = {
     gen_helper_sve_abs_b, gen_helper_sve_abs_h,
@@ -857,6 +868,8 @@ static gen_helper_gvec_3 * const fabs_ah_fns[4] = {
 };
 TRANS_FEAT(FABS_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            s->fpcr_ah ? fabs_ah_fns[a->esz] : fabs_fns[a->esz], a, 0)
+TRANS_FEAT(FABS_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           s->fpcr_ah ? fabs_ah_fns[a->esz] : fabs_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const fneg_fns[4] = {
     NULL,                  gen_helper_sve_fneg_h,
@@ -868,6 +881,8 @@ static gen_helper_gvec_3 * const fneg_ah_fns[4] = {
 };
 TRANS_FEAT(FNEG_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            s->fpcr_ah ? fneg_ah_fns[a->esz] : fneg_fns[a->esz], a, 0)
+TRANS_FEAT(FNEG_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           s->fpcr_ah ? fneg_ah_fns[a->esz] : fneg_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const sxtb_fns[4] = {
     NULL,                  gen_helper_sve_sxtb_h,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 50/61] target/arm: Implement SVE reverse within elements (zeroing)
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (48 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 49/61] target/arm: Implement SVE bitwise " Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 51/61] target/arm: Implement SVE reverse doublewords (zeroing) Peter Maydell
                   ` (11 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

This includes REVB, REVH, REVW, RBIT.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve.decode      | 5 +++++
 target/arm/tcg/translate-sve.c | 7 +++++++
 2 files changed, 12 insertions(+)

diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index e79b5e84c1..867ae2916e 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -730,6 +730,11 @@ REVW_m          00000101 .. 1001 10 100 ... ..... .....         @rd_pg_rn
 RBIT_m          00000101 .. 1001 11 100 ... ..... .....         @rd_pg_rn
 REVD_m          00000101 00 1011 10 100 ... ..... .....         @rd_pg_rn_e0
 
+REVB_z          00000101 .. 1001 00 101 ... ..... .....         @rd_pg_rn
+REVH_z          00000101 .. 1001 01 101 ... ..... .....         @rd_pg_rn
+REVW_z          00000101 .. 1001 10 101 ... ..... .....         @rd_pg_rn
+RBIT_z          00000101 .. 1001 11 101 ... ..... .....         @rd_pg_rn
+
 # SVE vector splice (predicated, destructive)
 SPLICE          00000101 .. 101 100 100 ... ..... .....         @rdn_pg_rm
 
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index d6a1683719..cd1e0e744d 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -836,6 +836,7 @@ static gen_helper_gvec_3 * const sve_rbit_fns[4] = {
     gen_helper_sve_rbit_s, gen_helper_sve_rbit_d,
 };
 TRANS_FEAT(RBIT_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz, sve_rbit_fns[a->esz], a, 0)
+TRANS_FEAT(RBIT_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz, sve_rbit_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const sve2p1_orqv_fns[4] = {
     gen_helper_sve2p1_orqv_b, gen_helper_sve2p1_orqv_h,
@@ -3070,15 +3071,21 @@ static gen_helper_gvec_3 * const revb_fns[4] = {
 };
 TRANS_FEAT(REVB_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            revb_fns[a->esz], a, 0)
+TRANS_FEAT(REVB_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           revb_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const revh_fns[4] = {
     NULL, NULL, gen_helper_sve_revh_s, gen_helper_sve_revh_d,
 };
 TRANS_FEAT(REVH_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            revh_fns[a->esz], a, 0)
+TRANS_FEAT(REVH_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           revh_fns[a->esz], a, 1)
 
 TRANS_FEAT(REVW_m, aa64_sme_or_sve, gen_gvec_ool_arg_zpz,
            a->esz == 3 ? gen_helper_sve_revw_d : NULL, a, 0)
+TRANS_FEAT(REVW_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           a->esz == 3 ? gen_helper_sve_revw_d : NULL, a, 1)
 
 TRANS_FEAT(REVD_m, aa64_sme_or_sve2p1, gen_gvec_ool_arg_zpz,
            gen_helper_sme_revd_q, a, 0)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 51/61] target/arm: Implement SVE reverse doublewords (zeroing)
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (49 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 50/61] target/arm: Implement SVE reverse within elements (zeroing) Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 52/61] target/arm: Implement SVE2 integer unary operations (predicated, zeroing) Peter Maydell
                   ` (10 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve.decode      | 1 +
 target/arm/tcg/sve_helper.c    | 4 ++++
 target/arm/tcg/translate-sve.c | 2 ++
 3 files changed, 7 insertions(+)

diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index 867ae2916e..45f8633fd3 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -734,6 +734,7 @@ REVB_z          00000101 .. 1001 00 101 ... ..... .....         @rd_pg_rn
 REVH_z          00000101 .. 1001 01 101 ... ..... .....         @rd_pg_rn
 REVW_z          00000101 .. 1001 10 101 ... ..... .....         @rd_pg_rn
 RBIT_z          00000101 .. 1001 11 101 ... ..... .....         @rd_pg_rn
+REVD_z          00000101 00 1011 10 101 ... ..... .....         @rd_pg_rn_e0
 
 # SVE vector splice (predicated, destructive)
 SPLICE          00000101 .. 101 100 100 ... ..... .....         @rdn_pg_rm
diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
index 9634517925..1711444800 100644
--- a/target/arm/tcg/sve_helper.c
+++ b/target/arm/tcg/sve_helper.c
@@ -971,6 +971,7 @@ DO_ZPZ_D(sve_revw_d, uint64_t, wswap64)
 void HELPER(sme_revd_q)(void *vd, void *vn, void *vg, uint32_t desc)
 {
     intptr_t i, opr_sz = simd_oprsz(desc) / 8;
+    bool zeroing = simd_data(desc) & 1;
     uint64_t *d = vd, *n = vn;
     uint8_t *pg = vg;
 
@@ -980,6 +981,9 @@ void HELPER(sme_revd_q)(void *vd, void *vn, void *vg, uint32_t desc)
             uint64_t n1 = n[i + 1];
             d[i + 0] = n1;
             d[i + 1] = n0;
+        } else if (zeroing) {
+            d[i + 0] = 0;
+            d[i + 1] = 0;
         }
     }
 }
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index cd1e0e744d..6a29c1e43e 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -3089,6 +3089,8 @@ TRANS_FEAT(REVW_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
 
 TRANS_FEAT(REVD_m, aa64_sme_or_sve2p1, gen_gvec_ool_arg_zpz,
            gen_helper_sme_revd_q, a, 0)
+TRANS_FEAT(REVD_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           gen_helper_sme_revd_q, a, 1)
 
 TRANS_FEAT(SPLICE, aa64_sme_or_sve, gen_gvec_ool_arg_zpzz,
            gen_helper_sve_splice, a, a->esz)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 52/61] target/arm: Implement SVE2 integer unary operations (predicated, zeroing)
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (50 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 51/61] target/arm: Implement SVE reverse doublewords (zeroing) Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 53/61] target/arm: Add data argument to do_frint_mode Peter Maydell
                   ` (9 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

This includes URECPE, URSQRTE, SQABS, SQNEG.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve.decode      | 5 +++++
 target/arm/tcg/translate-sve.c | 6 ++++++
 2 files changed, 11 insertions(+)

diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index 45f8633fd3..f1cf7a628d 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -1545,6 +1545,11 @@ URSQRTE_m       01000100 .. 000 001 101 ... ..... .....  @rd_pg_rn
 SQABS_m         01000100 .. 001 000 101 ... ..... .....  @rd_pg_rn
 SQNEG_m         01000100 .. 001 001 101 ... ..... .....  @rd_pg_rn
 
+URECPE_z        01000100 .. 000 010 101 ... ..... .....  @rd_pg_rn
+URSQRTE_z       01000100 .. 000 011 101 ... ..... .....  @rd_pg_rn
+SQABS_z         01000100 .. 001 010 101 ... ..... .....  @rd_pg_rn
+SQNEG_z         01000100 .. 001 011 101 ... ..... .....  @rd_pg_rn
+
 ### SVE2 saturating/rounding bitwise shift left (predicated)
 
 SRSHL           01000100 .. 000 010 100 ... ..... .....  @rdn_pg_rm
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 6a29c1e43e..fe78e4dda1 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -6791,21 +6791,27 @@ TRANS_FEAT(UADALP_zpzz, aa64_sme_or_sve2, gen_gvec_ool_arg_zpzz,
 
 TRANS_FEAT(URECPE_m, aa64_sme_or_sve2, gen_gvec_ool_arg_zpz,
            a->esz == 2 ? gen_helper_sve2_urecpe_s : NULL, a, 0)
+TRANS_FEAT(URECPE_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           a->esz == 2 ? gen_helper_sve2_urecpe_s : NULL, a, 1)
 
 TRANS_FEAT(URSQRTE_m, aa64_sme_or_sve2, gen_gvec_ool_arg_zpz,
            a->esz == 2 ? gen_helper_sve2_ursqrte_s : NULL, a, 0)
+TRANS_FEAT(URSQRTE_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz,
+           a->esz == 2 ? gen_helper_sve2_ursqrte_s : NULL, a, 1)
 
 static gen_helper_gvec_3 * const sqabs_fns[4] = {
     gen_helper_sve2_sqabs_b, gen_helper_sve2_sqabs_h,
     gen_helper_sve2_sqabs_s, gen_helper_sve2_sqabs_d,
 };
 TRANS_FEAT(SQABS_m, aa64_sme_or_sve2, gen_gvec_ool_arg_zpz, sqabs_fns[a->esz], a, 0)
+TRANS_FEAT(SQABS_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz, sqabs_fns[a->esz], a, 1)
 
 static gen_helper_gvec_3 * const sqneg_fns[4] = {
     gen_helper_sve2_sqneg_b, gen_helper_sve2_sqneg_h,
     gen_helper_sve2_sqneg_s, gen_helper_sve2_sqneg_d,
 };
 TRANS_FEAT(SQNEG_m, aa64_sme_or_sve2, gen_gvec_ool_arg_zpz, sqneg_fns[a->esz], a, 0)
+TRANS_FEAT(SQNEG_z, aa64_sme2p2_or_sve2p2, gen_gvec_ool_arg_zpz, sqneg_fns[a->esz], a, 1)
 
 DO_ZPZZ(SQSHL, aa64_sme_or_sve2, sve2_sqshl)
 DO_ZPZZ(SQRSHL, aa64_sme_or_sve2, sve2_sqrshl)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 53/61] target/arm: Add data argument to do_frint_mode
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (51 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 52/61] target/arm: Implement SVE2 integer unary operations (predicated, zeroing) Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 54/61] target/arm: Implement Floating-point round to integral value (predicated, zeroing) Peter Maydell
                   ` (8 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

Prepare for needing a non-zero value.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/translate-sve.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index fe78e4dda1..ea936669ef 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4656,7 +4656,8 @@ TRANS_FEAT(FRINTX_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            a->esz == MO_16 ? FPST_A64_F16 : FPST_A64);
 
 static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a,
-                          ARMFPRounding mode, gen_helper_gvec_3_ptr *fn)
+                          ARMFPRounding mode, int data,
+                          gen_helper_gvec_3_ptr *fn)
 {
     unsigned vsz;
     TCGv_i32 tmode;
@@ -4676,22 +4677,22 @@ static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a,
     tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
                        vec_full_reg_offset(s, a->rn),
                        pred_full_reg_offset(s, a->pg),
-                       status, vsz, vsz, 0, fn);
+                       status, vsz, vsz, data, fn);
 
     gen_restore_rmode(tmode, status);
     return true;
 }
 
 TRANS_FEAT(FRINTN_m, aa64_sme_or_sve, do_frint_mode, a,
-           FPROUNDING_TIEEVEN, frint_fns[a->esz])
+           FPROUNDING_TIEEVEN, 0, frint_fns[a->esz])
 TRANS_FEAT(FRINTP_m, aa64_sme_or_sve, do_frint_mode, a,
-           FPROUNDING_POSINF, frint_fns[a->esz])
+           FPROUNDING_POSINF, 0, frint_fns[a->esz])
 TRANS_FEAT(FRINTM_m, aa64_sme_or_sve, do_frint_mode, a,
-           FPROUNDING_NEGINF, frint_fns[a->esz])
+           FPROUNDING_NEGINF, 0, frint_fns[a->esz])
 TRANS_FEAT(FRINTZ_m, aa64_sme_or_sve, do_frint_mode, a,
-           FPROUNDING_ZERO, frint_fns[a->esz])
+           FPROUNDING_ZERO, 0, frint_fns[a->esz])
 TRANS_FEAT(FRINTA_m, aa64_sme_or_sve, do_frint_mode, a,
-           FPROUNDING_TIEAWAY, frint_fns[a->esz])
+           FPROUNDING_TIEAWAY, 0, frint_fns[a->esz])
 
 static gen_helper_gvec_3_ptr * const frecpx_fns[] = {
     NULL,                    gen_helper_sve_frecpx_h,
@@ -7998,9 +7999,9 @@ TRANS_FEAT(FCVTLT_sd_m, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
            gen_helper_sve2_fcvtlt_sd, a, 0, FPST_A64)
 
 TRANS_FEAT(FCVTX_ds_m, aa64_sme_or_sve2, do_frint_mode, a,
-           FPROUNDING_ODD, gen_helper_sve_fcvt_ds)
+           FPROUNDING_ODD, 0, gen_helper_sve_fcvt_ds)
 TRANS_FEAT(FCVTXNT_ds_m, aa64_sme_or_sve2, do_frint_mode, a,
-           FPROUNDING_ODD, gen_helper_sve2_fcvtnt_ds)
+           FPROUNDING_ODD, 0, gen_helper_sve2_fcvtnt_ds)
 
 static gen_helper_gvec_3_ptr * const flogb_fns[] = {
     NULL,               gen_helper_flogb_h,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 54/61] target/arm: Implement Floating-point round to integral value (predicated, zeroing)
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (52 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 53/61] target/arm: Add data argument to do_frint_mode Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 55/61] target/arm: Implement Floating-point convert " Peter Maydell
                   ` (7 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

This is the various FRINT rounding modes.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve.decode      |  8 ++++++++
 target/arm/tcg/translate-sve.c | 17 +++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index f1cf7a628d..099cd4e93d 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -1240,6 +1240,14 @@ FRINTA_m        01100101 .. 000 100 101 ... ..... .....         @rd_pg_rn
 FRINTX_m        01100101 .. 000 110 101 ... ..... .....         @rd_pg_rn
 FRINTI_m        01100101 .. 000 111 101 ... ..... .....         @rd_pg_rn
 
+FRINTN_z        01100100 .. 011 000 100 ... ..... .....         @rd_pg_rn
+FRINTP_z        01100100 .. 011 000 101 ... ..... .....         @rd_pg_rn
+FRINTM_z        01100100 .. 011 000 110 ... ..... .....         @rd_pg_rn
+FRINTZ_z        01100100 .. 011 000 111 ... ..... .....         @rd_pg_rn
+FRINTA_z        01100100 .. 011 001 100 ... ..... .....         @rd_pg_rn
+FRINTX_z        01100100 .. 011 001 110 ... ..... .....         @rd_pg_rn
+FRINTI_z        01100100 .. 011 001 111 ... ..... .....         @rd_pg_rn
+
 # SVE floating-point unary operations
 FRECPX_m        01100101 .. 001 100 101 ... ..... .....         @rd_pg_rn
 FSQRT_m         01100101 .. 001 101 101 ... ..... .....         @rd_pg_rn
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index ea936669ef..64ee1f7752 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4644,6 +4644,9 @@ static gen_helper_gvec_3_ptr * const frint_fns[] = {
 TRANS_FEAT(FRINTI_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            frint_fns[a->esz], a, 0,
            a->esz == MO_16 ? FPST_A64_F16 : FPST_A64)
+TRANS_FEAT(FRINTI_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           frint_fns[a->esz], a, 1,
+           a->esz == MO_16 ? FPST_A64_F16 : FPST_A64)
 
 static gen_helper_gvec_3_ptr * const frintx_fns[] = {
     NULL,
@@ -4654,6 +4657,9 @@ static gen_helper_gvec_3_ptr * const frintx_fns[] = {
 TRANS_FEAT(FRINTX_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            frintx_fns[a->esz], a, 0,
            a->esz == MO_16 ? FPST_A64_F16 : FPST_A64);
+TRANS_FEAT(FRINTX_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           frintx_fns[a->esz], a, 1,
+           a->esz == MO_16 ? FPST_A64_F16 : FPST_A64);
 
 static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a,
                           ARMFPRounding mode, int data,
@@ -4694,6 +4700,17 @@ TRANS_FEAT(FRINTZ_m, aa64_sme_or_sve, do_frint_mode, a,
 TRANS_FEAT(FRINTA_m, aa64_sme_or_sve, do_frint_mode, a,
            FPROUNDING_TIEAWAY, 0, frint_fns[a->esz])
 
+TRANS_FEAT(FRINTN_z, aa64_sme2p2_or_sve2p2, do_frint_mode, a,
+           FPROUNDING_TIEEVEN, 1, frint_fns[a->esz])
+TRANS_FEAT(FRINTP_z, aa64_sme2p2_or_sve2p2, do_frint_mode, a,
+           FPROUNDING_POSINF, 1, frint_fns[a->esz])
+TRANS_FEAT(FRINTM_z, aa64_sme2p2_or_sve2p2, do_frint_mode, a,
+           FPROUNDING_NEGINF, 1, frint_fns[a->esz])
+TRANS_FEAT(FRINTZ_z, aa64_sme2p2_or_sve2p2, do_frint_mode, a,
+           FPROUNDING_ZERO, 1, frint_fns[a->esz])
+TRANS_FEAT(FRINTA_z, aa64_sme2p2_or_sve2p2, do_frint_mode, a,
+           FPROUNDING_TIEAWAY, 1, frint_fns[a->esz])
+
 static gen_helper_gvec_3_ptr * const frecpx_fns[] = {
     NULL,                    gen_helper_sve_frecpx_h,
     gen_helper_sve_frecpx_s, gen_helper_sve_frecpx_d,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 55/61] target/arm: Implement Floating-point convert (predicated, zeroing)
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (53 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 54/61] target/arm: Implement Floating-point round to integral value (predicated, zeroing) Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 56/61] target/arm: Implement Floating-point square root " Peter Maydell
                   ` (6 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

This is FCVTX, FCVT and BFCVT.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve.decode      | 11 +++++++++++
 target/arm/tcg/translate-sve.c | 22 ++++++++++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index 099cd4e93d..889ff85f72 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -1215,6 +1215,17 @@ FCVT_hd_m       01100101 11 0010 01 101 ... ..... .....         @rd_pg_rn_e0
 FCVT_ds_m       01100101 11 0010 10 101 ... ..... .....         @rd_pg_rn_e0
 FCVT_sd_m       01100101 11 0010 11 101 ... ..... .....         @rd_pg_rn_e0
 
+FCVTX_ds_z      01100100 00 0110 101 10 ... ..... .....         @rd_pg_rn_e0
+
+FCVT_sh_z       01100100 10 0110 101 00 ... ..... .....         @rd_pg_rn_e0
+FCVT_hs_z       01100100 10 0110 101 01 ... ..... .....         @rd_pg_rn_e0
+BFCVT_z         01100100 10 0110 101 10 ... ..... .....         @rd_pg_rn_e0
+
+FCVT_dh_z       01100100 11 0110 101 00 ... ..... .....         @rd_pg_rn_e0
+FCVT_hd_z       01100100 11 0110 101 01 ... ..... .....         @rd_pg_rn_e0
+FCVT_ds_z       01100100 11 0110 101 10 ... ..... .....         @rd_pg_rn_e0
+FCVT_sd_z       01100100 11 0110 101 11 ... ..... .....         @rd_pg_rn_e0
+
 # SVE floating-point convert to integer
 FCVTZS_hh_m     01100101 01 011 01 0 101 ... ..... .....        @rd_pg_rn_e0
 FCVTZU_hh_m     01100101 01 011 01 1 101 ... ..... .....        @rd_pg_rn_e0
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 64ee1f7752..d610ea561d 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4588,21 +4588,40 @@ TRANS_FEAT(FCMLA_zzxz, aa64_sme_or_sve, gen_gvec_fpst_zzzz, fcmla_idx_fns[a->esz
 
 TRANS_FEAT(FCVT_sh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvt_sh, a, 0, FPST_A64)
+TRANS_FEAT(FCVT_sh_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvt_sh, a, 1, FPST_A64)
+
 TRANS_FEAT(FCVT_hs_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvt_hs, a, 0, FPST_A64_F16)
+TRANS_FEAT(FCVT_hs_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvt_hs, a, 1, FPST_A64_F16)
 
 TRANS_FEAT(BFCVT_m, aa64_sme_sve_bf16, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_bfcvt, a, 0,
            s->fpcr_ah ? FPST_AH : FPST_A64)
+TRANS_FEAT(BFCVT_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_bfcvt, a, 1,
+           s->fpcr_ah ? FPST_AH : FPST_A64)
 
 TRANS_FEAT(FCVT_dh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvt_dh, a, 0, FPST_A64)
+TRANS_FEAT(FCVT_dh_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvt_dh, a, 1, FPST_A64)
+
 TRANS_FEAT(FCVT_hd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvt_hd, a, 0, FPST_A64_F16)
+TRANS_FEAT(FCVT_hd_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvt_hd, a, 1, FPST_A64_F16)
+
 TRANS_FEAT(FCVT_ds_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvt_ds, a, 0, FPST_A64)
+TRANS_FEAT(FCVT_ds_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvt_ds, a, 1, FPST_A64)
+
 TRANS_FEAT(FCVT_sd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvt_sd, a, 0, FPST_A64)
+TRANS_FEAT(FCVT_sd_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvt_sd, a, 1, FPST_A64)
 
 TRANS_FEAT(FCVTZS_hh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzs_hh, a, 0, FPST_A64_F16)
@@ -8017,6 +8036,9 @@ TRANS_FEAT(FCVTLT_sd_m, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
 
 TRANS_FEAT(FCVTX_ds_m, aa64_sme_or_sve2, do_frint_mode, a,
            FPROUNDING_ODD, 0, gen_helper_sve_fcvt_ds)
+TRANS_FEAT(FCVTX_ds_z, aa64_sme2p2_or_sve2p2, do_frint_mode, a,
+           FPROUNDING_ODD, 1, gen_helper_sve_fcvt_ds)
+
 TRANS_FEAT(FCVTXNT_ds_m, aa64_sme_or_sve2, do_frint_mode, a,
            FPROUNDING_ODD, 0, gen_helper_sve2_fcvtnt_ds)
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 56/61] target/arm: Implement Floating-point square root (predicated, zeroing)
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (54 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 55/61] target/arm: Implement Floating-point convert " Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 57/61] target/arm: Implement SCVTF, UCVTF " Peter Maydell
                   ` (5 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

This is FRECPX and FSQRT.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve.decode      | 3 +++
 target/arm/tcg/translate-sve.c | 5 +++++
 2 files changed, 8 insertions(+)

diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index 889ff85f72..c39d80360f 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -1263,6 +1263,9 @@ FRINTI_z        01100100 .. 011 001 111 ... ..... .....         @rd_pg_rn
 FRECPX_m        01100101 .. 001 100 101 ... ..... .....         @rd_pg_rn
 FSQRT_m         01100101 .. 001 101 101 ... ..... .....         @rd_pg_rn
 
+FRECPX_z        01100100 .. 011 011 100 ... ..... .....         @rd_pg_rn
+FSQRT_z         01100100 .. 011 011 101 ... ..... .....         @rd_pg_rn
+
 # SVE integer convert to floating-point
 SCVTF_hh_m      01100101 01 010 01 0 101 ... ..... .....        @rd_pg_rn_e0
 SCVTF_sh_m      01100101 01 010 10 0 101 ... ..... .....        @rd_pg_rn_e0
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index d610ea561d..1ea9454267 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4736,6 +4736,8 @@ static gen_helper_gvec_3_ptr * const frecpx_fns[] = {
 };
 TRANS_FEAT(FRECPX_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            frecpx_fns[a->esz], a, 0, select_ah_fpst(s, a->esz))
+TRANS_FEAT(FRECPX_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           frecpx_fns[a->esz], a, 1, select_ah_fpst(s, a->esz))
 
 static gen_helper_gvec_3_ptr * const fsqrt_fns[] = {
     NULL,                   gen_helper_sve_fsqrt_h,
@@ -4744,6 +4746,9 @@ static gen_helper_gvec_3_ptr * const fsqrt_fns[] = {
 TRANS_FEAT(FSQRT_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            fsqrt_fns[a->esz], a, 0,
            a->esz == MO_16 ? FPST_A64_F16 : FPST_A64)
+TRANS_FEAT(FSQRT_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           fsqrt_fns[a->esz], a, 1,
+           a->esz == MO_16 ? FPST_A64_F16 : FPST_A64)
 
 TRANS_FEAT(SCVTF_hh_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_scvt_hh, a, 0, FPST_A64_F16)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 57/61] target/arm: Implement SCVTF, UCVTF (predicated, zeroing)
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (55 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 56/61] target/arm: Implement Floating-point square root " Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 58/61] target/arm: Implement FRINT{32,64}{X,Z} Peter Maydell
                   ` (4 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-15-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve.decode      | 16 ++++++++++++++++
 target/arm/tcg/translate-sve.c | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index c39d80360f..7460eee4a9 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -1283,6 +1283,22 @@ UCVTF_sd_m      01100101 11 010 00 1 101 ... ..... .....        @rd_pg_rn_e0
 UCVTF_ds_m      01100101 11 010 10 1 101 ... ..... .....        @rd_pg_rn_e0
 UCVTF_dd_m      01100101 11 010 11 1 101 ... ..... .....        @rd_pg_rn_e0
 
+SCVTF_hh_z      01100100 01 011 10 0 110 ... ..... .....        @rd_pg_rn_e0
+SCVTF_sh_z      01100100 01 011 10 1 100 ... ..... .....        @rd_pg_rn_e0
+SCVTF_ss_z      01100100 10 011 10 1 100 ... ..... .....        @rd_pg_rn_e0
+SCVTF_sd_z      01100100 11 011 10 0 100 ... ..... .....        @rd_pg_rn_e0
+SCVTF_dh_z      01100100 01 011 10 1 110 ... ..... .....        @rd_pg_rn_e0
+SCVTF_ds_z      01100100 11 011 10 1 100 ... ..... .....        @rd_pg_rn_e0
+SCVTF_dd_z      01100100 11 011 10 1 110 ... ..... .....        @rd_pg_rn_e0
+
+UCVTF_hh_z      01100100 01 011 10 0 111 ... ..... .....        @rd_pg_rn_e0
+UCVTF_sh_z      01100100 01 011 10 1 101 ... ..... .....        @rd_pg_rn_e0
+UCVTF_ss_z      01100100 10 011 10 1 101 ... ..... .....        @rd_pg_rn_e0
+UCVTF_sd_z      01100100 11 011 10 0 101 ... ..... .....        @rd_pg_rn_e0
+UCVTF_dh_z      01100100 01 011 10 1 111 ... ..... .....        @rd_pg_rn_e0
+UCVTF_ds_z      01100100 11 011 10 1 101 ... ..... .....        @rd_pg_rn_e0
+UCVTF_dd_z      01100100 11 011 10 1 111 ... ..... .....        @rd_pg_rn_e0
+
 ### SVE Memory - 32-bit Gather and Unsized Contiguous Group
 
 # SVE load predicate register
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 1ea9454267..4ac59664ae 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4784,6 +4784,40 @@ TRANS_FEAT(UCVTF_sd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
 TRANS_FEAT(UCVTF_dd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_ucvt_dd, a, 0, FPST_A64)
 
+TRANS_FEAT(SCVTF_hh_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_scvt_hh, a, 1, FPST_A64_F16)
+TRANS_FEAT(SCVTF_sh_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_scvt_sh, a, 1, FPST_A64_F16)
+TRANS_FEAT(SCVTF_dh_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_scvt_dh, a, 1, FPST_A64_F16)
+
+TRANS_FEAT(SCVTF_ss_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_scvt_ss, a, 1, FPST_A64)
+TRANS_FEAT(SCVTF_ds_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_scvt_ds, a, 1, FPST_A64)
+
+TRANS_FEAT(SCVTF_sd_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_scvt_sd, a, 1, FPST_A64)
+TRANS_FEAT(SCVTF_dd_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_scvt_dd, a, 1, FPST_A64)
+
+TRANS_FEAT(UCVTF_hh_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_ucvt_hh, a, 1, FPST_A64_F16)
+TRANS_FEAT(UCVTF_sh_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_ucvt_sh, a, 1, FPST_A64_F16)
+TRANS_FEAT(UCVTF_dh_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_ucvt_dh, a, 1, FPST_A64_F16)
+
+TRANS_FEAT(UCVTF_ss_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_ucvt_ss, a, 1, FPST_A64)
+TRANS_FEAT(UCVTF_ds_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_ucvt_ds, a, 1, FPST_A64)
+TRANS_FEAT(UCVTF_sd_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_ucvt_sd, a, 1, FPST_A64)
+
+TRANS_FEAT(UCVTF_dd_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_ucvt_dd, a, 1, FPST_A64)
+
 /*
  *** SVE Memory - 32-bit Gather and Unsized Contiguous Group
  */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 58/61] target/arm: Implement FRINT{32,64}{X,Z}
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (56 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 57/61] target/arm: Implement SCVTF, UCVTF " Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 59/61] target/arm: Enable zeroing in DO_FCVT{N, L}T macros in sve_helper.c Peter Maydell
                   ` (3 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-16-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/helper-sve-defs.h |  9 ++++++++
 target/arm/tcg/sve.decode        | 20 ++++++++++++++++++
 target/arm/tcg/sve_helper.c      |  5 +++++
 target/arm/tcg/translate-sve.c   | 36 ++++++++++++++++++++++++++++++++
 4 files changed, 70 insertions(+)

diff --git a/target/arm/tcg/helper-sve-defs.h b/target/arm/tcg/helper-sve-defs.h
index f97c31763f..de2254bb19 100644
--- a/target/arm/tcg/helper-sve-defs.h
+++ b/target/arm/tcg/helper-sve-defs.h
@@ -1441,6 +1441,15 @@ DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG,
                    void, ptr, ptr, ptr, fpst, i32)
 
+DEF_HELPER_FLAGS_5(sve2p2_frint32_s, TCG_CALL_NO_RWG,
+                   void, ptr, ptr, ptr, fpst, i32)
+DEF_HELPER_FLAGS_5(sve2p2_frint64_s, TCG_CALL_NO_RWG,
+                   void, ptr, ptr, ptr, fpst, i32)
+DEF_HELPER_FLAGS_5(sve2p2_frint32_d, TCG_CALL_NO_RWG,
+                   void, ptr, ptr, ptr, fpst, i32)
+DEF_HELPER_FLAGS_5(sve2p2_frint64_d, TCG_CALL_NO_RWG,
+                   void, ptr, ptr, ptr, fpst, i32)
+
 DEF_HELPER_FLAGS_5(sve_frecpx_h, TCG_CALL_NO_RWG,
                    void, ptr, ptr, ptr, fpst, i32)
 DEF_HELPER_FLAGS_5(sve_frecpx_s, TCG_CALL_NO_RWG,
diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index 7460eee4a9..5c814c7769 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -1259,6 +1259,26 @@ FRINTA_z        01100100 .. 011 001 100 ... ..... .....         @rd_pg_rn
 FRINTX_z        01100100 .. 011 001 110 ... ..... .....         @rd_pg_rn
 FRINTI_z        01100100 .. 011 001 111 ... ..... .....         @rd_pg_rn
 
+FRINT32X_s_m    01100101 00 010 001 101 ... ..... .....         @rd_pg_rn_e0
+FRINT32X_d_m    01100101 00 010 011 101 ... ..... .....         @rd_pg_rn_e0
+FRINT64X_s_m    01100101 00 010 101 101 ... ..... .....         @rd_pg_rn_e0
+FRINT64X_d_m    01100101 00 010 111 101 ... ..... .....         @rd_pg_rn_e0
+
+FRINT32X_s_z    01100100 00 011 100 101 ... ..... .....         @rd_pg_rn_e0
+FRINT32X_d_z    01100100 00 011 100 111 ... ..... .....         @rd_pg_rn_e0
+FRINT64X_s_z    01100100 00 011 101 101 ... ..... .....         @rd_pg_rn_e0
+FRINT64X_d_z    01100100 00 011 101 111 ... ..... .....         @rd_pg_rn_e0
+
+FRINT32Z_s_m    01100101 00 010 000 101 ... ..... .....         @rd_pg_rn_e0
+FRINT32Z_d_m    01100101 00 010 010 101 ... ..... .....         @rd_pg_rn_e0
+FRINT64Z_s_m    01100101 00 010 100 101 ... ..... .....         @rd_pg_rn_e0
+FRINT64Z_d_m    01100101 00 010 110 101 ... ..... .....         @rd_pg_rn_e0
+
+FRINT32Z_s_z    01100100 00 011 100 100 ... ..... .....         @rd_pg_rn_e0
+FRINT32Z_d_z    01100100 00 011 100 110 ... ..... .....         @rd_pg_rn_e0
+FRINT64Z_s_z    01100100 00 011 101 100 ... ..... .....         @rd_pg_rn_e0
+FRINT64Z_d_z    01100100 00 011 101 110 ... ..... .....         @rd_pg_rn_e0
+
 # SVE floating-point unary operations
 FRECPX_m        01100101 .. 001 100 101 ... ..... .....         @rd_pg_rn
 FSQRT_m         01100101 .. 001 101 101 ... ..... .....         @rd_pg_rn
diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
index 1711444800..f8e62cb11b 100644
--- a/target/arm/tcg/sve_helper.c
+++ b/target/arm/tcg/sve_helper.c
@@ -5017,6 +5017,11 @@ DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int)
 DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int)
 DO_ZPZ_FP(sve_frintx_d, uint64_t, H1_8, float64_round_to_int)
 
+DO_ZPZ_FP(sve2p2_frint32_s, uint32_t, H1_4, helper_frint32_s)
+DO_ZPZ_FP(sve2p2_frint64_s, uint32_t, H1_4, helper_frint64_s)
+DO_ZPZ_FP(sve2p2_frint32_d, uint64_t, H1_8, helper_frint32_d)
+DO_ZPZ_FP(sve2p2_frint64_d, uint64_t, H1_8, helper_frint64_d)
+
 DO_ZPZ_FP(sve_frecpx_h, uint16_t, H1_2, helper_frecpx_f16)
 DO_ZPZ_FP(sve_frecpx_s, uint32_t, H1_4, helper_frecpx_f32)
 DO_ZPZ_FP(sve_frecpx_d, uint64_t, H1_8, helper_frecpx_f64)
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 4ac59664ae..a8d21dedca 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4730,6 +4730,42 @@ TRANS_FEAT(FRINTZ_z, aa64_sme2p2_or_sve2p2, do_frint_mode, a,
 TRANS_FEAT(FRINTA_z, aa64_sme2p2_or_sve2p2, do_frint_mode, a,
            FPROUNDING_TIEAWAY, 1, frint_fns[a->esz])
 
+TRANS_FEAT(FRINT32X_s_m, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve2p2_frint32_s, a, 0, FPST_A64)
+TRANS_FEAT(FRINT32X_d_m, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve2p2_frint32_d, a, 0, FPST_A64)
+TRANS_FEAT(FRINT64X_s_m, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve2p2_frint64_s, a, 0, FPST_A64)
+TRANS_FEAT(FRINT64X_d_m, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve2p2_frint64_d, a, 0, FPST_A64)
+
+TRANS_FEAT(FRINT32X_s_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve2p2_frint32_s, a, 1, FPST_A64)
+TRANS_FEAT(FRINT32X_d_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve2p2_frint32_d, a, 1, FPST_A64)
+TRANS_FEAT(FRINT64X_s_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve2p2_frint64_s, a, 1, FPST_A64)
+TRANS_FEAT(FRINT64X_d_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve2p2_frint64_d, a, 1, FPST_A64)
+
+TRANS_FEAT(FRINT32Z_s_m, aa64_sme2p2_or_sve2p2, do_frint_mode,
+           a, FPROUNDING_ZERO, 0, gen_helper_sve2p2_frint32_s)
+TRANS_FEAT(FRINT32Z_d_m, aa64_sme2p2_or_sve2p2, do_frint_mode,
+           a, FPROUNDING_ZERO, 0, gen_helper_sve2p2_frint32_d)
+TRANS_FEAT(FRINT64Z_s_m, aa64_sme2p2_or_sve2p2, do_frint_mode,
+           a, FPROUNDING_ZERO, 0, gen_helper_sve2p2_frint64_s)
+TRANS_FEAT(FRINT64Z_d_m, aa64_sme2p2_or_sve2p2, do_frint_mode,
+           a, FPROUNDING_ZERO, 0, gen_helper_sve2p2_frint64_d)
+
+TRANS_FEAT(FRINT32Z_s_z, aa64_sme2p2_or_sve2p2, do_frint_mode,
+           a, FPROUNDING_ZERO, 1, gen_helper_sve2p2_frint32_s)
+TRANS_FEAT(FRINT32Z_d_z, aa64_sme2p2_or_sve2p2, do_frint_mode,
+           a, FPROUNDING_ZERO, 1, gen_helper_sve2p2_frint32_d)
+TRANS_FEAT(FRINT64Z_s_z, aa64_sme2p2_or_sve2p2, do_frint_mode,
+           a, FPROUNDING_ZERO, 1, gen_helper_sve2p2_frint64_s)
+TRANS_FEAT(FRINT64Z_d_z, aa64_sme2p2_or_sve2p2, do_frint_mode,
+           a, FPROUNDING_ZERO, 1, gen_helper_sve2p2_frint64_d)
+
 static gen_helper_gvec_3_ptr * const frecpx_fns[] = {
     NULL,                    gen_helper_sve_frecpx_h,
     gen_helper_sve_frecpx_s, gen_helper_sve_frecpx_d,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 59/61] target/arm: Enable zeroing in DO_FCVT{N, L}T macros in sve_helper.c
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (57 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 58/61] target/arm: Implement FRINT{32,64}{X,Z} Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 60/61] target/arm: Implement SVE floating-point convert (top, predicated, zeroing) Peter Maydell
                   ` (2 subsequent siblings)
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

Use the low bit of simd_data to hold a 'zeroing' bit.
The simd_data field is currently unused and always 0.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve_helper.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
index f8e62cb11b..a0a189eb1e 100644
--- a/target/arm/tcg/sve_helper.c
+++ b/target/arm/tcg/sve_helper.c
@@ -8595,6 +8595,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vg,                               \
                   float_status *status, uint32_t desc)                        \
 {                                                                             \
     intptr_t i = simd_oprsz(desc);                                            \
+    bool zeroing = simd_data(desc) & 1;                                       \
     uint64_t *g = vg;                                                         \
     do {                                                                      \
         uint64_t pg = g[(i - 1) >> 6];                                        \
@@ -8603,6 +8604,8 @@ void HELPER(NAME)(void *vd, void *vn, void *vg,                               \
             if (likely((pg >> (i & 63)) & 1)) {                               \
                 TYPEW nn = *(TYPEW *)(vn + HW(i));                            \
                 *(TYPEN *)(vd + HN(i + sizeof(TYPEN))) = OP(nn, status);      \
+            } else if (zeroing) {                                             \
+                *(TYPEN *)(vd + HN(i + sizeof(TYPEN))) = 0;                   \
             }                                                                 \
         } while (i & 63);                                                     \
     } while (i != 0);                                                         \
@@ -8617,6 +8620,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vg,                               \
                   float_status *status, uint32_t desc)                        \
 {                                                                             \
     intptr_t i = simd_oprsz(desc);                                            \
+    bool zeroing = simd_data(desc) & 1;                                       \
     uint64_t *g = vg;                                                         \
     do {                                                                      \
         uint64_t pg = g[(i - 1) >> 6];                                        \
@@ -8625,6 +8629,8 @@ void HELPER(NAME)(void *vd, void *vn, void *vg,                               \
             if (likely((pg >> (i & 63)) & 1)) {                               \
                 TYPEN nn = *(TYPEN *)(vn + HN(i + sizeof(TYPEN)));            \
                 *(TYPEW *)(vd + HW(i)) = OP(nn, status);                      \
+            } else if (zeroing) {                                             \
+                *(TYPEW *)(vd + HW(i)) = 0;                                   \
             }                                                                 \
         } while (i & 63);                                                     \
     } while (i != 0);                                                         \
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 60/61] target/arm: Implement SVE floating-point convert (top, predicated, zeroing)
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (58 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 59/61] target/arm: Enable zeroing in DO_FCVT{N, L}T macros in sve_helper.c Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-16 19:06 ` [PULL 61/61] target/arm: Implement floating-point log and convert to integer (zeroing) Peter Maydell
  2026-06-17 19:30 ` [PULL 00/61] target-arm queue Stefan Hajnoczi
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

This includes FCVTXNT, BFCVTNT, FCVTNT, FCVTLT.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-18-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve.decode      |  7 +++++++
 target/arm/tcg/translate-sve.c | 13 +++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index 5c814c7769..673cbaae57 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -1941,6 +1941,13 @@ FCVTLT_hs_m     01100100 10 0010 01 101 ... ..... .....  @rd_pg_rn_e0
 FCVTNT_ds_m     01100100 11 0010 10 101 ... ..... .....  @rd_pg_rn_e0
 FCVTLT_sd_m     01100100 11 0010 11 101 ... ..... .....  @rd_pg_rn_e0
 
+FCVTXNT_ds_z    01100100 00 0000 10 101 ... ..... .....  @rd_pg_rn_e0
+FCVTNT_sh_z     01100100 10 0000 00 101 ... ..... .....  @rd_pg_rn_e0
+FCVTNT_ds_z     01100100 11 0000 10 101 ... ..... .....  @rd_pg_rn_e0
+BFCVTNT_z       01100100 10 0000 10 101 ... ..... .....  @rd_pg_rn_e0
+FCVTLT_hs_z     01100100 10 0000 01 101 ... ..... .....  @rd_pg_rn_e0
+FCVTLT_sd_z     01100100 11 0000 11 101 ... ..... .....  @rd_pg_rn_e0
+
 ### SVE2 floating-point convert to integer
 FLOGB_m         01100101 00 011 esz:2 0101 pg:3 rn:5 rd:5  &rpr_esz
 
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index a8d21dedca..21ee10ee5c 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -8097,17 +8097,28 @@ static bool trans_RAX1(DisasContext *s, arg_RAX1 *a)
 
 TRANS_FEAT(FCVTNT_sh_m, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
            gen_helper_sve2_fcvtnt_sh, a, 0, FPST_A64)
+TRANS_FEAT(FCVTNT_sh_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve2_fcvtnt_sh, a, 1, FPST_A64)
 TRANS_FEAT(FCVTNT_ds_m, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
            gen_helper_sve2_fcvtnt_ds, a, 0, FPST_A64)
+TRANS_FEAT(FCVTNT_ds_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve2_fcvtnt_ds, a, 1, FPST_A64)
 
 TRANS_FEAT(BFCVTNT_m, aa64_sme_sve_bf16, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_bfcvtnt, a, 0,
            s->fpcr_ah ? FPST_AH : FPST_A64)
+TRANS_FEAT(BFCVTNT_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_bfcvtnt, a, 1,
+           s->fpcr_ah ? FPST_AH : FPST_A64)
 
 TRANS_FEAT(FCVTLT_hs_m, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
            gen_helper_sve2_fcvtlt_hs, a, 0, FPST_A64_F16)
+TRANS_FEAT(FCVTLT_hs_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve2_fcvtlt_hs, a, 1, FPST_A64_F16)
 TRANS_FEAT(FCVTLT_sd_m, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
            gen_helper_sve2_fcvtlt_sd, a, 0, FPST_A64)
+TRANS_FEAT(FCVTLT_sd_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve2_fcvtlt_sd, a, 1, FPST_A64)
 
 TRANS_FEAT(FCVTX_ds_m, aa64_sme_or_sve2, do_frint_mode, a,
            FPROUNDING_ODD, 0, gen_helper_sve_fcvt_ds)
@@ -8116,6 +8127,8 @@ TRANS_FEAT(FCVTX_ds_z, aa64_sme2p2_or_sve2p2, do_frint_mode, a,
 
 TRANS_FEAT(FCVTXNT_ds_m, aa64_sme_or_sve2, do_frint_mode, a,
            FPROUNDING_ODD, 0, gen_helper_sve2_fcvtnt_ds)
+TRANS_FEAT(FCVTXNT_ds_z, aa64_sme2p2_or_sve2p2, do_frint_mode, a,
+           FPROUNDING_ODD, 1, gen_helper_sve2_fcvtnt_ds)
 
 static gen_helper_gvec_3_ptr * const flogb_fns[] = {
     NULL,               gen_helper_flogb_h,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PULL 61/61] target/arm: Implement floating-point log and convert to integer (zeroing)
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (59 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 60/61] target/arm: Implement SVE floating-point convert (top, predicated, zeroing) Peter Maydell
@ 2026-06-16 19:06 ` Peter Maydell
  2026-06-17 19:30 ` [PULL 00/61] target-arm queue Stefan Hajnoczi
  61 siblings, 0 replies; 66+ messages in thread
From: Peter Maydell @ 2026-06-16 19:06 UTC (permalink / raw)
  To: qemu-devel

From: Richard Henderson <richard.henderson@linaro.org>

This is FLOGB, FCVTZS, FCVTZU.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20260604234852.573178-19-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/sve.decode      | 16 +++++++++++++++
 target/arm/tcg/translate-sve.c | 37 ++++++++++++++++++++++++++++++++--
 2 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index 673cbaae57..2795c2ec7f 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -1242,6 +1242,21 @@ FCVTZU_sd_m     01100101 11 011 10 1 101 ... ..... .....        @rd_pg_rn_e0
 FCVTZS_dd_m     01100101 11 011 11 0 101 ... ..... .....        @rd_pg_rn_e0
 FCVTZU_dd_m     01100101 11 011 11 1 101 ... ..... .....        @rd_pg_rn_e0
 
+FCVTZS_hh_z     01100100 01 011 11 0 110 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_hh_z     01100100 01 011 11 0 111 ... ..... .....        @rd_pg_rn_e0
+FCVTZS_hs_z     01100100 01 011 11 1 100 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_hs_z     01100100 01 011 11 1 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZS_hd_z     01100100 01 011 11 1 110 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_hd_z     01100100 01 011 11 1 111 ... ..... .....        @rd_pg_rn_e0
+FCVTZS_ss_z     01100100 10 011 11 1 100 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_ss_z     01100100 10 011 11 1 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZS_sd_z     01100100 11 011 11 1 100 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_sd_z     01100100 11 011 11 1 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZS_ds_z     01100100 11 011 11 0 100 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_ds_z     01100100 11 011 11 0 101 ... ..... .....        @rd_pg_rn_e0
+FCVTZS_dd_z     01100100 11 011 11 1 110 ... ..... .....        @rd_pg_rn_e0
+FCVTZU_dd_z     01100100 11 011 11 1 111 ... ..... .....        @rd_pg_rn_e0
+
 # SVE floating-point round to integral value
 FRINTN_m        01100101 .. 000 000 101 ... ..... .....         @rd_pg_rn
 FRINTP_m        01100101 .. 000 001 101 ... ..... .....         @rd_pg_rn
@@ -1950,6 +1965,7 @@ FCVTLT_sd_z     01100100 11 0000 11 101 ... ..... .....  @rd_pg_rn_e0
 
 ### SVE2 floating-point convert to integer
 FLOGB_m         01100101 00 011 esz:2 0101 pg:3 rn:5 rd:5  &rpr_esz
+FLOGB_z         01100100 00 011 1101 esz:2 pg:3 rn:5 rd:5  &rpr_esz
 
 ### SVE2 floating-point multiply-add long (vectors)
 FMLALB_zzzw     01100100 10 1 ..... 10 0 00 0 ..... .....  @rda_rn_rm_ex esz=2
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 21ee10ee5c..9a72e03513 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4654,6 +4654,37 @@ TRANS_FEAT(FCVTZS_dd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
 TRANS_FEAT(FCVTZU_dd_m, aa64_sme_or_sve, gen_gvec_fpst_arg_zpz,
            gen_helper_sve_fcvtzu_dd, a, 0, FPST_A64)
 
+TRANS_FEAT(FCVTZS_hh_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzs_hh, a, 1, FPST_A64_F16)
+TRANS_FEAT(FCVTZU_hh_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzu_hh, a, 1, FPST_A64_F16)
+TRANS_FEAT(FCVTZS_hs_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzs_hs, a, 1, FPST_A64_F16)
+TRANS_FEAT(FCVTZU_hs_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzu_hs, a, 1, FPST_A64_F16)
+TRANS_FEAT(FCVTZS_hd_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzs_hd, a, 1, FPST_A64_F16)
+TRANS_FEAT(FCVTZU_hd_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzu_hd, a, 1, FPST_A64_F16)
+
+TRANS_FEAT(FCVTZS_ss_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzs_ss, a, 1, FPST_A64)
+TRANS_FEAT(FCVTZU_ss_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzu_ss, a, 1, FPST_A64)
+TRANS_FEAT(FCVTZS_sd_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzs_sd, a, 1, FPST_A64)
+TRANS_FEAT(FCVTZU_sd_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzu_sd, a, 1, FPST_A64)
+TRANS_FEAT(FCVTZS_ds_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzs_ds, a, 1, FPST_A64)
+TRANS_FEAT(FCVTZU_ds_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzu_ds, a, 1, FPST_A64)
+
+TRANS_FEAT(FCVTZS_dd_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzs_dd, a, 1, FPST_A64)
+TRANS_FEAT(FCVTZU_dd_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           gen_helper_sve_fcvtzu_dd, a, 1, FPST_A64)
+
 static gen_helper_gvec_3_ptr * const frint_fns[] = {
     NULL,
     gen_helper_sve_frint_h,
@@ -8134,8 +8165,10 @@ static gen_helper_gvec_3_ptr * const flogb_fns[] = {
     NULL,               gen_helper_flogb_h,
     gen_helper_flogb_s, gen_helper_flogb_d
 };
-TRANS_FEAT(FLOGB_m, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz, flogb_fns[a->esz],
-           a, 0, a->esz == MO_16 ? FPST_A64_F16 : FPST_A64)
+TRANS_FEAT(FLOGB_m, aa64_sme_or_sve2, gen_gvec_fpst_arg_zpz,
+           flogb_fns[a->esz], a, 0, a->esz == MO_16 ? FPST_A64_F16 : FPST_A64)
+TRANS_FEAT(FLOGB_z, aa64_sme2p2_or_sve2p2, gen_gvec_fpst_arg_zpz,
+           flogb_fns[a->esz], a, 1, a->esz == MO_16 ? FPST_A64_F16 : FPST_A64)
 
 static bool do_FMLAL_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sub, bool sel)
 {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PULL 00/61] target-arm queue
  2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
                   ` (60 preceding siblings ...)
  2026-06-16 19:06 ` [PULL 61/61] target/arm: Implement floating-point log and convert to integer (zeroing) Peter Maydell
@ 2026-06-17 19:30 ` Stefan Hajnoczi
  61 siblings, 0 replies; 66+ messages in thread
From: Stefan Hajnoczi @ 2026-06-17 19:30 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 116 bytes --]

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/11.1 for any user-visible changes.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2026-06-17 19:31 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-16 19:05 [PULL 00/61] target-arm queue Peter Maydell
2026-06-16 19:05 ` [PULL 01/61] hw/arm/smmuv3: Update ATC invalidation check Peter Maydell
2026-06-16 19:05 ` [PULL 02/61] hw/arm/smmuv3: Improve accel SMMUv3 usage documentation Peter Maydell
2026-06-16 19:05 ` [PULL 03/61] hw/arm/smmuv3-accel: Add helper for resolving auto parameters Peter Maydell
2026-06-16 19:05 ` [PULL 04/61] hw/arm/smmuv3-accel: Implement "auto" value for "ats" Peter Maydell
2026-06-16 19:05 ` [PULL 05/61] hw/arm/smmuv3-accel: Implement "auto" value for "ril" Peter Maydell
2026-06-16 19:05 ` [PULL 06/61] hw/arm/smmuv3-accel: Implement "auto" value for "ssidsize" Peter Maydell
2026-06-16 19:05 ` [PULL 07/61] hw/arm/smmuv3-accel: Implement "auto" value for "oas" Peter Maydell
2026-06-16 19:05 ` [PULL 08/61] hw/arm/smmuv3: Set default ats, ril, ssidsize, oas to auto Peter Maydell
2026-06-16 19:05 ` [PULL 09/61] qemu-options.hx: Support "auto" for accel SMMUv3 properties Peter Maydell
2026-06-16 19:05 ` [PULL 10/61] hw/pci/pci: Enforce pci_setup_iommu_per_bus() is called only once per bus Peter Maydell
2026-06-16 19:05 ` [PULL 11/61] backends/iommufd: Update iommufd_backend_get_device_info Peter Maydell
2026-06-16 19:05 ` [PULL 12/61] backends/iommufd: Update iommufd_backend_alloc_viommu to allow user ptr Peter Maydell
2026-06-16 19:05 ` [PULL 13/61] backends/iommufd: Introduce iommufd_backend_alloc_hw_queue Peter Maydell
2026-06-16 19:05 ` [PULL 14/61] backends/iommufd: Introduce iommufd_backend_viommu_mmap Peter Maydell
2026-06-16 19:05 ` [PULL 15/61] system/iommufd: Remove unused viommu pointer from IOMMUFDVeventq Peter Maydell
2026-06-16 19:05 ` [PULL 16/61] hw/arm/smmuv3-accel: Introduce CMDQV ops interface Peter Maydell
2026-06-16 19:05 ` [PULL 17/61] hw/arm/tegra241-cmdqv: Add Tegra241 CMDQV ops backend stub Peter Maydell
2026-06-16 19:05 ` [PULL 18/61] hw/arm/smmuv3-accel: Wire CMDQV ops into accel lifecycle Peter Maydell
2026-06-16 19:05 ` [PULL 19/61] hw/arm/virt: Use stored SMMUv3 device list for IORT build Peter Maydell
2026-06-16 19:05 ` [PULL 20/61] hw/arm/tegra241-cmdqv: Probe host Tegra241 CMDQV support Peter Maydell
2026-06-16 19:05 ` [PULL 21/61] hw/arm/tegra241-cmdqv: Implement CMDQV init Peter Maydell
2026-06-16 19:05 ` [PULL 22/61] hw/arm/virt: Link SMMUv3 CMDQV resources to platform bus Peter Maydell
2026-06-16 19:06 ` [PULL 23/61] hw/arm/tegra241-cmdqv: Implement CMDQV vIOMMU alloc/free Peter Maydell
2026-06-16 19:06 ` [PULL 24/61] hw/arm/tegra241-cmdqv: mmap host VINTF Page0 for CMDQV Peter Maydell
2026-06-16 19:06 ` [PULL 25/61] hw/arm/tegra241-cmdqv: Emulate CMDQ-V Config region Peter Maydell
2026-06-16 19:06 ` [PULL 26/61] hw/arm/tegra241-cmdqv: Emulate VCMDQ register reads Peter Maydell
2026-06-16 19:06 ` [PULL 27/61] hw/arm/tegra241-cmdqv: Emulate VCMDQ register writes Peter Maydell
2026-06-16 19:06 ` [PULL 28/61] hw/arm/tegra241-cmdqv: Allocate HW VCMDQs once configured Peter Maydell
2026-06-16 19:06 ` [PULL 29/61] hw/arm/tegra241-cmdqv: Route allocated VCMDQ Page0 accesses to the mmap'd host VINTF page0 Peter Maydell
2026-06-16 19:06 ` [PULL 30/61] memory: Allow RAM device regions to skip IOMMU mapping Peter Maydell
2026-06-16 19:06 ` [PULL 31/61] hw/arm/tegra241-cmdqv: Use mmap'd host VINTF page0 for virtual VINTF page0 Peter Maydell
2026-06-16 19:06 ` [PULL 32/61] hw/arm/smmuv3-accel: Introduce common helper for veventq read Peter Maydell
2026-06-16 19:06 ` [PULL 33/61] hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors Peter Maydell
2026-06-16 19:06 ` [PULL 34/61] hw/arm/tegra241-cmdqv: Initialize register state on reset Peter Maydell
2026-06-16 19:06 ` [PULL 35/61] hw/arm/tegra241-cmdqv: Limit queue size based on backend page size Peter Maydell
2026-06-16 19:06 ` [PULL 36/61] hw/arm/smmuv3: Add per-device identifier property Peter Maydell
2026-06-16 19:06 ` [PULL 37/61] hw/arm/smmuv3-accel: Introduce helper to query CMDQV type Peter Maydell
2026-06-16 19:06 ` [PULL 38/61] hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT Peter Maydell
2026-06-16 19:06 ` [PULL 39/61] hw/arm/smmuv3-accel: Enforce viommu association when CMDQV is active Peter Maydell
2026-06-16 19:06 ` [PULL 40/61] hw/arm/tegra241-cmdqv: Document the CMDQV design and lifecycle Peter Maydell
2026-06-16 19:06 ` [PULL 41/61] hw/arm/smmuv3: Add cmdqv property for SMMUv3 device Peter Maydell
2026-06-16 19:06 ` [PULL 42/61] target/arm: honour CCR.BFHFNMIGN for probed data BusFaults Peter Maydell
2026-06-16 19:06 ` [PULL 43/61] hw/arm/bcm2838: Route I2C interrupts to GIC Peter Maydell
2026-06-16 19:06 ` [PULL 44/61] target/arm: Add feature predicates for SVE2.2 and SME2.2 Peter Maydell
2026-06-16 19:06 ` [PULL 45/61] target/arm: Rename sve unary predicated patterns Peter Maydell
2026-06-16 19:06 ` [PULL 46/61] target/arm: Enable zeroing in DO_ZPZ macros in sve_helper.c Peter Maydell
2026-06-16 19:06 ` [PULL 47/61] target/arm: Expand DO_ZPZ in translate-sve.c Peter Maydell
2026-06-16 19:06 ` [PULL 48/61] target/arm: Implement SVE integer unary operations (predicated, zeroing) Peter Maydell
2026-06-16 19:06 ` [PULL 49/61] target/arm: Implement SVE bitwise " Peter Maydell
2026-06-16 19:06 ` [PULL 50/61] target/arm: Implement SVE reverse within elements (zeroing) Peter Maydell
2026-06-16 19:06 ` [PULL 51/61] target/arm: Implement SVE reverse doublewords (zeroing) Peter Maydell
2026-06-16 19:06 ` [PULL 52/61] target/arm: Implement SVE2 integer unary operations (predicated, zeroing) Peter Maydell
2026-06-16 19:06 ` [PULL 53/61] target/arm: Add data argument to do_frint_mode Peter Maydell
2026-06-16 19:06 ` [PULL 54/61] target/arm: Implement Floating-point round to integral value (predicated, zeroing) Peter Maydell
2026-06-16 19:06 ` [PULL 55/61] target/arm: Implement Floating-point convert " Peter Maydell
2026-06-16 19:06 ` [PULL 56/61] target/arm: Implement Floating-point square root " Peter Maydell
2026-06-16 19:06 ` [PULL 57/61] target/arm: Implement SCVTF, UCVTF " Peter Maydell
2026-06-16 19:06 ` [PULL 58/61] target/arm: Implement FRINT{32,64}{X,Z} Peter Maydell
2026-06-16 19:06 ` [PULL 59/61] target/arm: Enable zeroing in DO_FCVT{N, L}T macros in sve_helper.c Peter Maydell
2026-06-16 19:06 ` [PULL 60/61] target/arm: Implement SVE floating-point convert (top, predicated, zeroing) Peter Maydell
2026-06-16 19:06 ` [PULL 61/61] target/arm: Implement floating-point log and convert to integer (zeroing) Peter Maydell
2026-06-17 19:30 ` [PULL 00/61] target-arm queue Stefan Hajnoczi
  -- strict thread matches above, loose matches on Subject: below --
2022-04-22 10:03 Peter Maydell
2022-04-22 11:41 ` Richard Henderson
2022-04-22 13:48   ` Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.