From: Sairaj Kodilkar <sarunkod@amd.com>
To: <qemu-devel@nongnu.org>, <kvm@vger.kernel.org>,
<alejandro.j.jimenez@oracle.com>, <vasant.hegde@amd.com>,
<suravee.suthikulpanit@amd.com>
Cc: <mst@redhat.com>, <imammedo@redhat.com>, <anisinha@redhat.com>,
<marcel.apfelbaum@gmail.com>, <pbonzini@redhat.com>,
<richard.henderson@linaro.org>, <eduardo@habkost.net>,
<yi.l.liu@intel.com>, <eric.auger@redhat.com>,
<zhenzhong.duan@intel.com>, <cohuck@redhat.com>,
<seanjc@google.com>, <iommu@lists.linux.dev>,
<kevin.tian@intel.com>, <joro@8bytes.org>
Subject: Re: [RFC PATCH RESEND 0/5] amd_iommu: support up to 2048 MSI vectors per IRT
Date: Wed, 7 Jan 2026 11:39:14 +0530 [thread overview]
Message-ID: <2440cf13-e4d4-4894-b41a-fbdf7cd9b3b5@amd.com> (raw)
In-Reply-To: <20251118101532.4315-1-sarunkod@amd.com>
Hello all,
Gentle ping,
On 11/18/2025 3:45 PM, Sairaj Kodilkar wrote:
> Resending this series with KVM and IOMMU maintainers in CC.
>
> AMD IOMMU can route upto 2048 MSI vectors through a single
> Interrupt Remapping Table (IRT) entry. This series brings the same
> capability to the emulated AMD IOMMU in QEMU.
>
> Highlights
> ----------
> * Sets bits [9:8] in Extended-Feature-Register-2 to advertise 2K MSI
> support to the guest.
> * Uses bits [10:0] of the MSI data to select the IRTE when the guest
> programs MSIs in logical-destination mode.
> * Introduces a new IOMMU device property:
> -device amd-iommu,...,numint2k=on
>
> The feature is **opt-in**; guests keep the 512-MSI behaviour unless
> `numint2k=on` is supplied.
>
> Passthrough devices
> -------------------
> When a PCI function is passed through via iommufd the code checks the
> host’s vendor capabilities. If the host IOMMU has not enabled
> 2K-MSI support (bits [44:43] set in the control register) the guest
> feature is disabled even if `numint2k=on` was requested.
>
> The detection logic relies on the iommufd interface; with the legacy
> VFIO container the guest always falls back to 512 MSIs.
>
> Example
> -------
> qemu-system-x86_64 \
> -enable-kvm -m 10G -smp cpus=8 \
> -kernel /boot/vmlinuz \
> -initrd /boot/initrd.img \
> -append "console=ttyS0 earlyprintk=serial root=<DEVICE>"
> -device amd-iommu,dma-remap=on,numint2k=on \
> -object iommufd,id=iommufd0 \
> -device vfio-pci,host=<DEVID>,iommufd=iommufd0 \
> -global kvm-pit.lost_tick_policy=discard \
> -cpu host \
> -machine q35,kernel_irqchip=split \
> -nographic \
> -smbios type=0,version=2.8 \
> -blockdev node-name=drive0,driver=qcow2,file.driver=file,file.filename=<IMAGE> \
> -device virtio-blk-pci,drive=drive0
>
> Limitations
> -----------
> This approach works well for features queried after IOMMUFD
> initialization but cannot handle features needed during early QEMU
> setup, before IOMMUFD is available.
>
> A key example is EFR2[HTRangeIgnore]. When this bit is set, the physical
> IOMMU treats HyperTransport (HT) address ranges as regular memory
> accesses rather than reserved regions. This has important implications
> for memory layout:
>
> * Without HTRangeIgnore: QEMU must relocate RAM above 4G to above 1T on
> AMD platforms to avoid HT conflicts
> * With HTRangeIgnore: QEMU can safely place RAM immediately above 4G,
> improving memory utilization
>
> Since RAM layout must be determined before IOMMUFD initialization, QEMU
> cannot use hwinfo to query EFR2[HTRangeIgnore] feature bit.
>
> Another limitation with using the control register is that, if BIOS enables
> particular feature (e.g. ControlRegister[GCR3TRPMode) without kernel support
> QEMU incorrectly assumes that host kernel supports that feature potentially
> causing guest failure.
>
> Alternative considered
> ----------------------
> We also explored alternate approach which uses KVM capability
> "KVM_CAP_AMD_NUM_INT_2K_SUP", which user can query to know if host
> kernel supports 2K MSIs. Similarly, this enables qemu to detect the
> presence of EFR2[HTRangeIgnore] during RAM initialization.
>
> Although current implementation allows 2K MSI support only with
> iommufd, it keeps the logic inside the vfio/iommufd and avoids
> modifying KVM ABI. I am happy to discuss advantages and drawbacks of
> both approaches.
>
> ------------------------------------------------------------------------
>
> The patches are based on top of bc831f37398b (qemu master). Additionally
> it requires linux kernel with patches[1] which expose control register
> via IOMMU_GET_HW_INFO ioctl.
>
> [1] https://lore.kernel.org/linux-iommu/20251029095846.4486-1-sarunkod@amd.com/
>
> ------------------------------------------------------------------------
>
> Sairaj Kodilkar (3):
> vfio/iommufd: Add amd specific hardware info struct to vendor
> capability
> amd_iommu: Add support for extended feature register 2
> amd_iommu: Add support for upto 2048 interrupts per IRT
>
> Suravee Suthikulpanit (2):
> [DO NOT MERGE] linux-headers: Introduce struct iommu_hw_info_amd
> amd-iommu: Add support for set/unset IOMMU for VFIO PCI devices
>
> hw/i386/acpi-build.c | 4 +-
> hw/i386/amd_iommu-stub.c | 5 +
> hw/i386/amd_iommu.c | 163 +++++++++++++++++++++++++++--
> hw/i386/amd_iommu.h | 24 +++++
> include/system/host_iommu_device.h | 1 +
> linux-headers/linux/iommufd.h | 20 ++++
> 6 files changed, 207 insertions(+), 10 deletions(-)
>
next prev parent reply other threads:[~2026-01-07 6:09 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-18 10:15 [RFC PATCH RESEND 0/5] amd_iommu: support up to 2048 MSI vectors per IRT Sairaj Kodilkar
2025-11-18 10:15 ` [RFC PATCH RESEND 1/5] [DO NOT MERGE] linux-headers: Introduce struct iommu_hw_info_amd Sairaj Kodilkar
2025-11-18 10:15 ` [RFC PATCH RESEND 2/5] vfio/iommufd: Add amd specific hardware info struct to vendor capability Sairaj Kodilkar
2026-01-28 1:25 ` Alejandro Jimenez
2025-11-18 10:15 ` [RFC PATCH RESEND 3/5] amd-iommu: Add support for set/unset IOMMU for VFIO PCI devices Sairaj Kodilkar
2026-01-28 1:40 ` Alejandro Jimenez
2026-01-28 11:19 ` Sairaj Kodilkar
2025-11-18 10:15 ` [RFC PATCH RESEND 4/5] amd_iommu: Add support for extended feature register 2 Sairaj Kodilkar
2025-11-18 10:15 ` [RFC PATCH RESEND 5/5] amd_iommu: Add support for upto 2048 interrupts per IRT Sairaj Kodilkar
2026-01-28 1:59 ` Alejandro Jimenez
2026-01-28 11:23 ` Sairaj Kodilkar
2026-01-07 6:09 ` Sairaj Kodilkar [this message]
2026-01-28 1:23 ` [RFC PATCH RESEND 0/5] amd_iommu: support up to 2048 MSI vectors " Alejandro Jimenez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2440cf13-e4d4-4894-b41a-fbdf7cd9b3b5@amd.com \
--to=sarunkod@amd.com \
--cc=alejandro.j.jimenez@oracle.com \
--cc=anisinha@redhat.com \
--cc=cohuck@redhat.com \
--cc=eduardo@habkost.net \
--cc=eric.auger@redhat.com \
--cc=imammedo@redhat.com \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=seanjc@google.com \
--cc=suravee.suthikulpanit@amd.com \
--cc=vasant.hegde@amd.com \
--cc=yi.l.liu@intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox