All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support
@ 2026-05-28  5:17 Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 01/26] iommu/amd: Make amd_iommu_completion_wait() non-static Suravee Suthikulpanit
                   ` (26 more replies)
  0 siblings, 27 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

AMD IOMMU introduces the vIOMMU feature, which provides partial hardware
acceleration when implementing Guest IOMMUs. This feature provides
acceleration for guest Command Buffer, Event Log, and PPR Log. This
eliminates the CPU overhead needed for the supporting HV intercepts and
reduces the latency of these operations.

When a guest attempts to access guest IOMMU MMIO registers with offsets
between 8KB and 12KB (i.e. the 3rd 4K region) such as the Command Buffer,
Event Log and PPR Log head and tail pointer registers, this is serviced
directly by the IOMMU. When the IOMMU accesses a Command Buffer, PPR Log
or a COMPLETION_WAIT store location in memory, it directly accesses guest
physical memory. The HV/VMM continues to trap and emulate the IOMMU
configuration MMIO registers between 0KB and 4KB (i.e. the 1st 4K
region), which are primarily used during initialization.

Additionally, the HV must initialize the vIOMMU feature, map MMIO resources
between the VMs and the IOMMU, manage additional supporting data structures
in memory (e.g. GPA->SPA translation DTE, Device ID and Domain ID mapping
tables), and allocate/map vIOMMU Private Address region used as backing
storage memory for the IOMMU. Support for new IOMMU command and events
specifically for vIOMMU are also added.

Guest IOMMUs are IOMMUs exposed to VMs with additional support from VMM
(QEMU) to generate guest ACPI IVRS table and define guest PCI topology for
IOMMU and pass-through VFIO devices, which are not covered by this series.

For more detail, please see the vIOMMU section of the AMD IOMMU
Specification[1].

This is version 2 of the AMD HW-vIOMMU series previously posted as v1[2].
It is implemented on top of the IOMMUFD vIOMMU, vDevice, and hw_queue
framework in Linux v7.1.0-rc4.

The series is organized into the following subsets:

  Patch 1-3   : Preparatory patches
  Patch 4-5   : Introduce IOMMUFD vIOMMU support for AMD
  Patch 6-8   : Introduce AMD vIOMMU VF MMIO and VFCtrl MMIO
  Patch 9-12  : Introduce AMD vIOMMU Private Address support
  Patch 13-16 : Introduce IOMMUFD vDevice support for AMD
  Patch 17-21 : Translation DTE, KVM FD, and translate-device-ID pool
  Patch 22-26 : IOMMUFD hw_queue, VIOMMU_COMMAND ioctl, enable vIOMMU

Changes since v1:
(https://lore.kernel.org/linux-iommu/20260330084206.9251-1-suravee.suthikulpanit@amd.com/)

Rebase and scope:
  * Rebased on Linux v7.1.0-rc4.

Guest ID (GID) - design change (patch 5-6):
  * v1: A single global IDA; each GID is unique across all AMD IOMMUs in
    the system.
  * v2: A per-amd_iommu IDA (gid_ida), initialized when vIOMMU MMIO is
    set up on that IOMMU. GIDs are allocated in amd_iommufd_viommu_init()
    and freed on destroy - one GID per IOMMUFD vIOMMU object, unique within
    that IOMMU only. A VM with guest IOMMUs behind multiple IOMMUs may
    therefore hold multiple GIDs. This is separate from TransDevID, which
    remains one per VM (kvmfd) and is shared across vIOMMUs for that VM.

Translate device ID (TransDevID) - design change (patches 17-21):
  * v1: VMM supplies trans_devid via struct iommu_viommu_amd; driver
    programs translation DTE and VFctrl guest-misc register.
  * v2: Driver allocates trans_devid per VM from a per-PCI-segment pool
    (trans_devid.c), reserves IDs used by real PCI functions on attach,
    and keys allocation to kvmfd so multiple vIOMMU instances for one VM
    share a single GPA->SPA translation DTE.
  * struct iommu_viommu_amd gains kvmfd; trans_devid removed from UAPI.

Userspace vIOMMU control path - design change (patches 24-25):
  * v1: Extended IOMMU_OPTION with IOMMU_OPTION_VIOMMU and
    set_option/get_option ops.
  * v2: New IOMMU_VIOMMU_COMMAND ioctl with set_command/get_command ops
    for indexed read/write of vIOMMU register arrays (e.g. guest MMIO
    shadow). AMD backend moves guest MMIO accessors to vfctrl_mmio.c.

Single-patch / implementation notes:
  * Patch 4: Gate amd_iommufd_get_viommu_size() on HW vIOMMU capability.
  * Patch 8: Reset-vMMIO helper declaration in amd_iommu.h (rebase).
  * Patch 11-12: Per-VM IPA map/unmap via iommu_map/iommu_unmap (was
    pt_iommu_amdv1_map_pages in v1 per-VM helpers).
  * Patch 26: iommu_feature_enable_and_check(); improved error unwind in
    amd_viommu_init(); enable vIOMMU by default when supported.

Upcoming series (in subsequent submission):
  * Extended Interrupt Remapping (guest Event / PPR log interrupts)
  * KVM/AVIC integration and guest event injection support

Testing done:
  * Single/Multiple vIOMMU instances
  * Single/Multiple VFIO devices per vIOMMU instance.

[1] IOMMU Specification: https://docs.amd.com/v/u/en-US/48882_3.10_PUB
[2] Linux git tree: https://github.com/AMDESE/linux-iommu/tree/linux-7.1.0-rc4-amd-viommu_upstream_v2

Thank you,
Suravee

Suravee Suthikulpanit (26):
  iommu/amd: Make amd_iommu_completion_wait() non-static
  iommu/amd: Introduce vIOMMU-specific events and event
  iommu/amd: Detect and initialize AMD vIOMMU feature
  iommu/amd: Introduce IOMMUFD vIOMMU support for AMD
  iommu/amd: Allocate Guest IDs for IOMMUFD vIOMMU instances
  iommu/amd: Map vIOMMU VF and VF Control MMIO BARs
  iommu/amd: Add support for AMD vIOMMU VF MMIO region
  iommu/amd: Introduce Reset vMMIO Command
  iommu/amd: Introduce domain for IOMMU Private Address (IPA) region
  iommu/amd: Assign IOMMU Private Address domain to IOMMU
  iommu/amd: Allocate and map vIOMMU private regions
  iommu/amd: Add per-VM private IPA alloc/map helpers
  iommu/amd: Add helper functions to manage DevID / DomID mapping tables
  iommu/amd: Introduce IOMMUFD vDevice support for AMD
  iommu/amd: Introduce helper function for updating domain ID mapping
    table
  iommu/amd: Introduce helper function for updating device ID mapping
    table
  iommu/amd: Pass KVM FD from userspace when initializing vIOMMU
  iommu/amd: Add translation DTE and VFctrl TransDevID helpers
  iommu/amd: Add per-segment translate device ID pool
  iommu/amd: Reserve translate-device-id for PCI requestor aliases
  iommu/amd: Map kvmfd to shared translate device ID for vIOMMU
  iommufd: Add hw_queue_init and split queue alloc paths
  iommu/amd: Add support for vIOMMU HW queues initialization
  iommufd: Introduce vIOMMU command via VIOMMU_COMMAND ioctl
  iommu/amd: Handle set/get command for AMD vIOMMU
  iommu/amd: Introduce logic to check and enable vIOMMU feature

 drivers/iommu/amd/Makefile              |   2 +-
 drivers/iommu/amd/amd_iommu.h           |  46 +++
 drivers/iommu/amd/amd_iommu_types.h     | 129 ++++++
 drivers/iommu/amd/amd_viommu.h          |  73 ++++
 drivers/iommu/amd/init.c                |  33 +-
 drivers/iommu/amd/iommu.c               | 225 ++++++++--
 drivers/iommu/amd/iommufd.c             | 245 +++++++++++
 drivers/iommu/amd/nested.c              |  18 +
 drivers/iommu/amd/trans_devid.c         | 317 ++++++++++++++
 drivers/iommu/amd/vfctrl_mmio.c         | 146 +++++++
 drivers/iommu/amd/viommu.c              | 529 ++++++++++++++++++++++++
 drivers/iommu/iommufd/iommufd_private.h |   1 +
 drivers/iommu/iommufd/main.c            |   3 +
 drivers/iommu/iommufd/viommu.c          | 150 +++++--
 include/linux/iommufd.h                 |  10 +-
 include/uapi/linux/iommufd.h            |  51 +++
 16 files changed, 1909 insertions(+), 69 deletions(-)
 create mode 100644 drivers/iommu/amd/amd_viommu.h
 create mode 100644 drivers/iommu/amd/trans_devid.c
 create mode 100644 drivers/iommu/amd/vfctrl_mmio.c
 create mode 100644 drivers/iommu/amd/viommu.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 01/26] iommu/amd: Make amd_iommu_completion_wait() non-static
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 02/26] iommu/amd: Introduce vIOMMU-specific events and event Suravee Suthikulpanit
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

This will be reused in a new iommufd.c file for nested translation.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu.h |  1 +
 drivers/iommu/amd/iommu.c     | 26 ++++++++++++--------------
 2 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index af720bf14914..9f961ccbe3b4 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -199,6 +199,7 @@ void amd_iommu_set_dte_v1(struct iommu_dev_data *dev_data,
 void amd_iommu_update_dte(struct amd_iommu *iommu,
 			  struct iommu_dev_data *dev_data,
 			  struct dev_table_entry *new);
+int amd_iommu_completion_wait(struct amd_iommu *iommu);
 
 static inline void
 amd_iommu_make_clear_dte(struct iommu_dev_data *dev_data, struct dev_table_entry *new)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 84cad43dc188..4b4dd20ebec6 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -90,8 +90,6 @@ static int amd_iommu_set_dirty_tracking(struct iommu_domain *domain,
 
 static void clone_aliases(struct amd_iommu *iommu, struct device *dev);
 
-static int iommu_completion_wait(struct amd_iommu *iommu);
-
 /****************************************************************************
  *
  * Helper functions
@@ -216,7 +214,7 @@ void amd_iommu_update_dte(struct amd_iommu *iommu,
 	update_dte256(iommu, dev_data, new);
 	clone_aliases(iommu, dev_data->dev);
 	device_flush_dte(dev_data);
-	iommu_completion_wait(iommu);
+	amd_iommu_completion_wait(iommu);
 }
 
 static void get_dte256(struct amd_iommu *iommu, struct iommu_dev_data *dev_data,
@@ -1449,7 +1447,7 @@ static u64 get_cmdsem_val(struct amd_iommu *iommu)
  * This function queues a completion wait command into the command
  * buffer of an IOMMU
  */
-static int iommu_completion_wait(struct amd_iommu *iommu)
+int amd_iommu_completion_wait(struct amd_iommu *iommu)
 {
 	struct iommu_cmd cmd;
 	unsigned long flags;
@@ -1487,7 +1485,7 @@ static void domain_flush_complete(struct protection_domain *domain)
 	 * We need to wait for completion of all commands.
 	 */
 	 xa_for_each(&domain->iommu_array, i, pdom_iommu_info)
-		iommu_completion_wait(pdom_iommu_info->iommu);
+		amd_iommu_completion_wait(pdom_iommu_info->iommu);
 }
 
 static int iommu_flush_dte(struct amd_iommu *iommu, u16 devid)
@@ -1505,7 +1503,7 @@ static void iommu_flush_dte_sync(struct amd_iommu *iommu, u16 devid)
 
 	ret = iommu_flush_dte(iommu, devid);
 	if (!ret)
-		iommu_completion_wait(iommu);
+		amd_iommu_completion_wait(iommu);
 }
 
 static void amd_iommu_flush_dte_all(struct amd_iommu *iommu)
@@ -1516,7 +1514,7 @@ static void amd_iommu_flush_dte_all(struct amd_iommu *iommu)
 	for (devid = 0; devid <= last_bdf; ++devid)
 		iommu_flush_dte(iommu, devid);
 
-	iommu_completion_wait(iommu);
+	amd_iommu_completion_wait(iommu);
 }
 
 /*
@@ -1535,7 +1533,7 @@ static void amd_iommu_flush_tlb_all(struct amd_iommu *iommu)
 		iommu_queue_command(iommu, &cmd);
 	}
 
-	iommu_completion_wait(iommu);
+	amd_iommu_completion_wait(iommu);
 }
 
 static void amd_iommu_flush_tlb_domid(struct amd_iommu *iommu, u32 dom_id)
@@ -1546,7 +1544,7 @@ static void amd_iommu_flush_tlb_domid(struct amd_iommu *iommu, u32 dom_id)
 			      dom_id, IOMMU_NO_PASID, false);
 	iommu_queue_command(iommu, &cmd);
 
-	iommu_completion_wait(iommu);
+	amd_iommu_completion_wait(iommu);
 }
 
 static int iommu_flush_pages_v1_hdom_ids(struct protection_domain *pdom, u64 address, size_t size)
@@ -1582,7 +1580,7 @@ static void amd_iommu_flush_all(struct amd_iommu *iommu)
 	build_inv_all(&cmd);
 
 	iommu_queue_command(iommu, &cmd);
-	iommu_completion_wait(iommu);
+	amd_iommu_completion_wait(iommu);
 }
 
 static void iommu_flush_irt(struct amd_iommu *iommu, u16 devid)
@@ -1605,7 +1603,7 @@ static void amd_iommu_flush_irt_all(struct amd_iommu *iommu)
 	for (devid = 0; devid <= last_bdf; devid++)
 		iommu_flush_irt(iommu, devid);
 
-	iommu_completion_wait(iommu);
+	amd_iommu_completion_wait(iommu);
 }
 
 void amd_iommu_flush_all_caches(struct amd_iommu *iommu)
@@ -1841,7 +1839,7 @@ void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
 	if (dev_data->ats_enabled)
 		device_flush_iotlb(dev_data, address, size, pasid, true);
 
-	iommu_completion_wait(iommu);
+	amd_iommu_completion_wait(iommu);
 }
 
 static void dev_flush_pasid_all(struct iommu_dev_data *dev_data,
@@ -2495,7 +2493,7 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
 		goto out_err;
 	}
 
-	iommu_completion_wait(iommu);
+	amd_iommu_completion_wait(iommu);
 
 	if (FEATURE_NUM_INT_REMAP_SUP_2K(amd_iommu_efr2))
 		dev_data->max_irqs = MAX_IRQS_PER_TABLE_2K;
@@ -3392,7 +3390,7 @@ static struct irq_remap_table *alloc_irq_table(struct amd_iommu *iommu,
 		set_remap_table_entry(iommu, alias, table);
 
 out_wait:
-	iommu_completion_wait(iommu);
+	amd_iommu_completion_wait(iommu);
 
 out_unlock:
 	spin_unlock_irqrestore(&iommu_table_lock, flags);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 02/26] iommu/amd: Introduce vIOMMU-specific events and event
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 01/26] iommu/amd: Make amd_iommu_completion_wait() non-static Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 03/26] iommu/amd: Detect and initialize AMD vIOMMU feature Suravee Suthikulpanit
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Adding support for new vIOMMU events:
  * Guest Event Fault event
  * vIOMMU Hardware Error event

Also, adding support for the additional vIOMMU related flags
in existing events.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  7 ++++
 drivers/iommu/amd/iommu.c           | 58 ++++++++++++++++++++++-------
 2 files changed, 52 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index c685d3771436..31af02d12afd 100644
---
 drivers/iommu/amd/amd_iommu_types.h |  7 ++++
 drivers/iommu/amd/iommu.c           | 58 ++++++++++++++++++++++-------
 2 files changed, 52 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index d2c64e2e9f05..4df6a50128de 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -149,6 +149,9 @@
 #define EVENT_TYPE_IOTLB_INV_TO	0x7
 #define EVENT_TYPE_INV_DEV_REQ	0x8
 #define EVENT_TYPE_INV_PPR_REQ	0x9
+#define EVENT_TYPE_GUEST_EVENT_FAULT	0xb
+#define EVENT_TYPE_VIOMMU_HW_ERR	0xc
+
 #define EVENT_TYPE_RMP_FAULT	0xd
 #define EVENT_TYPE_RMP_HW_ERR	0xe
 #define EVENT_DEVID_MASK	0xffff
@@ -261,6 +264,10 @@
 #define EVTLOG_SIZE_MAX		SZ_512K /* 32K entries */
 #define EVTLOG_LEN_MASK_MAX	(0xFULL << EVTLOG_SIZE_SHIFT)
 
+/* Constants for IO_PAGE_FAULT event */
+#define IO_PAGE_FAULT_VFLAGS_SHIFT	27
+#define IO_PAGE_FAULT_VFLAGS_MASK	GENMASK_ULL(27, 23)
+
 /* Constants for PPR Log handling */
 #define PPRLOG_ENTRY_SIZE	0x10
 #define PPRLOG_SIZE_SHIFT	56
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 4b4dd20ebec6..50f26c8123f3 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -854,7 +854,7 @@ static void amd_iommu_report_rmp_fault(struct amd_iommu *iommu, volatile u32 *ev
 
 static void amd_iommu_report_page_fault(struct amd_iommu *iommu,
 					u16 devid, u16 domain_id,
-					u64 address, int flags)
+					u64 address, int flags, u8 vflags)
 {
 	struct iommu_dev_data *dev_data = NULL;
 	struct pci_dev *pdev;
@@ -889,13 +889,13 @@ static void amd_iommu_report_page_fault(struct amd_iommu *iommu,
 		}
 
 		if (__ratelimit(&dev_data->rs)) {
-			pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x]\n",
-				domain_id, address, flags);
+			pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x vflags=%#x]\n",
+				domain_id, address, flags, vflags);
 		}
 	} else {
-		pr_err_ratelimited("Event logged [IO_PAGE_FAULT device=%04x:%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n",
+		pr_err_ratelimited("Event logged [IO_PAGE_FAULT device=%04x:%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x vflags=%#x]\n",
 			iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
-			domain_id, address, flags);
+			domain_id, address, flags, vflags);
 	}
 
 out:
@@ -932,29 +932,42 @@ static void iommu_print_event(struct amd_iommu *iommu, void *__evt)
 	}
 
 	if (type == EVENT_TYPE_IO_FAULT) {
-		amd_iommu_report_page_fault(iommu, devid, pasid, address, flags);
+		u8 vflags = FIELD_GET(IO_PAGE_FAULT_VFLAGS_MASK, event[0]);
+
+		amd_iommu_report_page_fault(iommu, devid, pasid, address, flags, vflags);
 		return;
 	}
 
 	switch (type) {
 	case EVENT_TYPE_ILL_DEV:
-		dev_err(dev, "Event logged [ILLEGAL_DEV_TABLE_ENTRY device=%04x:%02x:%02x.%x pasid=0x%05x address=0x%llx flags=0x%04x]\n",
+	{
+		u8 vflags = FIELD_GET(IO_PAGE_FAULT_VFLAGS_MASK, event[0]);
+
+		dev_err(dev, "Event logged [ILLEGAL_DEV_TABLE_ENTRY deice=%04x:%02x:%02x.%x pasid=0x%05x address=0x%llx flags=0x%04x vflags=%#x]\n",
 			iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
-			pasid, address, flags);
+			pasid, address, flags, vflags);
 		dev_err(dev, "Control Reg : 0x%llx\n", ctrl);
 		dump_dte_entry(iommu, devid);
 		break;
+	}
 	case EVENT_TYPE_DEV_TAB_ERR:
-		dev_err(dev, "Event logged [DEV_TAB_HARDWARE_ERROR device=%04x:%02x:%02x.%x "
-			"address=0x%llx flags=0x%04x]\n",
+	{
+		u8 vflags = FIELD_GET(IO_PAGE_FAULT_VFLAGS_MASK, event[0]);
+
+		dev_err(dev, "Event logged [DEV_TAB_HARDWARE_ERROR device=%04x:%02x:%02x.%x address=%#llx flags=%#04x vlfags=%#x]\n",
 			iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
-			address, flags);
+			address, flags, vflags);
 		break;
+	}
 	case EVENT_TYPE_PAGE_TAB_ERR:
-		dev_err(dev, "Event logged [PAGE_TAB_HARDWARE_ERROR device=%04x:%02x:%02x.%x pasid=0x%04x address=0x%llx flags=0x%04x]\n",
+	{
+		u8 vflags = FIELD_GET(IO_PAGE_FAULT_VFLAGS_MASK, event[0]);
+
+		dev_err(dev, "Event logged [PAGE_TAB_HARDWARE_ERROR device=%04x:%02x:%02x.%x pasid=0x%04x address=0x%llx flags=0x%04x vflags=%#x]\n",
 			iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
-			pasid, address, flags);
+			pasid, address, flags, vflags);
 		break;
+	}
 	case EVENT_TYPE_ILL_CMD:
 		dev_err(dev, "Event logged [ILLEGAL_COMMAND_ERROR address=0x%llx]\n", address);
 		dump_command(address);
@@ -986,6 +999,25 @@ static void iommu_print_event(struct amd_iommu *iommu, void *__evt)
 			iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
 			pasid, address, flags, tag);
 		break;
+	case EVENT_TYPE_GUEST_EVENT_FAULT:
+	{
+		u8 gid = event[1] & 0xFFFF;
+		u8 vflags = FIELD_GET(IO_PAGE_FAULT_VFLAGS_MASK, event[0]);
+
+		dev_err(dev, "Event logged [GUEST_EVENT_FAULT gid=#%x flags=0x%04x vflags=%#x]\n",
+			gid, flags, vflags);
+		break;
+	}
+	case EVENT_TYPE_VIOMMU_HW_ERR:
+	{
+		u16 gid = event[0] & 0xFFFF;
+		u8 src = (event[0] >> 16) & 0x3;
+		u8 vflags = FIELD_GET(IO_PAGE_FAULT_VFLAGS_MASK, event[0]);
+
+		dev_err(dev, "Event logged [VIOMMU_HW_ERR gid=%#x address=%#llx src=%#x flags=%#x vflags=%#x]\n",
+			gid, address, src, flags, vflags);
+		break;
+	}
 	default:
 		dev_err(dev, "Event logged [UNKNOWN event[0]=0x%08x event[1]=0x%08x event[2]=0x%08x event[3]=0x%08x\n",
 			event[0], event[1], event[2], event[3]);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 03/26] iommu/amd: Detect and initialize AMD vIOMMU feature
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 01/26] iommu/amd: Make amd_iommu_completion_wait() non-static Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 02/26] iommu/amd: Introduce vIOMMU-specific events and event Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-06-01 12:43   ` Jason Gunthorpe
  2026-05-28  5:17 ` [PATCH v2 04/26] iommu/amd: Introduce IOMMUFD vIOMMU support for AMD Suravee Suthikulpanit
                   ` (23 subsequent siblings)
  26 siblings, 1 reply; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

The feature is advertised w/ EFR[VIOMMUSup]. Please see the AMD IOMMU
specification[1] for more detail.

Introduce a new global variable amd_iommu_viommu, which is used to
control the feature enablement in the driver. Currently, the feature
is default to disabled. Once the feature is fully supported, it will be
changed to enabled by default along with a command-line option to disable
if needed.

[1] https://docs.amd.com/v/u/en-US/48882_3.10_PUB

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/Makefile          |  2 +-
 drivers/iommu/amd/amd_iommu.h       |  2 ++
 drivers/iommu/amd/amd_iommu_types.h |  1 +
 drivers/iommu/amd/amd_viommu.h      | 22 ++++++++++++++++++++++
 drivers/iommu/amd/init.c            | 15 +++++++++++++++
 drivers/iommu/amd/viommu.c          | 29 +++++++++++++++++++++++++++++
 6 files changed, 70 insertions(+), 1 deletion(-)
 create mode 100644 drivers/iommu/amd/amd_viommu.h
 create mode 100644 drivers/iommu/amd/viommu.c

diff --git a/drivers/iommu/amd/Makefile b/drivers/iommu/amd/Makefile
index 94b8ef2acb18..e1e824b9c7b0 100644
--- a/drivers/iommu/amd/Makefile
+++ b/drivers/iommu/amd/Makefile
@@ -1,4 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-y += iommu.o init.o quirks.o ppr.o pasid.o
-obj-$(CONFIG_AMD_IOMMU_IOMMUFD) += iommufd.o nested.o
+obj-$(CONFIG_AMD_IOMMU_IOMMUFD) += iommufd.o nested.o viommu.o
 obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += debugfs.o
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 9f961ccbe3b4..17fc0b5b3fa8 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -35,6 +35,8 @@ void amd_iommu_debugfs_setup(void);
 static inline void amd_iommu_debugfs_setup(void) {}
 #endif
 
+extern bool amd_iommu_viommu;
+
 /* Needed for interrupt remapping */
 int amd_iommu_prepare(void);
 int amd_iommu_enable(void);
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 4df6a50128de..b5327bf6814b 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -103,6 +103,7 @@
 #define FEATURE_HASUP		BIT_ULL(49)
 #define FEATURE_EPHSUP		BIT_ULL(50)
 #define FEATURE_HDSUP		BIT_ULL(52)
+#define FEATURE_VIOMMU		BIT_ULL(55)
 #define FEATURE_SNP		BIT_ULL(63)
 
 
diff --git a/drivers/iommu/amd/amd_viommu.h b/drivers/iommu/amd/amd_viommu.h
new file mode 100644
index 000000000000..f08ab9ef23a9
--- /dev/null
+++ b/drivers/iommu/amd/amd_viommu.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2026 Advanced Micro Devices, Inc.
+ */
+
+#ifndef AMD_VIOMMU_H
+#define AMD_VIOMMU_H
+
+#if IS_ENABLED(CONFIG_AMD_IOMMU_IOMMUFD)
+
+int amd_viommu_init(struct amd_iommu *iommu);
+
+#else
+
+static inline int amd_viommu_init(struct amd_iommu *iommu)
+{
+	return -EOPNOTSUPP;
+}
+
+#endif /* CONFIG_AMD_IOMMU_IOMMUFD */
+
+#endif /* AMD_VIOMMU_H */
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index d4dc9b2a50f3..5ac883429ced 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -34,6 +34,7 @@
 #include <linux/crash_dump.h>
 
 #include "amd_iommu.h"
+#include "amd_viommu.h"
 #include "../irq_remapping.h"
 #include "../iommu-pages.h"
 
@@ -196,6 +197,9 @@ bool amdr_ivrs_remap_support __read_mostly;
 
 bool amd_iommu_force_isolation __read_mostly;
 
+/* VIOMMU enabling flag */
+bool amd_iommu_viommu;
+
 unsigned long amd_iommu_pgsize_bitmap __ro_after_init = AMD_IOMMU_PGSIZES;
 
 enum iommu_init_state {
@@ -2188,6 +2192,12 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
 	if (check_feature(FEATURE_PPR) && amd_iommu_alloc_ppr_log(iommu))
 		return -ENOMEM;
 
+	ret = amd_viommu_init(iommu);
+	if (ret) {
+		pr_err("Failed to initialize vIOMMU.\n");
+		amd_iommu_viommu = false;
+	}
+
 	if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE)) {
 		pr_info("Using strict mode due to virtualization\n");
 		iommu_set_dma_strict();
@@ -2281,6 +2291,9 @@ static void print_iommu_info(void)
 		if (check_feature2(FEATURE_SEVSNPIO_SUP))
 			pr_cont(" SEV-TIO");
 
+		if (check_feature(FEATURE_VIOMMU))
+			pr_cont(" vIOMMU");
+
 		pr_cont("\n");
 	}
 
@@ -2293,6 +2306,8 @@ static void print_iommu_info(void)
 		pr_info("V2 page table enabled (Paging mode : %d level)\n",
 			amd_iommu_gpt_level);
 	}
+	if (amd_iommu_viommu)
+		pr_info("AMD-Vi: vIOMMU enabled\n");
 }
 
 static int __init amd_iommu_init_pci(void)
diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
new file mode 100644
index 000000000000..f4b5f96d4785
--- /dev/null
+++ b/drivers/iommu/amd/viommu.c
@@ -0,0 +1,29 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023 Advanced Micro Devices, Inc.
+ */
+
+#define pr_fmt(fmt)     "AMD-Vi: " fmt
+#define dev_fmt(fmt)    pr_fmt(fmt)
+
+#include <linux/iommu.h>
+#include <linux/iommufd.h>
+#include <linux/amd-iommu.h>
+#include <uapi/linux/iommufd.h>
+
+#include <asm/iommu.h>
+#include <asm/set_memory.h>
+
+#include "iommufd.h"
+#include "amd_iommu.h"
+#include "amd_iommu_types.h"
+#include "amd_viommu.h"
+
+int __init amd_viommu_init(struct amd_iommu *iommu)
+{
+	if (!amd_iommu_viommu ||
+	    !check_feature(FEATURE_VIOMMU))
+		return 0;
+
+	return 0;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 04/26] iommu/amd: Introduce IOMMUFD vIOMMU support for AMD
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (2 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 03/26] iommu/amd: Detect and initialize AMD vIOMMU feature Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-06-01 12:44   ` Jason Gunthorpe
  2026-05-28  5:17 ` [PATCH v2 05/26] iommu/amd: Allocate Guest IDs for IOMMUFD vIOMMU instances Suravee Suthikulpanit
                   ` (22 subsequent siblings)
  26 siblings, 1 reply; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Introduce a new enum iommu_viommu_type (IOMMU_VIOMMU_TYPE_AMD) for AMD
vIOMMU along with the struct iommu_viommu_amd, which is used to initialize
IOMMUFD vIOMMU instance when calling struct iommu_ops.viommu_init().

Also, hook up struct iomufd_viomu_ops.alloc_domain_nested to connect
nested domain allocation with AMD vIOMMU implementation.

Additional initialization will be added in subsequent patches.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/iommufd.c  |  4 ++++
 include/uapi/linux/iommufd.h | 10 ++++++++++
 2 files changed, 14 insertions(+)

diff --git a/drivers/iommu/amd/iommufd.c b/drivers/iommu/amd/iommufd.c
index 52300b867c1f..eee29c26169a 100644
---
 drivers/iommu/amd/iommufd.c  |  4 ++++
 include/uapi/linux/iommufd.h | 10 ++++++++++
 2 files changed, 14 insertions(+)

diff --git a/drivers/iommu/amd/iommufd.c b/drivers/iommu/amd/iommufd.c
index 52300b867c1f..eee29c26169a 100644
--- a/drivers/iommu/amd/iommufd.c
+++ b/drivers/iommu/amd/iommufd.c
@@ -34,6 +34,9 @@ void *amd_iommufd_hw_info(struct device *dev, u32 *length, enum iommu_hw_info_ty
 
 size_t amd_iommufd_get_viommu_size(struct device *dev, enum iommu_viommu_type viommu_type)
 {
+	if (!amd_iommu_viommu || (viommu_type != IOMMU_VIOMMU_TYPE_AMD))
+		return 0;
+
 	return VIOMMU_STRUCT_SIZE(struct amd_iommu_viommu, core);
 }
 
@@ -73,5 +76,6 @@ static void amd_iommufd_viommu_destroy(struct iommufd_viommu *viommu)
  * struct iommufd_viommu_ops - vIOMMU specific operations
  */
 static const struct iommufd_viommu_ops amd_viommu_ops = {
+	.alloc_domain_nested = amd_iommu_alloc_domain_nested,
 	.destroy = amd_iommufd_viommu_destroy,
 };
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index e998dfbd6960..83ed10610957 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -1052,6 +1052,7 @@ struct iommu_fault_alloc {
  * @IOMMU_VIOMMU_TYPE_ARM_SMMUV3: ARM SMMUv3 driver specific type
  * @IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV: NVIDIA Tegra241 CMDQV (extension for ARM
  *                                    SMMUv3) enabled ARM SMMUv3 type
+ * @IOMMU_VIOMMU_TYPE_AMD: AMD HW-vIOMMU type
  */
 enum iommu_viommu_type {
 	IOMMU_VIOMMU_TYPE_DEFAULT = 0,
@@ -1062,6 +1063,7 @@ enum iommu_viommu_type {
 	 *   VMM must wire the HYP_OWN bit to 0 in guest VINTF_CONFIG register
 	 */
 	IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV = 2,
+	IOMMU_VIOMMU_TYPE_AMD = 3,
 };
 
 /**
@@ -1080,6 +1082,14 @@ struct iommu_viommu_tegra241_cmdqv {
 	__aligned_u64 out_vintf_mmap_length;
 };
 
+/**
+ * struct iommu_viommu_amd - AMD vIOMMU Interface (IOMMU_VIOMMU_TYPE_AMD)
+ * @reserved: Must be zero
+ */
+struct iommu_viommu_amd {
+	__u32 reserved; /* must be last */
+};
+
 /**
  * struct iommu_viommu_alloc - ioctl(IOMMU_VIOMMU_ALLOC)
  * @size: sizeof(struct iommu_viommu_alloc)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 05/26] iommu/amd: Allocate Guest IDs for IOMMUFD vIOMMU instances
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (3 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 04/26] iommu/amd: Introduce IOMMUFD vIOMMU support for AMD Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 06/26] iommu/amd: Map vIOMMU VF and VF Control MMIO BARs Suravee Suthikulpanit
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Hardware vIOMMU uses a 16-bit Guest ID (GID) per guest IOMMU to
index driver and hardware state. Allocate one GID per IOMMUFD vIOMMU
from a per-amd_iommu IDA (unique within that IOMMU; a VM behind
multiple IOMMUs may hold more than one GID).

Add amd_iommu_gid_alloc() and amd_iommu_gid_free(), store the ID in
amd_iommu_viommu::gid, and call them from amd_iommufd_viommu_init() and
destroy after copying struct iommu_viommu_amd to userspace.

ida_init() for gid_ida is done when vIOMMU MMIO is mapped (next patch).

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       |  4 ++++
 drivers/iommu/amd/amd_iommu_types.h |  8 ++++++++
 drivers/iommu/amd/iommu.c           | 19 +++++++++++++++++
 drivers/iommu/amd/iommufd.c         | 32 +++++++++++++++++++++++++++++
 4 files changed, 63 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 17fc0b5b3fa8..9f2a1a8a6d3c 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -228,4 +228,8 @@ amd_iommu_make_clear_dte(struct iommu_dev_data *dev_data, struct dev_table_entry
 struct iommu_domain *
 amd_iommu_alloc_domain_nested(struct iommufd_viommu *viommu, u32 flags,
 			      const struct iommu_user_data *user_data);
+
+/* Guest ID for vIOMMU */
+int amd_iommu_gid_alloc(struct amd_iommu *iommu);
+void amd_iommu_gid_free(struct amd_iommu *iommu, int gid);
 #endif /* AMD_IOMMU_H */
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index b5327bf6814b..00f964d5b149 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -21,6 +21,7 @@
 #include <linux/iommufd.h>
 #include <linux/irqreturn.h>
 #include <linux/generic_pt/iommu.h>
+#include <linux/idr.h>
 
 #include <uapi/linux/iommufd.h>
 
@@ -413,6 +414,9 @@
 
 #define MAX_DOMAIN_ID 65536
 
+/* For vIOMMU, the GID is 16-bit. */
+#define VIOMMU_MAX_GID		0xFFFF
+
 /* Timeout stuff */
 #define LOOP_TIMEOUT		100000
 #define MMIO_STATUS_TIMEOUT	2000000
@@ -509,6 +513,7 @@ struct amd_iommu_viommu {
 	struct iommufd_viommu core;
 	struct protection_domain *parent; /* nest parent domain for this viommu */
 	struct list_head pdom_list;	  /* For protection_domain->viommu_list */
+	u16 gid;			  /* Guest ID for the vIOMMU */
 
 	/*
 	 * Per-vIOMMU guest domain ID to host domain ID mapping.
@@ -768,6 +773,9 @@ struct amd_iommu {
 	/* IOPF support */
 	struct iopf_queue *iopf_queue;
 	unsigned char iopfq_name[32];
+
+	struct ida gid_ida;		 /* guest IDs for this IOMMU */
+	bool gid_ida_inited;
 };
 
 static inline struct amd_iommu *dev_to_amd_iommu(struct device *dev)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 50f26c8123f3..73fba8be40d1 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -252,6 +252,25 @@ static inline bool pdom_is_sva_capable(struct protection_domain *pdom)
 	return pdom_is_v2_pgtbl_mode(pdom) || pdom_is_in_pt_mode(pdom);
 }
 
+int amd_iommu_gid_alloc(struct amd_iommu *iommu)
+{
+	int ret = ida_alloc_range(&iommu->gid_ida, 1, VIOMMU_MAX_GID, GFP_KERNEL);
+
+	if (ret < 0)
+		pr_err("%s: Failed to allocate guest ID (devid=%#x)\n",
+		       __func__, iommu->devid);
+	else
+		pr_debug("%s: iommu devid=%#x, gid=%u\n", __func__, iommu->devid, ret);
+
+	return ret;
+}
+
+void amd_iommu_gid_free(struct amd_iommu *iommu, int gid)
+{
+	pr_debug("%s: iommu devid=%#x, gid=%u\n", __func__, iommu->devid, gid);
+	ida_free(&iommu->gid_ida, gid);
+}
+
 static inline int get_acpihid_device_id(struct device *dev,
 					struct acpihid_map_entry **entry)
 {
diff --git a/drivers/iommu/amd/iommufd.c b/drivers/iommu/amd/iommufd.c
index eee29c26169a..785fa2d575f2 100644
--- a/drivers/iommu/amd/iommufd.c
+++ b/drivers/iommu/amd/iommufd.c
@@ -43,13 +43,37 @@ size_t amd_iommufd_get_viommu_size(struct device *dev, enum iommu_viommu_type vi
 int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *parent,
 			    const struct iommu_user_data *user_data)
 {
+	int ret;
 	unsigned long flags;
+	struct iommu_viommu_amd data;
 	struct protection_domain *pdom = to_pdomain(parent);
 	struct amd_iommu_viommu *aviommu = container_of(viommu, struct amd_iommu_viommu, core);
+	struct amd_iommu *iommu = container_of(viommu->iommu_dev, struct amd_iommu, iommu);
 
 	xa_init_flags(&aviommu->gdomid_array, XA_FLAGS_ALLOC1);
 	aviommu->parent = pdom;
 
+	if (!user_data)
+		return -EINVAL;
+
+	ret = iommu_copy_struct_from_user(&data, user_data,
+					  IOMMU_VIOMMU_TYPE_AMD,
+					  reserved);
+	if (ret)
+		return ret;
+
+	ret = amd_iommu_gid_alloc(iommu);
+	if (ret < 0)
+		goto err_gid;
+	aviommu->gid = ret;
+	pr_debug("%s: gid=%#x", __func__, aviommu->gid);
+
+	ret = iommu_copy_struct_to_user(user_data, &data,
+					IOMMU_VIOMMU_TYPE_AMD,
+					reserved);
+	if (ret)
+		goto err_init;
+
 	viommu->ops = &amd_viommu_ops;
 
 	spin_lock_irqsave(&pdom->lock, flags);
@@ -57,6 +81,10 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
 	spin_unlock_irqrestore(&pdom->lock, flags);
 
 	return 0;
+err_init:
+	amd_iommu_gid_free(iommu, aviommu->gid);
+err_gid:
+	return ret;
 }
 
 static void amd_iommufd_viommu_destroy(struct iommufd_viommu *viommu)
@@ -64,11 +92,15 @@ static void amd_iommufd_viommu_destroy(struct iommufd_viommu *viommu)
 	unsigned long flags;
 	struct amd_iommu_viommu *aviommu = container_of(viommu, struct amd_iommu_viommu, core);
 	struct protection_domain *pdom = aviommu->parent;
+	struct amd_iommu *iommu = container_of(viommu->iommu_dev, struct amd_iommu, iommu);
+
+	pr_debug("%s: gid=%#x, iommu devid=%#x\n", __func__, aviommu->gid, iommu->devid);
 
 	spin_lock_irqsave(&pdom->lock, flags);
 	list_del(&aviommu->pdom_list);
 	spin_unlock_irqrestore(&pdom->lock, flags);
 	xa_destroy(&aviommu->gdomid_array);
+	amd_iommu_gid_free(iommu, aviommu->gid);
 }
 
 /*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 06/26] iommu/amd: Map vIOMMU VF and VF Control MMIO BARs
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (4 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 05/26] iommu/amd: Allocate Guest IDs for IOMMUFD vIOMMU instances Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 07/26] iommu/amd: Add support for AMD vIOMMU VF MMIO region Suravee Suthikulpanit
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Enable hardware vIOMMU on an IOMMU by locating its PCI vendor-specific
capability (VSC), reading the VF and VF Control BAR addresses, and
mapping them for host access (256MB VF, 4MB VF Control).

VF Control covers the first 4K of guest IOMMU MMIO (control registers,
trapped by QEMU). VF MMIO covers the third 4K (virtualized by the
IOMMU). Per-guest bases use the Guest ID from the previous patch.

Initialize the per-amd_iommu gid_ida here so amd_iommu_gid_alloc() can
run when IOMMUFD creates a vIOMMU instance. Export MMIO map helpers and
call amd_viommu_uninit() from IOMMU teardown.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       |   2 +
 drivers/iommu/amd/amd_iommu_types.h |  34 ++++++++
 drivers/iommu/amd/amd_viommu.h      |   6 ++
 drivers/iommu/amd/init.c            |   5 +-
 drivers/iommu/amd/viommu.c          | 124 ++++++++++++++++++++++++++++
 5 files changed, 169 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 9f2a1a8a6d3c..044bc9a634a1 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -28,6 +28,8 @@ void amd_iommu_set_rlookup_table(struct amd_iommu *iommu, u16 devid);
 void iommu_feature_enable(struct amd_iommu *iommu, u8 bit);
 void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu,
 				  gfp_t gfp, size_t size);
+u8 __iomem * __init iommu_map_mmio_space(u64 address, u64 end);
+void __init iommu_unmap_mmio_space(struct amd_iommu *iommu);
 
 #ifdef CONFIG_AMD_IOMMU_DEBUGFS
 void amd_iommu_debugfs_setup(void);
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 00f964d5b149..e88e0bacd1a9 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -40,6 +40,12 @@
 #define MMIO_RANGE_OFFSET	0x0c
 #define MMIO_MISC_OFFSET	0x10
 
+/* vIOMMU Capability offsets (from IOMMU Capability Header) */
+#define MMIO_VSC_VF_BAR_LO_OFFSET	0x08
+#define MMIO_VSC_VF_BAR_HI_OFFSET	0x0c
+#define MMIO_VSC_VF_CNTL_BAR_LO_OFFSET	0x10
+#define MMIO_VSC_VF_CNTL_BAR_HI_OFFSET	0x14
+
 /* Masks, shifts and macros to parse the device range capability */
 #define MMIO_RANGE_LD_MASK	0xff000000
 #define MMIO_RANGE_FD_MASK	0x00ff0000
@@ -473,6 +479,20 @@ extern bool amdr_ivrs_remap_support;
 #define for_each_ivhd_dte_flags(entry) \
 	list_for_each_entry((entry), &amd_ivhd_dev_flags_list, list)
 
+/* VIOMMU stuff */
+#define VIOMMU_VF_MMIO_ENTRY_SIZE		4096
+#define VIOMMU_VFCTRL_MMIO_ENTRY_SIZE		64
+
+/* Host ioremap/request_mem_region sizes for VF / VF_CNTL BARs */
+#define VIOMMU_VF_MMIO_MAP_SIZE		0x10000000UL
+#define VIOMMU_VF_CNTL_MMIO_MAP_SIZE	0x400000UL
+
+#define VIOMMU_VF_MMIO_BASE(iommu, guestId) \
+	(iommu->vf_base + (guestId * VIOMMU_VF_MMIO_ENTRY_SIZE))
+
+#define VIOMMU_VFCTRL_MMIO_BASE(iommu, guestId) \
+	(iommu->vfctrl_base + (guestId * VIOMMU_VFCTRL_MMIO_ENTRY_SIZE))
+
 struct amd_iommu;
 struct iommu_domain;
 struct irq_domain;
@@ -686,6 +706,20 @@ struct amd_iommu {
 	 */
 	u16 cap_ptr;
 
+	/* Vendor-Specific Capability (VSC) pointer. */
+	u16 vsc_offset;
+
+	/*
+	 * VF MMIO base physical address. This is needed to calculate/pass
+	 * per guest VF MMIO address (3rd 4K of IOMMU MMIO space)
+	 */
+	u64 vf_base_phys;
+	u64 vf_cntl_phys;
+
+	/* virtual addresses of vIOMMU VF/VF_CNTL BAR */
+	u8 __iomem *vf_base;
+	u8 __iomem *vfctrl_base;
+
 	/* pci domain of this IOMMU */
 	struct amd_iommu_pci_seg *pci_seg;
 
diff --git a/drivers/iommu/amd/amd_viommu.h b/drivers/iommu/amd/amd_viommu.h
index f08ab9ef23a9..d0c4fdd00809 100644
--- a/drivers/iommu/amd/amd_viommu.h
+++ b/drivers/iommu/amd/amd_viommu.h
@@ -10,6 +10,8 @@
 
 int amd_viommu_init(struct amd_iommu *iommu);
 
+void __init amd_viommu_uninit(struct amd_iommu *iommu);
+
 #else
 
 static inline int amd_viommu_init(struct amd_iommu *iommu)
@@ -17,6 +19,10 @@ static inline int amd_viommu_init(struct amd_iommu *iommu)
 	return -EOPNOTSUPP;
 }
 
+static inline void amd_viommu_uninit(struct amd_iommu *iommu)
+{
+}
+
 #endif /* CONFIG_AMD_IOMMU_IOMMUFD */
 
 #endif /* AMD_VIOMMU_H */
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 5ac883429ced..6e69b3dd8b1e 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -459,7 +459,7 @@ static void iommu_disable(struct amd_iommu *iommu)
  * mapping and unmapping functions for the IOMMU MMIO space. Each AMD IOMMU in
  * the system has one.
  */
-static u8 __iomem * __init iommu_map_mmio_space(u64 address, u64 end)
+u8 __iomem * __init iommu_map_mmio_space(u64 address, u64 end)
 {
 	if (!request_mem_region(address, end, "amd_iommu")) {
 		pr_err("Can not reserve memory region %llx-%llx for mmio\n",
@@ -471,7 +471,7 @@ static u8 __iomem * __init iommu_map_mmio_space(u64 address, u64 end)
 	return (u8 __iomem *)ioremap(address, end);
 }
 
-static void __init iommu_unmap_mmio_space(struct amd_iommu *iommu)
+void __init iommu_unmap_mmio_space(struct amd_iommu *iommu)
 {
 	if (iommu->mmio_base)
 		iounmap(iommu->mmio_base);
@@ -1790,6 +1790,7 @@ static void __init free_iommu_one(struct amd_iommu *iommu)
 	free_iommu_buffers(iommu);
 	amd_iommu_free_ppr_log(iommu);
 	free_ga_log(iommu);
+	amd_viommu_uninit(iommu);
 	iommu_unmap_mmio_space(iommu);
 	amd_iommu_iopf_uninit(iommu);
 }
diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
index f4b5f96d4785..014ae16bf58b 100644
--- a/drivers/iommu/amd/viommu.c
+++ b/drivers/iommu/amd/viommu.c
@@ -7,9 +7,15 @@
 #define dev_fmt(fmt)    pr_fmt(fmt)
 
 #include <linux/iommu.h>
+#include <linux/amd-iommu.h>
+
+#include <linux/fs.h>
+#include <linux/cdev.h>
+#include <linux/ioctl.h>
 #include <linux/iommufd.h>
 #include <linux/amd-iommu.h>
 #include <uapi/linux/iommufd.h>
+#include <linux/mem_encrypt.h>
 
 #include <asm/iommu.h>
 #include <asm/set_memory.h>
@@ -18,12 +24,130 @@
 #include "amd_iommu.h"
 #include "amd_iommu_types.h"
 #include "amd_viommu.h"
+#include "../iommu-pages.h"
+
+LIST_HEAD(viommu_devid_map);
+
+static int viommu_init_pci_vsc(struct amd_iommu *iommu)
+{
+	iommu->vsc_offset = pci_find_capability(iommu->dev, PCI_CAP_ID_VNDR);
+	if (!iommu->vsc_offset)
+		return -ENODEV;
+
+	DUMP_printk("device:%s, vsc offset:%04x\n",
+		    pci_name(iommu->dev), iommu->vsc_offset);
+	return 0;
+}
+
+static void amd_viommu_gid_ida_init(struct amd_iommu *iommu)
+{
+	ida_init(&iommu->gid_ida);
+	iommu->gid_ida_inited = true;
+}
+
+static void amd_viommu_gid_ida_fini(struct amd_iommu *iommu)
+{
+	if (!iommu->gid_ida_inited)
+		return;
+
+	ida_destroy(&iommu->gid_ida);
+	iommu->gid_ida_inited = false;
+}
+
+static void __init amd_viommu_vf_vfcntl_unmap(struct amd_iommu *iommu)
+{
+	if (iommu->vfctrl_base) {
+		iounmap(iommu->vfctrl_base);
+		iommu->vfctrl_base = NULL;
+	}
+	if (iommu->vf_cntl_phys)
+		release_mem_region(iommu->vf_cntl_phys, VIOMMU_VF_CNTL_MMIO_MAP_SIZE);
+
+	if (iommu->vf_base) {
+		iounmap(iommu->vf_base);
+		iommu->vf_base = NULL;
+	}
+	if (iommu->vf_base_phys)
+		release_mem_region(iommu->vf_base_phys, VIOMMU_VF_MMIO_MAP_SIZE);
+}
+
+void __init amd_viommu_uninit(struct amd_iommu *iommu)
+{
+	amd_viommu_gid_ida_fini(iommu);
+	amd_viommu_vf_vfcntl_unmap(iommu);
+}
+
+static int __init viommu_vf_vfcntl_init(struct amd_iommu *iommu)
+{
+	u32 lo, hi;
+	u64 vf_phys, vf_cntl_phys;
+
+	/* Setting up VF and VF_CNTL MMIOs */
+	pci_read_config_dword(iommu->dev, iommu->vsc_offset + MMIO_VSC_VF_BAR_LO_OFFSET, &lo);
+	pci_read_config_dword(iommu->dev, iommu->vsc_offset + MMIO_VSC_VF_BAR_HI_OFFSET, &hi);
+	vf_phys = hi;
+	vf_phys = (vf_phys << 32) | lo;
+	if (!(vf_phys & 1)) {
+		pr_err(FW_BUG "vf_phys disabled\n");
+		return -EINVAL;
+	}
+
+	pci_read_config_dword(iommu->dev, iommu->vsc_offset + MMIO_VSC_VF_CNTL_BAR_LO_OFFSET, &lo);
+	pci_read_config_dword(iommu->dev, iommu->vsc_offset + MMIO_VSC_VF_CNTL_BAR_HI_OFFSET, &hi);
+	vf_cntl_phys = hi;
+	vf_cntl_phys = (vf_cntl_phys << 32) | lo;
+	if (!(vf_cntl_phys & 1)) {
+		pr_err(FW_BUG "vf_cntl_phys disabled\n");
+		return -EINVAL;
+	}
+
+	if (!vf_phys || !vf_cntl_phys) {
+		pr_err(FW_BUG "AMD-Vi: Unassigned VF resources.\n");
+		return -ENOMEM;
+	}
+
+	/* Mapping 256MB of VF and 4MB of VF_CNTL BARs */
+	vf_phys &= ~1ULL;
+	iommu->vf_base = iommu_map_mmio_space(vf_phys, VIOMMU_VF_MMIO_MAP_SIZE);
+	if (!iommu->vf_base) {
+		pr_err("Can't reserve vf_base\n");
+		return -ENOMEM;
+	}
+	iommu->vf_base_phys = vf_phys;
+
+	vf_cntl_phys &= ~1ULL;
+	iommu->vfctrl_base = iommu_map_mmio_space(vf_cntl_phys, VIOMMU_VF_CNTL_MMIO_MAP_SIZE);
+	if (!iommu->vfctrl_base) {
+		pr_err("Can't reserve vfctrl_base\n");
+		goto err_out;
+	}
+	iommu->vf_cntl_phys = vf_cntl_phys;
+
+	pr_debug("%s: IOMMU device:%s, vf_base:%#llx, vfctrl_base:%#llx\n",
+		 __func__, pci_name(iommu->dev), vf_phys, vf_cntl_phys);
+	return 0;
+err_out:
+	amd_viommu_uninit(iommu);
+	return -ENOMEM;
+}
 
 int __init amd_viommu_init(struct amd_iommu *iommu)
 {
+	int ret;
+
 	if (!amd_iommu_viommu ||
 	    !check_feature(FEATURE_VIOMMU))
 		return 0;
 
+	ret = viommu_init_pci_vsc(iommu);
+	if (ret)
+		return ret;
+
+	ret = viommu_vf_vfcntl_init(iommu);
+	if (ret)
+		return ret;
+
+	amd_viommu_gid_ida_init(iommu);
+
 	return 0;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 07/26] iommu/amd: Add support for AMD vIOMMU VF MMIO region
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (5 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 06/26] iommu/amd: Map vIOMMU VF and VF Control MMIO BARs Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-06-01 12:51   ` Jason Gunthorpe
  2026-05-28  5:17 ` [PATCH v2 08/26] iommu/amd: Introduce Reset vMMIO Command Suravee Suthikulpanit
                   ` (19 subsequent siblings)
  26 siblings, 1 reply; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit,
	Vasant Hegde

The AMD vIOMMU virtualizes guest MMIO registers at the 3rd 4K region.
This is achieved using the iommufd_viommu_alloc_mmap().

Co-developed-by: Vasant Hegde <Vasant.Hegde@amd.com>
Signed-off-by: Vasant Hegde <Vasant.Hegde@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  3 +++
 drivers/iommu/amd/amd_viommu.h      |  7 +++++++
 drivers/iommu/amd/iommufd.c         | 18 +++++++++++++++++-
 drivers/iommu/amd/viommu.c          | 11 +++++++++++
 include/uapi/linux/iommufd.h        |  2 ++
 5 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index e88e0bacd1a9..cc7049bbfa14 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -540,6 +540,9 @@ struct amd_iommu_viommu {
 	 * Indexed by guest domain ID.
 	 */
 	struct xarray gdomid_array;
+
+	/* Offset for mmap() of guest VF MMIO; set after iommufd_viommu_alloc_mmap(). */
+	unsigned long vfmmio_mmap_offset;
 };
 
 /*
diff --git a/drivers/iommu/amd/amd_viommu.h b/drivers/iommu/amd/amd_viommu.h
index d0c4fdd00809..447692b9101c 100644
--- a/drivers/iommu/amd/amd_viommu.h
+++ b/drivers/iommu/amd/amd_viommu.h
@@ -12,6 +12,8 @@ int amd_viommu_init(struct amd_iommu *iommu);
 
 void __init amd_viommu_uninit(struct amd_iommu *iommu);
 
+u64 amd_viommu_get_vfmmio_addr(struct amd_iommu *iommu, u16 gid);
+
 #else
 
 static inline int amd_viommu_init(struct amd_iommu *iommu)
@@ -23,6 +25,11 @@ static inline void amd_viommu_uninit(struct amd_iommu *iommu)
 {
 }
 
+static inline u64 amd_viommu_get_vfmmio_addr(struct amd_iommu *iommu, u16 gid)
+{
+	return 0;
+}
+
 #endif /* CONFIG_AMD_IOMMU_IOMMUFD */
 
 #endif /* AMD_VIOMMU_H */
diff --git a/drivers/iommu/amd/iommufd.c b/drivers/iommu/amd/iommufd.c
index 785fa2d575f2..34524c1309c3 100644
--- a/drivers/iommu/amd/iommufd.c
+++ b/drivers/iommu/amd/iommufd.c
@@ -7,6 +7,7 @@
 
 #include "iommufd.h"
 #include "amd_iommu.h"
+#include "amd_viommu.h"
 #include "amd_iommu_types.h"
 
 static const struct iommufd_viommu_ops amd_viommu_ops;
@@ -44,11 +45,12 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
 			    const struct iommu_user_data *user_data)
 {
 	int ret;
+	phys_addr_t page_base;
 	unsigned long flags;
 	struct iommu_viommu_amd data;
 	struct protection_domain *pdom = to_pdomain(parent);
-	struct amd_iommu_viommu *aviommu = container_of(viommu, struct amd_iommu_viommu, core);
 	struct amd_iommu *iommu = container_of(viommu->iommu_dev, struct amd_iommu, iommu);
+	struct amd_iommu_viommu *aviommu = container_of(viommu, struct amd_iommu_viommu, core);
 
 	xa_init_flags(&aviommu->gdomid_array, XA_FLAGS_ALLOC1);
 	aviommu->parent = pdom;
@@ -68,6 +70,16 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
 	aviommu->gid = ret;
 	pr_debug("%s: gid=%#x", __func__, aviommu->gid);
 
+	page_base = amd_viommu_get_vfmmio_addr(iommu, aviommu->gid);
+
+	ret = iommufd_viommu_alloc_mmap(&aviommu->core,
+					page_base, SZ_4K,
+					(unsigned long *)&data.out_vfmmio_mmap_offset);
+	if (ret)
+		goto err_mmap;
+
+	aviommu->vfmmio_mmap_offset = data.out_vfmmio_mmap_offset;
+
 	ret = iommu_copy_struct_to_user(user_data, &data,
 					IOMMU_VIOMMU_TYPE_AMD,
 					reserved);
@@ -82,6 +94,8 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
 
 	return 0;
 err_init:
+	iommufd_viommu_destroy_mmap(&aviommu->core, aviommu->vfmmio_mmap_offset);
+err_mmap:
 	amd_iommu_gid_free(iommu, aviommu->gid);
 err_gid:
 	return ret;
@@ -100,6 +114,8 @@ static void amd_iommufd_viommu_destroy(struct iommufd_viommu *viommu)
 	list_del(&aviommu->pdom_list);
 	spin_unlock_irqrestore(&pdom->lock, flags);
 	xa_destroy(&aviommu->gdomid_array);
+	if (aviommu->vfmmio_mmap_offset)
+		iommufd_viommu_destroy_mmap(&aviommu->core, aviommu->vfmmio_mmap_offset);
 	amd_iommu_gid_free(iommu, aviommu->gid);
 }
 
diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
index 014ae16bf58b..9e6eb2f977ec 100644
--- a/drivers/iommu/amd/viommu.c
+++ b/drivers/iommu/amd/viommu.c
@@ -131,6 +131,17 @@ static int __init viommu_vf_vfcntl_init(struct amd_iommu *iommu)
 	return -ENOMEM;
 }
 
+/*
+ * Returns VF MMIO BAR offset for the give guest ID which will be
+ * mapped to guest vIOMMU 3rd 4K MMIO address
+ */
+u64 amd_viommu_get_vfmmio_addr(struct amd_iommu *iommu, u16 gid)
+{
+	/* TODO: Add check for sVIOMMU and set gid[bit 15] */
+	return iommu->vf_base_phys + gid * VIOMMU_VF_MMIO_ENTRY_SIZE;
+}
+EXPORT_SYMBOL(amd_viommu_get_vfmmio_addr);
+
 int __init amd_viommu_init(struct amd_iommu *iommu)
 {
 	int ret;
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 83ed10610957..0b7a3e5b057c 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -1084,9 +1084,11 @@ struct iommu_viommu_tegra241_cmdqv {
 
 /**
  * struct iommu_viommu_amd - AMD vIOMMU Interface (IOMMU_VIOMMU_TYPE_AMD)
+ * @out_vfmmio_mmap_offset: (out) mmap offset for vIOMMU VF-MMIO
  * @reserved: Must be zero
  */
 struct iommu_viommu_amd {
+	__aligned_u64 out_vfmmio_mmap_offset;
 	__u32 reserved; /* must be last */
 };
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 08/26] iommu/amd: Introduce Reset vMMIO Command
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (6 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 07/26] iommu/amd: Add support for AMD vIOMMU VF MMIO region Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 09/26] iommu/amd: Introduce domain for IOMMU Private Address (IPA) region Suravee Suthikulpanit
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Introduce new IOMMU commands for vIOMMU to reset
virtualized MMIO registers of a particular guest.

Reviewed-by: Weinan Liu <wnliu@google.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       |  1 +
 drivers/iommu/amd/amd_iommu_types.h |  1 +
 drivers/iommu/amd/iommu.c           | 22 ++++++++++++++++++++++
 drivers/iommu/amd/iommufd.c         |  3 +++
 4 files changed, 27 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 044bc9a634a1..2ce207529ea0 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -11,6 +11,7 @@
 
 #include "amd_iommu_types.h"
 
+void iommu_reset_vmmio(struct amd_iommu *iommu, u16 gid);
 extern int amd_iommu_evtlog_size;
 extern int amd_iommu_pprlog_size;
 
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index cc7049bbfa14..44fa1d6c64d6 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -218,6 +218,7 @@
 #define CMD_INV_IRT		0x05
 #define CMD_COMPLETE_PPR	0x07
 #define CMD_INV_ALL		0x08
+#define CMD_RESET_VMMIO		0x0A
 
 #define CMD_COMPL_WAIT_STORE_MASK	0x01
 #define CMD_COMPL_WAIT_INT_MASK		0x02
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 73fba8be40d1..6f5ecc48f4ad 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1428,6 +1428,18 @@ static void build_inv_irt(struct iommu_cmd *cmd, u16 devid)
 	CMD_SET_TYPE(cmd, CMD_INV_IRT);
 }
 
+static void build_reset_vmmio(struct iommu_cmd *cmd, u16 gid,
+			      bool vcmd, bool all)
+{
+	memset(cmd, 0, sizeof(*cmd));
+	cmd->data[0] = gid;
+	if (all)
+		cmd->data[0] |= (1 << 28);
+	if (vcmd)
+		cmd->data[0] |= (1 << 31);
+	CMD_SET_TYPE(cmd, CMD_RESET_VMMIO);
+}
+
 /*
  * Writes the command to the IOMMUs command buffer and informs the
  * hardware about the new command.
@@ -1668,6 +1680,16 @@ void amd_iommu_flush_all_caches(struct amd_iommu *iommu)
 	}
 }
 
+void iommu_reset_vmmio(struct amd_iommu *iommu, u16 gid)
+{
+	struct iommu_cmd cmd;
+
+	build_reset_vmmio(&cmd, gid, 1, 1);
+
+	iommu_queue_command(iommu, &cmd);
+	amd_iommu_completion_wait(iommu);
+}
+
 /*
  * Command send function for flushing on-device TLB
  */
diff --git a/drivers/iommu/amd/iommufd.c b/drivers/iommu/amd/iommufd.c
index 34524c1309c3..42307ae71b24 100644
--- a/drivers/iommu/amd/iommufd.c
+++ b/drivers/iommu/amd/iommufd.c
@@ -80,6 +80,9 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
 
 	aviommu->vfmmio_mmap_offset = data.out_vfmmio_mmap_offset;
 
+	/* Reset vIOMMU MMIOs to initialize the vIOMMU */
+	iommu_reset_vmmio(iommu, aviommu->gid);
+
 	ret = iommu_copy_struct_to_user(user_data, &data,
 					IOMMU_VIOMMU_TYPE_AMD,
 					reserved);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 09/26] iommu/amd: Introduce domain for IOMMU Private Address (IPA) region
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (7 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 08/26] iommu/amd: Introduce Reset vMMIO Command Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 10/26] iommu/amd: Assign IOMMU Private Address domain to IOMMU Suravee Suthikulpanit
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

AMD vIOMMU introduces the IOMMU Private Address (IPA) region, which is
used to manage data structures necessary for IOMMU virtualization within
the guest.

Introduce a new domain specifically for IPA region for each IOMMU, which
is stored in struct amd_iommu.viommu_pdom. This domain uses AMD IOMMU v1
page table.

For more info, please see section vIOMMU Private Address Space of the IOMMU
specification [1].

[1] https://docs.amd.com/v/u/en-US/48882_3.10_PUB

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       |  5 +++++
 drivers/iommu/amd/amd_iommu_types.h |  3 +++
 drivers/iommu/amd/iommu.c           |  9 ++++----
 drivers/iommu/amd/viommu.c          | 35 +++++++++++++++++++++++++++++
 4 files changed, 47 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 2ce207529ea0..1aa79a26a127 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -31,6 +31,7 @@ void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu,
 				  gfp_t gfp, size_t size);
 u8 __iomem * __init iommu_map_mmio_space(u64 address, u64 end);
 void __init iommu_unmap_mmio_space(struct amd_iommu *iommu);
+int iommu_flush_dte(struct amd_iommu *iommu, u16 devid);
 
 #ifdef CONFIG_AMD_IOMMU_DEBUGFS
 void amd_iommu_debugfs_setup(void);
@@ -39,6 +40,8 @@ static inline void amd_iommu_debugfs_setup(void) {}
 #endif
 
 extern bool amd_iommu_viommu;
+extern const struct pt_iommu_driver_ops amd_hw_driver_ops_v1;
+extern const struct iommu_domain_ops amdv1_ops;
 
 /* Needed for interrupt remapping */
 int amd_iommu_prepare(void);
@@ -56,6 +59,8 @@ extern bool amd_iommu_hatdis;
 /* Protection domain ops */
 void amd_iommu_init_identity_domain(void);
 struct protection_domain *protection_domain_alloc(void);
+struct iommu_domain *amd_iommu_domain_alloc_paging_v1(struct device *dev,
+						      u32 flags);
 struct iommu_domain *amd_iommu_domain_alloc_sva(struct device *dev,
 						struct mm_struct *mm);
 void amd_iommu_domain_free(struct iommu_domain *dom);
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 44fa1d6c64d6..02d359f09148 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -814,6 +814,9 @@ struct amd_iommu {
 
 	struct ida gid_ida;		 /* guest IDs for this IOMMU */
 	bool gid_ida_inited;
+
+	/* HW vIOMMU support */
+	struct protection_domain *viommu_pdom;
 };
 
 static inline struct amd_iommu *dev_to_amd_iommu(struct device *dev)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 6f5ecc48f4ad..d89664eba898 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1551,7 +1551,7 @@ static void domain_flush_complete(struct protection_domain *domain)
 		amd_iommu_completion_wait(pdom_iommu_info->iommu);
 }
 
-static int iommu_flush_dte(struct amd_iommu *iommu, u16 devid)
+int iommu_flush_dte(struct amd_iommu *iommu, u16 devid)
 {
 	struct iommu_cmd cmd;
 
@@ -2726,12 +2726,12 @@ static void amd_iommu_iotlb_sync(struct iommu_domain *domain,
 	iommu_put_pages_list(&gather->freelist);
 }
 
-static const struct pt_iommu_driver_ops amd_hw_driver_ops_v1 = {
+const struct pt_iommu_driver_ops amd_hw_driver_ops_v1 = {
 	.get_top_lock = amd_iommu_get_top_lock,
 	.change_top = amd_iommu_change_top,
 };
 
-static const struct iommu_domain_ops amdv1_ops = {
+const struct iommu_domain_ops amdv1_ops = {
 	IOMMU_PT_DOMAIN_OPS(amdv1),
 	.iotlb_sync_map = amd_iommu_iotlb_sync_map,
 	.flush_iotlb_all = amd_iommu_flush_iotlb_all,
@@ -2746,8 +2746,7 @@ static const struct iommu_dirty_ops amdv1_dirty_ops = {
 	.set_dirty_tracking = amd_iommu_set_dirty_tracking,
 };
 
-static struct iommu_domain *amd_iommu_domain_alloc_paging_v1(struct device *dev,
-							     u32 flags)
+struct iommu_domain *amd_iommu_domain_alloc_paging_v1(struct device *dev, u32 flags)
 {
 	struct pt_iommu_amdv1_cfg cfg = {};
 	struct protection_domain *domain;
diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
index 9e6eb2f977ec..63360eef6b0d 100644
--- a/drivers/iommu/amd/viommu.c
+++ b/drivers/iommu/amd/viommu.c
@@ -131,6 +131,37 @@ static int __init viommu_vf_vfcntl_init(struct amd_iommu *iommu)
 	return -ENOMEM;
 }
 
+static int viommu_private_space_init(struct amd_iommu *iommu)
+{
+	struct iommu_domain *dom;
+	struct protection_domain *pdom;
+	struct pt_iommu_amdv1_hw_info pt_info;
+
+	/*
+	 * Setup page table root pointer, Guest MMIO and
+	 * Cmdbuf Dirty Status regions.
+	 */
+	dom = amd_iommu_domain_alloc_paging_v1(&iommu->dev->dev, 0);
+	if (!dom) {
+		pr_err("%s: Failed to initialize private space\n", __func__);
+		goto err_out;
+	}
+
+	pdom = to_pdomain(dom);
+	iommu->viommu_pdom = pdom;
+
+	pt_iommu_amdv1_hw_info(&pdom->amdv1, &pt_info);
+	pr_debug("%s: devid=%#x, pte_root=%#llx\n",
+		 __func__, iommu->devid,
+		 (unsigned long long)pt_info.host_pt_root);
+
+	return 0;
+err_out:
+	if (dom)
+		amd_iommu_domain_free(dom);
+	return -ENOMEM;
+}
+
 /*
  * Returns VF MMIO BAR offset for the give guest ID which will be
  * mapped to guest vIOMMU 3rd 4K MMIO address
@@ -160,5 +191,9 @@ int __init amd_viommu_init(struct amd_iommu *iommu)
 
 	amd_viommu_gid_ida_init(iommu);
 
+	ret = viommu_private_space_init(iommu);
+	if (ret)
+		return ret;
+
 	return 0;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 10/26] iommu/amd: Assign IOMMU Private Address domain to IOMMU
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (8 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 09/26] iommu/amd: Introduce domain for IOMMU Private Address (IPA) region Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-06-01 12:59   ` Jason Gunthorpe
  2026-05-28  5:17 ` [PATCH v2 11/26] iommu/amd: Allocate and map vIOMMU private regions Suravee Suthikulpanit
                   ` (16 subsequent siblings)
  26 siblings, 1 reply; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

By setting the domain ID, pagetable mode, and IOMMU v1 page table in the
IOMMU Device Table Entry (DTE) indexed using the device ID of the
AMD IOMMU.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/viommu.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
index 63360eef6b0d..14426649074f 100644
--- a/drivers/iommu/amd/viommu.c
+++ b/drivers/iommu/amd/viommu.c
@@ -173,6 +173,35 @@ u64 amd_viommu_get_vfmmio_addr(struct amd_iommu *iommu, u16 gid)
 }
 EXPORT_SYMBOL(amd_viommu_get_vfmmio_addr);
 
+/* Set DTE for IOMMU device */
+static void set_iommu_dte(struct amd_iommu *iommu)
+{
+	u64 dte0, dte1;
+	u16 devid = iommu->devid;
+	struct pt_iommu_amdv1_hw_info pt_info;
+	struct protection_domain *pdom = iommu->viommu_pdom;
+	struct dev_table_entry *dev_table = get_dev_table(iommu);
+
+	pt_iommu_amdv1_hw_info(&pdom->amdv1, &pt_info);
+
+	pr_debug("%s: host_pt_root=%#llx, mode=%#x\n",
+		 __func__, pt_info.host_pt_root, pt_info.mode);
+
+	dte0 = FIELD_PREP(DTE_HOST_TRP, pt_info.host_pt_root >> 12);
+	dte0 |= (pt_info.mode & DEV_ENTRY_MODE_MASK) << DEV_ENTRY_MODE_SHIFT;
+	dte0 |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
+
+	dte1 = dev_table[devid].data[1];
+	dte1 &= ~DTE_DOMID_MASK;
+	dte1 |= pdom->id;
+
+	dev_table[devid].data[1] = dte1;
+	dev_table[devid].data[0] = dte0;
+
+	iommu_flush_dte(iommu, devid);
+	amd_iommu_completion_wait(iommu);
+}
+
 int __init amd_viommu_init(struct amd_iommu *iommu)
 {
 	int ret;
@@ -195,5 +224,7 @@ int __init amd_viommu_init(struct amd_iommu *iommu)
 	if (ret)
 		return ret;
 
+	set_iommu_dte(iommu);
+
 	return 0;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 11/26] iommu/amd: Allocate and map vIOMMU private regions
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (9 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 10/26] iommu/amd: Assign IOMMU Private Address domain to IOMMU Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-06-01 13:05   ` Jason Gunthorpe
  2026-05-28  5:17 ` [PATCH v2 12/26] iommu/amd: Add per-VM private IPA alloc/map helpers Suravee Suthikulpanit
                   ` (15 subsequent siblings)
  26 siblings, 1 reply; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

The AMD IOMMU Private Address (IPA) region is allocated and mapped during
IOMMU driver initialization. According to the specification, 8MB is needed.
Since the hardware does not require the IPA region to be physically
contiguous, split the IPA region into 4 2MB subregions to match hugepage
granularity and create mapping in the v1 page-table for each IOMMU.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       |  3 ++
 drivers/iommu/amd/amd_iommu_types.h |  8 +++
 drivers/iommu/amd/iommu.c           | 16 ++++++
 drivers/iommu/amd/viommu.c          | 79 +++++++++++++++++++++++++++--
 4 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 1aa79a26a127..279f458becda 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -105,6 +105,9 @@ void amd_iommu_domain_flush_pages(struct protection_domain *domain,
 void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
 				     ioasid_t pasid, u64 address, size_t size);
 
+int amd_iommu_flush_private_vm_region(struct amd_iommu *iommu, struct protection_domain *pdom,
+				      u64 address, size_t size);
+
 #ifdef CONFIG_IRQ_REMAP
 int amd_iommu_create_irq_domain(struct amd_iommu *iommu);
 #else
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 02d359f09148..a5e2f32590d1 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -424,6 +424,13 @@
 /* For vIOMMU, the GID is 16-bit. */
 #define VIOMMU_MAX_GID		0xFFFF
 
+/*
+ * Total IOMMU private region is 8MB (4 x 2MB-subregion)
+ */
+#define VIOMMU_PRIV_REGION_BASE		(0)
+#define VIOMMU_PRIV_SUBREGION_CNT	(4)
+#define VIOMMU_PRIV_SUBREGION_SIZE	(0x200000)  /* 2MB */
+
 /* Timeout stuff */
 #define LOOP_TIMEOUT		100000
 #define MMIO_STATUS_TIMEOUT	2000000
@@ -817,6 +824,7 @@ struct amd_iommu {
 
 	/* HW vIOMMU support */
 	struct protection_domain *viommu_pdom;
+	void *viommu_priv_region[VIOMMU_PRIV_SUBREGION_CNT];
 };
 
 static inline struct amd_iommu *dev_to_amd_iommu(struct device *dev)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index d89664eba898..8b441f68bc47 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1808,6 +1808,22 @@ static int domain_flush_pages_v1(struct protection_domain *pdom,
 	return ret;
 }
 
+int amd_iommu_flush_private_vm_region(struct amd_iommu *iommu, struct protection_domain *pdom,
+				      u64 address, size_t size)
+{
+	int ret;
+	struct iommu_cmd cmd;
+
+	build_inv_iommu_pages(&cmd, address, size, pdom->id, 0, false);
+
+	ret = iommu_queue_command(iommu, &cmd);
+	if (ret)
+		return ret;
+
+	amd_iommu_completion_wait(iommu);
+	return ret;
+}
+
 /*
  * TLB invalidation function which is called from the mapping functions.
  * It flushes range of PTEs of the domain.
diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
index 14426649074f..90ed2eb92aeb 100644
--- a/drivers/iommu/amd/viommu.c
+++ b/drivers/iommu/amd/viommu.c
@@ -131,8 +131,66 @@ static int __init viommu_vf_vfcntl_init(struct amd_iommu *iommu)
 	return -ENOMEM;
 }
 
+static void *alloc_private_subregion(struct amd_iommu *iommu, u64 base, size_t size)
+{
+	int ret;
+	void *region;
+	int nid = iommu && iommu->dev ? dev_to_node(&iommu->dev->dev) : NUMA_NO_NODE;
+
+	region = (void *)iommu_alloc_pages_node_sz(nid, GFP_KERNEL | __GFP_ZERO, size);
+	if (!region)
+		return NULL;
+
+	ret = set_memory_uc((unsigned long)region, size >> PAGE_SHIFT);
+	if (ret)
+		goto err_out;
+
+	ret = iommu_map(&iommu->viommu_pdom->domain, base,
+			iommu_virt_to_phys(region), size,
+			IOMMU_PROT_IR | IOMMU_PROT_IW, GFP_KERNEL);
+
+	if (ret)
+		goto cleanup_mem_attr;
+
+	pr_debug("%s: base=%#llx, size=%#lx, subregion=%#llx(%#llx)\n",
+		 __func__, base, size, (unsigned long long)region, iommu_virt_to_phys(region));
+
+	amd_iommu_flush_private_vm_region(iommu, iommu->viommu_pdom, base, size);
+
+	return region;
+cleanup_mem_attr:
+	set_memory_wb((unsigned long)region, size >> PAGE_SHIFT);
+err_out:
+	iommu_free_pages(region);
+	return NULL;
+}
+
+static void viommu_private_space_uninit(struct amd_iommu *iommu)
+{
+	int i;
+	struct iommu_domain *dom;
+
+	if (!iommu->viommu_pdom)
+		return;
+
+	for (i = 0; i < VIOMMU_PRIV_SUBREGION_CNT; i++) {
+		if (!iommu->viommu_priv_region[i])
+			continue;
+		set_memory_wb((unsigned long)iommu->viommu_priv_region[i],
+			      VIOMMU_PRIV_SUBREGION_SIZE >> PAGE_SHIFT);
+		iommu_free_pages(iommu->viommu_priv_region[i]);
+		iommu->viommu_priv_region[i] = NULL;
+	}
+
+	dom = &iommu->viommu_pdom->domain;
+	amd_iommu_domain_free(dom);
+	iommu->viommu_pdom = NULL;
+}
+
 static int viommu_private_space_init(struct amd_iommu *iommu)
 {
+	int i;
+	u64 base;
 	struct iommu_domain *dom;
 	struct protection_domain *pdom;
 	struct pt_iommu_amdv1_hw_info pt_info;
@@ -144,22 +202,33 @@ static int viommu_private_space_init(struct amd_iommu *iommu)
 	dom = amd_iommu_domain_alloc_paging_v1(&iommu->dev->dev, 0);
 	if (!dom) {
 		pr_err("%s: Failed to initialize private space\n", __func__);
-		goto err_out;
+		return -ENOMEM;
 	}
 
 	pdom = to_pdomain(dom);
 	iommu->viommu_pdom = pdom;
 
+	/*
+	 * Each private region requires to 8MB of memory to be allocated
+	 * and mapped. Split the region into 4 x 2MB-subregion.
+	 */
+	for (i = 0; i < VIOMMU_PRIV_SUBREGION_CNT; i++) {
+		base = VIOMMU_PRIV_REGION_BASE + (i * VIOMMU_PRIV_SUBREGION_SIZE);
+		iommu->viommu_priv_region[i] = alloc_private_subregion(iommu, base,
+								       VIOMMU_PRIV_SUBREGION_SIZE);
+		if (!iommu->viommu_priv_region[i]) {
+			pr_err("%s: Failed to allocate vIOMMU private subregion %d\n", __func__, i);
+			viommu_private_space_uninit(iommu);
+			return -ENOMEM;
+		}
+	}
+
 	pt_iommu_amdv1_hw_info(&pdom->amdv1, &pt_info);
 	pr_debug("%s: devid=%#x, pte_root=%#llx\n",
 		 __func__, iommu->devid,
 		 (unsigned long long)pt_info.host_pt_root);
 
 	return 0;
-err_out:
-	if (dom)
-		amd_iommu_domain_free(dom);
-	return -ENOMEM;
 }
 
 /*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 12/26] iommu/amd: Add per-VM private IPA alloc/map helpers
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (10 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 11/26] iommu/amd: Allocate and map vIOMMU private regions Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-30 20:44   ` Weinan Liu
                     ` (3 more replies)
  2026-05-28  5:17 ` [PATCH v2 13/26] iommu/amd: Add helper functions to manage DevID / DomID mapping tables Suravee Suthikulpanit
                   ` (14 subsequent siblings)
  26 siblings, 4 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Guest device ID and guest domain ID tables use dedicated slots in
the vIOMMU private address (IPA) region, indexed by guest ID (GID).

Add alloc_private_vm_region() and free_private_vm_region() to
allocate backing pages, map them through viommu_pdom, flush the
private VM mapping, and tear down on VM destroy.

Export amd_iommu_iotlb_sync() and use it for nested domain
iotlb_sync so nested attach paths can flush gathered IOTLB state.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

---
 drivers/iommu/amd/amd_iommu.h |  2 ++
 drivers/iommu/amd/iommu.c     |  4 +--
 drivers/iommu/amd/nested.c    |  1 +
 drivers/iommu/amd/viommu.c    | 59 +++++++++++++++++++++++++++++++++++
 4 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 7c725e032d20..304cb9093249 100644
---
 drivers/iommu/amd/amd_iommu.h |  2 ++
 drivers/iommu/amd/iommu.c     |  4 +--
 drivers/iommu/amd/nested.c    |  1 +
 drivers/iommu/amd/viommu.c    | 55 +++++++++++++++++++++++++++++++++++
 4 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 279f458becda..1d8727c16840 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -105,6 +105,8 @@ void amd_iommu_domain_flush_pages(struct protection_domain *domain,
 void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
 				     ioasid_t pasid, u64 address, size_t size);
 
+void amd_iommu_iotlb_sync(struct iommu_domain *domain,
+			  struct iommu_iotlb_gather *gather);
 int amd_iommu_flush_private_vm_region(struct amd_iommu *iommu, struct protection_domain *pdom,
 				      u64 address, size_t size);
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 8b441f68bc47..e0433e65cfa5 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2729,8 +2729,8 @@ static void amd_iommu_flush_iotlb_all(struct iommu_domain *domain)
 	spin_unlock_irqrestore(&dom->lock, flags);
 }
 
-static void amd_iommu_iotlb_sync(struct iommu_domain *domain,
-				 struct iommu_iotlb_gather *gather)
+void amd_iommu_iotlb_sync(struct iommu_domain *domain,
+			  struct iommu_iotlb_gather *gather)
 {
 	struct protection_domain *dom = to_pdomain(domain);
 	unsigned long flags;
diff --git a/drivers/iommu/amd/nested.c b/drivers/iommu/amd/nested.c
index 5b902598e68a..15dc57cf7c5f 100644
--- a/drivers/iommu/amd/nested.c
+++ b/drivers/iommu/amd/nested.c
@@ -291,4 +291,5 @@ static void nested_domain_free(struct iommu_domain *dom)
 static const struct iommu_domain_ops nested_domain_ops = {
 	.attach_dev = nested_attach_device,
 	.free = nested_domain_free,
+	.iotlb_sync = amd_iommu_iotlb_sync,
 };
diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
index 90ed2eb92aeb..6dcb02b12a28 100644
--- a/drivers/iommu/amd/viommu.c
+++ b/drivers/iommu/amd/viommu.c
@@ -297,3 +297,58 @@ int __init amd_viommu_init(struct amd_iommu *iommu)
 
 	return 0;
 }
+
+static int __maybe_unused alloc_private_vm_region(struct amd_iommu *iommu, u64 **entry,
+						 u64 base, size_t size, u16 gid)
+{
+	int ret;
+	u64 addr = base + (gid * size);
+	int nid = iommu && iommu->dev ? dev_to_node(&iommu->dev->dev) : NUMA_NO_NODE;
+
+	*entry = (void *)iommu_alloc_pages_node_sz(nid, GFP_KERNEL | __GFP_ZERO, size);
+	if (!*entry)
+		return -ENOMEM;
+
+	ret = set_memory_uc((unsigned long)*entry, size >> PAGE_SHIFT);
+	if (ret)
+		goto err_out;
+
+	pr_debug("%s: entry=%#llx(%#llx), addr=%#llx, size=%#lx\n", __func__,
+		 (unsigned long  long)*entry, iommu_virt_to_phys(*entry), addr, size);
+
+	ret = iommu_map(&iommu->viommu_pdom->domain, addr,
+			iommu_virt_to_phys(*entry), size,
+			IOMMU_PROT_IR | IOMMU_PROT_IW, GFP_KERNEL);
+	if (ret)
+		goto cleanup_mem_attr;
+
+	return amd_iommu_flush_private_vm_region(iommu, iommu->viommu_pdom, addr, size);
+cleanup_mem_attr:
+	set_memory_wb((unsigned long)*entry, size >> PAGE_SHIFT);
+err_out:
+	iommu_free_pages(*entry);
+	*entry = NULL;
+	return ret;
+}
+
+static void __maybe_unused free_private_vm_region(struct amd_iommu *iommu, u64 **entry,
+						  u64 base, size_t size, u16 gid)
+{
+	size_t unmapped;
+	u64 addr = base + (gid * size);
+
+	pr_debug("%s: entry=%#llx(%#llx), base=%#llx, addr=%#llx, size=%#lx\n",
+		 __func__, (unsigned long  long)*entry,
+		 iommu_virt_to_phys(*entry), base, addr, size);
+
+	if (!iommu || !iommu->viommu_pdom)
+		return;
+
+	unmapped = iommu_unmap(&iommu->viommu_pdom->domain, addr, size);
+	if (unmapped != size)
+		pr_warn("%s: unmapped %#zx of %#lx at %#llx\n", __func__, unmapped, size, addr);
+
+	set_memory_wb((unsigned long)*entry, size >> PAGE_SHIFT);
+	iommu_free_pages(*entry);
+	*entry = NULL;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 13/26] iommu/amd: Add helper functions to manage DevID / DomID mapping tables
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (11 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 12/26] iommu/amd: Add per-VM private IPA alloc/map helpers Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-30 21:26   ` Weinan Liu
  2026-05-28  5:17 ` [PATCH v2 14/26] iommu/amd: Introduce IOMMUFD vDevice support for AMD Suravee Suthikulpanit
                   ` (13 subsequent siblings)
  26 siblings, 1 reply; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Introduce amd_viommu_init_one() and amd_viommu_uninit_one().
These functions are called during IOMMUFD vIOMMU initialize and destroy.
Currently, it manages the IPA mapping for Device ID and Domain ID mapping
tables.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  3 ++
 drivers/iommu/amd/amd_viommu.h      | 13 ++++++++
 drivers/iommu/amd/iommu.c           |  1 +
 drivers/iommu/amd/iommufd.c         |  5 +++
 drivers/iommu/amd/viommu.c          | 52 +++++++++++++++++++++++++++++
 5 files changed, 74 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index a5e2f32590d1..63410028bae3 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -549,6 +549,9 @@ struct amd_iommu_viommu {
 	 */
 	struct xarray gdomid_array;
 
+	u64 *devid_table;
+	u64 *domid_table;
+
 	/* Offset for mmap() of guest VF MMIO; set after iommufd_viommu_alloc_mmap(). */
 	unsigned long vfmmio_mmap_offset;
 };
diff --git a/drivers/iommu/amd/amd_viommu.h b/drivers/iommu/amd/amd_viommu.h
index 447692b9101c..8b57717c22a6 100644
--- a/drivers/iommu/amd/amd_viommu.h
+++ b/drivers/iommu/amd/amd_viommu.h
@@ -14,6 +14,10 @@ void __init amd_viommu_uninit(struct amd_iommu *iommu);
 
 u64 amd_viommu_get_vfmmio_addr(struct amd_iommu *iommu, u16 gid);
 
+int amd_viommu_init_one(struct amd_iommu *iommu, struct amd_iommu_viommu *viommu);
+
+void amd_viommu_uninit_one(struct amd_iommu *iommu, struct amd_iommu_viommu *viommu);
+
 #else
 
 static inline int amd_viommu_init(struct amd_iommu *iommu)
@@ -30,6 +34,15 @@ static inline u64 amd_viommu_get_vfmmio_addr(struct amd_iommu *iommu, u16 gid)
 	return 0;
 }
 
+static inline int amd_viommu_init_one(struct amd_iommu *iommu, struct amd_iommu_viommu *viommu)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline void amd_viommu_uninit_one(struct amd_iommu *iommu, struct amd_iommu_viommu *viommu)
+{
+}
+
 #endif /* CONFIG_AMD_IOMMU_IOMMUFD */
 
 #endif /* AMD_VIOMMU_H */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index e0433e65cfa5..44d20e598d85 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -43,6 +43,7 @@
 #include <linux/generic_pt/iommu.h>
 
 #include "amd_iommu.h"
+#include "amd_viommu.h"
 #include "iommufd.h"
 #include "../irq_remapping.h"
 #include "../iommu-pages.h"
diff --git a/drivers/iommu/amd/iommufd.c b/drivers/iommu/amd/iommufd.c
index 42307ae71b24..efa9e1f49550 100644
--- a/drivers/iommu/amd/iommufd.c
+++ b/drivers/iommu/amd/iommufd.c
@@ -83,6 +83,10 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
 	/* Reset vIOMMU MMIOs to initialize the vIOMMU */
 	iommu_reset_vmmio(iommu, aviommu->gid);
 
+	ret = amd_viommu_init_one(iommu, aviommu);
+	if (ret)
+		goto err_init;
+
 	ret = iommu_copy_struct_to_user(user_data, &data,
 					IOMMU_VIOMMU_TYPE_AMD,
 					reserved);
@@ -120,6 +124,7 @@ static void amd_iommufd_viommu_destroy(struct iommufd_viommu *viommu)
 	if (aviommu->vfmmio_mmap_offset)
 		iommufd_viommu_destroy_mmap(&aviommu->core, aviommu->vfmmio_mmap_offset);
 	amd_iommu_gid_free(iommu, aviommu->gid);
+	amd_viommu_uninit_one(iommu, aviommu);
 }
 
 /*
diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
index 6dcb02b12a28..3636093732ce 100644
--- a/drivers/iommu/amd/viommu.c
+++ b/drivers/iommu/amd/viommu.c
@@ -26,6 +26,20 @@
 #include "amd_viommu.h"
 #include "../iommu-pages.h"
 
+/*
+ * Guest Device ID Mapping Table
+ */
+#define VIOMMU_MAX_GDEVID	0xFFFF
+#define VIOMMU_DEVID_MAPPING_BASE	0x1000000000ULL
+#define VIOMMU_DEVID_MAPPING_ENTRY_SIZE	(1 << 20)
+
+/*
+ * Guest Domain ID Mapping Table
+ */
+#define VIOMMU_MAX_GDOMID	0xFFFF
+#define VIOMMU_DOMID_MAPPING_BASE	0x2000000000ULL
+#define VIOMMU_DOMID_MAPPING_ENTRY_SIZE	(1 << 19)
+
 LIST_HEAD(viommu_devid_map);
 
 static int viommu_init_pci_vsc(struct amd_iommu *iommu)
@@ -352,3 +366,41 @@ static void __maybe_unused free_private_vm_region(struct amd_iommu *iommu, u64 *
 	iommu_free_pages(*entry);
 	*entry = NULL;
 }
+
+void amd_viommu_uninit_one(struct amd_iommu *iommu, struct amd_iommu_viommu *aviommu)
+{
+	pr_debug("%s: gid=%u\n", __func__, aviommu->gid);
+
+	free_private_vm_region(iommu, &aviommu->devid_table,
+			       VIOMMU_DEVID_MAPPING_BASE,
+			       VIOMMU_DEVID_MAPPING_ENTRY_SIZE,
+			       aviommu->gid);
+	free_private_vm_region(iommu, &aviommu->domid_table,
+			       VIOMMU_DOMID_MAPPING_BASE,
+			       VIOMMU_DOMID_MAPPING_ENTRY_SIZE,
+			       aviommu->gid);
+}
+
+int amd_viommu_init_one(struct amd_iommu *iommu, struct amd_iommu_viommu *viommu)
+{
+	int ret;
+
+	ret = alloc_private_vm_region(iommu, &viommu->devid_table,
+				      VIOMMU_DEVID_MAPPING_BASE,
+				      VIOMMU_DEVID_MAPPING_ENTRY_SIZE,
+				      viommu->gid);
+	if (ret)
+		goto err_out;
+
+	ret = alloc_private_vm_region(iommu, &viommu->domid_table,
+				      VIOMMU_DOMID_MAPPING_BASE,
+				      VIOMMU_DOMID_MAPPING_ENTRY_SIZE,
+				      viommu->gid);
+	if (ret)
+		goto err_out;
+
+	return 0;
+err_out:
+	amd_viommu_uninit_one(iommu, viommu);
+	return -ENOMEM;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 14/26] iommu/amd: Introduce IOMMUFD vDevice support for AMD
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (12 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 13/26] iommu/amd: Add helper functions to manage DevID / DomID mapping tables Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 15/26] iommu/amd: Introduce helper function for updating domain ID mapping table Suravee Suthikulpanit
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Initialize vDevice for AMD vIOMMU by setting up the Device ID Mapping
table using the guest device ID.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h | 12 +++++++++++
 drivers/iommu/amd/iommufd.c         | 33 +++++++++++++++++++++++++++++
 drivers/iommu/amd/nested.c          | 12 +++++++++++
 3 files changed, 57 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 63410028bae3..836d5cd08134 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -388,6 +388,11 @@
 #define DTE_GPT_LEVEL_SHIFT	54
 #define DTE_GPT_LEVEL_MASK	GENMASK_ULL(55, 54)
 
+/* vIOMMU bit fields */
+#define DTE_VIOMMU_EN_SHIFT		15
+#define DTE_VIOMMU_GDEVICEID_MASK	GENMASK_ULL(31, 16)
+#define DTE_VIOMMU_GUESTID_MASK		GENMASK_ULL(47, 32)
+
 #define GCR3_VALID		0x01ULL
 
 /* DTE[128:179] | DTE[184:191] */
@@ -894,6 +899,9 @@ struct iommu_dev_data {
 	bool defer_attach;
 
 	struct ratelimit_state rs;        /* Ratelimit IOPF messages */
+
+	u16 gid;			/* Guest ID */
+	u16 gDevId;			/* Guest Device ID */
 };
 
 /* Map HPET and IOAPIC ids to the devid used by the IOMMU */
@@ -1121,6 +1129,10 @@ struct amd_irte_ops {
 	void (*clear_allocated)(struct irq_remap_table *, int);
 };
 
+struct amd_iommu_vdevice {
+	struct iommufd_vdevice core;
+};
+
 #ifdef CONFIG_IRQ_REMAP
 extern struct amd_irte_ops irte_32_ops;
 extern struct amd_irte_ops irte_128_ops;
diff --git a/drivers/iommu/amd/iommufd.c b/drivers/iommu/amd/iommufd.c
index efa9e1f49550..eb43d02371a8 100644
--- a/drivers/iommu/amd/iommufd.c
+++ b/drivers/iommu/amd/iommufd.c
@@ -9,6 +9,7 @@
 #include "amd_iommu.h"
 #include "amd_viommu.h"
 #include "amd_iommu_types.h"
+#include "../iommufd/iommufd_private.h"
 
 static const struct iommufd_viommu_ops amd_viommu_ops;
 
@@ -127,6 +128,36 @@ static void amd_iommufd_viommu_destroy(struct iommufd_viommu *viommu)
 	amd_viommu_uninit_one(iommu, aviommu);
 }
 
+/*
+ * Called from drivers/iommu/iommufd/viommu.c: iommufd_vdevice_alloc_ioctl()
+ */
+static int _amd_viommu_vdevice_init(struct iommufd_vdevice *vdev)
+{
+	struct iommu_dev_data *dev_data;
+	struct pci_dev *pdev = to_pci_dev(vdev->idev->dev);
+	struct iommufd_viommu *viommu = vdev->viommu;
+	struct amd_iommu_viommu *aviommu = container_of(viommu, struct amd_iommu_viommu, core);
+
+	if (!pdev) {
+		pr_err("%s: not a PCI device\n", __func__);
+		return -EINVAL;
+	}
+
+	dev_data = dev_iommu_priv_get(&pdev->dev);
+	if (!dev_data) {
+		pr_err("%s: Device not found (devid=%#x)\n",
+		       __func__, pci_dev_id(pdev));
+		return -EINVAL;
+	}
+
+	dev_data->gid = aviommu->gid;
+	dev_data->gDevId = vdev->virt_id;
+	pr_debug("%s: gid=%#x, hdev_id=%#x, gdev_id=%#x\n", __func__,
+			 dev_data->gid, pci_dev_id(pdev), dev_data->gDevId);
+
+	return 0;
+}
+
 /*
  * See include/linux/iommufd.h
  * struct iommufd_viommu_ops - vIOMMU specific operations
@@ -134,4 +165,6 @@ static void amd_iommufd_viommu_destroy(struct iommufd_viommu *viommu)
 static const struct iommufd_viommu_ops amd_viommu_ops = {
 	.alloc_domain_nested = amd_iommu_alloc_domain_nested,
 	.destroy = amd_iommufd_viommu_destroy,
+	.vdevice_size = VDEVICE_STRUCT_SIZE(struct amd_iommu_vdevice, core),
+	.vdevice_init = _amd_viommu_vdevice_init,
 };
diff --git a/drivers/iommu/amd/nested.c b/drivers/iommu/amd/nested.c
index 15dc57cf7c5f..08f74ebb3523 100644
--- a/drivers/iommu/amd/nested.c
+++ b/drivers/iommu/amd/nested.c
@@ -227,6 +227,18 @@ static void set_dte_nested(struct amd_iommu *iommu, struct iommu_domain *dom,
 
 	/* Guest paging mode */
 	new->data[2] |= gdte->dte[2] & DTE_GPT_LEVEL_MASK;
+
+	/* vImuEn */
+	new->data[3] |= 1ULL << DTE_VIOMMU_EN_SHIFT;
+
+	/* GDeviceID */
+	new->data[3] |= FIELD_PREP(DTE_VIOMMU_GDEVICEID_MASK,
+				   dev_data->gDevId);
+
+	/* GuestID */
+	new->data[3] |= FIELD_PREP(DTE_VIOMMU_GUESTID_MASK,
+				   dev_data->gid);
+
 }
 
 static int nested_attach_device(struct iommu_domain *dom, struct device *dev,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 15/26] iommu/amd: Introduce helper function for updating domain ID mapping table
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (13 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 14/26] iommu/amd: Introduce IOMMUFD vDevice support for AMD Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 16/26] iommu/amd: Introduce helper function for updating device " Suravee Suthikulpanit
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

AMD vIOMMU hardware uses the Domain ID mapping table to map Guest Domain ID
(GDomID) to Host Domain ID when it virtualises guest IOMMU commands.
It uses GID and GDomID to index into the table to look up host domain ID.

Linux IOMMU driver programs the table entry using VFCntlMMIO Guest Domain
Map Control Register.

Introduce amd_viommu_domain_id_update(), which is used to set the entry
when attaching the nested device. Clearing the entry is done during VM
destroy.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_viommu.h |  2 ++
 drivers/iommu/amd/nested.c     |  5 ++++
 drivers/iommu/amd/viommu.c     | 46 ++++++++++++++++++++++++++++++++++
 3 files changed, 53 insertions(+)

diff --git a/drivers/iommu/amd/amd_viommu.h b/drivers/iommu/amd/amd_viommu.h
index 8b57717c22a6..b6fd5ffc3b82 100644
--- a/drivers/iommu/amd/amd_viommu.h
+++ b/drivers/iommu/amd/amd_viommu.h
@@ -18,6 +18,8 @@ int amd_viommu_init_one(struct amd_iommu *iommu, struct amd_iommu_viommu *viommu
 
 void amd_viommu_uninit_one(struct amd_iommu *iommu, struct amd_iommu_viommu *viommu);
 
+int amd_viommu_domain_id_update(struct amd_iommu *iommu, u16 gid,
+				u16 hdom_id, u16 gdom_id);
 #else
 
 static inline int amd_viommu_init(struct amd_iommu *iommu)
diff --git a/drivers/iommu/amd/nested.c b/drivers/iommu/amd/nested.c
index 08f74ebb3523..ce16c8404c28 100644
--- a/drivers/iommu/amd/nested.c
+++ b/drivers/iommu/amd/nested.c
@@ -10,6 +10,7 @@
 #include <uapi/linux/iommufd.h>
 
 #include "amd_iommu.h"
+#include "amd_viommu.h"
 
 static const struct iommu_domain_ops nested_domain_ops;
 
@@ -245,6 +246,7 @@ static int nested_attach_device(struct iommu_domain *dom, struct device *dev,
 				struct iommu_domain *old)
 {
 	struct dev_table_entry new = {0};
+	struct nested_domain *ndom = to_ndomain(dom);
 	struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
 	struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
 	int ret = 0;
@@ -262,6 +264,9 @@ static int nested_attach_device(struct iommu_domain *dom, struct device *dev,
 
 	amd_iommu_update_dte(iommu, dev_data, &new);
 
+	ret = amd_viommu_domain_id_update(iommu, dev_data->gid,
+					  ndom->gdom_info->hdom_id, ndom->gdom_id);
+
 	mutex_unlock(&dev_data->mutex);
 
 	return ret;
diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
index 3636093732ce..3f7c17d400d0 100644
--- a/drivers/iommu/amd/viommu.c
+++ b/drivers/iommu/amd/viommu.c
@@ -40,6 +40,8 @@
 #define VIOMMU_DOMID_MAPPING_BASE	0x2000000000ULL
 #define VIOMMU_DOMID_MAPPING_ENTRY_SIZE	(1 << 19)
 
+#define VIOMMU_VFCTRL_GUEST_DID_MAP_CONTROL1_OFFSET	0x08
+
 LIST_HEAD(viommu_devid_map);
 
 static int viommu_init_pci_vsc(struct amd_iommu *iommu)
@@ -367,6 +369,22 @@ static void __maybe_unused free_private_vm_region(struct amd_iommu *iommu, u64 *
 	*entry = NULL;
 }
 
+static void viommu_clear_mapping(struct amd_iommu *iommu,
+				 struct amd_iommu_viommu *aviommu)
+{
+	int i;
+	u16 gid = aviommu->gid;
+
+	/*
+	 * IOMMU hardware uses the domain ID mapping table to map gdom ID to hdom ID.
+	 * If the mapping does not exist, the hardware would generate error in the event log.
+	 * Therefore, initialize all gdom ID entries to map to parent domain ID to prevent
+	 * unknown mapping scenario.
+	 */
+	for (i = 0; i <= VIOMMU_MAX_GDOMID; i++)
+		amd_viommu_domain_id_update(iommu, gid, aviommu->parent->id, i);
+}
+
 void amd_viommu_uninit_one(struct amd_iommu *iommu, struct amd_iommu_viommu *aviommu)
 {
 	pr_debug("%s: gid=%u\n", __func__, aviommu->gid);
@@ -379,6 +397,7 @@ void amd_viommu_uninit_one(struct amd_iommu *iommu, struct amd_iommu_viommu *avi
 			       VIOMMU_DOMID_MAPPING_BASE,
 			       VIOMMU_DOMID_MAPPING_ENTRY_SIZE,
 			       aviommu->gid);
+	viommu_clear_mapping(iommu, aviommu);
 }
 
 int amd_viommu_init_one(struct amd_iommu *iommu, struct amd_iommu_viommu *viommu)
@@ -399,8 +418,35 @@ int amd_viommu_init_one(struct amd_iommu *iommu, struct amd_iommu_viommu *viommu
 	if (ret)
 		goto err_out;
 
+	viommu_clear_mapping(iommu, viommu);
+
 	return 0;
 err_out:
 	amd_viommu_uninit_one(iommu, viommu);
 	return -ENOMEM;
 }
+
+/*
+ * Program the DomID via VFCTRL registers
+ * This function will be called during VM init via VFIO.
+ */
+
+ #define DOMID_ENTRY_GDOMID_MASK	GENMASK_ULL(61, 46)
+ #define DOMID_ENTRY_HDOMID_MASK	GENMASK_ULL(29, 14)
+ #define DOMID_ENTRY_VALID		BIT_ULL(0)
+ #define DOMID_ENTRY_WRITE		BIT_ULL(63)
+
+int amd_viommu_domain_id_update(struct amd_iommu *iommu, u16 gid,
+				u16 hdom_id, u16 gdom_id)
+{
+	u64 val;
+	u8 __iomem *vfctrl = VIOMMU_VFCTRL_MMIO_BASE(iommu, gid);
+
+	val = FIELD_PREP(DOMID_ENTRY_GDOMID_MASK, gdom_id) |
+	      FIELD_PREP(DOMID_ENTRY_HDOMID_MASK, hdom_id) |
+	      DOMID_ENTRY_WRITE | DOMID_ENTRY_VALID;
+
+	writeq(val, vfctrl + VIOMMU_VFCTRL_GUEST_DID_MAP_CONTROL1_OFFSET);
+	return 0;
+}
+EXPORT_SYMBOL(amd_viommu_domain_id_update);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 16/26] iommu/amd: Introduce helper function for updating device ID mapping table
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (14 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 15/26] iommu/amd: Introduce helper function for updating domain ID mapping table Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 17/26] iommu/amd: Pass KVM FD from userspace when initializing vIOMMU Suravee Suthikulpanit
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

AMD vIOMMU hardware uses the Device ID mapping table to map Guest Device ID
(GDevID) to Host Device ID when it virtualises guest IOMMU commands.
It uses GID and GDevID to indexe into the table to look up host device ID.

Linux IOMMU driver programs the table entry using VFCntlMMIO Guest Device
Map Control Register.

Introduce amd_viommu_set/clear_device_mapping(), which are used to set
the entry when initialize the IOMMUFD vDevice.  Clearing the entry is
done during VM destroy.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_viommu.h |  8 ++++++
 drivers/iommu/amd/iommufd.c    |  3 ++
 drivers/iommu/amd/viommu.c     | 52 ++++++++++++++++++++++++++++++++++
 3 files changed, 63 insertions(+)

diff --git a/drivers/iommu/amd/amd_viommu.h b/drivers/iommu/amd/amd_viommu.h
index b6fd5ffc3b82..3a8f41baaab9 100644
--- a/drivers/iommu/amd/amd_viommu.h
+++ b/drivers/iommu/amd/amd_viommu.h
@@ -20,6 +20,9 @@ void amd_viommu_uninit_one(struct amd_iommu *iommu, struct amd_iommu_viommu *vio
 
 int amd_viommu_domain_id_update(struct amd_iommu *iommu, u16 gid,
 				u16 hdom_id, u16 gdom_id);
+
+void amd_viommu_set_device_mapping(struct amd_iommu *iommu, u16 hDevId,
+				   u16 guestId, u16 gDevId);
 #else
 
 static inline int amd_viommu_init(struct amd_iommu *iommu)
@@ -45,6 +48,11 @@ static inline void amd_viommu_uninit_one(struct amd_iommu *iommu, struct amd_iom
 {
 }
 
+static inline void amd_viommu_set_device_mapping(struct amd_iommu *iommu, u16 hDevId,
+						 u16 guestId, u16 gDevId)
+{
+}
+
 #endif /* CONFIG_AMD_IOMMU_IOMMUFD */
 
 #endif /* AMD_VIOMMU_H */
diff --git a/drivers/iommu/amd/iommufd.c b/drivers/iommu/amd/iommufd.c
index eb43d02371a8..6020f2caf445 100644
--- a/drivers/iommu/amd/iommufd.c
+++ b/drivers/iommu/amd/iommufd.c
@@ -137,6 +137,7 @@ static int _amd_viommu_vdevice_init(struct iommufd_vdevice *vdev)
 	struct pci_dev *pdev = to_pci_dev(vdev->idev->dev);
 	struct iommufd_viommu *viommu = vdev->viommu;
 	struct amd_iommu_viommu *aviommu = container_of(viommu, struct amd_iommu_viommu, core);
+	struct amd_iommu *iommu = container_of(viommu->iommu_dev, struct amd_iommu, iommu);
 
 	if (!pdev) {
 		pr_err("%s: not a PCI device\n", __func__);
@@ -155,6 +156,8 @@ static int _amd_viommu_vdevice_init(struct iommufd_vdevice *vdev)
 	pr_debug("%s: gid=%#x, hdev_id=%#x, gdev_id=%#x\n", __func__,
 			 dev_data->gid, pci_dev_id(pdev), dev_data->gDevId);
 
+	amd_viommu_set_device_mapping(iommu, pci_dev_id(pdev), dev_data->gid, dev_data->gDevId);
+
 	return 0;
 }
 
diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
index 3f7c17d400d0..b3150f7bcec3 100644
--- a/drivers/iommu/amd/viommu.c
+++ b/drivers/iommu/amd/viommu.c
@@ -40,6 +40,7 @@
 #define VIOMMU_DOMID_MAPPING_BASE	0x2000000000ULL
 #define VIOMMU_DOMID_MAPPING_ENTRY_SIZE	(1 << 19)
 
+#define VIOMMU_VFCTRL_GUEST_DID_MAP_CONTROL0_OFFSET	0x00
 #define VIOMMU_VFCTRL_GUEST_DID_MAP_CONTROL1_OFFSET	0x08
 
 LIST_HEAD(viommu_devid_map);
@@ -369,6 +370,53 @@ static void __maybe_unused free_private_vm_region(struct amd_iommu *iommu, u64 *
 	*entry = NULL;
 }
 
+#define DEVID_ENTRY_GDEVID_MASK		GENMASK_ULL(61, 46)
+#define DEVID_ENTRY_HDEVID_MASK		GENMASK_ULL(29, 14)
+#define DEVID_ENTRY_WRITE		BIT_ULL(63)
+#define DEVID_ENTRY_VALID		BIT_ULL(0)
+
+/*
+ * Program the DevID via VFCTRL registers
+ * This function will be called during VM init via VFIO.
+ */
+void amd_viommu_set_device_mapping(struct amd_iommu *iommu, u16 hDevId,
+				   u16 guestId, u16 gDevId)
+{
+	u64 val;
+	u8 __iomem *vfctrl;
+
+	pr_debug("%s: iommu_devid=%#x, gid=%#x, hDevId=%#x, gDevId=%#x\n",
+		__func__, pci_dev_id(iommu->dev), guestId, hDevId, gDevId);
+
+	val = FIELD_PREP(DEVID_ENTRY_GDEVID_MASK, gDevId) |
+	      FIELD_PREP(DEVID_ENTRY_HDEVID_MASK, hDevId) |
+	      DEVID_ENTRY_WRITE | DEVID_ENTRY_VALID;
+
+	vfctrl = VIOMMU_VFCTRL_MMIO_BASE(iommu, guestId);
+
+	writeq(val, vfctrl + VIOMMU_VFCTRL_GUEST_DID_MAP_CONTROL0_OFFSET);
+}
+
+/*
+ * Clear the DevID via VFCTRL registers
+ * This function will be called during VM destroy via VFIO.
+ */
+static void clear_device_mapping(struct amd_iommu *iommu, u16 guestId, u16 gDevId)
+{
+	u64 val;
+	u8 __iomem *vfctrl;
+
+	/*
+	 * Clear the DevID in VFCTRL registers
+	 */
+	val = FIELD_PREP(DEVID_ENTRY_GDEVID_MASK, gDevId) |
+	      FIELD_PREP(DEVID_ENTRY_HDEVID_MASK, 0) |
+	      DEVID_ENTRY_WRITE | DEVID_ENTRY_VALID;
+
+	vfctrl = VIOMMU_VFCTRL_MMIO_BASE(iommu, guestId);
+	writeq(val, vfctrl + VIOMMU_VFCTRL_GUEST_DID_MAP_CONTROL0_OFFSET);
+}
+
 static void viommu_clear_mapping(struct amd_iommu *iommu,
 				 struct amd_iommu_viommu *aviommu)
 {
@@ -383,6 +431,10 @@ static void viommu_clear_mapping(struct amd_iommu *iommu,
 	 */
 	for (i = 0; i <= VIOMMU_MAX_GDOMID; i++)
 		amd_viommu_domain_id_update(iommu, gid, aviommu->parent->id, i);
+
+	for (i = 0; i <= VIOMMU_MAX_GDEVID; i++)
+		clear_device_mapping(iommu, gid, i);
+
 }
 
 void amd_viommu_uninit_one(struct amd_iommu *iommu, struct amd_iommu_viommu *aviommu)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 17/26] iommu/amd: Pass KVM FD from userspace when initializing vIOMMU
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (15 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 16/26] iommu/amd: Introduce helper function for updating device " Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 18/26] iommu/amd: Add translation DTE and VFctrl TransDevID helpers Suravee Suthikulpanit
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

AMD HW-vIOMMU feature requires IOMMU driver to keep track of each guest.
This information is used by AMD IOMMU driver to manage the following:

* GID: Each  guest is assigned a unique 16-bit Guest ID (GID), which is
used to index into various data structures for configuring the hardware.

* Translation-devid: Each guest has one GPA->SPA mapping, which requires
one trans_devid to program the v1 page table in the DTE[trans_devid].

* Extended interrupt remapping: The AMD HW-vIOMMU uses AVIC to implement
extended interrupt remapping to virtualize Event/PPR log interrupts.
A KVM reference is neeed to get the AVIC vm_id.

The KVM FD can be used for these purposes.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |  2 ++
 drivers/iommu/amd/iommufd.c         | 27 ++++++++++++++++++++++++++-
 include/uapi/linux/iommufd.h        |  2 ++
 3 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 836d5cd08134..e9b49f0b9051 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -547,6 +547,8 @@ struct amd_iommu_viommu {
 	struct protection_domain *parent; /* nest parent domain for this viommu */
 	struct list_head pdom_list;	  /* For protection_domain->viommu_list */
 	u16 gid;			  /* Guest ID for the vIOMMU */
+	u32 kvmfd;			  /* KVM fd from VMM */
+	void *kvm;			  /* Hold struct kvm pointer */
 
 	/*
 	 * Per-vIOMMU guest domain ID to host domain ID mapping.
diff --git a/drivers/iommu/amd/iommufd.c b/drivers/iommu/amd/iommufd.c
index 6020f2caf445..7c7ef267088b 100644
--- a/drivers/iommu/amd/iommufd.c
+++ b/drivers/iommu/amd/iommufd.c
@@ -4,6 +4,8 @@
  */
 
 #include <linux/iommu.h>
+#include <linux/file.h>
+#include <linux/amd-iommu.h>
 
 #include "iommufd.h"
 #include "amd_iommu.h"
@@ -42,13 +44,27 @@ size_t amd_iommufd_get_viommu_size(struct device *dev, enum iommu_viommu_type vi
 	return VIOMMU_STRUCT_SIZE(struct amd_iommu_viommu, core);
 }
 
+static void *get_kvm_handler(u32 kvmfd)
+{
+	struct fd f;
+
+	f = fdget(kvmfd);
+
+	if (fd_empty(f)) {
+		pr_warn("%s: fdget failed\n", __func__);
+		return NULL;
+	}
+
+	return fd_file(f)->private_data;
+}
+
 int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *parent,
 			    const struct iommu_user_data *user_data)
 {
 	int ret;
 	phys_addr_t page_base;
 	unsigned long flags;
-	struct iommu_viommu_amd data;
+	struct iommu_viommu_amd data = {};
 	struct protection_domain *pdom = to_pdomain(parent);
 	struct amd_iommu *iommu = container_of(viommu->iommu_dev, struct amd_iommu, iommu);
 	struct amd_iommu_viommu *aviommu = container_of(viommu, struct amd_iommu_viommu, core);
@@ -81,6 +97,13 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
 
 	aviommu->vfmmio_mmap_offset = data.out_vfmmio_mmap_offset;
 
+	aviommu->kvm = get_kvm_handler(data.kvmfd);
+	if (aviommu->kvm == NULL) {
+		pr_err("%s: Failed to get KVM handler (kvmfd=%#x)\n", __func__, data.kvmfd);
+		ret = -EINVAL;
+		goto err_kvmfd;
+	}
+
 	/* Reset vIOMMU MMIOs to initialize the vIOMMU */
 	iommu_reset_vmmio(iommu, aviommu->gid);
 
@@ -94,6 +117,7 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
 	if (ret)
 		goto err_init;
 
+	aviommu->kvmfd = data.kvmfd;
 	viommu->ops = &amd_viommu_ops;
 
 	spin_lock_irqsave(&pdom->lock, flags);
@@ -102,6 +126,7 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
 
 	return 0;
 err_init:
+err_kvmfd:
 	iommufd_viommu_destroy_mmap(&aviommu->core, aviommu->vfmmio_mmap_offset);
 err_mmap:
 	amd_iommu_gid_free(iommu, aviommu->gid);
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 0b7a3e5b057c..69aa44bba95a 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -1085,10 +1085,12 @@ struct iommu_viommu_tegra241_cmdqv {
 /**
  * struct iommu_viommu_amd - AMD vIOMMU Interface (IOMMU_VIOMMU_TYPE_AMD)
  * @out_vfmmio_mmap_offset: (out) mmap offset for vIOMMU VF-MMIO
+ * @kvmfd: KVM FD handler
  * @reserved: Must be zero
  */
 struct iommu_viommu_amd {
 	__aligned_u64 out_vfmmio_mmap_offset;
+	__u32 kvmfd;
 	__u32 reserved; /* must be last */
 };
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 18/26] iommu/amd: Add translation DTE and VFctrl TransDevID helpers
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (16 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 17/26] iommu/amd: Pass KVM FD from userspace when initializing vIOMMU Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-06-01 13:31   ` Jason Gunthorpe
  2026-05-28  5:17 ` [PATCH v2 19/26] iommu/amd: Add per-segment translate device ID pool Suravee Suthikulpanit
                   ` (8 subsequent siblings)
  26 siblings, 1 reply; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

The hardware vIOMMU uses a per-VM translate device ID (TransDevID) to
index the host device table when guest IOMMU traffic needs GPA->SPA
translation.  The VF Control guest miscellaneous register tells the
IOMMU which TransDevID to use; the corresponding device table entry
(DTE) points at the nested IOMMU v1 page table that performs the walk
from guest physical to system physical addresses.

Add amd_iommu_set_translate_dte() and amd_iommu_clear_translate_dte()
to install or clear that DTE for a given TransDevID slot, using the
AMDv1 page table root and mode from the protection domain.  Add
amd_iommu_update_vfctrl_mmio_translate_devid() to publish the
TransDevID in VFctrl guest-misc MMIO, and
VIOMMU_VFCTRL_GUEST_MISC_CONTROL_OFFSET for the register offset.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       |  7 ++++
 drivers/iommu/amd/amd_iommu_types.h |  1 +
 drivers/iommu/amd/iommu.c           | 51 +++++++++++++++++++++++++++++
 3 files changed, 59 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 1d8727c16840..d1640181b292 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -216,6 +216,13 @@ void amd_iommu_update_dte(struct amd_iommu *iommu,
 			  struct dev_table_entry *new);
 int amd_iommu_completion_wait(struct amd_iommu *iommu);
 
+void amd_iommu_set_translate_dte(struct amd_iommu *iommu, u16 gid,
+				 struct protection_domain *pdom,
+				 u32 devid);
+void amd_iommu_clear_translate_dte(struct amd_iommu *iommu, u16 gid, u32 devid);
+void amd_iommu_update_vfctrl_mmio_translate_devid(struct amd_iommu *iommu,
+						  u16 gid, u32 trans_devid);
+
 static inline void
 amd_iommu_make_clear_dte(struct iommu_dev_data *dev_data, struct dev_table_entry *new)
 {
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index e9b49f0b9051..2d7bc791dbd9 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -495,6 +495,7 @@ extern bool amdr_ivrs_remap_support;
 /* VIOMMU stuff */
 #define VIOMMU_VF_MMIO_ENTRY_SIZE		4096
 #define VIOMMU_VFCTRL_MMIO_ENTRY_SIZE		64
+#define VIOMMU_VFCTRL_GUEST_MISC_CONTROL_OFFSET	0x10
 
 /* Host ioremap/request_mem_region sizes for VF / VF_CNTL BARs */
 #define VIOMMU_VF_MMIO_MAP_SIZE		0x10000000UL
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 44d20e598d85..6c4c4f62ddde 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3242,6 +3242,57 @@ static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain)
 	return true;
 }
 
+#if IS_ENABLED(CONFIG_AMD_IOMMU_IOMMUFD)
+
+void amd_iommu_update_vfctrl_mmio_translate_devid(struct amd_iommu *iommu,
+						  u16 gid, u32 devid)
+{
+	writeq((devid & 0xFFFFULL) << 16,
+	       VIOMMU_VFCTRL_MMIO_BASE(iommu, gid) +
+	       VIOMMU_VFCTRL_GUEST_MISC_CONTROL_OFFSET);
+}
+
+void amd_iommu_set_translate_dte(struct amd_iommu *iommu, u16 gid,
+				 struct protection_domain *pdom,
+				 u32 devid)
+{
+	u64 tmp0 = 0ULL, tmp1 = 0ULL;
+	struct pt_iommu_amdv1_hw_info pt_info;
+	struct dev_table_entry *dev_table = get_dev_table(iommu);
+
+	pt_iommu_amdv1_hw_info(&pdom->amdv1, &pt_info);
+
+	pr_debug("%s: gid=%#x, iommu_devid=%#x, devid=%#x, host_pt_root=%#llx, mode=%#x\n",
+		 __func__, gid, iommu->devid, devid, pt_info.host_pt_root, pt_info.mode);
+
+	/* Setup DTE for v1 page table at the offset specified by devid */
+	tmp0 |=	FIELD_PREP(DTE_HOST_TRP, pt_info.host_pt_root >> 12);
+	tmp0 |= FIELD_PREP(DTE_MODE_MASK, pt_info.mode);
+	tmp0 |= (DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_TV | DTE_FLAG_V);
+	tmp1 |= FIELD_PREP(DTE_DOMID_MASK, pdom->id);
+
+	dev_table[devid].data[0] = tmp0;
+	dev_table[devid].data[1] = tmp1;
+
+	iommu_flush_dte(iommu, devid);
+	amd_iommu_completion_wait(iommu);
+}
+
+void amd_iommu_clear_translate_dte(struct amd_iommu *iommu, u16 gid, u32 devid)
+{
+	struct dev_table_entry *dev_table = get_dev_table(iommu);
+
+	pr_debug("%s: gid=%#x, iommu_devid=%#x, devid=%#x\n",
+		 __func__, gid, iommu->devid, devid);
+
+	dev_table[devid].data[0] = 0ULL;
+	dev_table[devid].data[1] = 0ULL;
+
+	iommu_flush_dte(iommu, devid);
+	amd_iommu_completion_wait(iommu);
+}
+#endif /* CONFIG_AMD_IOMMU_IOMMUFD */
+
 const struct iommu_ops amd_iommu_ops = {
 	.capable = amd_iommu_capable,
 	.hw_info = amd_iommufd_hw_info,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 19/26] iommu/amd: Add per-segment translate device ID pool
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (17 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 18/26] iommu/amd: Add translation DTE and VFctrl TransDevID helpers Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 20/26] iommu/amd: Reserve translate-device-id for PCI requestor aliases Suravee Suthikulpanit
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Track translate-device-id slots per PCI segment so real PCI device IDs can
be reserved for normal DTE programming and excluded from dynamic allocation
for vIOMMU translation DTEs.

Add amd_iommu_pci_seg_trans_devid_init/fini() during segment setup and
teardown, amd_iommu_trans_devid_reserve() for attach-time reservation, and
trans_devid.c implementing the xarray-backed state machine.

Call the reserve hook from amd_iommu_attach_device() before programming
the DTE.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/Makefile          |  2 +-
 drivers/iommu/amd/amd_iommu.h       | 12 +++++
 drivers/iommu/amd/amd_iommu_types.h | 17 ++++++
 drivers/iommu/amd/init.c            |  3 ++
 drivers/iommu/amd/iommu.c           | 12 +++++
 drivers/iommu/amd/trans_devid.c     | 80 +++++++++++++++++++++++++++++
 6 files changed, 125 insertions(+), 1 deletion(-)
 create mode 100644 drivers/iommu/amd/trans_devid.c

diff --git a/drivers/iommu/amd/Makefile b/drivers/iommu/amd/Makefile
index e1e824b9c7b0..12c3fe83e4ce 100644
--- a/drivers/iommu/amd/Makefile
+++ b/drivers/iommu/amd/Makefile
@@ -1,4 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-y += iommu.o init.o quirks.o ppr.o pasid.o
-obj-$(CONFIG_AMD_IOMMU_IOMMUFD) += iommufd.o nested.o viommu.o
+obj-$(CONFIG_AMD_IOMMU_IOMMUFD) += iommufd.o nested.o viommu.o trans_devid.o
 obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += debugfs.o
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index d1640181b292..d411bc326241 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -216,6 +216,18 @@ void amd_iommu_update_dte(struct amd_iommu *iommu,
 			  struct dev_table_entry *new);
 int amd_iommu_completion_wait(struct amd_iommu *iommu);
 
+/* Per-segment translate-device-id pool (CONFIG_AMD_IOMMU_IOMMUFD) */
+#ifdef CONFIG_AMD_IOMMU_IOMMUFD
+void amd_iommu_pci_seg_trans_devid_init(struct amd_iommu_pci_seg *pci_seg);
+void amd_iommu_pci_seg_trans_devid_fini(struct amd_iommu_pci_seg *pci_seg);
+int amd_iommu_trans_devid_reserve(struct amd_iommu_pci_seg *pci_seg, u16 id);
+#else
+static inline void
+amd_iommu_pci_seg_trans_devid_init(struct amd_iommu_pci_seg *pci_seg) { }
+static inline void
+amd_iommu_pci_seg_trans_devid_fini(struct amd_iommu_pci_seg *pci_seg) { }
+#endif
+
 void amd_iommu_set_translate_dte(struct amd_iommu *iommu, u16 gid,
 				 struct protection_domain *pdom,
 				 u32 devid);
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 2d7bc791dbd9..ffa338c8735f 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -615,6 +615,14 @@ PT_IOMMU_CHECK_DOMAIN(struct protection_domain, iommu, domain);
 PT_IOMMU_CHECK_DOMAIN(struct protection_domain, amdv1.iommu, domain);
 PT_IOMMU_CHECK_DOMAIN(struct protection_domain, amdv2.iommu, domain);
 
+#ifdef CONFIG_AMD_IOMMU_IOMMUFD
+enum trans_devid_state {
+	TRANS_DEVID_FREE = 0,
+	TRANS_DEVID_RESERVED,
+	TRANS_DEVID_ALLOCATED,
+};
+#endif
+
 /*
  * This structure contains information about one PCI segment in the system.
  */
@@ -676,6 +684,15 @@ struct amd_iommu_pci_seg {
 	 * parsing time.
 	 */
 	struct list_head unity_map;
+
+#ifdef CONFIG_AMD_IOMMU_IOMMUFD
+	/*
+	 * Per-segment translate-device-id allocation. The xarray is indexed by
+	 * the translate-device-id. The value is the state (enum trans_devid_state).
+	 */
+	struct mutex trans_devid_mutex;
+	struct xarray trans_devid_xa;
+#endif
 };
 
 /*
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 6e69b3dd8b1e..622bc0337eda 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1737,6 +1737,8 @@ static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id,
 	if (alloc_rlookup_table(pci_seg))
 		goto err_free_alias_table;
 
+	amd_iommu_pci_seg_trans_devid_init(pci_seg);
+
 	return pci_seg;
 
 err_free_alias_table:
@@ -1768,6 +1770,7 @@ static void __init free_pci_segments(void)
 
 	for_each_pci_segment_safe(pci_seg, next) {
 		list_del(&pci_seg->list);
+		amd_iommu_pci_seg_trans_devid_fini(pci_seg);
 		free_irq_lookup_table(pci_seg);
 		free_rlookup_table(pci_seg);
 		free_alias_table(pci_seg);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 6c4c4f62ddde..2600af84c8ca 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3055,6 +3055,18 @@ static int amd_iommu_attach_device(struct iommu_domain *dom, struct device *dev,
 	if (dom->dirty_ops && !amd_iommu_hd_support(iommu))
 		return -EINVAL;
 
+#if IS_ENABLED(CONFIG_AMD_IOMMU_IOMMUFD)
+	/* Translate-device-id reservation must be done before setting up
+	 * the DTE for the device to make sure that the id has not been allocated
+	 * yet. (See amd_iommu_trans_devid_alloc().)
+	 */
+	ret = amd_iommu_trans_devid_reserve(iommu->pci_seg, dev_data->devid);
+	if (ret) {
+		pr_err("%s: Failed to reserve device id %#x\n", __func__, dev_data->devid);
+		return ret;
+	}
+#endif
+
 	if (dev_data->domain)
 		detach_device(dev);
 
diff --git a/drivers/iommu/amd/trans_devid.c b/drivers/iommu/amd/trans_devid.c
new file mode 100644
index 000000000000..f15cbaae9118
--- /dev/null
+++ b/drivers/iommu/amd/trans_devid.c
@@ -0,0 +1,80 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2025 Advanced Micro Devices, Inc.
+ *
+ * AMD vIOMMU translate-device-id pool per PCI segment.
+ */
+
+#include <linux/kernel.h>
+#include <linux/xarray.h>
+
+#include "amd_iommu.h"
+
+static inline enum trans_devid_state trans_devid_xa_get_state(void *entry)
+{
+	if (!entry)
+		return TRANS_DEVID_FREE;
+	if (WARN_ON_ONCE(!xa_is_value(entry)))
+		return TRANS_DEVID_FREE;
+	return (enum trans_devid_state)xa_to_value(entry);
+}
+
+static inline void *trans_devid_xa_mk_state(enum trans_devid_state s)
+{
+	return xa_mk_value((unsigned long)s);
+}
+
+void amd_iommu_pci_seg_trans_devid_init(struct amd_iommu_pci_seg *pci_seg)
+{
+	mutex_init(&pci_seg->trans_devid_mutex);
+	xa_init(&pci_seg->trans_devid_xa);
+}
+
+void amd_iommu_pci_seg_trans_devid_fini(struct amd_iommu_pci_seg *pci_seg)
+{
+	xa_destroy(&pci_seg->trans_devid_xa);
+}
+
+/**
+ * amd_iommu_trans_devid_reserve - occupy @id so it is never returned by alloc
+ *
+ * Reservation is done when attaching device to a domain (see amd_iommu_attach_device()).
+ *
+ * Note: Since PCI hot-plug devices are enumerated during runtime, they could clash
+ * with the translate-device-id allocation. In such case, amd_iommu_trans_devid_reserve()
+ * could fail with %-EBUSY. This can be avoided by reserving the hot-plug id range if it
+ * is known in advance.
+ *
+ * Return: 0 on success, %-EBUSY if @id is already allocated. A second reserve of
+ * an already-reserved @id succeeds.
+ */
+int amd_iommu_trans_devid_reserve(struct amd_iommu_pci_seg *pci_seg, u16 id)
+{
+	void *entry, *old;
+	int ret = 0;
+
+	mutex_lock(&pci_seg->trans_devid_mutex);
+	entry = xa_load(&pci_seg->trans_devid_xa, id);
+	switch (trans_devid_xa_get_state(entry)) {
+	case TRANS_DEVID_ALLOCATED:
+		ret = -EBUSY;
+		break;
+	case TRANS_DEVID_RESERVED:
+		break;
+	case TRANS_DEVID_FREE:
+		old = xa_store(&pci_seg->trans_devid_xa, id,
+			       trans_devid_xa_mk_state(TRANS_DEVID_RESERVED), GFP_KERNEL);
+		if (xa_is_err(old)) {
+			ret = xa_err(old);
+			break;
+		}
+		WARN_ON_ONCE(old);
+		break;
+	}
+	mutex_unlock(&pci_seg->trans_devid_mutex);
+
+	if (!ret)
+		pr_debug("%s: Reserved trans_devid %#x (seg %#x)\n", __func__, id,
+			 pci_seg->id);
+	return ret;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 20/26] iommu/amd: Reserve translate-device-id for PCI requestor aliases
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (18 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 19/26] iommu/amd: Add per-segment translate device ID pool Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 21/26] iommu/amd: Map kvmfd to shared translate device ID for vIOMMU Suravee Suthikulpanit
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

The per-segment translate-device-id (trans_devid) pool hands out numeric
device-table indices used by the AMD vIOMMU / iommufd path (e.g. mapping a
KVM file descriptor to a shared trans_devid).  amd_iommu_attach_device()
already calls amd_iommu_trans_devid_reserve() for the struct device's own
PCI BDF so that id cannot later be returned by the allocator.

That is not sufficient on its own.  The AMD IOMMU driver programs identical
DMA translation device-table entries (DTEs) for every requestor ID that can
issue DMA on behalf of the same PCI function: the IVRS alias from
alias_table[] when it is not covered by the PCI DMA-alias walk (different
bus than the device), and every alias visited by pci_for_each_dma_alias().
Those alternate BDFs are not separate struct device attach targets, so they
never received a trans_devid reservation and could in principle collide
with a dynamically allocated trans_devid.

Introduce amd_iommu_trans_devid_reserve_pci_aliases() in trans_devid.c and
invoke it from amd_iommu_attach_device() immediately after the primary
amd_iommu_trans_devid_reserve() succeeds.  For PCI devices the helper
reserves the IVRS alias when it differs from the device BDF, then walks
pci_for_each_dma_alias() and reserves each alias BDF.  Repeated attach and
overlap with the primary BDF in the PCI walk are handled by the existing
idempotency of amd_iommu_trans_devid_reserve() (a second reserve of an
already-reserved id succeeds).

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu.h   |  2 ++
 drivers/iommu/amd/iommu.c       |  7 +++++
 drivers/iommu/amd/trans_devid.c | 45 +++++++++++++++++++++++++++++++++
 3 files changed, 54 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index d411bc326241..ddfc6329d235 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -221,6 +221,8 @@ int amd_iommu_completion_wait(struct amd_iommu *iommu);
 void amd_iommu_pci_seg_trans_devid_init(struct amd_iommu_pci_seg *pci_seg);
 void amd_iommu_pci_seg_trans_devid_fini(struct amd_iommu_pci_seg *pci_seg);
 int amd_iommu_trans_devid_reserve(struct amd_iommu_pci_seg *pci_seg, u16 id);
+int amd_iommu_trans_devid_reserve_pci_aliases(struct amd_iommu *iommu,
+					      struct device *dev);
 #else
 static inline void
 amd_iommu_pci_seg_trans_devid_init(struct amd_iommu_pci_seg *pci_seg) { }
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 2600af84c8ca..2f9ca8f2d3c6 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3065,6 +3065,13 @@ static int amd_iommu_attach_device(struct iommu_domain *dom, struct device *dev,
 		pr_err("%s: Failed to reserve device id %#x\n", __func__, dev_data->devid);
 		return ret;
 	}
+
+	ret = amd_iommu_trans_devid_reserve_pci_aliases(iommu, dev);
+	if (ret) {
+		pr_err("%s: Failed to reserve translate devid for alias of %#x (err %d)\n",
+		       __func__, dev_data->devid, ret);
+		return ret;
+	}
 #endif
 
 	if (dev_data->domain)
diff --git a/drivers/iommu/amd/trans_devid.c b/drivers/iommu/amd/trans_devid.c
index f15cbaae9118..76450d1735e2 100644
--- a/drivers/iommu/amd/trans_devid.c
+++ b/drivers/iommu/amd/trans_devid.c
@@ -6,6 +6,7 @@
  */
 
 #include <linux/kernel.h>
+#include <linux/pci.h>
 #include <linux/xarray.h>
 
 #include "amd_iommu.h"
@@ -78,3 +79,47 @@ int amd_iommu_trans_devid_reserve(struct amd_iommu_pci_seg *pci_seg, u16 id)
 			 pci_seg->id);
 	return ret;
 }
+
+static int reserve_trans_devid_each_dma_alias(struct pci_dev *pdev, u16 alias,
+					      void *data)
+{
+	struct amd_iommu_pci_seg *pci_seg = data;
+
+	(void)pdev;
+	return amd_iommu_trans_devid_reserve(pci_seg, alias);
+}
+
+/**
+ * amd_iommu_trans_devid_reserve_pci_aliases - reserve translate-device-ids for
+ * PCI DMA aliases and for the IVRS alias when it is not walked as a PCI DMA
+ * alias (different bus). Idempotent for repeated attach; see
+ * amd_iommu_trans_devid_reserve().
+ *
+ * Return: 0 on success or if @dev is not PCI; otherwise an errno from
+ * amd_iommu_trans_devid_reserve() or pci_for_each_dma_alias().
+ */
+int amd_iommu_trans_devid_reserve_pci_aliases(struct amd_iommu *iommu,
+					      struct device *dev)
+{
+	struct pci_dev *pdev;
+	struct amd_iommu_pci_seg *pci_seg;
+	u16 devid, ivrs_alias;
+	int ret;
+
+	if (!dev_is_pci(dev))
+		return 0;
+
+	pdev = to_pci_dev(dev);
+	pci_seg = iommu->pci_seg;
+	devid = pci_dev_id(pdev);
+
+	ivrs_alias = pci_seg->alias_table[devid];
+	if (ivrs_alias != devid) {
+		ret = amd_iommu_trans_devid_reserve(pci_seg, ivrs_alias);
+		if (ret)
+			return ret;
+	}
+
+	return pci_for_each_dma_alias(pdev, reserve_trans_devid_each_dma_alias,
+				      pci_seg);
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 21/26] iommu/amd: Map kvmfd to shared translate device ID for vIOMMU
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (19 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 20/26] iommu/amd: Reserve translate-device-id for PCI requestor aliases Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-06-01 13:35   ` Jason Gunthorpe
  2026-05-28  5:17 ` [PATCH v2 22/26] iommufd: Add hw_queue_init and split queue alloc paths Suravee Suthikulpanit
                   ` (5 subsequent siblings)
  26 siblings, 1 reply; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Add per-PCI-segment kvmfd xarray amd_iommu_kvmfd_trans_entry with
refcounted so all vIOMMUs for one VM share one translate-device-id and
GPA->SPA DTE.

Expose helper functions:
  * amd_iommu_get_trans_devid_by_kvmfd()
  * amd_iommu_free_trans_devid_by_kvmfd()
  * amd_iommu_trans_devid_alloc()
and extend segment init/fini for the kvmfd map.

Wire iommufd vIOMMU init/destroy to obtain the ID, program the translation
DTE and VFctrl TransDevID field, and tear down on error or uninit. Clear
translation state in amd_viommu_uninit_one().

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       |   4 +
 drivers/iommu/amd/amd_iommu_types.h |  13 ++
 drivers/iommu/amd/iommu.c           |   2 +-
 drivers/iommu/amd/iommufd.c         |  18 ++-
 drivers/iommu/amd/trans_devid.c     | 194 +++++++++++++++++++++++++++-
 drivers/iommu/amd/viommu.c          |   3 +
 6 files changed, 230 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index ddfc6329d235..cf2b051948a3 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -223,6 +223,10 @@ void amd_iommu_pci_seg_trans_devid_fini(struct amd_iommu_pci_seg *pci_seg);
 int amd_iommu_trans_devid_reserve(struct amd_iommu_pci_seg *pci_seg, u16 id);
 int amd_iommu_trans_devid_reserve_pci_aliases(struct amd_iommu *iommu,
 					      struct device *dev);
+int amd_iommu_get_trans_devid_by_kvmfd(struct amd_iommu_pci_seg *pci_seg,
+				       u32 kvmfd, u16 *trans_devid);
+void amd_iommu_free_trans_devid_by_kvmfd(struct amd_iommu_pci_seg *pci_seg,
+					 u32 kvmfd);
 #else
 static inline void
 amd_iommu_pci_seg_trans_devid_init(struct amd_iommu_pci_seg *pci_seg) { }
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index ffa338c8735f..92c83c68b2b1 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -559,6 +559,7 @@ struct amd_iommu_viommu {
 
 	u64 *devid_table;
 	u64 *domid_table;
+	u16 trans_devid;
 
 	/* Offset for mmap() of guest VF MMIO; set after iommufd_viommu_alloc_mmap(). */
 	unsigned long vfmmio_mmap_offset;
@@ -621,6 +622,11 @@ enum trans_devid_state {
 	TRANS_DEVID_RESERVED,
 	TRANS_DEVID_ALLOCATED,
 };
+
+struct amd_iommu_kvmfd_trans_entry {
+	refcount_t refs;
+	u16 trans_devid;
+};
 #endif
 
 /*
@@ -692,6 +698,13 @@ struct amd_iommu_pci_seg {
 	 */
 	struct mutex trans_devid_mutex;
 	struct xarray trans_devid_xa;
+
+	/*
+	 * Per-segment kvmfd mapping. The xarray is indexed by the kvmfd.
+	 * Values are struct amd_iommu_kvmfd_trans_entry * or xa_mk_value(trans_devid).
+	 */
+	struct mutex kvmfd_xa_mutex;
+	struct xarray kvmfd_xa;
 #endif
 };
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 2f9ca8f2d3c6..2530c8e5490c 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3058,7 +3058,7 @@ static int amd_iommu_attach_device(struct iommu_domain *dom, struct device *dev,
 #if IS_ENABLED(CONFIG_AMD_IOMMU_IOMMUFD)
 	/* Translate-device-id reservation must be done before setting up
 	 * the DTE for the device to make sure that the id has not been allocated
-	 * yet. (See amd_iommu_trans_devid_alloc().)
+	 * yet. (See trans_devid_alloc() in trans_devid.c.)
 	 */
 	ret = amd_iommu_trans_devid_reserve(iommu->pci_seg, dev_data->devid);
 	if (ret) {
diff --git a/drivers/iommu/amd/iommufd.c b/drivers/iommu/amd/iommufd.c
index 7c7ef267088b..23a2c8c20365 100644
--- a/drivers/iommu/amd/iommufd.c
+++ b/drivers/iommu/amd/iommufd.c
@@ -64,6 +64,7 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
 	int ret;
 	phys_addr_t page_base;
 	unsigned long flags;
+	u16 trans_devid;
 	struct iommu_viommu_amd data = {};
 	struct protection_domain *pdom = to_pdomain(parent);
 	struct amd_iommu *iommu = container_of(viommu->iommu_dev, struct amd_iommu, iommu);
@@ -104,9 +105,17 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
 		goto err_kvmfd;
 	}
 
+	ret = amd_iommu_get_trans_devid_by_kvmfd(iommu->pci_seg, data.kvmfd,
+						 &trans_devid);
+	if (ret)
+		goto err_kvmfd;
+
 	/* Reset vIOMMU MMIOs to initialize the vIOMMU */
 	iommu_reset_vmmio(iommu, aviommu->gid);
 
+	amd_iommu_set_translate_dte(iommu, aviommu->gid, pdom, trans_devid);
+	amd_iommu_update_vfctrl_mmio_translate_devid(iommu, aviommu->gid, trans_devid);
+
 	ret = amd_viommu_init_one(iommu, aviommu);
 	if (ret)
 		goto err_init;
@@ -117,6 +126,7 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
 	if (ret)
 		goto err_init;
 
+	aviommu->trans_devid = trans_devid;
 	aviommu->kvmfd = data.kvmfd;
 	viommu->ops = &amd_viommu_ops;
 
@@ -126,6 +136,9 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
 
 	return 0;
 err_init:
+	amd_iommu_update_vfctrl_mmio_translate_devid(iommu, aviommu->gid, 0);
+	amd_iommu_clear_translate_dte(iommu, aviommu->gid, trans_devid);
+	amd_iommu_free_trans_devid_by_kvmfd(iommu->pci_seg, data.kvmfd);
 err_kvmfd:
 	iommufd_viommu_destroy_mmap(&aviommu->core, aviommu->vfmmio_mmap_offset);
 err_mmap:
@@ -147,10 +160,11 @@ static void amd_iommufd_viommu_destroy(struct iommufd_viommu *viommu)
 	list_del(&aviommu->pdom_list);
 	spin_unlock_irqrestore(&pdom->lock, flags);
 	xa_destroy(&aviommu->gdomid_array);
-	if (aviommu->vfmmio_mmap_offset)
-		iommufd_viommu_destroy_mmap(&aviommu->core, aviommu->vfmmio_mmap_offset);
 	amd_iommu_gid_free(iommu, aviommu->gid);
 	amd_viommu_uninit_one(iommu, aviommu);
+	if (aviommu->vfmmio_mmap_offset)
+		iommufd_viommu_destroy_mmap(&aviommu->core, aviommu->vfmmio_mmap_offset);
+	amd_iommu_free_trans_devid_by_kvmfd(iommu->pci_seg, aviommu->kvmfd);
 }
 
 /*
diff --git a/drivers/iommu/amd/trans_devid.c b/drivers/iommu/amd/trans_devid.c
index 76450d1735e2..712e0aabcd94 100644
--- a/drivers/iommu/amd/trans_devid.c
+++ b/drivers/iommu/amd/trans_devid.c
@@ -2,15 +2,22 @@
 /*
  * Copyright (C) 2025 Advanced Micro Devices, Inc.
  *
- * AMD vIOMMU translate-device-id pool per PCI segment.
+ * AMD vIOMMU translate-device-id management.
+ *
+ * The id must be allocated from unused range. It is used to program the vIOMMU VF Control
+ * register to specify the DTE used to contain the GPA->SPA mapping (v1 page table).
  */
 
 #include <linux/kernel.h>
 #include <linux/pci.h>
+#include <linux/refcount.h>
+#include <linux/slab.h>
 #include <linux/xarray.h>
 
 #include "amd_iommu.h"
 
+static void trans_devid_free(struct amd_iommu_pci_seg *pci_seg, u16 id);
+
 static inline enum trans_devid_state trans_devid_xa_get_state(void *entry)
 {
 	if (!entry)
@@ -29,10 +36,32 @@ void amd_iommu_pci_seg_trans_devid_init(struct amd_iommu_pci_seg *pci_seg)
 {
 	mutex_init(&pci_seg->trans_devid_mutex);
 	xa_init(&pci_seg->trans_devid_xa);
+	mutex_init(&pci_seg->kvmfd_xa_mutex);
+	xa_init(&pci_seg->kvmfd_xa);
 }
 
 void amd_iommu_pci_seg_trans_devid_fini(struct amd_iommu_pci_seg *pci_seg)
 {
+	unsigned long index = 0;
+	void *e;
+
+	while ((e = xa_find(&pci_seg->kvmfd_xa, &index, ULONG_MAX, XA_PRESENT))) {
+		unsigned long cur = index;
+
+		if (xa_is_value(e))
+			trans_devid_free(pci_seg, (u16)xa_to_value(e));
+		else {
+			struct amd_iommu_kvmfd_trans_entry *entry = e;
+
+			trans_devid_free(pci_seg, entry->trans_devid);
+			kfree(entry);
+		}
+		xa_erase(&pci_seg->kvmfd_xa, cur);
+		if (cur == ULONG_MAX)
+			break;
+		index = cur + 1;
+	}
+	xa_destroy(&pci_seg->kvmfd_xa);
 	xa_destroy(&pci_seg->trans_devid_xa);
 }
 
@@ -123,3 +152,166 @@ int amd_iommu_trans_devid_reserve_pci_aliases(struct amd_iommu *iommu,
 	return pci_for_each_dma_alias(pdev, reserve_trans_devid_each_dma_alias,
 				      pci_seg);
 }
+
+/**
+ * trans_devid_alloc - allocate a translate-device-id for @pci_seg
+ *
+ * The trans_devid is allocated from the highest id to the lowest id.
+ * Generally, the PCI devices enumerated from the beginning of the bus range.
+ * Therefore, ids in the high range are likely to not be used.
+ *
+ * Return: allocated id on success, negative errno on failure.
+ */
+static int trans_devid_alloc(struct amd_iommu_pci_seg *pci_seg)
+{
+	int id;
+
+	mutex_lock(&pci_seg->trans_devid_mutex);
+	for (id = U16_MAX; id >= 0; id--) {
+		void *entry, *old;
+
+		entry = xa_load(&pci_seg->trans_devid_xa, id);
+		if (entry)
+			continue;
+
+		old = xa_store(&pci_seg->trans_devid_xa, id,
+			       trans_devid_xa_mk_state(TRANS_DEVID_ALLOCATED), GFP_KERNEL);
+		if (xa_is_err(old)) {
+			int err = xa_err(old);
+
+			mutex_unlock(&pci_seg->trans_devid_mutex);
+			return err;
+		}
+		WARN_ON_ONCE(old);
+		mutex_unlock(&pci_seg->trans_devid_mutex);
+		pr_debug("%s: Allocated trans_devid %#x (seg %#x)\n", __func__, id,
+			 pci_seg->id);
+		return id;
+	}
+	pr_err("%s: No free trans_devid found (seg %#x)\n", __func__, pci_seg->id);
+	mutex_unlock(&pci_seg->trans_devid_mutex);
+	return -ENOSPC;
+}
+
+static void trans_devid_free(struct amd_iommu_pci_seg *pci_seg, u16 id)
+{
+	void *old;
+
+	mutex_lock(&pci_seg->trans_devid_mutex);
+	old = xa_erase(&pci_seg->trans_devid_xa, id);
+	if (WARN_ON_ONCE(!old || trans_devid_xa_get_state(old) == TRANS_DEVID_FREE))
+		goto out;
+	pr_debug("%s: Freed trans_devid %#x (seg %#x)\n", __func__, id, pci_seg->id);
+out:
+	mutex_unlock(&pci_seg->trans_devid_mutex);
+}
+
+/**
+ * amd_iommu_get_trans_devid_by_kvmfd - look up or allocate trans_devid for @kvmfd
+ *
+ * If an entry already exists for @kvmfd, bumps its refcount and returns the same
+ * @trans_devid. Otherwise allocates a new translate devid, inserts an entry with
+ * refcount 1, and returns it.
+ *
+ * Note: Each translate-device-id is allocated per VM (kvmfd) since there is one
+ * GPA->SPA mapping per VM. In case of multiple vIOMMUs, all vIOMMUs share the same
+ * translate-device-id.
+ *
+ * Return: 0 on success, %-ENOMEM on allocation failure, %-EIO if the map holds an
+ * unexpected entry type.
+ */
+int amd_iommu_get_trans_devid_by_kvmfd(struct amd_iommu_pci_seg *pci_seg, u32 kvmfd,
+				       u16 *trans_devid)
+{
+	struct amd_iommu_kvmfd_trans_entry *entry;
+	void *prev;
+	int id, ret = 0;
+
+	mutex_lock(&pci_seg->kvmfd_xa_mutex);
+	entry = xa_load(&pci_seg->kvmfd_xa, kvmfd);
+	if (entry) {
+		if (WARN_ON_ONCE(xa_is_value(entry))) {
+			ret = -EIO;
+			goto out_unlock;
+		}
+		refcount_inc(&entry->refs);
+		*trans_devid = entry->trans_devid;
+		pr_debug("%s: Got trans_devid %#x for kvmfd %#x (seg %#x)\n",
+			 __func__, *trans_devid, kvmfd, pci_seg->id);
+		goto out_unlock;
+	}
+
+	id = trans_devid_alloc(pci_seg);
+	if (id < 0) {
+		pr_err("%s: Failed to allocate trans_devid (kvmfd=%#x seg=%#x err=%d)\n",
+		       __func__, kvmfd, pci_seg->id, id);
+		ret = id;
+		goto out_unlock;
+	}
+
+	entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry) {
+		trans_devid_free(pci_seg, id);
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	refcount_set(&entry->refs, 1);
+	entry->trans_devid = id;
+
+	prev = xa_store(&pci_seg->kvmfd_xa, kvmfd, entry, GFP_KERNEL);
+	if (xa_is_err(prev)) {
+		ret = xa_err(prev);
+		kfree(entry);
+		trans_devid_free(pci_seg, id);
+		goto out_unlock;
+	}
+	WARN_ON_ONCE(prev);
+
+	*trans_devid = id;
+	pr_debug("%s: Allocated trans_devid %#x for kvmfd %#x (seg %#x)\n",
+		 __func__, id, kvmfd, pci_seg->id);
+
+out_unlock:
+	mutex_unlock(&pci_seg->kvmfd_xa_mutex);
+	return ret;
+}
+
+/**
+ * amd_iommu_free_trans_devid_by_kvmfd - drop one reference for @kvmfd
+ *
+ * Decrements the per-kvmfd refcount. The translate devid is returned to the
+ * segment pool and the map entry is removed only when the refcount reaches zero.
+ */
+void amd_iommu_free_trans_devid_by_kvmfd(struct amd_iommu_pci_seg *pci_seg, u32 kvmfd)
+{
+	struct amd_iommu_kvmfd_trans_entry *entry;
+	u16 tid;
+
+	mutex_lock(&pci_seg->kvmfd_xa_mutex);
+	entry = xa_load(&pci_seg->kvmfd_xa, kvmfd);
+	if (!entry) {
+		mutex_unlock(&pci_seg->kvmfd_xa_mutex);
+		return;
+	}
+
+	if (WARN_ON_ONCE(xa_is_value(entry))) {
+		mutex_unlock(&pci_seg->kvmfd_xa_mutex);
+		return;
+	}
+
+	if (!refcount_dec_and_test(&entry->refs)) {
+		pr_debug("%s: kvmfd %#x, trans_devid %#x (seg %#x)\n",
+			 __func__, kvmfd, entry->trans_devid, pci_seg->id);
+		mutex_unlock(&pci_seg->kvmfd_xa_mutex);
+		return;
+	}
+
+	tid = entry->trans_devid;
+	trans_devid_free(pci_seg, tid);
+	xa_erase(&pci_seg->kvmfd_xa, kvmfd);
+	kfree(entry);
+	mutex_unlock(&pci_seg->kvmfd_xa_mutex);
+	pr_debug("%s: Freed trans_devid %#x for kvmfd %#x (seg %#x)\n", __func__, tid,
+		 kvmfd, pci_seg->id);
+}
diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
index b3150f7bcec3..d2c883e314f8 100644
--- a/drivers/iommu/amd/viommu.c
+++ b/drivers/iommu/amd/viommu.c
@@ -449,6 +449,9 @@ void amd_viommu_uninit_one(struct amd_iommu *iommu, struct amd_iommu_viommu *avi
 			       VIOMMU_DOMID_MAPPING_BASE,
 			       VIOMMU_DOMID_MAPPING_ENTRY_SIZE,
 			       aviommu->gid);
+
+	amd_iommu_update_vfctrl_mmio_translate_devid(iommu, aviommu->gid, 0);
+	amd_iommu_clear_translate_dte(iommu, aviommu->gid, aviommu->trans_devid);
 	viommu_clear_mapping(iommu, aviommu);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 22/26] iommufd: Add hw_queue_init and split queue alloc paths
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (20 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 21/26] iommu/amd: Map kvmfd to shared translate device ID for vIOMMU Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-29  0:14   ` Nicolin Chen
  2026-06-01 13:38   ` Jason Gunthorpe
  2026-05-28  5:17 ` [PATCH v2 23/26] iommu/amd: Add support for vIOMMU HW queues initialization Suravee Suthikulpanit
                   ` (4 subsequent siblings)
  26 siblings, 2 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Add iommufd_viommu_ops.hw_queue_init for vIOMMU backends whose
hardware uses a guest physical queue base from hw_queue->base_addr
instead of a host physical address.

Previously, HW queue alloc always went through
iommufd_hw_queue_alloc_phys(), an iommufd_access, and
hw_queue_init_phys(base_pa). AMD vIOMMU instead takes GPA from
userspace in hw_queue->base_addr and programs hardware without host PA
resolution. Splitting helpers and dispatching from the ioctl keeps
one uAPI while making the contract explicit. Each vIOMMU driver should
implement only one of hw_queue_init_phys or hw_queue_init.

Refactor iommufd_hw_queue_alloc_ioctl() so shared validation and
viommu lookup stay in the ioctl, while setup is delegated to
_iommufd_hw_queue_init_phys() or _iommufd_hw_queue_init().

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/iommufd/viommu.c | 125 +++++++++++++++++++++++----------
 include/linux/iommufd.h        |   5 +-
 2 files changed, 91 insertions(+), 39 deletions(-)

diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c
index 4081deda9b33..bc58d240fafd 100644
--- a/drivers/iommu/iommufd/viommu.c
+++ b/drivers/iommu/iommufd/viommu.c
@@ -353,62 +353,35 @@ iommufd_hw_queue_alloc_phys(struct iommu_hw_queue_alloc *cmd,
 	return ERR_PTR(rc);
 }
 
-int iommufd_hw_queue_alloc_ioctl(struct iommufd_ucmd *ucmd)
+static int _iommufd_hw_queue_init_phys(struct iommufd_ucmd *ucmd,
+				       struct iommufd_viommu *viommu)
 {
 	struct iommu_hw_queue_alloc *cmd = ucmd->cmd;
 	struct iommufd_hw_queue *hw_queue;
-	struct iommufd_viommu *viommu;
 	struct iommufd_access *access;
 	size_t hw_queue_size;
 	phys_addr_t base_pa;
-	u64 last;
 	int rc;
 
-	if (cmd->flags || cmd->type == IOMMU_HW_QUEUE_TYPE_DEFAULT)
-		return -EOPNOTSUPP;
-	if (!cmd->length)
-		return -EINVAL;
-	if (check_add_overflow(cmd->nesting_parent_iova, cmd->length - 1,
-			       &last))
-		return -EOVERFLOW;
-
-	viommu = iommufd_get_viommu(ucmd, cmd->viommu_id);
-	if (IS_ERR(viommu))
-		return PTR_ERR(viommu);
-
-	if (!viommu->ops || !viommu->ops->get_hw_queue_size ||
-	    !viommu->ops->hw_queue_init_phys) {
-		rc = -EOPNOTSUPP;
-		goto out_put_viommu;
-	}
-
 	hw_queue_size = viommu->ops->get_hw_queue_size(viommu, cmd->type);
-	if (!hw_queue_size) {
-		rc = -EOPNOTSUPP;
-		goto out_put_viommu;
-	}
+	if (!hw_queue_size)
+		return -EOPNOTSUPP;
 
 	/*
 	 * It is a driver bug for providing a hw_queue_size smaller than the
 	 * core HW queue structure size
 	 */
-	if (WARN_ON_ONCE(hw_queue_size < sizeof(*hw_queue))) {
-		rc = -EOPNOTSUPP;
-		goto out_put_viommu;
-	}
+	if (WARN_ON_ONCE(hw_queue_size < sizeof(*hw_queue)))
+		return -EOPNOTSUPP;
 
 	hw_queue = (struct iommufd_hw_queue *)_iommufd_object_alloc_ucmd(
 		ucmd, hw_queue_size, IOMMUFD_OBJ_HW_QUEUE);
-	if (IS_ERR(hw_queue)) {
-		rc = PTR_ERR(hw_queue);
-		goto out_put_viommu;
-	}
+	if (IS_ERR(hw_queue))
+		return PTR_ERR(hw_queue);
 
 	access = iommufd_hw_queue_alloc_phys(cmd, viommu, &base_pa);
-	if (IS_ERR(access)) {
-		rc = PTR_ERR(access);
-		goto out_put_viommu;
-	}
+	if (IS_ERR(access))
+		return PTR_ERR(access);
 
 	hw_queue->viommu = viommu;
 	refcount_inc(&viommu->obj.users);
@@ -419,9 +392,85 @@ int iommufd_hw_queue_alloc_ioctl(struct iommufd_ucmd *ucmd)
 
 	rc = viommu->ops->hw_queue_init_phys(hw_queue, cmd->index, base_pa);
 	if (rc)
-		goto out_put_viommu;
+		return rc;
 
 	cmd->out_hw_queue_id = hw_queue->obj.id;
+	return rc;
+}
+
+static int _iommufd_hw_queue_init(struct iommufd_ucmd *ucmd,
+				  struct iommufd_viommu *viommu)
+{
+	struct iommu_hw_queue_alloc *cmd = ucmd->cmd;
+	struct iommufd_hw_queue *hw_queue;
+	size_t hw_queue_size;
+	int rc;
+
+	hw_queue_size = viommu->ops->get_hw_queue_size(viommu, cmd->type);
+	if (!hw_queue_size)
+		return -EOPNOTSUPP;
+
+	/*
+	 * It is a driver bug for providing a hw_queue_size smaller than the
+	 * core HW queue structure size
+	 */
+	if (WARN_ON_ONCE(hw_queue_size < sizeof(*hw_queue)))
+		return -EOPNOTSUPP;
+
+	hw_queue = (struct iommufd_hw_queue *)_iommufd_object_alloc_ucmd(
+		ucmd, hw_queue_size, IOMMUFD_OBJ_HW_QUEUE);
+	if (IS_ERR(hw_queue))
+		return PTR_ERR(hw_queue);
+
+	hw_queue->viommu = viommu;
+	refcount_inc(&viommu->obj.users);
+	hw_queue->access = NULL;
+	hw_queue->type = cmd->type;
+	hw_queue->length = cmd->length;
+	hw_queue->base_addr = cmd->nesting_parent_iova;
+
+	rc = viommu->ops->hw_queue_init(hw_queue, cmd->index);
+	if (rc)
+		return rc;
+
+	cmd->out_hw_queue_id = hw_queue->obj.id;
+	return rc;
+}
+
+int iommufd_hw_queue_alloc_ioctl(struct iommufd_ucmd *ucmd)
+{
+	struct iommu_hw_queue_alloc *cmd = ucmd->cmd;
+	struct iommufd_viommu *viommu;
+	u64 last;
+	int rc;
+
+	if (cmd->flags || cmd->type == IOMMU_HW_QUEUE_TYPE_DEFAULT)
+		return -EOPNOTSUPP;
+	if (!cmd->length)
+		return -EINVAL;
+	if (check_add_overflow(cmd->nesting_parent_iova, cmd->length - 1,
+			       &last))
+		return -EOVERFLOW;
+
+	viommu = iommufd_get_viommu(ucmd, cmd->viommu_id);
+	if (IS_ERR(viommu))
+		return PTR_ERR(viommu);
+
+	if (!viommu->ops || !viommu->ops->get_hw_queue_size) {
+		rc = -EOPNOTSUPP;
+		goto out_put_viommu;
+	}
+
+	if (viommu->ops->hw_queue_init_phys)
+		rc = _iommufd_hw_queue_init_phys(ucmd, viommu);
+	else if (viommu->ops->hw_queue_init)
+		rc = _iommufd_hw_queue_init(ucmd, viommu);
+	else
+		rc = -EOPNOTSUPP;
+
+	if (rc)
+		goto out_put_viommu;
+
 	rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
 
 out_put_viommu:
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 6e7efe83bc5d..c0030677e13c 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -180,6 +180,9 @@ struct iommufd_hw_queue {
  *                      the physical location of the guest queue
  *                      If driver has a deinit function to revert what this op
  *                      does, it should set it to the @hw_queue->destroy pointer
+ * @hw_queue_init: Similar to hw_queue_init_phys, but driver providing this op
+ *                 indicates that HW accesses the guest queue memory via
+ *                 @hw_queue->baseaddr.
  */
 struct iommufd_viommu_ops {
 	void (*destroy)(struct iommufd_viommu *viommu);
@@ -192,9 +195,9 @@ struct iommufd_viommu_ops {
 	int (*vdevice_init)(struct iommufd_vdevice *vdev);
 	size_t (*get_hw_queue_size)(struct iommufd_viommu *viommu,
 				    enum iommu_hw_queue_type queue_type);
-	/* AMD's HW will add hw_queue_init simply using @hw_queue->base_addr */
 	int (*hw_queue_init_phys)(struct iommufd_hw_queue *hw_queue, u32 index,
 				  phys_addr_t base_addr_pa);
+	int (*hw_queue_init)(struct iommufd_hw_queue *hw_queue, u32 index);
 };
 
 #if IS_ENABLED(CONFIG_IOMMUFD)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 23/26] iommu/amd: Add support for vIOMMU HW queues initialization
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (21 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 22/26] iommufd: Add hw_queue_init and split queue alloc paths Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 24/26] iommufd: Introduce vIOMMU command via VIOMMU_COMMAND ioctl Suravee Suthikulpanit
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

AMD HW vIOMMU supports virtualizing Command buffer, Event log, and PPR log.
Each can be initialized using the struct iommufd_viommu_ops.hw_queue_init
to communicate base address (GPA) and length of each queue to the AMD IOMMU
driver in order to programe the corresponded VF control MMIO registers.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu_types.h |   8 +++
 drivers/iommu/amd/iommufd.c         | 108 ++++++++++++++++++++++++++++
 include/uapi/linux/iommufd.h        |   3 +
 3 files changed, 119 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 92c83c68b2b1..70d060a85e66 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -507,6 +507,10 @@ extern bool amdr_ivrs_remap_support;
 #define VIOMMU_VFCTRL_MMIO_BASE(iommu, guestId) \
 	(iommu->vfctrl_base + (guestId * VIOMMU_VFCTRL_MMIO_ENTRY_SIZE))
 
+#define VIOMMU_VFCTRL_MMIO_GUEST_COMMAND_CONTROL_OFFSET	0x20
+#define VIOMMU_VFCTRL_MMIO_GUEST_EVENT_CONTROL_OFFSET	0x28
+#define VIOMMU_VFCTRL_MMIO_GUEST_PPR_CONTROL_OFFSET	0x30
+
 struct amd_iommu;
 struct iommu_domain;
 struct irq_domain;
@@ -1166,6 +1170,10 @@ struct amd_iommu_vdevice {
 	struct iommufd_vdevice core;
 };
 
+struct amd_iommu_hw_queue {
+	struct iommufd_hw_queue core;
+};
+
 #ifdef CONFIG_IRQ_REMAP
 extern struct amd_irte_ops irte_32_ops;
 extern struct amd_irte_ops irte_128_ops;
diff --git a/drivers/iommu/amd/iommufd.c b/drivers/iommu/amd/iommufd.c
index 23a2c8c20365..7e6381e5a06b 100644
--- a/drivers/iommu/amd/iommufd.c
+++ b/drivers/iommu/amd/iommufd.c
@@ -200,6 +200,112 @@ static int _amd_viommu_vdevice_init(struct iommufd_vdevice *vdev)
 	return 0;
 }
 
+static size_t _amd_viommu_get_hw_queue_size(struct iommufd_viommu *viommu,
+					    enum iommu_hw_queue_type queue_type)
+{
+	/* Currently do not support Eventlog B and PPRlog B */
+	if ((queue_type != IOMMU_HW_QUEUE_TYPE_AMD_CMD) &&
+	    (queue_type != IOMMU_HW_QUEUE_TYPE_AMD_EVT) &&
+	    (queue_type != IOMMU_HW_QUEUE_TYPE_AMD_PPR))
+		return 0;
+
+	return HW_QUEUE_STRUCT_SIZE(struct amd_iommu_hw_queue, core);
+}
+
+static int _amd_viommu_hw_queue_init(struct iommufd_hw_queue *hw_queue, u32 index)
+{
+	int ret = 0;
+	u64 val, vfctrl_mask, base;
+	u8 __iomem *vfctrl, *vf;
+	struct iommufd_viommu *viommu = hw_queue->viommu;
+	struct amd_iommu_viommu *aviommu = container_of(viommu, struct amd_iommu_viommu, core);
+	struct amd_iommu *iommu = container_of(viommu->iommu_dev, struct amd_iommu, iommu);
+	int gid = aviommu->gid;
+
+	vf = VIOMMU_VF_MMIO_BASE(iommu, gid);
+	vfctrl = VIOMMU_VFCTRL_MMIO_BASE(iommu, gid);
+
+	switch (hw_queue->type) {
+	case IOMMU_HW_QUEUE_TYPE_AMD_CMD:
+	{
+		/*
+		 * Capture base and length from guest Command Buffer control register.
+		 * and program onto VF Ctrl MMIO Command Buffer Control register.
+		 *
+		 * Command Buffer Control register field mapping :
+		 * - ComBase[51:12] = vfctrl[51:12]
+		 * - ComLen[3:0] = vfctrl[3:0]
+		 *
+		 * Guest Command Buffer Control Register field mapping :
+		 * - ComBase[51:12] = vfctrl[51:12]
+		 * - ComLen[3:0] = vfctrl[3:0]
+		 */
+		vfctrl_mask = GENMASK_ULL(51, 12) | GENMASK_ULL(3, 0);
+		val = readq(vfctrl + VIOMMU_VFCTRL_MMIO_GUEST_COMMAND_CONTROL_OFFSET) &
+		      (~vfctrl_mask);
+		base = FIELD_GET(GENMASK_ULL(51, 12), hw_queue->base_addr);
+		val |= FIELD_PREP(GENMASK_ULL(51, 12), base) |
+		       FIELD_PREP(GENMASK_ULL(3, 0), hw_queue->length);
+		writeq(val, vfctrl + VIOMMU_VFCTRL_MMIO_GUEST_COMMAND_CONTROL_OFFSET);
+		break;
+	}
+	case IOMMU_HW_QUEUE_TYPE_AMD_EVT:
+	{
+		/*
+		 * Capture base and length from guest Event Buffer control register.
+		 * and program onto VF Ctrl MMIO Event Buffer Control register.
+		 *
+		 * Event Buffer Control register field mapping :
+		 * - EvtBase[51:12] = vfctrl[51:12]
+		 * - EvtLen[3:0] = vfctrl[3:0]
+		 *
+		 * Guest Event Buffer Control Register field mapping :
+		 * - EvtBase[51:12] = vfctrl[51:12]
+		 * - EvtLen[3:0] = vfctrl[3:0]
+		 */
+		vfctrl_mask = GENMASK_ULL(51, 12) | GENMASK_ULL(3, 0);
+		val = readq(vfctrl + VIOMMU_VFCTRL_MMIO_GUEST_EVENT_CONTROL_OFFSET) &
+		      (~vfctrl_mask);
+		base = FIELD_GET(GENMASK_ULL(51, 12), hw_queue->base_addr);
+		val |= FIELD_PREP(GENMASK_ULL(51, 12), base) |
+		       FIELD_PREP(GENMASK_ULL(3, 0), hw_queue->length);
+		writeq(val, vfctrl + VIOMMU_VFCTRL_MMIO_GUEST_EVENT_CONTROL_OFFSET);
+		break;
+	}
+	case IOMMU_HW_QUEUE_TYPE_AMD_PPR:
+	{
+		/*
+		 * Capture base and length from guest PPR Buffer control register.
+		 * and program onto VF Ctrl MMIO PPR Buffer Control register.
+		 *
+		 * PPR Buffer Control register field mapping :
+		 * - PPRBase[51:12] = vfctrl[55:16]
+		 * - PPRLen[3:0] = vfctrl[3:0]
+		 *
+		 * Guest PPR Buffer Control Register field mapping :
+		 * - PPRBase[51:12] = vfctrl[55:16]
+		 * - PPRLen[3:0] = vfctrl[3:0]
+		 */
+		vfctrl_mask = GENMASK_ULL(55, 16) | GENMASK_ULL(3, 0);
+		val = readq(vfctrl + VIOMMU_VFCTRL_MMIO_GUEST_PPR_CONTROL_OFFSET) & (~vfctrl_mask);
+		base = FIELD_GET(GENMASK_ULL(51, 12), hw_queue->base_addr);
+		val |= FIELD_PREP(GENMASK_ULL(55, 16), base) |
+		       FIELD_PREP(GENMASK_ULL(3, 0), hw_queue->length);
+		writeq(val, vfctrl + VIOMMU_VFCTRL_MMIO_GUEST_PPR_CONTROL_OFFSET);
+		break;
+	}
+	default:
+		pr_err("%s: Invalid type (%#x)\n", __func__, hw_queue->type);
+		return -EINVAL;
+	}
+
+	pr_debug("%s: iommu_devid=%#x, gid=%#x, type=%#x, addr=%#llx, len=%#lx, val=%#llx\n",
+		 __func__, iommu->devid, gid, hw_queue->type,
+		 hw_queue->base_addr, hw_queue->length, val);
+
+	return ret;
+}
+
 /*
  * See include/linux/iommufd.h
  * struct iommufd_viommu_ops - vIOMMU specific operations
@@ -209,4 +315,6 @@ static const struct iommufd_viommu_ops amd_viommu_ops = {
 	.destroy = amd_iommufd_viommu_destroy,
 	.vdevice_size = VDEVICE_STRUCT_SIZE(struct amd_iommu_vdevice, core),
 	.vdevice_init = _amd_viommu_vdevice_init,
+	.get_hw_queue_size = _amd_viommu_get_hw_queue_size,
+	.hw_queue_init = _amd_viommu_hw_queue_init,
 };
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 69aa44bba95a..e08de6ab8209 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -1326,6 +1326,9 @@ enum iommu_hw_queue_type {
 	 *   emulated vSMMU's IDR1.CMDQS to log2(huge page size / 16 bytes)
 	 */
 	IOMMU_HW_QUEUE_TYPE_TEGRA241_CMDQV = 1,
+	IOMMU_HW_QUEUE_TYPE_AMD_CMD,
+	IOMMU_HW_QUEUE_TYPE_AMD_EVT,
+	IOMMU_HW_QUEUE_TYPE_AMD_PPR,
 };
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 24/26] iommufd: Introduce vIOMMU command via VIOMMU_COMMAND ioctl
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (22 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 23/26] iommu/amd: Add support for vIOMMU HW queues initialization Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-28  5:17 ` [PATCH v2 25/26] iommu/amd: Handle set/get command for AMD vIOMMU Suravee Suthikulpanit
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Introduce a new IOMMU_VIOMMU_COMMAND ioctl, to support vIOMMU-specific
commands. Initially, the ioctl support set / get operations on a specified
index to write / read value. This allows write / read access to a
particular index of a data arrays such as hardware registers.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/iommufd/iommufd_private.h |  1 +
 drivers/iommu/iommufd/main.c            |  3 ++
 drivers/iommu/iommufd/viommu.c          | 39 +++++++++++++++++++++++++
 include/linux/iommufd.h                 |  5 ++++
 include/uapi/linux/iommufd.h            | 34 +++++++++++++++++++++
 5 files changed, 82 insertions(+)

diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 6ac1965199e9..5cf839a4405a 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -692,6 +692,7 @@ iommufd_viommu_find_veventq(struct iommufd_viommu *viommu,
 
 int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd);
 void iommufd_viommu_destroy(struct iommufd_object *obj);
+int iommufd_viommu_command_ioctl(struct iommufd_ucmd *ucmd);
 int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd);
 void iommufd_vdevice_destroy(struct iommufd_object *obj);
 void iommufd_vdevice_abort(struct iommufd_object *obj);
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 8c6d43601afb..194f6409ce53 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -432,6 +432,7 @@ union ucmd_buffer {
 	struct iommu_veventq_alloc veventq;
 	struct iommu_vfio_ioas vfio_ioas;
 	struct iommu_viommu_alloc viommu;
+	struct iommu_viommu_command viommu_command;
 #ifdef CONFIG_IOMMUFD_TEST
 	struct iommu_test_cmd test;
 #endif
@@ -493,6 +494,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
 		 __reserved),
 	IOCTL_OP(IOMMU_VIOMMU_ALLOC, iommufd_viommu_alloc_ioctl,
 		 struct iommu_viommu_alloc, out_viommu_id),
+	IOCTL_OP(IOMMU_VIOMMU_COMMAND, iommufd_viommu_command_ioctl,
+		 struct iommu_viommu_command, val64),
 #ifdef CONFIG_IOMMUFD_TEST
 	IOCTL_OP(IOMMU_TEST_CMD, iommufd_test, struct iommu_test_cmd, last),
 #endif
diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c
index bc58d240fafd..53f9a7d2734a 100644
--- a/drivers/iommu/iommufd/viommu.c
+++ b/drivers/iommu/iommufd/viommu.c
@@ -477,3 +477,42 @@ int iommufd_hw_queue_alloc_ioctl(struct iommufd_ucmd *ucmd)
 	iommufd_put_object(ucmd->ictx, &viommu->obj);
 	return rc;
 }
+
+int iommufd_viommu_command_ioctl(struct iommufd_ucmd *ucmd)
+{
+	int rc = 0;
+	struct iommu_viommu_command *cmd = ucmd->cmd;
+	struct iommufd_viommu *viommu = iommufd_get_viommu(ucmd, cmd->object_id);
+
+	if (cmd->__reserved) {
+		rc = -EOPNOTSUPP;
+		goto out_put_viommu;
+	}
+
+	if (IS_ERR(viommu)) {
+		rc = PTR_ERR(viommu);
+		goto out_put_viommu;
+	}
+
+	if (cmd->op == IOMMU_VIOMMU_COMMAND_OP_SET) {
+		if (!viommu->ops->set_command)
+			rc = -EOPNOTSUPP;
+		rc = viommu->ops->set_command(viommu, cmd->index, cmd->val64);
+	} else if (cmd->op == IOMMU_VIOMMU_COMMAND_OP_GET) {
+		if (!viommu->ops->get_command)
+			rc = -EOPNOTSUPP;
+		rc = viommu->ops->get_command(viommu, cmd->index, &cmd->val64);
+	} else {
+		rc = -EOPNOTSUPP;
+	}
+
+	if (rc)
+		goto out_put_viommu;
+
+	if (copy_to_user(&((struct iommu_viommu_command __user *)ucmd->ubuffer)->val64,
+			 &cmd->val64, sizeof(cmd->val64)))
+		rc = -EFAULT;
+out_put_viommu:
+	iommufd_put_object(ucmd->ictx, &viommu->obj);
+	return rc;
+}
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index c0030677e13c..ec3563f4c09c 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -183,6 +183,9 @@ struct iommufd_hw_queue {
  * @hw_queue_init: Similar to hw_queue_init_phys, but driver providing this op
  *                 indicates that HW accesses the guest queue memory via
  *                 @hw_queue->baseaddr.
+ * @set_command: Set data on the specified index of the specified vIOMMU.
+ * @get_command: Get data on the specified index of the specified vIOMMU.
+ *               On success, the value is returned via the provided value.
  */
 struct iommufd_viommu_ops {
 	void (*destroy)(struct iommufd_viommu *viommu);
@@ -198,6 +201,8 @@ struct iommufd_viommu_ops {
 	int (*hw_queue_init_phys)(struct iommufd_hw_queue *hw_queue, u32 index,
 				  phys_addr_t base_addr_pa);
 	int (*hw_queue_init)(struct iommufd_hw_queue *hw_queue, u32 index);
+	int (*set_command)(struct iommufd_viommu *viommu, u16 index, u64 value);
+	int (*get_command)(struct iommufd_viommu *viommu, u16 index, u64 *value);
 };
 
 #if IS_ENABLED(CONFIG_IOMMUFD)
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index e08de6ab8209..a2195b1dfabe 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -57,6 +57,7 @@ enum {
 	IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x92,
 	IOMMUFD_CMD_VEVENTQ_ALLOC = 0x93,
 	IOMMUFD_CMD_HW_QUEUE_ALLOC = 0x94,
+	IOMMUFD_CMD_VIOMMU_COMMAND = 0x95,
 };
 
 /**
@@ -1131,6 +1132,39 @@ struct iommu_viommu_alloc {
 };
 #define IOMMU_VIOMMU_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VIOMMU_ALLOC)
 
+/**
+ * enum viommu_command_ops - viommu command operations
+ * @IOMMU_VIOMMU_COMMAND_OP_SET: Set the command's data
+ * @IOMMU_VIOMMU_COMMAND_OP_GET: Get the command's data
+ */
+enum viommu_command_ops {
+	IOMMU_VIOMMU_COMMAND_OP_SET = 0,
+	IOMMU_VIOMMU_COMMAND_OP_GET = 1,
+};
+
+/**
+ * struct iommu_viommu_command - iommu viommu command multiplexer
+ * @size: sizeof(struct iommu_viommu_command)
+ * @object_id: ID of the vIOMMU if required
+ * @op: One of enum viommu_command_ops
+ * @index: Command index to match with the value
+ * @__reserved: Must be 0
+ * @val64: Command data to set or data returned on get
+ *
+ * This multiplexer allows controlling commands on vIOMMU.
+ * IOMMU_VIOMMU_COMMAND_OP_SET will load a command and
+ * IOMMU_VIOMMU_COMMAND_OP_GET will return the current value.
+ */
+struct iommu_viommu_command {
+	__u32 size;
+	__u32 object_id;
+	__u16 op;
+	__u16 index;
+	__u32 __reserved;
+	__aligned_u64 val64;
+};
+#define IOMMU_VIOMMU_COMMAND _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VIOMMU_COMMAND)
+
 /**
  * struct iommu_vdevice_alloc - ioctl(IOMMU_VDEVICE_ALLOC)
  * @size: sizeof(struct iommu_vdevice_alloc)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 25/26] iommu/amd: Handle set/get command for AMD vIOMMU
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (23 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 24/26] iommufd: Introduce vIOMMU command via VIOMMU_COMMAND ioctl Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-05-29 23:20   ` Nicolin Chen
  2026-05-28  5:17 ` [PATCH v2 26/26] iommu/amd: Introduce logic to check and enable vIOMMU feature Suravee Suthikulpanit
  2026-06-01 13:30 ` [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Jason Gunthorpe
  26 siblings, 1 reply; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Guest kernel programs guest Command Buffer, Event Log, and PPR Log
(a.k.a hardware queue) settings via guest control MMIO register
(guest MMIO offset 0x18). Accesses to the register is trapped by VMM (QEMU)
and information is passed as IOMMU_VIOMMU_OPTION to host IOMMU driver via
struct iommufd_viommu_ops set_command() and get_command().

Provides AMD IOMMU driver hooks to handle set/get operations for the
guest control MMIO register, which uses key parameter as AMD IOMMU MMIO
offset. The value parameter contains the value of the corresponding guest
MMIO register, which is converted to the format of AMD vIOMMU VF Control
MMIO registers then programed onto the hardware.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/Makefile          |   2 +-
 drivers/iommu/amd/amd_iommu_types.h |   5 +
 drivers/iommu/amd/amd_viommu.h      |  15 +++
 drivers/iommu/amd/iommufd.c         |   2 +
 drivers/iommu/amd/vfctrl_mmio.c     | 146 ++++++++++++++++++++++++++++
 5 files changed, 169 insertions(+), 1 deletion(-)
 create mode 100644 drivers/iommu/amd/vfctrl_mmio.c

diff --git a/drivers/iommu/amd/Makefile b/drivers/iommu/amd/Makefile
index 12c3fe83e4ce..afbefda87f57 100644
--- a/drivers/iommu/amd/Makefile
+++ b/drivers/iommu/amd/Makefile
@@ -1,4 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-y += iommu.o init.o quirks.o ppr.o pasid.o
-obj-$(CONFIG_AMD_IOMMU_IOMMUFD) += iommufd.o nested.o viommu.o trans_devid.o
+obj-$(CONFIG_AMD_IOMMU_IOMMUFD) += iommufd.o nested.o viommu.o trans_devid.o vfctrl_mmio.o
 obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += debugfs.o
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 70d060a85e66..44a185bfa39e 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -191,10 +191,15 @@
 #define CONTROL_GAM_EN		25
 #define CONTROL_GALOG_EN	28
 #define CONTROL_GAINT_EN	29
+#define CONTROL_DUALPPRLOG_EN   30
+#define CONTROL_DUALEVTLOG_EN   32
+#define CONTROL_PPR_AUTO_RSP_EN 39
+#define CONTROL_BLKSTOPMRK_EN   41
 #define CONTROL_NUM_INT_REMAP_MODE	43
 #define CONTROL_NUM_INT_REMAP_MODE_MASK	0x03
 #define CONTROL_NUM_INT_REMAP_MODE_2K	0x01
 #define CONTROL_EPH_EN		45
+#define CONTROL_PPR_AUTO_RSP_AON 48
 #define CONTROL_XT_EN		50
 #define CONTROL_INTCAPXT_EN	51
 #define CONTROL_GCR3TRPMODE	58
diff --git a/drivers/iommu/amd/amd_viommu.h b/drivers/iommu/amd/amd_viommu.h
index 3a8f41baaab9..c4ef374b68ec 100644
--- a/drivers/iommu/amd/amd_viommu.h
+++ b/drivers/iommu/amd/amd_viommu.h
@@ -23,6 +23,11 @@ int amd_viommu_domain_id_update(struct amd_iommu *iommu, u16 gid,
 
 void amd_viommu_set_device_mapping(struct amd_iommu *iommu, u16 hDevId,
 				   u16 guestId, u16 gDevId);
+
+int amd_viommu_guest_mmio_write(struct iommufd_viommu *viommu, u16 offset, u64 value);
+
+int amd_viommu_guest_mmio_read(struct iommufd_viommu *viommu, u16 offset, u64 *value);
+
 #else
 
 static inline int amd_viommu_init(struct amd_iommu *iommu)
@@ -53,6 +58,16 @@ static inline void amd_viommu_set_device_mapping(struct amd_iommu *iommu, u16 hD
 {
 }
 
+static inline int amd_viommu_guest_mmio_write(struct iommufd_viommu *viommu, u16 offset, u64 value)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int amd_viommu_guest_mmio_read(struct iommufd_viommu *viommu, u16 offset, u64 *value)
+{
+	return -EOPNOTSUPP;
+}
+
 #endif /* CONFIG_AMD_IOMMU_IOMMUFD */
 
 #endif /* AMD_VIOMMU_H */
diff --git a/drivers/iommu/amd/iommufd.c b/drivers/iommu/amd/iommufd.c
index 7e6381e5a06b..53dda68c5a0a 100644
--- a/drivers/iommu/amd/iommufd.c
+++ b/drivers/iommu/amd/iommufd.c
@@ -317,4 +317,6 @@ static const struct iommufd_viommu_ops amd_viommu_ops = {
 	.vdevice_init = _amd_viommu_vdevice_init,
 	.get_hw_queue_size = _amd_viommu_get_hw_queue_size,
 	.hw_queue_init = _amd_viommu_hw_queue_init,
+	.set_command = amd_viommu_guest_mmio_write,
+	.get_command = amd_viommu_guest_mmio_read,
 };
diff --git a/drivers/iommu/amd/vfctrl_mmio.c b/drivers/iommu/amd/vfctrl_mmio.c
new file mode 100644
index 000000000000..ece94f2212ec
--- /dev/null
+++ b/drivers/iommu/amd/vfctrl_mmio.c
@@ -0,0 +1,146 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023 Advanced Micro Devices, Inc.
+ * Author: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
+ */
+
+#define pr_fmt(fmt)     "AMD-Vi: " fmt
+#define dev_fmt(fmt)    pr_fmt(fmt)
+
+#include <linux/iommu.h>
+#include <linux/amd-iommu.h>
+
+#include <linux/fs.h>
+#include <linux/cdev.h>
+#include <linux/ioctl.h>
+#include <linux/iommufd.h>
+#include <uapi/linux/iommufd.h>
+#include <linux/mem_encrypt.h>
+
+#include <asm/iommu.h>
+#include <asm/set_memory.h>
+
+#include "amd_iommu.h"
+#include "amd_iommu_types.h"
+#include "amd_viommu.h"
+#include "../iommu-pages.h"
+
+#define GET_CTRL_BITS(reg, bit, msk)	(((reg) >> (bit)) & (ULL(msk)))
+#define SET_CTRL_BITS(reg, bit1, bit2, msk) \
+	((((reg) >> (bit1)) & (ULL(msk))) << (bit2))
+
+int amd_viommu_guest_mmio_read(struct iommufd_viommu *viommu, u16 offset, u64 *value)
+{
+	u8 __iomem *vfctrl, *vf;
+	u64 val, tmp = 0;
+	struct amd_iommu_viommu *aviommu = container_of(viommu, struct amd_iommu_viommu, core);
+	struct amd_iommu *iommu = container_of(viommu->iommu_dev, struct amd_iommu, iommu);
+	int gid = aviommu->gid;
+
+	vf = VIOMMU_VF_MMIO_BASE(iommu, gid);
+	vfctrl = VIOMMU_VFCTRL_MMIO_BASE(iommu, gid);
+
+	switch (offset) {
+	case MMIO_CONTROL_OFFSET:
+	{
+		/* VFCTRL offset 20h */
+		val = readq(vfctrl + 0x20);
+		tmp |= SET_CTRL_BITS(val, 8, CONTROL_CMDBUF_EN, 1); // [12]
+		tmp |= SET_CTRL_BITS(val, 9, CONTROL_COMWAIT_EN, 1); // [4]
+
+		/* VFCTRL offset 28h */
+		val = readq(vfctrl + 0x28);
+		tmp |= SET_CTRL_BITS(val, 8, CONTROL_EVT_LOG_EN, 1); // [2]
+		tmp |= SET_CTRL_BITS(val, 9, CONTROL_EVT_INT_EN, 1); // [3]
+		tmp |= SET_CTRL_BITS(val, 10, CONTROL_DUALEVTLOG_EN, 3); // [33:32]
+
+		/* VFCTRL offset 30h */
+		val = readq(vfctrl + 0x30);
+		tmp |= SET_CTRL_BITS(val, 8, CONTROL_PPRLOG_EN, 1); // [13]
+		tmp |= SET_CTRL_BITS(val, 9, CONTROL_PPRINT_EN, 1); // [14]
+		tmp |= SET_CTRL_BITS(val, 10, CONTROL_PPR_EN, 1); // [15]
+		tmp |= SET_CTRL_BITS(val, 11, CONTROL_DUALPPRLOG_EN, 3); // [31:30]
+		tmp |= SET_CTRL_BITS(val, 13, CONTROL_PPR_AUTO_RSP_EN, 1); // [39]
+		tmp |= SET_CTRL_BITS(val, 14, CONTROL_BLKSTOPMRK_EN, 1); // [41]
+		tmp |= SET_CTRL_BITS(val, 15, CONTROL_PPR_AUTO_RSP_AON, 1); // [42]
+
+		*value = tmp;
+		break;
+	}
+	default:
+		pr_err("%s: invalid offset=%#x\n", __func__, offset);
+		WARN_ON(1);
+		break;
+	}
+
+	pr_debug("%s: iommu_devid=%#x, gid=%u, offset=%#x, value=%#llx\n",
+		 __func__, iommu->devid, gid, offset, *value);
+	return 0;
+}
+EXPORT_SYMBOL(amd_viommu_guest_mmio_read);
+
+int amd_viommu_guest_mmio_write(struct iommufd_viommu *viommu, u16 offset, u64 value)
+{
+	u8 __iomem *vfctrl, *vf;
+	u64 val, tmp, ctrl = value;
+	struct amd_iommu_viommu *aviommu = container_of(viommu, struct amd_iommu_viommu, core);
+	struct amd_iommu *iommu = container_of(viommu->iommu_dev, struct amd_iommu, iommu);
+	int gid = aviommu->gid;
+
+	pr_debug("%s: iommu_devid=%#x, gid=%u, offset=%#x, value=%#llx\n",
+		 __func__, iommu->devid, gid, offset, value);
+
+	vf = VIOMMU_VF_MMIO_BASE(iommu, gid);
+	vfctrl = VIOMMU_VFCTRL_MMIO_BASE(iommu, gid);
+
+	switch (offset) {
+	case MMIO_CONTROL_OFFSET:
+	{
+		/* VFCTRL offset 20h */
+		val = readq(vfctrl + 0x20);
+		val &= ~(0x3ULL << 8);
+		tmp = GET_CTRL_BITS(ctrl, CONTROL_CMDBUF_EN, 1); // [12]
+		val |= (tmp << 8);
+		tmp = GET_CTRL_BITS(ctrl, CONTROL_COMWAIT_EN, 1); // [4]
+		val |= (tmp << 9);
+		writeq(val, vfctrl + 0x20);
+
+		/* VFCTRL offset 28h */
+		val = readq(vfctrl + 0x28);
+		val &= ~(0xFULL << 8);
+		tmp = GET_CTRL_BITS(ctrl, CONTROL_EVT_LOG_EN, 1); // [2]
+		val |= (tmp << 8);
+		tmp = GET_CTRL_BITS(ctrl, CONTROL_EVT_INT_EN, 1); // [3]
+		val |= (tmp << 9);
+		tmp = GET_CTRL_BITS(ctrl, CONTROL_DUALEVTLOG_EN, 3); // [33:32]
+		val |= (tmp << 10);
+		writeq(val, vfctrl + 0x28);
+
+		/* VFCTRL offset 30h */
+		val = readq(vfctrl + 0x30);
+		val &= ~(0xFFULL << 8);
+		tmp = GET_CTRL_BITS(ctrl, CONTROL_PPRLOG_EN, 1); // [13]
+		val |= (tmp << 8);
+		tmp = GET_CTRL_BITS(ctrl, CONTROL_PPRINT_EN, 1); // [14]
+		val |= (tmp << 9);
+		tmp = GET_CTRL_BITS(ctrl, CONTROL_PPR_EN, 1); // [15]
+		val |= (tmp << 10);
+		tmp = GET_CTRL_BITS(ctrl, CONTROL_DUALPPRLOG_EN, 3); // [31:30]
+		val |= (tmp << 11);
+		tmp = GET_CTRL_BITS(ctrl, CONTROL_PPR_AUTO_RSP_EN, 1); // [39]
+		val |= (tmp << 13);
+		tmp = GET_CTRL_BITS(ctrl, CONTROL_BLKSTOPMRK_EN, 1); // [41]
+		val |= (tmp << 14);
+		tmp = GET_CTRL_BITS(ctrl, CONTROL_PPR_AUTO_RSP_AON, 1); // [42]
+		val |= (tmp << 15);
+		writeq(val, vfctrl + 0x30);
+		break;
+	}
+	default:
+		WARN_ON(1);
+		break;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(amd_viommu_guest_mmio_write);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 26/26] iommu/amd: Introduce logic to check and enable vIOMMU feature
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (24 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 25/26] iommu/amd: Handle set/get command for AMD vIOMMU Suravee Suthikulpanit
@ 2026-05-28  5:17 ` Suravee Suthikulpanit
  2026-06-01 13:30 ` [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Jason Gunthorpe
  26 siblings, 0 replies; 51+ messages in thread
From: Suravee Suthikulpanit @ 2026-05-28  5:17 UTC (permalink / raw)
  To: linux-kernel, iommu, joro, jgg
  Cc: yi.l.liu, kevin.tian, nicolinc, vasant.hegde, jon.grimm,
	santosh.shukla, sairaj.arunkodilkar, jay.chen, wvw, wnliu,
	dantuluris, chriscli, kpsingh, Suravee Suthikulpanit

Also switch to enable vIOMMU by default.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/amd_iommu.h       |  1 +
 drivers/iommu/amd/amd_iommu_types.h |  3 +++
 drivers/iommu/amd/init.c            | 12 +++++++++++-
 drivers/iommu/amd/viommu.c          | 24 +++++++++++++++++++++++-
 4 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index cf2b051948a3..e505c5420255 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -27,6 +27,7 @@ void amd_iommu_restart_ga_log(struct amd_iommu *iommu);
 void amd_iommu_restart_ppr_log(struct amd_iommu *iommu);
 void amd_iommu_set_rlookup_table(struct amd_iommu *iommu, u16 devid);
 void iommu_feature_enable(struct amd_iommu *iommu, u8 bit);
+bool iommu_feature_enable_and_check(struct amd_iommu *iommu, u8 bit);
 void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu,
 				  gfp_t gfp, size_t size);
 u8 __iomem * __init iommu_map_mmio_space(u64 address, u64 end);
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 44a185bfa39e..7ff714ce79b8 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -201,9 +201,12 @@
 #define CONTROL_EPH_EN		45
 #define CONTROL_PPR_AUTO_RSP_AON 48
 #define CONTROL_XT_EN		50
+#define CONTROL_VCMD_EN         52
+#define CONTROL_VIOMMU_EN       53
 #define CONTROL_INTCAPXT_EN	51
 #define CONTROL_GCR3TRPMODE	58
 #define CONTROL_IRTCACHEDIS	59
+#define CONTROL_GSTBUFFERTRPMODE	60
 #define CONTROL_SNPAVIC_EN	61
 
 #define CTRL_INV_TO_MASK	7
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 622bc0337eda..a66a5089bf91 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -198,7 +198,7 @@ bool amdr_ivrs_remap_support __read_mostly;
 bool amd_iommu_force_isolation __read_mostly;
 
 /* VIOMMU enabling flag */
-bool amd_iommu_viommu;
+bool amd_iommu_viommu = true;
 
 unsigned long amd_iommu_pgsize_bitmap __ro_after_init = AMD_IOMMU_PGSIZES;
 
@@ -417,6 +417,16 @@ void iommu_feature_enable(struct amd_iommu *iommu, u8 bit)
 	iommu_feature_set(iommu, 1ULL, 1ULL, bit);
 }
 
+bool iommu_feature_enable_and_check(struct amd_iommu *iommu, u8 bit)
+{
+	u64 ctrl;
+
+	iommu_feature_enable(iommu, bit);
+
+	ctrl = readq(iommu->mmio_base +  MMIO_CONTROL_OFFSET);
+	return (ctrl & (1ULL << bit));
+}
+
 static void iommu_feature_disable(struct amd_iommu *iommu, u8 bit)
 {
 	iommu_feature_set(iommu, 0ULL, 1ULL, bit);
diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
index d2c883e314f8..4dcd781bc35a 100644
--- a/drivers/iommu/amd/viommu.c
+++ b/drivers/iommu/amd/viommu.c
@@ -45,6 +45,18 @@
 
 LIST_HEAD(viommu_devid_map);
 
+static int viommu_enable(struct amd_iommu *iommu)
+{
+	/* The GstBufferTRPMode feature is checked by set and test */
+	if (!iommu_feature_enable_and_check(iommu, CONTROL_GSTBUFFERTRPMODE))
+		return -EINVAL;
+
+	iommu_feature_enable(iommu, CONTROL_VCMD_EN);
+	iommu_feature_enable(iommu, CONTROL_VIOMMU_EN);
+
+	return 0;
+}
+
 static int viommu_init_pci_vsc(struct amd_iommu *iommu)
 {
 	iommu->vsc_offset = pci_find_capability(iommu->dev, PCI_CAP_ID_VNDR);
@@ -308,11 +320,21 @@ int __init amd_viommu_init(struct amd_iommu *iommu)
 
 	ret = viommu_private_space_init(iommu);
 	if (ret)
-		return ret;
+		goto err_unmap_vf;
 
 	set_iommu_dte(iommu);
 
+	ret = viommu_enable(iommu);
+	if (ret)
+		goto err_private_space;
+
 	return 0;
+
+err_private_space:
+	viommu_private_space_uninit(iommu);
+err_unmap_vf:
+	amd_viommu_uninit(iommu);
+	return ret;
 }
 
 static int __maybe_unused alloc_private_vm_region(struct amd_iommu *iommu, u64 **entry,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 22/26] iommufd: Add hw_queue_init and split queue alloc paths
  2026-05-28  5:17 ` [PATCH v2 22/26] iommufd: Add hw_queue_init and split queue alloc paths Suravee Suthikulpanit
@ 2026-05-29  0:14   ` Nicolin Chen
  2026-06-03  0:30     ` Suthikulpanit, Suravee
  2026-06-01 13:38   ` Jason Gunthorpe
  1 sibling, 1 reply; 51+ messages in thread
From: Nicolin Chen @ 2026-05-29  0:14 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, jgg, yi.l.liu, kevin.tian,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Thu, May 28, 2026 at 05:17:34AM +0000, Suravee Suthikulpanit wrote:
> Splitting helpers and dispatching from the ioctl keeps
> one uAPI while making the contract explicit.

Why split? Those two new ioctl functions look quite redundant..

I imagined that the original ioctl function just needed:
[...]
-	struct iommufd_access *access;
+	struct iommufd_access *access = NULL;
[...]
 	if (!viommu->ops || !viommu->ops->get_hw_queue_size ||
-	    !viommu->ops->hw_queue_init_phys) {
+	    (!viommu->ops->hw_queue_init_phys && !viommu->ops->hw_queue_init)) {
		rc = -EOPNOTSUPP;
		goto out_put_viommu;
	}
[...]
-	access = iommufd_hw_queue_alloc_phys(cmd, viommu, &base_pa);
-	if (IS_ERR(access)) {
-		rc = PTR_ERR(access);
-		goto out_put_viommu;
-	}
+	if (viommu->ops->hw_queue_init_phys) {
+		access = iommufd_hw_queue_alloc_phys(cmd, viommu, &base_pa);
+		if (IS_ERR(access)) {
+			rc = PTR_ERR(access);
+			goto out_put_viommu;
+		}
+	}
[...]
	rc = viommu->ops->hw_queue_init_phys(hw_queue, cmd->index, base_pa);
+ 	if (viommu->ops->hw_queue_init_phys)
+		rc = viommu->ops->hw_queue_init_phys(hw_queue, cmd->index,
+						     base_pa);
+	else
+		rc = viommu->ops->hw_queue_init(hw_queue, cmd->index);

and then it should work?

Nicolin

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 25/26] iommu/amd: Handle set/get command for AMD vIOMMU
  2026-05-28  5:17 ` [PATCH v2 25/26] iommu/amd: Handle set/get command for AMD vIOMMU Suravee Suthikulpanit
@ 2026-05-29 23:20   ` Nicolin Chen
  2026-06-03  3:53     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 51+ messages in thread
From: Nicolin Chen @ 2026-05-29 23:20 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, jgg, yi.l.liu, kevin.tian,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Thu, May 28, 2026 at 05:17:37AM +0000, Suravee Suthikulpanit wrote:
> Guest kernel programs guest Command Buffer, Event Log, and PPR Log
> (a.k.a hardware queue) settings via guest control MMIO register
> (guest MMIO offset 0x18). Accesses to the register is trapped by VMM (QEMU)
> and information is passed as IOMMU_VIOMMU_OPTION to host IOMMU driver via
> struct iommufd_viommu_ops set_command() and get_command().
> 
> Provides AMD IOMMU driver hooks to handle set/get operations for the
> guest control MMIO register, which uses key parameter as AMD IOMMU MMIO
> offset. The value parameter contains the value of the corresponding guest
> MMIO register, which is converted to the format of AMD vIOMMU VF Control
> MMIO registers then programed onto the hardware.

So the whole set_command/get_command thing is to MMIO a control
register? I doubt this is a right ioctl(s) as it can be abused.

Looking at the details:

> +	switch (offset) {
> +	case MMIO_CONTROL_OFFSET:
> +	{
> +		/* VFCTRL offset 20h */
> +		val = readq(vfctrl + 0x20);
> +		val &= ~(0x3ULL << 8);
> +		tmp = GET_CTRL_BITS(ctrl, CONTROL_CMDBUF_EN, 1); // [12]
> +		val |= (tmp << 8);

This is to enable a hw_queue. So it could be marked when a hw_queue
is allocated; IMHO, VMM should only allocate the hw_queue, when the
guest really uses (enables) the buffer.

> +		tmp = GET_CTRL_BITS(ctrl, CONTROL_COMWAIT_EN, 1); // [4]
> +		val |= (tmp << 9);
> +		writeq(val, vfctrl + 0x20);

The spec suggests to set this with CmdBufEn, so it may be a vendor
(or hw_queue type) specific flag in structu iommu_hw_queue_alloc?

And similar comments to the event log and pri log buffers as well.

Overall, VMM should control the timings of allocating these viommu
hw_queue objects; then kernel would just enable them accordingly.

Nicolin

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 12/26] iommu/amd: Add per-VM private IPA alloc/map helpers
  2026-05-28  5:17 ` [PATCH v2 12/26] iommu/amd: Add per-VM private IPA alloc/map helpers Suravee Suthikulpanit
@ 2026-05-30 20:44   ` Weinan Liu
  2026-06-01 13:08   ` Jason Gunthorpe
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 51+ messages in thread
From: Weinan Liu @ 2026-05-30 20:44 UTC (permalink / raw)
  To: suravee.suthikulpanit
  Cc: chriscli, dantuluris, iommu, jay.chen, jgg, jon.grimm, joro,
	kevin.tian, kpsingh, linux-kernel, nicolinc, sairaj.arunkodilkar,
	santosh.shukla, vasant.hegde, wnliu, wvw, yi.l.liu

On Wed, May 27, 2026 at 10:19 PM Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> wrote:
>
> +static void __maybe_unused free_private_vm_region(struct amd_iommu *iommu, u64 **entry,
> +                                                 u64 base, size_t size, u16 gid)
> +{
> +       size_t unmapped;
> +       u64 addr = base + (gid * size);
> +
> +       pr_debug("%s: entry=%#llx(%#llx), base=%#llx, addr=%#llx, size=%#lx\n",
> +                __func__, (unsigned long  long)*entry,
> +                iommu_virt_to_phys(*entry), base, addr, size);
> +
> +       if (!iommu || !iommu->viommu_pdom)
> +               return;

Should check if the page pointer *entry is non-NULL before operating on it.
*entry  will be  NULL if the caller encounters an error during alloc_private_vm_region(),
it may attempt to unmap and free a NULL pointer below:


> +
> +       unmapped = iommu_unmap(&iommu->viommu_pdom->domain, addr, size);
> +       if (unmapped != size)
> +               pr_warn("%s: unmapped %#zx of %#lx at %#llx\n", __func__, unmapped, size, addr);
> +
> +       set_memory_wb((unsigned long)*entry, size >> PAGE_SHIFT);
> +       iommu_free_pages(*entry);
> +       *entry = NULL;
> +}

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 13/26] iommu/amd: Add helper functions to manage DevID / DomID mapping tables
  2026-05-28  5:17 ` [PATCH v2 13/26] iommu/amd: Add helper functions to manage DevID / DomID mapping tables Suravee Suthikulpanit
@ 2026-05-30 21:26   ` Weinan Liu
  0 siblings, 0 replies; 51+ messages in thread
From: Weinan Liu @ 2026-05-30 21:26 UTC (permalink / raw)
  To: suravee.suthikulpanit
  Cc: chriscli, dantuluris, iommu, jay.chen, jgg, jon.grimm, joro,
	kevin.tian, kpsingh, linux-kernel, nicolinc, sairaj.arunkodilkar,
	santosh.shukla, vasant.hegde, wnliu, wvw, yi.l.liu

On Wed, May 27, 2026 at 10:19 PM Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> wrote:

> diff --git a/drivers/iommu/amd/iommufd.c b/drivers/iommu/amd/iommufd.c
> index 42307ae71b24..efa9e1f49550 100644
> --- a/drivers/iommu/amd/iommufd.c
> +++ b/drivers/iommu/amd/iommufd.c
> @@ -83,6 +83,10 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
>         /* Reset vIOMMU MMIOs to initialize the vIOMMU */
>         iommu_reset_vmmio(iommu, aviommu->gid);
>
> +       ret = amd_viommu_init_one(iommu, aviommu);
> +       if (ret)
> +               goto err_init;
> +
>         ret = iommu_copy_struct_to_user(user_data, &data,
>                                         IOMMU_VIOMMU_TYPE_AMD,
>                                         reserved);
> @@ -120,6 +124,7 @@ static void amd_iommufd_viommu_destroy(struct iommufd_viommu *viommu)
>         if (aviommu->vfmmio_mmap_offset)
>                 iommufd_viommu_destroy_mmap(&aviommu->core, aviommu->vfmmio_mmap_offset);
>         amd_iommu_gid_free(iommu, aviommu->gid);
> +       amd_viommu_uninit_one(iommu, aviommu);
>  }
>
The finalization order should be the reverse of the initialization order.

If amd_iommu_gid_free() is called before amd_viommu_uninit_one(), the gid could be reallocated to a new
vIOMMU instance before the cleanup is complete.
Please consider moving amd_viommu_uninit_one() before the GID free call.


> diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
> index 6dcb02b12a28..3636093732ce 100644
> --- a/drivers/iommu/amd/viommu.c
> +++ b/drivers/iommu/amd/viommu.c
> +
[...]
> +void amd_viommu_uninit_one(struct amd_iommu *iommu, struct amd_iommu_viommu *aviommu)
> +{
> +       pr_debug("%s: gid=%u\n", __func__, aviommu->gid);
> +
> +       free_private_vm_region(iommu, &aviommu->devid_table,
> +                              VIOMMU_DEVID_MAPPING_BASE,
> +                              VIOMMU_DEVID_MAPPING_ENTRY_SIZE,
> +                              aviommu->gid);
> +       free_private_vm_region(iommu, &aviommu->domid_table,
> +                              VIOMMU_DOMID_MAPPING_BASE,
> +                              VIOMMU_DOMID_MAPPING_ENTRY_SIZE,
> +                              aviommu->gid);
> +}
> +

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 03/26] iommu/amd: Detect and initialize AMD vIOMMU feature
  2026-05-28  5:17 ` [PATCH v2 03/26] iommu/amd: Detect and initialize AMD vIOMMU feature Suravee Suthikulpanit
@ 2026-06-01 12:43   ` Jason Gunthorpe
  2026-06-05  8:45     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 51+ messages in thread
From: Jason Gunthorpe @ 2026-06-01 12:43 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Thu, May 28, 2026 at 05:17:15AM +0000, Suravee Suthikulpanit wrote:
> The feature is advertised w/ EFR[VIOMMUSup]. Please see the AMD IOMMU
> specification[1] for more detail.
> 
> Introduce a new global variable amd_iommu_viommu, which is used to
> control the feature enablement in the driver. Currently, the feature
> is default to disabled. Once the feature is fully supported, it will be
> changed to enabled by default along with a command-line option to disable
> if needed.

Still no to command line option. Just don't use iommufd if you don't
want to use it. It must have no cost until iommufd activates it.

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 04/26] iommu/amd: Introduce IOMMUFD vIOMMU support for AMD
  2026-05-28  5:17 ` [PATCH v2 04/26] iommu/amd: Introduce IOMMUFD vIOMMU support for AMD Suravee Suthikulpanit
@ 2026-06-01 12:44   ` Jason Gunthorpe
  2026-06-05  8:55     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 51+ messages in thread
From: Jason Gunthorpe @ 2026-06-01 12:44 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Thu, May 28, 2026 at 05:17:16AM +0000, Suravee Suthikulpanit wrote:

> +/**
> + * struct iommu_viommu_amd - AMD vIOMMU Interface (IOMMU_VIOMMU_TYPE_AMD)
> + * @reserved: Must be zero
> + */
> +struct iommu_viommu_amd {
> +	__u32 reserved; /* must be last */
> +};

Do not structure patches like this, introduce the full and complete
ABI structure here.

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 07/26] iommu/amd: Add support for AMD vIOMMU VF MMIO region
  2026-05-28  5:17 ` [PATCH v2 07/26] iommu/amd: Add support for AMD vIOMMU VF MMIO region Suravee Suthikulpanit
@ 2026-06-01 12:51   ` Jason Gunthorpe
  0 siblings, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2026-06-01 12:51 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Thu, May 28, 2026 at 05:17:19AM +0000, Suravee Suthikulpanit wrote:
> @@ -68,6 +70,16 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
>  	aviommu->gid = ret;
>  	pr_debug("%s: gid=%#x", __func__, aviommu->gid);
>  
> +	page_base = amd_viommu_get_vfmmio_addr(iommu, aviommu->gid);
> +
> +	ret = iommufd_viommu_alloc_mmap(&aviommu->core,
> +					page_base, SZ_4K,
> +					(unsigned long *)&data.out_vfmmio_mmap_offset);

This should not be casted like this.

> +	if (ret)
> +		goto err_mmap;
> +
> +	aviommu->vfmmio_mmap_offset = data.out_vfmmio_mmap_offset;
> +
>  	ret = iommu_copy_struct_to_user(user_data, &data,
>  					IOMMU_VIOMMU_TYPE_AMD,
>  					reserved);
> @@ -82,6 +94,8 @@ int amd_iommufd_viommu_init(struct iommufd_viommu *viommu, struct iommu_domain *
>  
>  	return 0;
>  err_init:
> +	iommufd_viommu_destroy_mmap(&aviommu->core, aviommu->vfmmio_mmap_offset);
> +err_mmap:
>  	amd_iommu_gid_free(iommu, aviommu->gid);
>  err_gid:
>  	return ret;
> @@ -100,6 +114,8 @@ static void amd_iommufd_viommu_destroy(struct iommufd_viommu *viommu)
>  	list_del(&aviommu->pdom_list);
>  	spin_unlock_irqrestore(&pdom->lock, flags);
>  	xa_destroy(&aviommu->gdomid_array);
> +	if (aviommu->vfmmio_mmap_offset)
> +		iommufd_viommu_destroy_mmap(&aviommu->core, aviommu->vfmmio_mmap_offset);

I guess mmap_offset could be zero legitimately

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 10/26] iommu/amd: Assign IOMMU Private Address domain to IOMMU
  2026-05-28  5:17 ` [PATCH v2 10/26] iommu/amd: Assign IOMMU Private Address domain to IOMMU Suravee Suthikulpanit
@ 2026-06-01 12:59   ` Jason Gunthorpe
  0 siblings, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2026-06-01 12:59 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Thu, May 28, 2026 at 05:17:22AM +0000, Suravee Suthikulpanit wrote:
> By setting the domain ID, pagetable mode, and IOMMU v1 page table in the
> IOMMU Device Table Entry (DTE) indexed using the device ID of the
> AMD IOMMU.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>  drivers/iommu/amd/viommu.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/drivers/iommu/amd/viommu.c b/drivers/iommu/amd/viommu.c
> index 63360eef6b0d..14426649074f 100644
> --- a/drivers/iommu/amd/viommu.c
> +++ b/drivers/iommu/amd/viommu.c
> @@ -173,6 +173,35 @@ u64 amd_viommu_get_vfmmio_addr(struct amd_iommu *iommu, u16 gid)
>  }
>  EXPORT_SYMBOL(amd_viommu_get_vfmmio_addr);
>  
> +/* Set DTE for IOMMU device */
> +static void set_iommu_dte(struct amd_iommu *iommu)
> +{
> +	u64 dte0, dte1;
> +	u16 devid = iommu->devid;
> +	struct pt_iommu_amdv1_hw_info pt_info;
> +	struct protection_domain *pdom = iommu->viommu_pdom;
> +	struct dev_table_entry *dev_table = get_dev_table(iommu);
> +
> +	pt_iommu_amdv1_hw_info(&pdom->amdv1, &pt_info);
> +
> +	pr_debug("%s: host_pt_root=%#llx, mode=%#x\n",
> +		 __func__, pt_info.host_pt_root, pt_info.mode);
> +
> +	dte0 = FIELD_PREP(DTE_HOST_TRP, pt_info.host_pt_root >> 12);
> +	dte0 |= (pt_info.mode & DEV_ENTRY_MODE_MASK) << DEV_ENTRY_MODE_SHIFT;
> +	dte0 |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
> +
> +	dte1 = dev_table[devid].data[1];
> +	dte1 &= ~DTE_DOMID_MASK;
> +	dte1 |= pdom->id;

Why is this editing the DTE in place!?

The special DTE should be entirely deterministic. Call
amd_iommu_make_clear_dte(), and use amd_iommu_update_dte() to write
it like every other DTE.

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 11/26] iommu/amd: Allocate and map vIOMMU private regions
  2026-05-28  5:17 ` [PATCH v2 11/26] iommu/amd: Allocate and map vIOMMU private regions Suravee Suthikulpanit
@ 2026-06-01 13:05   ` Jason Gunthorpe
  0 siblings, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2026-06-01 13:05 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Thu, May 28, 2026 at 05:17:23AM +0000, Suravee Suthikulpanit wrote:

> --- a/drivers/iommu/amd/viommu.c
> +++ b/drivers/iommu/amd/viommu.c
> @@ -131,8 +131,66 @@ static int __init viommu_vf_vfcntl_init(struct amd_iommu *iommu)
>  	return -ENOMEM;
>  }
>  
> +static void *alloc_private_subregion(struct amd_iommu *iommu, u64 base, size_t size)
> +{
> +	int ret;
> +	void *region;
> +	int nid = iommu && iommu->dev ? dev_to_node(&iommu->dev->dev) : NUMA_NO_NODE;
> +
> +	region = (void *)iommu_alloc_pages_node_sz(nid, GFP_KERNEL | __GFP_ZERO, size);
> +	if (!region)
> +		return NULL;
> +
> +	ret = set_memory_uc((unsigned long)region, size >> PAGE_SHIFT);
> +	if (ret)
> +		goto err_out;

Why?

> +	ret = iommu_map(&iommu->viommu_pdom->domain, base,
> +			iommu_virt_to_phys(region), size,
> +			IOMMU_PROT_IR | IOMMU_PROT_IW, GFP_KERNEL);
> +
> +	if (ret)
> +		goto cleanup_mem_attr;
> +
> +	pr_debug("%s: base=%#llx, size=%#lx, subregion=%#llx(%#llx)\n",
> +		 __func__, base, size, (unsigned long long)region, iommu_virt_to_phys(region));
> +
> +	amd_iommu_flush_private_vm_region(iommu, iommu->viommu_pdom, base, size);

Why? Is there suddenly negative caching for this mode?

> +	return region;
> +cleanup_mem_attr:
> +	set_memory_wb((unsigned long)region, size >> PAGE_SHIFT);
> +err_out:
> +	iommu_free_pages(region);
> +	return NULL;
> +}
> +
> +static void viommu_private_space_uninit(struct amd_iommu *iommu)
> +{
> +	int i;
> +	struct iommu_domain *dom;
> +
> +	if (!iommu->viommu_pdom)
> +		return;
> +
> +	for (i = 0; i < VIOMMU_PRIV_SUBREGION_CNT; i++) {
> +		if (!iommu->viommu_priv_region[i])
> +			continue;
> +		set_memory_wb((unsigned long)iommu->viommu_priv_region[i],
> +			      VIOMMU_PRIV_SUBREGION_SIZE >> PAGE_SHIFT);
> +		iommu_free_pages(iommu->viommu_priv_region[i]);
> +		iommu->viommu_priv_region[i] = NULL;
> +	}
> +
> +	dom = &iommu->viommu_pdom->domain;
> +	amd_iommu_domain_free(dom);
> +	iommu->viommu_pdom = NULL;
> +}

Shouldn't something flush the DID before freeing the domain?

>  static int viommu_private_space_init(struct amd_iommu *iommu)
>  {
> +	int i;
> +	u64 base;
>  	struct iommu_domain *dom;
>  	struct protection_domain *pdom;
>  	struct pt_iommu_amdv1_hw_info pt_info;
> @@ -144,22 +202,33 @@ static int viommu_private_space_init(struct amd_iommu *iommu)
>  	dom = amd_iommu_domain_alloc_paging_v1(&iommu->dev->dev, 0);
>  	if (!dom) {
>  		pr_err("%s: Failed to initialize private space\n", __func__);
> -		goto err_out;
> +		return -ENOMEM;
>  	}
>  
>  	pdom = to_pdomain(dom);
>  	iommu->viommu_pdom = pdom;
>  
> +	/*
> +	 * Each private region requires to 8MB of memory to be allocated
> +	 * and mapped. Split the region into 4 x 2MB-subregion.
> +	 */
> +	for (i = 0; i < VIOMMU_PRIV_SUBREGION_CNT; i++) {
> +		base = VIOMMU_PRIV_REGION_BASE + (i * VIOMMU_PRIV_SUBREGION_SIZE);
> +		iommu->viommu_priv_region[i] = alloc_private_subregion(iommu, base,
> +								       VIOMMU_PRIV_SUBREGION_SIZE);
> +		if (!iommu->viommu_priv_region[i]) {
> +			pr_err("%s: Failed to allocate vIOMMU private subregion %d\n", __func__, i);
> +			viommu_private_space_uninit(iommu);
> +			return -ENOMEM;
> +		}
> +	}
> +
>  	pt_iommu_amdv1_hw_info(&pdom->amdv1, &pt_info);
>  	pr_debug("%s: devid=%#x, pte_root=%#llx\n",
>  		 __func__, iommu->devid,
>  		 (unsigned long long)pt_info.host_pt_root);
>  
>  	return 0;
> -err_out:
> -	if (dom)
> -		amd_iommu_domain_free(dom);
> -	return -ENOMEM;

Why is the error handling being deleted now? You should organize your
patches to avoid churn like this.

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 12/26] iommu/amd: Add per-VM private IPA alloc/map helpers
  2026-05-28  5:17 ` [PATCH v2 12/26] iommu/amd: Add per-VM private IPA alloc/map helpers Suravee Suthikulpanit
  2026-05-30 20:44   ` Weinan Liu
@ 2026-06-01 13:08   ` Jason Gunthorpe
  2026-06-01 13:11   ` Jason Gunthorpe
  2026-06-01 18:16   ` Weinan Liu
  3 siblings, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2026-06-01 13:08 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Thu, May 28, 2026 at 05:17:24AM +0000, Suravee Suthikulpanit wrote:
> +static int __maybe_unused alloc_private_vm_region(struct amd_iommu *iommu, u64 **entry,
> +						 u64 base, size_t size, u16 gid)
> +{
> +	int ret;
> +	u64 addr = base + (gid * size);
> +	int nid = iommu && iommu->dev ? dev_to_node(&iommu->dev->dev) : NUMA_NO_NODE;
> +
> +	*entry = (void *)iommu_alloc_pages_node_sz(nid, GFP_KERNEL | __GFP_ZERO, size);
> +	if (!*entry)
> +		return -ENOMEM;

No cast

> +	ret = set_memory_uc((unsigned long)*entry, size >> PAGE_SHIFT);
> +	if (ret)
> +		goto err_out;
> +
> +	pr_debug("%s: entry=%#llx(%#llx), addr=%#llx, size=%#lx\n", __func__,
> +		 (unsigned long  long)*entry, iommu_virt_to_phys(*entry), addr, size);
> +
> +	ret = iommu_map(&iommu->viommu_pdom->domain, addr,
> +			iommu_virt_to_phys(*entry), size,
> +			IOMMU_PROT_IR | IOMMU_PROT_IW, GFP_KERNEL);
> +	if (ret)
> +		goto cleanup_mem_attr;
> +
> +	return amd_iommu_flush_private_vm_region(iommu, iommu->viommu_pdom, addr, size);
> +cleanup_mem_attr:
> +	set_memory_wb((unsigned long)*entry, size >> PAGE_SHIFT);
> +err_out:
> +	iommu_free_pages(*entry);
> +	*entry = NULL;
> +	return ret;
> +}

This all seems duplicated from the prior patch, you should try to make
one helper that does this alloc and map sequence.

> +static void __maybe_unused free_private_vm_region(struct amd_iommu *iommu, u64 **entry,
> +						  u64 base, size_t size, u16 gid)
> +{
> +	size_t unmapped;
> +	u64 addr = base + (gid * size);
> +
> +	pr_debug("%s: entry=%#llx(%#llx), base=%#llx, addr=%#llx, size=%#lx\n",
> +		 __func__, (unsigned long  long)*entry,
> +		 iommu_virt_to_phys(*entry), base, addr, size);
> +
> +	if (!iommu || !iommu->viommu_pdom)
> +		return;
> +
> +	unmapped = iommu_unmap(&iommu->viommu_pdom->domain, addr, size);
> +	if (unmapped != size)
> +		pr_warn("%s: unmapped %#zx of %#lx at %#llx\n", __func__, unmapped, size, addr);

No flush? Or does the gather work for the special domain?

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 12/26] iommu/amd: Add per-VM private IPA alloc/map helpers
  2026-05-28  5:17 ` [PATCH v2 12/26] iommu/amd: Add per-VM private IPA alloc/map helpers Suravee Suthikulpanit
  2026-05-30 20:44   ` Weinan Liu
  2026-06-01 13:08   ` Jason Gunthorpe
@ 2026-06-01 13:11   ` Jason Gunthorpe
  2026-06-01 18:16   ` Weinan Liu
  3 siblings, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2026-06-01 13:11 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Thu, May 28, 2026 at 05:17:24AM +0000, Suravee Suthikulpanit wrote:
> Export amd_iommu_iotlb_sync() and use it for nested domain
> iotlb_sync so nested attach paths can flush gathered IOTLB state.

[ ..]

> --- a/drivers/iommu/amd/nested.c
> +++ b/drivers/iommu/amd/nested.c
> @@ -291,4 +291,5 @@ static void nested_domain_free(struct iommu_domain *dom)
>  static const struct iommu_domain_ops nested_domain_ops = {
>  	.attach_dev = nested_attach_device,
>  	.free = nested_domain_free,
> +	.iotlb_sync = amd_iommu_iotlb_sync,
>  };


I don't get it, what is all of this for?

iotlb_sync should never be called on an IOMMU_DOMAIN_NESTED.

"nested attach paths" should never have gathered IOTLB state.

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support
  2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
                   ` (25 preceding siblings ...)
  2026-05-28  5:17 ` [PATCH v2 26/26] iommu/amd: Introduce logic to check and enable vIOMMU feature Suravee Suthikulpanit
@ 2026-06-01 13:30 ` Jason Gunthorpe
  2026-06-03  4:13   ` Suthikulpanit, Suravee
  2026-06-03  6:41   ` Suthikulpanit, Suravee
  26 siblings, 2 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2026-06-01 13:30 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Thu, May 28, 2026 at 05:17:12AM +0000, Suravee Suthikulpanit wrote:

> [1] IOMMU Specification: https://docs.amd.com/v/u/en-US/48882_3.10_PUB

This link doesn't work

And I'm not sure the older PDF has everything, I haven't figure out
where in the spec this TransDevID stuff comes from

> [2] Linux git tree: https://github.com/AMDESE/linux-iommu/tree/linux-7.1.0-rc4-amd-viommu_upstream_v2

Build test all the kconfig combinations, I noticed problems.

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 18/26] iommu/amd: Add translation DTE and VFctrl TransDevID helpers
  2026-05-28  5:17 ` [PATCH v2 18/26] iommu/amd: Add translation DTE and VFctrl TransDevID helpers Suravee Suthikulpanit
@ 2026-06-01 13:31   ` Jason Gunthorpe
  0 siblings, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2026-06-01 13:31 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Thu, May 28, 2026 at 05:17:30AM +0000, Suravee Suthikulpanit wrote:

> +void amd_iommu_set_translate_dte(struct amd_iommu *iommu, u16 gid,
> +				 struct protection_domain *pdom,
> +				 u32 devid)
> +{
> +	u64 tmp0 = 0ULL, tmp1 = 0ULL;
> +	struct pt_iommu_amdv1_hw_info pt_info;
> +	struct dev_table_entry *dev_table = get_dev_table(iommu);
> +
> +	pt_iommu_amdv1_hw_info(&pdom->amdv1, &pt_info);
> +
> +	pr_debug("%s: gid=%#x, iommu_devid=%#x, devid=%#x, host_pt_root=%#llx, mode=%#x\n",
> +		 __func__, gid, iommu->devid, devid, pt_info.host_pt_root, pt_info.mode);
> +
> +	/* Setup DTE for v1 page table at the offset specified by devid */
> +	tmp0 |=	FIELD_PREP(DTE_HOST_TRP, pt_info.host_pt_root >> 12);
> +	tmp0 |= FIELD_PREP(DTE_MODE_MASK, pt_info.mode);
> +	tmp0 |= (DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_TV | DTE_FLAG_V);
> +	tmp1 |= FIELD_PREP(DTE_DOMID_MASK, pdom->id);
> +
> +	dev_table[devid].data[0] = tmp0;
> +	dev_table[devid].data[1] = tmp1;

Please stop editing DTEs in place.

This looks like it is writing some otherwise unused DTE, it should
fully initialize it and use the normal flow to write it.

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 21/26] iommu/amd: Map kvmfd to shared translate device ID for vIOMMU
  2026-05-28  5:17 ` [PATCH v2 21/26] iommu/amd: Map kvmfd to shared translate device ID for vIOMMU Suravee Suthikulpanit
@ 2026-06-01 13:35   ` Jason Gunthorpe
  0 siblings, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2026-06-01 13:35 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Thu, May 28, 2026 at 05:17:33AM +0000, Suravee Suthikulpanit wrote:
> Add per-PCI-segment kvmfd xarray amd_iommu_kvmfd_trans_entry with
> refcounted so all vIOMMUs for one VM share one translate-device-id and
> GPA->SPA DTE.

What? There should no KVM in the viommu implementation.

I can't understand why you did this.

AFAICT the HW needs some unused devid "TransDevID" for some
purpose. This is pretty sketchy, but OK..

The ID should be affiliated with every nest parent domain, or maybe
every vIOMMU.

It has nothing to do with KVM. You can't assume anything about iommufd
is constructing the viommu based on the KVM.

The only acceptable use of KVM in iommufd land is to get some CPU
specific shared information which this is not doing.

So no to any KVM stuff in this series.

In an AMD system there should be exactly one viommu per kvm per
physical instance anyhow.

Jsaon

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 22/26] iommufd: Add hw_queue_init and split queue alloc paths
  2026-05-28  5:17 ` [PATCH v2 22/26] iommufd: Add hw_queue_init and split queue alloc paths Suravee Suthikulpanit
  2026-05-29  0:14   ` Nicolin Chen
@ 2026-06-01 13:38   ` Jason Gunthorpe
  1 sibling, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2026-06-01 13:38 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Thu, May 28, 2026 at 05:17:34AM +0000, Suravee Suthikulpanit wrote:
> Add iommufd_viommu_ops.hw_queue_init for vIOMMU backends whose
> hardware uses a guest physical queue base from hw_queue->base_addr
> instead of a host physical address.
> 
> Previously, HW queue alloc always went through
> iommufd_hw_queue_alloc_phys(), an iommufd_access, and
> hw_queue_init_phys(base_pa). AMD vIOMMU instead takes GPA from
> userspace in hw_queue->base_addr and programs hardware without host PA
> resolution. Splitting helpers and dispatching from the ioctl keeps
> one uAPI while making the contract explicit. Each vIOMMU driver should
> implement only one of hw_queue_init_phys or hw_queue_init.
> 
> Refactor iommufd_hw_queue_alloc_ioctl() so shared validation and
> viommu lookup stay in the ioctl, while setup is delegated to
> _iommufd_hw_queue_init_phys() or _iommufd_hw_queue_init().

This patch series is getting pretty big, getting the viommu working
alone is enough for one series, I would split getting the direct
assigned queues working to a following series

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 12/26] iommu/amd: Add per-VM private IPA alloc/map helpers
  2026-05-28  5:17 ` [PATCH v2 12/26] iommu/amd: Add per-VM private IPA alloc/map helpers Suravee Suthikulpanit
                     ` (2 preceding siblings ...)
  2026-06-01 13:11   ` Jason Gunthorpe
@ 2026-06-01 18:16   ` Weinan Liu
  3 siblings, 0 replies; 51+ messages in thread
From: Weinan Liu @ 2026-06-01 18:16 UTC (permalink / raw)
  To: suravee.suthikulpanit
  Cc: chriscli, dantuluris, iommu, jay.chen, jgg, jon.grimm, joro,
	kevin.tian, kpsingh, linux-kernel, nicolinc, sairaj.arunkodilkar,
	santosh.shukla, vasant.hegde, wnliu, wvw, yi.l.liu

On Mon, Jun 1, 2026 at 6:11 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Thu, May 28, 2026 at 05:17:24AM +0000, Suravee Suthikulpanit wrote:
> > Export amd_iommu_iotlb_sync() and use it for nested domain
> > iotlb_sync so nested attach paths can flush gathered IOTLB state.
>
> [ ..]
>
> > --- a/drivers/iommu/amd/nested.c
> > +++ b/drivers/iommu/amd/nested.c
> > @@ -291,4 +291,5 @@ static void nested_domain_free(struct iommu_domain *dom)
> >  static const struct iommu_domain_ops nested_domain_ops = {
> >       .attach_dev = nested_attach_device,
> >       .free = nested_domain_free,
> > +     .iotlb_sync = amd_iommu_iotlb_sync,
> >  };
>
>
> I don't get it, what is all of this for?
>
> iotlb_sync should never be called on an IOMMU_DOMAIN_NESTED.
>
> "nested attach paths" should never have gathered IOTLB state.
>
> Jason

BTW, amd_iommu_iotlb_sync() is designed to flush a protection_domain and cannot be used for a nested_domain.

Specifically, the current implementation explicitly casts the domain to a protection_domain:

```
void amd_iommu_iotlb_sync(struct iommu_domain *domain,
                          struct iommu_iotlb_gather *gather)
{
        struct protection_domain *dom = to_pdomain(domain);
```

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 22/26] iommufd: Add hw_queue_init and split queue alloc paths
  2026-05-29  0:14   ` Nicolin Chen
@ 2026-06-03  0:30     ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 51+ messages in thread
From: Suthikulpanit, Suravee @ 2026-06-03  0:30 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: linux-kernel, iommu, joro, jgg, yi.l.liu, kevin.tian,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh



On 5/29/2026 7:14 AM, Nicolin Chen wrote:
> On Thu, May 28, 2026 at 05:17:34AM +0000, Suravee Suthikulpanit wrote:
>> Splitting helpers and dispatching from the ioctl keeps
>> one uAPI while making the contract explicit.
> 
> Why split? Those two new ioctl functions look quite redundant..
> 
> I imagined that the original ioctl function just needed:
> [...]
> -	struct iommufd_access *access;
> +	struct iommufd_access *access = NULL;
> [...]
>   	if (!viommu->ops || !viommu->ops->get_hw_queue_size ||
> -	    !viommu->ops->hw_queue_init_phys) {
> +	    (!viommu->ops->hw_queue_init_phys && !viommu->ops->hw_queue_init)) {
> 		rc = -EOPNOTSUPP;
> 		goto out_put_viommu;
> 	}
> [...]
> -	access = iommufd_hw_queue_alloc_phys(cmd, viommu, &base_pa);
> -	if (IS_ERR(access)) {
> -		rc = PTR_ERR(access);
> -		goto out_put_viommu;
> -	}
> +	if (viommu->ops->hw_queue_init_phys) {
> +		access = iommufd_hw_queue_alloc_phys(cmd, viommu, &base_pa);
> +		if (IS_ERR(access)) {
> +			rc = PTR_ERR(access);
> +			goto out_put_viommu;
> +		}
> +	}
> [...]
> 	rc = viommu->ops->hw_queue_init_phys(hw_queue, cmd->index, base_pa);
> + 	if (viommu->ops->hw_queue_init_phys)
> +		rc = viommu->ops->hw_queue_init_phys(hw_queue, cmd->index,
> +						     base_pa);
> +	else
> +		rc = viommu->ops->hw_queue_init(hw_queue, cmd->index);
> 
> and then it should work?

Good point. I'll clean this up and send out new version.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 25/26] iommu/amd: Handle set/get command for AMD vIOMMU
  2026-05-29 23:20   ` Nicolin Chen
@ 2026-06-03  3:53     ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 51+ messages in thread
From: Suthikulpanit, Suravee @ 2026-06-03  3:53 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: linux-kernel, iommu, joro, jgg, yi.l.liu, kevin.tian,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh



On 5/30/2026 6:20 AM, Nicolin Chen wrote:
> On Thu, May 28, 2026 at 05:17:37AM +0000, Suravee Suthikulpanit wrote:
>> Guest kernel programs guest Command Buffer, Event Log, and PPR Log
>> (a.k.a hardware queue) settings via guest control MMIO register
>> (guest MMIO offset 0x18). Accesses to the register is trapped by VMM (QEMU)
>> and information is passed as IOMMU_VIOMMU_OPTION to host IOMMU driver via
>> struct iommufd_viommu_ops set_command() and get_command().
>>
>> Provides AMD IOMMU driver hooks to handle set/get operations for the
>> guest control MMIO register, which uses key parameter as AMD IOMMU MMIO
>> offset. The value parameter contains the value of the corresponding guest
>> MMIO register, which is converted to the format of AMD vIOMMU VF Control
>> MMIO registers then programed onto the hardware.
> 
> So the whole set_command/get_command thing is to MMIO a control
> register? I doubt this is a right ioctl(s) as it can be abused.

Correct. As part of the AMD vIOMMU architecture, MMIO registers in the 
1st 4K range are control registers, which can be grouped into Command 
buffer, Event Log buffer, PPR Log buffer. These are trapped by QEMU and 
communicate to AMD IOMMU driver to configure the vIOMMU hardware.

> Looking at the details:
> 
>> +	switch (offset) {
>> +	case MMIO_CONTROL_OFFSET:
>> +	{
>> +		/* VFCTRL offset 20h */
>> +		val = readq(vfctrl + 0x20);
>> +		val &= ~(0x3ULL << 8);
>> +		tmp = GET_CTRL_BITS(ctrl, CONTROL_CMDBUF_EN, 1); // [12]
>> +		val |= (tmp << 8);
> 
> This is to enable a hw_queue. So it could be marked when a hw_queue
> is allocated; IMHO, VMM should only allocate the hw_queue, when the
> guest really uses (enables) the buffer.

There are more than one registers to control each buffer. Some of them 
we can assume that it will be set last (e.g. the CONTROL_CMDBUF_EN, 
CONTROL_EVT_LOG_EN, CONTROL_PPRLOG_EN bits), and use them to trigger 
hw_queue allocation.

My original thought is to be explicit and separate the programming into:
   * Guest setup the buffer base/size -> IOMMUFD allocate the hw_queue
   * Guest enable/disable the buffer -> IOMMUFD enable/disable the 
control bits of each buffer.

Combine these steps into just hw_queue init/destroy might be okay.

> 
>> +		tmp = GET_CTRL_BITS(ctrl, CONTROL_COMWAIT_EN, 1); // [4]
>> +		val |= (tmp << 9);
>> +		writeq(val, vfctrl + 0x20);
> 
> The spec suggests to set this with CmdBufEn, so it may be a vendor
> (or hw_queue type) specific flag in structu iommu_hw_queue_alloc?
> 
> And similar comments to the event log and pri log buffers as well.

Lemme see if we can integrate this for all hw_queue.

> Overall, VMM should control the timings of allocating these viommu
> hw_queue objects; then kernel would just enable them accordingly.
> 
> Nicolin
> 

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support
  2026-06-01 13:30 ` [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Jason Gunthorpe
@ 2026-06-03  4:13   ` Suthikulpanit, Suravee
  2026-06-03  6:41   ` Suthikulpanit, Suravee
  1 sibling, 0 replies; 51+ messages in thread
From: Suthikulpanit, Suravee @ 2026-06-03  4:13 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh



On 6/1/2026 8:30 PM, Jason Gunthorpe wrote:
> On Thu, May 28, 2026 at 05:17:12AM +0000, Suravee Suthikulpanit wrote:
> 
>> [1] IOMMU Specification: https://docs.amd.com/v/u/en-US/48882_3.10_PUB
> 
> This link doesn't work

AMD Documentation team have recently updated the doc and remove the old 
link :( Here is the most recent one.

https://docs.amd.com/v/u/en-US/48882_3.11_IOMMU_PUB

> And I'm not sure the older PDF has everything, I haven't figure out
> where in the spec this TransDevID stuff comes from

The term TransDevID is not mentioning in the spec. It's basically what 
we need to program into the VFCntlMMIO Offset {16’b[GuestID], 
6’b01_0000} Guest Miscellaneous Control Register bits [31:16] (DeviceID: 
The system deviceID which has the entire guest GPA to SPA translation).

The hardware expects driver to setup some unused DTE with the 
translation (Host Page Table Root Pointer [31:12], Mode[2:0], TV, V 
bits). So, I am referring to the device ID used as index of the unused 
DTE as TransDevID.

Suravee


>> [2] Linux git tree: https://github.com/AMDESE/linux-iommu/tree/linux-7.1.0-rc4-amd-viommu_upstream_v2
> 
> Build test all the kconfig combinations, I noticed problems.
> 
> Jason


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support
  2026-06-01 13:30 ` [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Jason Gunthorpe
  2026-06-03  4:13   ` Suthikulpanit, Suravee
@ 2026-06-03  6:41   ` Suthikulpanit, Suravee
  2026-06-03 12:13     ` Jason Gunthorpe
  2026-06-05  8:41     ` Suthikulpanit, Suravee
  1 sibling, 2 replies; 51+ messages in thread
From: Suthikulpanit, Suravee @ 2026-06-03  6:41 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh



On 6/1/2026 8:30 PM, Jason Gunthorpe wrote:
>> [2] Linux git tree:https://github.com/AMDESE/linux-iommu/tree/linux-7.1.0-rc4-amd- 
>> viommu_upstream_v2
> Build test all the kconfig combinations, I noticed problems.
> 
> Jason

I have build tested w/:
- CONFIG_IOMMUFD (m/n)
- CONFIG_IOMMUFD (m) + CONFIG_AMD_IOMMU_IOMMUFD (y/n)

And they all have passed. Could you elaborate on "all the kconfig 
combinations"?

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support
  2026-06-03  6:41   ` Suthikulpanit, Suravee
@ 2026-06-03 12:13     ` Jason Gunthorpe
  2026-06-05  8:41     ` Suthikulpanit, Suravee
  1 sibling, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2026-06-03 12:13 UTC (permalink / raw)
  To: Suthikulpanit, Suravee
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh

On Wed, Jun 03, 2026 at 01:41:34PM +0700, Suthikulpanit, Suravee wrote:
> 
> 
> On 6/1/2026 8:30 PM, Jason Gunthorpe wrote:
> > > [2] Linux git
> > > tree:https://github.com/AMDESE/linux-iommu/tree/linux-7.1.0-rc4-amd-
> > > viommu_upstream_v2
> > Build test all the kconfig combinations, I noticed problems.
> > 
> > Jason
> 
> I have build tested w/:
> - CONFIG_IOMMUFD (m/n)
> - CONFIG_IOMMUFD (m) + CONFIG_AMD_IOMMU_IOMMUFD (y/n)
> 
> And they all have passed. Could you elaborate on "all the kconfig
> combinations"?

I don't remember, just my default built kconfig didn't work, there was
a stray ; in one of the #ifdef'd off blocks

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support
  2026-06-03  6:41   ` Suthikulpanit, Suravee
  2026-06-03 12:13     ` Jason Gunthorpe
@ 2026-06-05  8:41     ` Suthikulpanit, Suravee
  1 sibling, 0 replies; 51+ messages in thread
From: Suthikulpanit, Suravee @ 2026-06-05  8:41 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh



On 6/3/2026 1:41 PM, Suthikulpanit, Suravee wrote:
> 
> 
> On 6/1/2026 8:30 PM, Jason Gunthorpe wrote:
>>> [2] Linux git tree:https://github.com/AMDESE/linux-iommu/tree/ 
>>> linux-7.1.0-rc4-amd- viommu_upstream_v2
>> Build test all the kconfig combinations, I noticed problems.
>>
>> Jason
> 
> I have build tested w/:
> - CONFIG_IOMMUFD (m/n)
> - CONFIG_IOMMUFD (m) + CONFIG_AMD_IOMMU_IOMMUFD (y/n)
> 
> And they all have passed. Could you elaborate on "all the kconfig 
> combinations"?
> 
> Thanks,
> Suravee

I found the issue. Please ignore this.

Suravee

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 03/26] iommu/amd: Detect and initialize AMD vIOMMU feature
  2026-06-01 12:43   ` Jason Gunthorpe
@ 2026-06-05  8:45     ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 51+ messages in thread
From: Suthikulpanit, Suravee @ 2026-06-05  8:45 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh



On 6/1/2026 7:43 PM, Jason Gunthorpe wrote:
> On Thu, May 28, 2026 at 05:17:15AM +0000, Suravee Suthikulpanit wrote:
>> The feature is advertised w/ EFR[VIOMMUSup]. Please see the AMD IOMMU
>> specification[1] for more detail.
>>
>> Introduce a new global variable amd_iommu_viommu, which is used to
>> control the feature enablement in the driver. Currently, the feature
>> is default to disabled. Once the feature is fully supported, it will be
>> changed to enabled by default along with a command-line option to disable
>> if needed.
> 
> Still no to command line option. Just don't use iommufd if you don't
> want to use it. It must have no cost until iommufd activates it.
> 
> Jason

Noted. Just do not set CONFIG_AMD_IOMMU_IOMMUFD to remove viommu support.

Suravee

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 04/26] iommu/amd: Introduce IOMMUFD vIOMMU support for AMD
  2026-06-01 12:44   ` Jason Gunthorpe
@ 2026-06-05  8:55     ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 51+ messages in thread
From: Suthikulpanit, Suravee @ 2026-06-05  8:55 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-kernel, iommu, joro, yi.l.liu, kevin.tian, nicolinc,
	vasant.hegde, jon.grimm, santosh.shukla, sairaj.arunkodilkar,
	jay.chen, wvw, wnliu, dantuluris, chriscli, kpsingh



On 6/1/2026 7:44 PM, Jason Gunthorpe wrote:
> On Thu, May 28, 2026 at 05:17:16AM +0000, Suravee Suthikulpanit wrote:
> 
>> +/**
>> + * struct iommu_viommu_amd - AMD vIOMMU Interface (IOMMU_VIOMMU_TYPE_AMD)
>> + * @reserved: Must be zero
>> + */
>> +struct iommu_viommu_amd {
>> +	__u32 reserved; /* must be last */
>> +};
> 
> Do not structure patches like this, introduce the full and complete
> ABI structure here.
> 
> Jason

Noted. Thanks.
Suravee

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2026-06-05  8:55 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-28  5:17 [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 01/26] iommu/amd: Make amd_iommu_completion_wait() non-static Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 02/26] iommu/amd: Introduce vIOMMU-specific events and event Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 03/26] iommu/amd: Detect and initialize AMD vIOMMU feature Suravee Suthikulpanit
2026-06-01 12:43   ` Jason Gunthorpe
2026-06-05  8:45     ` Suthikulpanit, Suravee
2026-05-28  5:17 ` [PATCH v2 04/26] iommu/amd: Introduce IOMMUFD vIOMMU support for AMD Suravee Suthikulpanit
2026-06-01 12:44   ` Jason Gunthorpe
2026-06-05  8:55     ` Suthikulpanit, Suravee
2026-05-28  5:17 ` [PATCH v2 05/26] iommu/amd: Allocate Guest IDs for IOMMUFD vIOMMU instances Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 06/26] iommu/amd: Map vIOMMU VF and VF Control MMIO BARs Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 07/26] iommu/amd: Add support for AMD vIOMMU VF MMIO region Suravee Suthikulpanit
2026-06-01 12:51   ` Jason Gunthorpe
2026-05-28  5:17 ` [PATCH v2 08/26] iommu/amd: Introduce Reset vMMIO Command Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 09/26] iommu/amd: Introduce domain for IOMMU Private Address (IPA) region Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 10/26] iommu/amd: Assign IOMMU Private Address domain to IOMMU Suravee Suthikulpanit
2026-06-01 12:59   ` Jason Gunthorpe
2026-05-28  5:17 ` [PATCH v2 11/26] iommu/amd: Allocate and map vIOMMU private regions Suravee Suthikulpanit
2026-06-01 13:05   ` Jason Gunthorpe
2026-05-28  5:17 ` [PATCH v2 12/26] iommu/amd: Add per-VM private IPA alloc/map helpers Suravee Suthikulpanit
2026-05-30 20:44   ` Weinan Liu
2026-06-01 13:08   ` Jason Gunthorpe
2026-06-01 13:11   ` Jason Gunthorpe
2026-06-01 18:16   ` Weinan Liu
2026-05-28  5:17 ` [PATCH v2 13/26] iommu/amd: Add helper functions to manage DevID / DomID mapping tables Suravee Suthikulpanit
2026-05-30 21:26   ` Weinan Liu
2026-05-28  5:17 ` [PATCH v2 14/26] iommu/amd: Introduce IOMMUFD vDevice support for AMD Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 15/26] iommu/amd: Introduce helper function for updating domain ID mapping table Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 16/26] iommu/amd: Introduce helper function for updating device " Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 17/26] iommu/amd: Pass KVM FD from userspace when initializing vIOMMU Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 18/26] iommu/amd: Add translation DTE and VFctrl TransDevID helpers Suravee Suthikulpanit
2026-06-01 13:31   ` Jason Gunthorpe
2026-05-28  5:17 ` [PATCH v2 19/26] iommu/amd: Add per-segment translate device ID pool Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 20/26] iommu/amd: Reserve translate-device-id for PCI requestor aliases Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 21/26] iommu/amd: Map kvmfd to shared translate device ID for vIOMMU Suravee Suthikulpanit
2026-06-01 13:35   ` Jason Gunthorpe
2026-05-28  5:17 ` [PATCH v2 22/26] iommufd: Add hw_queue_init and split queue alloc paths Suravee Suthikulpanit
2026-05-29  0:14   ` Nicolin Chen
2026-06-03  0:30     ` Suthikulpanit, Suravee
2026-06-01 13:38   ` Jason Gunthorpe
2026-05-28  5:17 ` [PATCH v2 23/26] iommu/amd: Add support for vIOMMU HW queues initialization Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 24/26] iommufd: Introduce vIOMMU command via VIOMMU_COMMAND ioctl Suravee Suthikulpanit
2026-05-28  5:17 ` [PATCH v2 25/26] iommu/amd: Handle set/get command for AMD vIOMMU Suravee Suthikulpanit
2026-05-29 23:20   ` Nicolin Chen
2026-06-03  3:53     ` Suthikulpanit, Suravee
2026-05-28  5:17 ` [PATCH v2 26/26] iommu/amd: Introduce logic to check and enable vIOMMU feature Suravee Suthikulpanit
2026-06-01 13:30 ` [PATCH v2 00/26] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support Jason Gunthorpe
2026-06-03  4:13   ` Suthikulpanit, Suravee
2026-06-03  6:41   ` Suthikulpanit, Suravee
2026-06-03 12:13     ` Jason Gunthorpe
2026-06-05  8:41     ` Suthikulpanit, Suravee

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.