* [PATCH v2 01/19] memory: Add permissions in IOMMUAccessFlags
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 02/19] intel_iommu: Declare supported PASID size CLEMENT MATHIEU--DRIF
` (19 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
This will be necessary for devices implementing ATS.
We also define a new macro IOMMU_ACCESS_FLAG_FULL in addition to
IOMMU_ACCESS_FLAG to support more access flags.
IOMMU_ACCESS_FLAG is kept for convenience and backward compatibility.
Here are the flags added (defined by the PCIe 5 specification) :
- Execute Requested
- Privileged Mode Requested
- Global
- Untranslated Only
IOMMU_ACCESS_FLAG sets the additional flags to 0
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
include/exec/memory.h | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 3ee1901b52..56c3a3515e 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -110,15 +110,34 @@ struct MemoryRegionSection {
typedef struct IOMMUTLBEntry IOMMUTLBEntry;
-/* See address_space_translate: bit 0 is read, bit 1 is write. */
+/*
+ * See address_space_translate:
+ * - bit 0 : read
+ * - bit 1 : write
+ * - bit 2 : exec
+ * - bit 3 : priv
+ * - bit 4 : global
+ * - bit 5 : untranslated only
+ */
typedef enum {
IOMMU_NONE = 0,
IOMMU_RO = 1,
IOMMU_WO = 2,
IOMMU_RW = 3,
+ IOMMU_EXEC = 4,
+ IOMMU_PRIV = 8,
+ IOMMU_GLOBAL = 16,
+ IOMMU_UNTRANSLATED_ONLY = 32,
} IOMMUAccessFlags;
-#define IOMMU_ACCESS_FLAG(r, w) (((r) ? IOMMU_RO : 0) | ((w) ? IOMMU_WO : 0))
+#define IOMMU_ACCESS_FLAG(r, w) (((r) ? IOMMU_RO : 0) | \
+ ((w) ? IOMMU_WO : 0))
+#define IOMMU_ACCESS_FLAG_FULL(r, w, x, p, g, uo) \
+ (IOMMU_ACCESS_FLAG(r, w) | \
+ ((x) ? IOMMU_EXEC : 0) | \
+ ((p) ? IOMMU_PRIV : 0) | \
+ ((g) ? IOMMU_GLOBAL : 0) | \
+ ((uo) ? IOMMU_UNTRANSLATED_ONLY : 0))
struct IOMMUTLBEntry {
AddressSpace *target_as;
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 02/19] intel_iommu: Declare supported PASID size
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 01/19] memory: Add permissions in IOMMUAccessFlags CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 03/19] memory: Allow to store the PASID in IOMMUTLBEntry CLEMENT MATHIEU--DRIF
` (18 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
PSS field of the ecap register stores the supported PASID size minus 1.
Thus, this commit adds support for 20bits PASIDs.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/i386/intel_iommu.c | 2 +-
hw/i386/intel_iommu_internal.h | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index f366c223d0..1d5ff8f4f6 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -4574,7 +4574,7 @@ static void vtd_cap_init(IntelIOMMUState *s)
}
if (s->pasid) {
- s->ecap |= VTD_ECAP_PASID;
+ s->ecap |= VTD_ECAP_PASID | VTD_ECAP_PSS;
}
}
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index e8b211e8b0..238f1f443f 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -192,6 +192,7 @@
#define VTD_ECAP_SC (1ULL << 7)
#define VTD_ECAP_MHMV (15ULL << 20)
#define VTD_ECAP_SRS (1ULL << 31)
+#define VTD_ECAP_PSS (19ULL << 35)
#define VTD_ECAP_PASID (1ULL << 40)
#define VTD_ECAP_SMTS (1ULL << 43)
#define VTD_ECAP_SLTS (1ULL << 46)
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 03/19] memory: Allow to store the PASID in IOMMUTLBEntry
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 01/19] memory: Add permissions in IOMMUAccessFlags CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 02/19] intel_iommu: Declare supported PASID size CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 04/19] intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry CLEMENT MATHIEU--DRIF
` (17 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
This will be useful for devices that support ATS
and need to store entries in an ATC (device IOTLB).
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
include/exec/memory.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 56c3a3515e..9889b97abb 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -145,6 +145,7 @@ struct IOMMUTLBEntry {
hwaddr translated_addr;
hwaddr addr_mask; /* 0xfff = 4k translation */
IOMMUAccessFlags perm;
+ uint32_t pasid;
};
/*
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 04/19] intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (2 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 03/19] memory: Allow to store the PASID in IOMMUTLBEntry CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 05/19] pcie: Add helper to declare PASID capability for a pcie device CLEMENT MATHIEU--DRIF
` (16 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
PASID value must be used by devices as a key (or part of a key)
when populating their ATC with the IOTLB entries returned by the IOMMU.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/i386/intel_iommu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 1d5ff8f4f6..c58e18a56c 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2511,6 +2511,7 @@ static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s,
.translated_addr = 0,
.addr_mask = size - 1,
.perm = IOMMU_NONE,
+ .pasid = vtd_as->pasid,
},
};
memory_region_notify_iommu(&vtd_as->iommu, 0, event);
@@ -3098,6 +3099,7 @@ static void do_invalidate_device_tlb(VTDAddressSpace *vtd_dev_as,
event.entry.iova = addr;
event.entry.perm = IOMMU_NONE;
event.entry.translated_addr = 0;
+ event.entry.pasid = vtd_dev_as->pasid;
memory_region_notify_iommu(&vtd_dev_as->iommu, 0, event);
}
@@ -3680,6 +3682,7 @@ static IOMMUTLBEntry vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
IOMMUTLBEntry iotlb = {
/* We'll fill in the rest later. */
.target_as = &address_space_memory,
+ .pasid = vtd_as->pasid,
};
bool success;
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 05/19] pcie: Add helper to declare PASID capability for a pcie device
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (3 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 04/19] intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 06/19] pcie: Helper functions to check if PASID is enabled CLEMENT MATHIEU--DRIF
` (15 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/pci/pcie.c | 24 ++++++++++++++++++++++++
include/hw/pci/pcie.h | 6 +++++-
include/hw/pci/pcie_regs.h | 5 +++++
3 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index 1b12db6fa2..f42a256f15 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -1214,3 +1214,27 @@ void pcie_acs_reset(PCIDevice *dev)
pci_set_word(dev->config + dev->exp.acs_cap + PCI_ACS_CTRL, 0);
}
}
+
+/* PASID */
+void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
+ bool exec_perm, bool priv_mod)
+{
+ assert(pasid_width <= PCI_EXT_CAP_PASID_MAX_WIDTH);
+ static const uint16_t control_reg_rw_mask = 0x07;
+ uint16_t capability_reg = pasid_width;
+
+ pcie_add_capability(dev, PCI_EXT_CAP_ID_PASID, PCI_PASID_VER, offset,
+ PCI_EXT_CAP_PASID_SIZEOF);
+
+ capability_reg <<= PCI_PASID_CAP_WIDTH_SHIFT;
+ capability_reg |= exec_perm ? PCI_PASID_CAP_EXEC : 0;
+ capability_reg |= priv_mod ? PCI_PASID_CAP_PRIV : 0;
+ pci_set_word(dev->config + offset + PCI_PASID_CAP, capability_reg);
+
+ /* Everything is disabled by default */
+ pci_set_word(dev->config + offset + PCI_PASID_CTRL, 0);
+
+ pci_set_word(dev->wmask + offset + PCI_PASID_CTRL, control_reg_rw_mask);
+
+ dev->exp.pasid_cap = offset;
+}
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index b8d59732bc..aa040c3e97 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -72,8 +72,9 @@ struct PCIExpressDevice {
uint16_t aer_cap;
PCIEAERLog aer_log;
- /* Offset of ATS capability in config space */
+ /* Offset of ATS and PASID capabilities in config space */
uint16_t ats_cap;
+ uint16_t pasid_cap;
/* ACS */
uint16_t acs_cap;
@@ -152,4 +153,7 @@ void pcie_cap_slot_unplug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
Error **errp);
void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev,
DeviceState *dev, Error **errp);
+
+void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
+ bool exec_perm, bool priv_mod);
#endif /* QEMU_PCIE_H */
diff --git a/include/hw/pci/pcie_regs.h b/include/hw/pci/pcie_regs.h
index 9d3b6868dc..4d9cf4a29c 100644
--- a/include/hw/pci/pcie_regs.h
+++ b/include/hw/pci/pcie_regs.h
@@ -86,6 +86,11 @@ typedef enum PCIExpLinkWidth {
#define PCI_ARI_VER 1
#define PCI_ARI_SIZEOF 8
+/* PASID */
+#define PCI_PASID_VER 1
+#define PCI_EXT_CAP_PASID_MAX_WIDTH 20
+#define PCI_PASID_CAP_WIDTH_SHIFT 8
+
/* AER */
#define PCI_ERR_VER 2
#define PCI_ERR_SIZEOF 0x48
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 06/19] pcie: Helper functions to check if PASID is enabled
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (4 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 05/19] pcie: Add helper to declare PASID capability for a pcie device CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 07/19] pcie: Helper function to check if ATS " CLEMENT MATHIEU--DRIF
` (14 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
pasid_enabled checks whether the capability is
present or not. If so, we read the configuration space to get
the status of the feature (enabled or not).
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/pci/pcie.c | 9 +++++++++
include/hw/pci/pcie.h | 2 ++
2 files changed, 11 insertions(+)
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index f42a256f15..8186d64234 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -1238,3 +1238,12 @@ void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
dev->exp.pasid_cap = offset;
}
+
+bool pcie_pasid_enabled(const PCIDevice *dev)
+{
+ if (!pci_is_express(dev) || !dev->exp.pasid_cap) {
+ return false;
+ }
+ return (pci_get_word(dev->config + dev->exp.pasid_cap + PCI_PASID_CTRL) &
+ PCI_PASID_CTRL_ENABLE) != 0;
+}
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index aa040c3e97..63604ccc6e 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -156,4 +156,6 @@ void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev,
void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
bool exec_perm, bool priv_mod);
+
+bool pcie_pasid_enabled(const PCIDevice *dev);
#endif /* QEMU_PCIE_H */
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 07/19] pcie: Helper function to check if ATS is enabled
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (5 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 06/19] pcie: Helper functions to check if PASID is enabled CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 08/19] pci: Cache the bus mastering status in the device CLEMENT MATHIEU--DRIF
` (13 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
ats_enabled checks whether the capability is
present or not. If so, we read the configuration space to get
the status of the feature (enabled or not).
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/pci/pcie.c | 9 +++++++++
include/hw/pci/pcie.h | 1 +
2 files changed, 10 insertions(+)
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index 8186d64234..3b8fd6f33c 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -1247,3 +1247,12 @@ bool pcie_pasid_enabled(const PCIDevice *dev)
return (pci_get_word(dev->config + dev->exp.pasid_cap + PCI_PASID_CTRL) &
PCI_PASID_CTRL_ENABLE) != 0;
}
+
+bool pcie_ats_enabled(const PCIDevice *dev)
+{
+ if (!pci_is_express(dev) || !dev->exp.ats_cap) {
+ return false;
+ }
+ return (pci_get_word(dev->config + dev->exp.ats_cap + PCI_ATS_CTRL) &
+ PCI_ATS_CTRL_ENABLE) != 0;
+}
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index 63604ccc6e..7e7b8baa6e 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -158,4 +158,5 @@ void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
bool exec_perm, bool priv_mod);
bool pcie_pasid_enabled(const PCIDevice *dev);
+bool pcie_ats_enabled(const PCIDevice *dev);
#endif /* QEMU_PCIE_H */
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 08/19] pci: Cache the bus mastering status in the device
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (6 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 07/19] pcie: Helper function to check if ATS " CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 09/19] pci: Add IOMMU operations to get memory regions with PASID CLEMENT MATHIEU--DRIF
` (12 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
The cached is_master value is necessary to know if a device is
allowed to issue ATS requests or not.
This behavior is implemented in an upcoming patch.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/pci/pci.c | 25 +++++++++++++++----------
include/hw/pci/pci_device.h | 1 +
2 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 2afa423925..164bb22e05 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -134,6 +134,12 @@ static GSequence *pci_acpi_index_list(void)
return used_acpi_index_list;
}
+static void pci_set_master(PCIDevice *d, bool enable)
+{
+ memory_region_set_enabled(&d->bus_master_enable_region, enable);
+ d->is_master = enable; /* cache the status */
+}
+
static void pci_init_bus_master(PCIDevice *pci_dev)
{
AddressSpace *dma_as = pci_device_iommu_address_space(pci_dev);
@@ -141,7 +147,7 @@ static void pci_init_bus_master(PCIDevice *pci_dev)
memory_region_init_alias(&pci_dev->bus_master_enable_region,
OBJECT(pci_dev), "bus master",
dma_as->root, 0, memory_region_size(dma_as->root));
- memory_region_set_enabled(&pci_dev->bus_master_enable_region, false);
+ pci_set_master(pci_dev, false);
memory_region_add_subregion(&pci_dev->bus_master_container_region, 0,
&pci_dev->bus_master_enable_region);
}
@@ -727,9 +733,8 @@ static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
pci_bridge_update_mappings(PCI_BRIDGE(s));
}
- memory_region_set_enabled(&s->bus_master_enable_region,
- pci_get_word(s->config + PCI_COMMAND)
- & PCI_COMMAND_MASTER);
+ pci_set_master(s, pci_get_word(s->config + PCI_COMMAND)
+ & PCI_COMMAND_MASTER);
g_free(config);
return 0;
@@ -1684,9 +1689,10 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
if (ranges_overlap(addr, l, PCI_COMMAND, 2)) {
pci_update_irq_disabled(d, was_irq_disabled);
- memory_region_set_enabled(&d->bus_master_enable_region,
- (pci_get_word(d->config + PCI_COMMAND)
- & PCI_COMMAND_MASTER) && d->enabled);
+ pci_set_master(d,
+ (pci_get_word(d->config + PCI_COMMAND) &
+ PCI_COMMAND_MASTER) &&
+ d->enabled);
}
msi_write_config(d, addr, val_in, l);
@@ -2974,9 +2980,8 @@ void pci_set_enabled(PCIDevice *d, bool state)
d->enabled = state;
pci_update_mappings(d);
- memory_region_set_enabled(&d->bus_master_enable_region,
- (pci_get_word(d->config + PCI_COMMAND)
- & PCI_COMMAND_MASTER) && d->enabled);
+ pci_set_master(d, (pci_get_word(d->config + PCI_COMMAND)
+ & PCI_COMMAND_MASTER) && d->enabled);
if (!d->enabled) {
pci_device_reset(d);
}
diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h
index add208edfa..40606baa5d 100644
--- a/include/hw/pci/pci_device.h
+++ b/include/hw/pci/pci_device.h
@@ -88,6 +88,7 @@ struct PCIDevice {
char name[64];
PCIIORegion io_regions[PCI_NUM_REGIONS];
AddressSpace bus_master_as;
+ bool is_master;
MemoryRegion bus_master_container_region;
MemoryRegion bus_master_enable_region;
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 09/19] pci: Add IOMMU operations to get memory regions with PASID
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (7 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 08/19] pci: Cache the bus mastering status in the device CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 10/19] intel_iommu: Implement the get_memory_region_pasid iommu operation CLEMENT MATHIEU--DRIF
` (11 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
The region returned by this operation will be used as the input region
for ATS.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
include/hw/pci/pci.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 4002bbeebd..644551550b 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -391,6 +391,22 @@ typedef struct PCIIOMMUOps {
* @devfn: device and function number
*/
AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
+ /**
+ * @get_memory_region_pasid: get the iommu memory region for a given
+ * device and pasid
+ *
+ * @bus: the #PCIBus being accessed.
+ *
+ * @opaque: the data passed to pci_setup_iommu().
+ *
+ * @devfn: device and function number
+ *
+ * @pasid: the pasid associated with the requested memory region
+ */
+ IOMMUMemoryRegion * (*get_memory_region_pasid)(PCIBus *bus,
+ void *opaque,
+ int devfn,
+ uint32_t pasid);
/**
* @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
*
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 10/19] intel_iommu: Implement the get_memory_region_pasid iommu operation
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (8 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 09/19] pci: Add IOMMU operations to get memory regions with PASID CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 11/19] memory: Store user data pointer in the IOMMU notifiers CLEMENT MATHIEU--DRIF
` (10 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/i386/intel_iommu.c | 17 ++++++++++++++++-
include/hw/i386/intel_iommu.h | 2 +-
2 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index c58e18a56c..021834c41f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -4202,7 +4202,7 @@ static const MemoryRegionOps vtd_mem_ir_fault_ops = {
};
VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
- int devfn, unsigned int pasid)
+ int devfn, uint32_t pasid)
{
/*
* We can't simply use sid here since the bus number might not be
@@ -4719,8 +4719,23 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
return &vtd_as->as;
}
+static IOMMUMemoryRegion *vtd_get_memory_region_pasid(PCIBus *bus,
+ void *opaque,
+ int devfn,
+ uint32_t pasid)
+{
+ IntelIOMMUState *s = opaque;
+ VTDAddressSpace *vtd_as;
+
+ assert(0 <= devfn && devfn < PCI_DEVFN_MAX);
+
+ vtd_as = vtd_find_add_as(s, bus, devfn, pasid);
+ return &vtd_as->iommu;
+}
+
static PCIIOMMUOps vtd_iommu_ops = {
.get_address_space = vtd_host_dma_iommu,
+ .get_memory_region_pasid = vtd_get_memory_region_pasid,
.set_iommu_device = vtd_dev_set_iommu_device,
.unset_iommu_device = vtd_dev_unset_iommu_device,
};
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index e95477e855..08f71c262e 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -324,6 +324,6 @@ struct IntelIOMMUState {
* create a new one if none exists
*/
VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
- int devfn, unsigned int pasid);
+ int devfn, uint32_t pasid);
#endif
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 11/19] memory: Store user data pointer in the IOMMU notifiers
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (9 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 10/19] intel_iommu: Implement the get_memory_region_pasid iommu operation CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 12/19] pci: Add a pci-level initialization function for iommu notifiers CLEMENT MATHIEU--DRIF
` (9 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
This will help developers of svm devices to track a state
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
include/exec/memory.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 9889b97abb..468b003bf1 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -204,6 +204,7 @@ struct IOMMUNotifier {
hwaddr start;
hwaddr end;
int iommu_idx;
+ void *opaque;
QLIST_ENTRY(IOMMUNotifier) node;
};
typedef struct IOMMUNotifier IOMMUNotifier;
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 12/19] pci: Add a pci-level initialization function for iommu notifiers
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (10 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 11/19] memory: Store user data pointer in the IOMMU notifiers CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 13/19] atc: Generic ATC that can be used by PCIe devices that support SVM CLEMENT MATHIEU--DRIF
` (8 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
We add a convenient way to initialize an device-iotlb notifier.
This is meant to be used by ATS-capable devices.
pci_device_iommu_memory_region_pasid is introduces in this commit and
will be used in several other SVM-related functions exposed in
the PCI API.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/pci/pci.c | 40 ++++++++++++++++++++++++++++++++++++++++
include/hw/pci/pci.h | 15 +++++++++++++++
2 files changed, 55 insertions(+)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 164bb22e05..be29c0375f 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2825,6 +2825,46 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
return &address_space_memory;
}
+static IOMMUMemoryRegion *pci_device_iommu_memory_region_pasid(PCIDevice *dev,
+ uint32_t pasid)
+{
+ PCIBus *bus;
+ PCIBus *iommu_bus;
+ int devfn;
+
+ /*
+ * This function is for internal use in the module,
+ * we can call it with PCI_NO_PASID
+ */
+ if (!dev->is_master ||
+ ((pasid != PCI_NO_PASID) && !pcie_pasid_enabled(dev))) {
+ return NULL;
+ }
+
+ pci_device_get_iommu_bus_devfn(dev, &bus, &iommu_bus, &devfn);
+ if (iommu_bus && iommu_bus->iommu_ops->get_memory_region_pasid) {
+ return iommu_bus->iommu_ops->get_memory_region_pasid(bus,
+ iommu_bus->iommu_opaque, devfn, pasid);
+ }
+ return NULL;
+}
+
+bool pci_iommu_init_iotlb_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n, IOMMUNotify fn,
+ void *opaque)
+{
+ IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+ pasid);
+ if (!iommu_mr) {
+ return false;
+ }
+ iommu_notifier_init(n, fn, IOMMU_NOTIFIER_DEVIOTLB_EVENTS, 0, HWADDR_MAX,
+ memory_region_iommu_attrs_to_index(iommu_mr,
+ MEMTXATTRS_UNSPECIFIED));
+ n->opaque = opaque;
+ return true;
+}
+
bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
Error **errp)
{
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 644551550b..a11366e08d 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -446,6 +446,21 @@ bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
Error **errp);
void pci_device_unset_iommu_device(PCIDevice *dev);
+/**
+ * pci_iommu_init_iotlb_notifier: initialize an IOMMU notifier
+ *
+ * This function is used by devices before registering an IOTLB notifier
+ *
+ * @dev: the device
+ * @pasid: the pasid of the address space to watch
+ * @n: the notifier to initialize
+ * @fn: the callback to be installed
+ * @opaque: user pointer that can be used to store a state
+ */
+bool pci_iommu_init_iotlb_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n, IOMMUNotify fn,
+ void *opaque);
+
/**
* pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
*
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 13/19] atc: Generic ATC that can be used by PCIe devices that support SVM
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (11 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 12/19] pci: Add a pci-level initialization function for iommu notifiers CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 14/19] atc: Add unit tests CLEMENT MATHIEU--DRIF
` (7 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
As the SVM-capable devices will need to cache translations, we provide
an first implementation.
This cache uses a two-level design based on hash tables.
The first level is indexed by a PASID and the second by a virtual addresse.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
util/atc.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++
util/atc.h | 117 ++++++++++++++++++++++++++
util/meson.build | 1 +
3 files changed, 329 insertions(+)
create mode 100644 util/atc.c
create mode 100644 util/atc.h
diff --git a/util/atc.c b/util/atc.c
new file mode 100644
index 0000000000..584ce045db
--- /dev/null
+++ b/util/atc.c
@@ -0,0 +1,211 @@
+/*
+ * QEMU emulation of an ATC
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "util/atc.h"
+
+
+#define PAGE_TABLE_ENTRY_SIZE 8
+
+/* a pasid is hashed using the identity function */
+static guint atc_pasid_key_hash(gconstpointer v)
+{
+ return (guint)(uintptr_t)v; /* pasid */
+}
+
+/* pasid equality */
+static gboolean atc_pasid_key_equal(gconstpointer v1, gconstpointer v2)
+{
+ return v1 == v2;
+}
+
+/* Hash function for IOTLB entries */
+static guint atc_addr_key_hash(gconstpointer v)
+{
+ hwaddr addr = (hwaddr)v;
+ return (guint)((addr >> 32) ^ (addr & 0xffffffffU));
+}
+
+/* Equality test for IOTLB entries */
+static gboolean atc_addr_key_equal(gconstpointer v1, gconstpointer v2)
+{
+ return (hwaddr)v1 == (hwaddr)v2;
+}
+
+static void atc_address_space_free(void *as)
+{
+ g_hash_table_unref(as);
+}
+
+/* return log2(val), or UINT8_MAX if val is not a power of 2 */
+static uint8_t ilog2(uint64_t val)
+{
+ uint8_t result = 0;
+ while (val != 1) {
+ if (val & 1) {
+ return UINT8_MAX;
+ }
+
+ val >>= 1;
+ result += 1;
+ }
+ return result;
+}
+
+ATC *atc_new(uint64_t page_size, uint8_t address_width)
+{
+ ATC *atc;
+ uint8_t log_page_size = ilog2(page_size);
+ /* number of bits each used to store all the intermediate indexes */
+ uint64_t addr_lookup_indexes_size;
+
+ if (log_page_size == UINT8_MAX) {
+ return NULL;
+ }
+ /*
+ * We only support page table entries of 8 (PAGE_TABLE_ENTRY_SIZE) bytes
+ * log2(page_size / 8) = log2(page_size) - 3
+ * is the level offset
+ */
+ if (log_page_size <= 3) {
+ return NULL;
+ }
+
+ atc = g_new0(ATC, 1);
+ atc->address_spaces = g_hash_table_new_full(atc_pasid_key_hash,
+ atc_pasid_key_equal,
+ NULL, atc_address_space_free);
+ atc->level_offset = log_page_size - 3;
+ /* at this point, we know that page_size is a power of 2 */
+ atc->min_addr_mask = page_size - 1;
+ addr_lookup_indexes_size = address_width - log_page_size;
+ if ((addr_lookup_indexes_size % atc->level_offset) != 0) {
+ goto error;
+ }
+ atc->levels = addr_lookup_indexes_size / atc->level_offset;
+ atc->page_size = page_size;
+ return atc;
+
+error:
+ g_free(atc);
+ return NULL;
+}
+
+static inline GHashTable *atc_get_address_space_cache(ATC *atc, uint32_t pasid)
+{
+ return g_hash_table_lookup(atc->address_spaces,
+ (gconstpointer)(uintptr_t)pasid);
+}
+
+void atc_create_address_space_cache(ATC *atc, uint32_t pasid)
+{
+ GHashTable *as_cache;
+
+ as_cache = atc_get_address_space_cache(atc, pasid);
+ if (!as_cache) {
+ as_cache = g_hash_table_new_full(atc_addr_key_hash,
+ atc_addr_key_equal,
+ NULL, g_free);
+ g_hash_table_replace(atc->address_spaces,
+ (gpointer)(uintptr_t)pasid, as_cache);
+ }
+}
+
+void atc_delete_address_space_cache(ATC *atc, uint32_t pasid)
+{
+ g_hash_table_remove(atc->address_spaces, (gpointer)(uintptr_t)pasid);
+}
+
+int atc_update(ATC *atc, IOMMUTLBEntry *entry)
+{
+ IOMMUTLBEntry *value;
+ GHashTable *as_cache = atc_get_address_space_cache(atc, entry->pasid);
+ if (!as_cache) {
+ return -ENODEV;
+ }
+ value = g_memdup2(entry, sizeof(*value));
+ g_hash_table_replace(as_cache, (gpointer)(entry->iova), value);
+ return 0;
+}
+
+IOMMUTLBEntry *atc_lookup(ATC *atc, uint32_t pasid, hwaddr addr)
+{
+ IOMMUTLBEntry *entry;
+ hwaddr mask = atc->min_addr_mask;
+ hwaddr key = addr & (~mask);
+ GHashTable *as_cache = atc_get_address_space_cache(atc, pasid);
+
+ if (!as_cache) {
+ return NULL;
+ }
+
+ /*
+ * Iterate over the possible page sizes and try to find a hit
+ */
+ for (uint8_t level = 0; level < atc->levels; ++level) {
+ entry = g_hash_table_lookup(as_cache, (gconstpointer)key);
+ if (entry && (mask == entry->addr_mask)) {
+ return entry;
+ }
+ mask = (mask << atc->level_offset) | ((1 << atc->level_offset) - 1);
+ key = addr & (~mask);
+ }
+
+ return NULL;
+}
+
+static gboolean atc_invalidate_entry_predicate(gpointer key, gpointer value,
+ gpointer user_data)
+{
+ IOMMUTLBEntry *entry = (IOMMUTLBEntry *)value;
+ IOMMUTLBEntry *target = (IOMMUTLBEntry *)user_data;
+ hwaddr target_mask = ~target->addr_mask;
+ hwaddr entry_mask = ~entry->addr_mask;
+ return ((target->iova & target_mask) == (entry->iova & target_mask)) ||
+ ((target->iova & entry_mask) == (entry->iova & entry_mask));
+}
+
+void atc_invalidate(ATC *atc, IOMMUTLBEntry *entry)
+{
+ GHashTable *as_cache = atc_get_address_space_cache(atc, entry->pasid);
+ if (!as_cache) {
+ return;
+ }
+ g_hash_table_foreach_remove(as_cache,
+ atc_invalidate_entry_predicate,
+ entry);
+}
+
+void atc_destroy(ATC *atc)
+{
+ g_hash_table_unref(atc->address_spaces);
+}
+
+size_t atc_get_max_number_of_pages(ATC *atc, hwaddr addr, size_t length)
+{
+ hwaddr page_mask = ~(atc->min_addr_mask);
+ size_t result = (length / atc->page_size);
+ if ((((addr & page_mask) + length - 1) & page_mask) !=
+ ((addr + length - 1) & page_mask)) {
+ result += 1;
+ }
+ return result + (length % atc->page_size != 0 ? 1 : 0);
+}
+
+void atc_reset(ATC *atc)
+{
+ g_hash_table_remove_all(atc->address_spaces);
+}
diff --git a/util/atc.h b/util/atc.h
new file mode 100644
index 0000000000..8be95f5cca
--- /dev/null
+++ b/util/atc.h
@@ -0,0 +1,117 @@
+/*
+ * QEMU emulation of an ATC
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef UTIL_ATC_H
+#define UTIL_ATC_H
+
+#include "qemu/osdep.h"
+#include "exec/memory.h"
+
+typedef struct ATC {
+ GHashTable *address_spaces; /* Key : pasid, value : GHashTable */
+ hwaddr min_addr_mask;
+ uint64_t page_size;
+ uint8_t levels;
+ uint8_t level_offset;
+} ATC;
+
+/*
+ * atc_new: Create an ATC.
+ *
+ * Return an ATC or NULL if the creation failed
+ *
+ * @page_size: #PCIDevice doing the memory access
+ * @address_width: width of the virtual addresses used by the IOMMU (in bits)
+ */
+ATC *atc_new(uint64_t page_size, uint8_t address_width);
+
+/*
+ * atc_update: Insert or update an entry in the cache
+ *
+ * Return 0 if the operation succeeds, a negative error code otherwise
+ *
+ * The insertion will fail if the address space associated with this pasid
+ * has not been created with atc_create_address_space_cache
+ *
+ * @atc: the ATC to update
+ * @entry: the tlb entry to insert into the cache
+ */
+int atc_update(ATC *atc, IOMMUTLBEntry *entry);
+
+/*
+ * atc_create_address_space_cache: delare a new address space
+ * identified by a PASID
+ *
+ * @atc: the ATC to update
+ * @pasid: the pasid of the address space to be created
+ */
+void atc_create_address_space_cache(ATC *atc, uint32_t pasid);
+
+/*
+ * atc_delete_address_space_cache: delete an address space
+ * identified by a PASID
+ *
+ * @atc: the ATC to update
+ * @pasid: the pasid of the address space to be deleted
+ */
+void atc_delete_address_space_cache(ATC *atc, uint32_t pasid);
+
+/*
+ * atc_lookup: query the cache in a given address space
+ *
+ * @atc: the ATC to query
+ * @pasid: the pasid of the address space to query
+ * @addr: the virtual address to translate
+ */
+IOMMUTLBEntry *atc_lookup(ATC *atc, uint32_t pasid, hwaddr addr);
+
+/*
+ * atc_invalidate: invalidate an entry in the cache
+ *
+ * @atc: the ATC to update
+ * @entry: the entry to invalidate
+ */
+void atc_invalidate(ATC *atc, IOMMUTLBEntry *entry);
+
+/*
+ * atc_destroy: delete an ATC
+ *
+ * @atc: the cache to be deleted
+ */
+void atc_destroy(ATC *atc);
+
+/*
+ * atc_get_max_number_of_pages: get the number of pages a memory operation
+ * will access if all the pages concerned have the minimum size.
+ *
+ * This function can be used to determine the size of the result array to be
+ * allocated when issuing an ATS request.
+ *
+ * @atc: the cache
+ * @addr: start address
+ * @length: number of bytes accessed from addr
+ */
+size_t atc_get_max_number_of_pages(ATC *atc, hwaddr addr, size_t length);
+
+/*
+ * atc_reset: invalidates all the entries stored in the ATC
+ *
+ * @atc: the cache
+ */
+void atc_reset(ATC *atc);
+
+#endif
diff --git a/util/meson.build b/util/meson.build
index 5d8bef9891..f2dec01300 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -93,6 +93,7 @@ if have_block
util_ss.add(files('hbitmap.c'))
util_ss.add(files('hexdump.c'))
util_ss.add(files('iova-tree.c'))
+ util_ss.add(files('atc.c'))
util_ss.add(files('iov.c'))
util_ss.add(files('nvdimm-utils.c'))
util_ss.add(files('block-helpers.c'))
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 14/19] atc: Add unit tests
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (12 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 13/19] atc: Generic ATC that can be used by PCIe devices that support SVM CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 15/19] memory: Add an API for ATS support CLEMENT MATHIEU--DRIF
` (6 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
tests/unit/meson.build | 1 +
tests/unit/test-atc.c | 527 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 528 insertions(+)
create mode 100644 tests/unit/test-atc.c
diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index d5248ae51d..810197d5e1 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -48,6 +48,7 @@ tests = {
'test-qapi-util': [],
'test-interval-tree': [],
'test-fifo': [],
+ 'test-atc': [],
}
if have_system or have_tools
diff --git a/tests/unit/test-atc.c b/tests/unit/test-atc.c
new file mode 100644
index 0000000000..0d1c1b7ca7
--- /dev/null
+++ b/tests/unit/test-atc.c
@@ -0,0 +1,527 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "util/atc.h"
+
+static inline bool tlb_entry_equal(IOMMUTLBEntry *e1, IOMMUTLBEntry *e2)
+{
+ if (!e1 || !e2) {
+ return !e1 && !e2;
+ }
+ return e1->iova == e2->iova &&
+ e1->addr_mask == e2->addr_mask &&
+ e1->pasid == e2->pasid &&
+ e1->perm == e2->perm &&
+ e1->target_as == e2->target_as &&
+ e1->translated_addr == e2->translated_addr;
+}
+
+static void assert_lookup_equals(ATC *atc, IOMMUTLBEntry *target,
+ uint32_t pasid, hwaddr iova)
+{
+ IOMMUTLBEntry *result;
+ result = atc_lookup(atc, pasid, iova);
+ g_assert(tlb_entry_equal(result, target));
+}
+
+static void check_creation(uint64_t page_size, uint8_t address_width,
+ uint8_t levels, uint8_t level_offset,
+ bool should_work) {
+ ATC *atc = atc_new(page_size, address_width);
+ if (atc) {
+ g_assert(atc->levels == levels);
+ g_assert(atc->level_offset == level_offset);
+
+ atc_destroy(atc);
+ g_assert(should_work);
+ } else {
+ g_assert(!should_work);
+ }
+}
+
+static void test_creation_parameters(void)
+{
+ check_creation(8, 39, 3, 9, false);
+ check_creation(4095, 39, 3, 9, false);
+ check_creation(4097, 39, 3, 9, false);
+ check_creation(8192, 48, 0, 0, false);
+
+ check_creation(4096, 38, 0, 0, false);
+ check_creation(4096, 39, 3, 9, true);
+ check_creation(4096, 40, 0, 0, false);
+ check_creation(4096, 47, 0, 0, false);
+ check_creation(4096, 48, 4, 9, true);
+ check_creation(4096, 49, 0, 0, false);
+ check_creation(4096, 56, 0, 0, false);
+ check_creation(4096, 57, 5, 9, true);
+ check_creation(4096, 58, 0, 0, false);
+
+ check_creation(16384, 35, 0, 0, false);
+ check_creation(16384, 36, 2, 11, true);
+ check_creation(16384, 37, 0, 0, false);
+ check_creation(16384, 46, 0, 0, false);
+ check_creation(16384, 47, 3, 11, true);
+ check_creation(16384, 48, 0, 0, false);
+ check_creation(16384, 57, 0, 0, false);
+ check_creation(16384, 58, 4, 11, true);
+ check_creation(16384, 59, 0, 0, false);
+}
+
+static void test_single_entry(void)
+{
+ IOMMUTLBEntry entry = {
+ .iova = 0x123456789000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 5,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xdeadbeefULL,
+ };
+
+ ATC *atc = atc_new(4096, 48);
+ g_assert(atc);
+
+ assert_lookup_equals(atc, NULL, entry.pasid,
+ entry.iova + (entry.addr_mask / 2));
+
+ atc_create_address_space_cache(atc, entry.pasid);
+ g_assert(atc_update(atc, &entry) == 0);
+
+ assert_lookup_equals(atc, NULL, entry.pasid + 1,
+ entry.iova + (entry.addr_mask / 2));
+ assert_lookup_equals(atc, &entry, entry.pasid,
+ entry.iova + (entry.addr_mask / 2));
+
+ atc_destroy(atc);
+}
+
+static void test_single_entry_2(void)
+{
+ static uint64_t page_size = 4096;
+ IOMMUTLBEntry e1 = {
+ .iova = 0xabcdef200000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eedULL,
+ };
+
+ ATC *atc = atc_new(page_size , 48);
+ atc_create_address_space_cache(atc, e1.pasid);
+ atc_update(atc, &e1);
+
+ assert_lookup_equals(atc, NULL, e1.pasid, 0xabcdef201000ULL);
+
+ atc_destroy(atc);
+}
+
+static void test_page_boundaries(void)
+{
+ static const uint32_t pasid = 5;
+ static const hwaddr page_size = 4096;
+
+ /* 2 consecutive entries */
+ IOMMUTLBEntry e1 = {
+ .iova = 0x123456789000ULL,
+ .addr_mask = page_size - 1,
+ .pasid = pasid,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xdeadbeefULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = e1.iova + page_size,
+ .addr_mask = page_size - 1,
+ .pasid = pasid,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x900df00dULL,
+ };
+
+ ATC *atc = atc_new(page_size, 48);
+
+ atc_create_address_space_cache(atc, e1.pasid);
+ /* creating the address space twice should not be a problem */
+ atc_create_address_space_cache(atc, e1.pasid);
+
+ atc_update(atc, &e1);
+ atc_update(atc, &e2);
+
+ assert_lookup_equals(atc, NULL, e1.pasid, e1.iova - 1);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova + e1.addr_mask);
+ g_assert((e1.iova + e1.addr_mask + 1) == e2.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova + e2.addr_mask);
+ assert_lookup_equals(atc, NULL, e2.pasid, e2.iova + e2.addr_mask + 1);
+
+ assert_lookup_equals(atc, NULL, e1.pasid + 10, e1.iova);
+ assert_lookup_equals(atc, NULL, e2.pasid + 10, e2.iova);
+ atc_destroy(atc);
+}
+
+static void test_huge_page(void)
+{
+ static const uint32_t pasid = 5;
+ static const hwaddr page_size = 4096;
+ IOMMUTLBEntry e1 = {
+ .iova = 0x123456600000ULL,
+ .addr_mask = 0x1fffffULL,
+ .pasid = pasid,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xdeadbeefULL,
+ };
+ hwaddr addr;
+
+ ATC *atc = atc_new(page_size, 48);
+
+ atc_create_address_space_cache(atc, e1.pasid);
+ atc_update(atc, &e1);
+
+ for (addr = e1.iova; addr <= e1.iova + e1.addr_mask; addr += page_size) {
+ assert_lookup_equals(atc, &e1, e1.pasid, addr);
+ }
+ /* addr is now out of the huge page */
+ assert_lookup_equals(atc, NULL, e1.pasid, addr);
+ atc_destroy(atc);
+}
+
+static void test_pasid(void)
+{
+ hwaddr addr = 0xaaaaaaaaa000ULL;
+ IOMMUTLBEntry e1 = {
+ .iova = addr,
+ .addr_mask = 0xfffULL,
+ .pasid = 8,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xdeadbeefULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = addr,
+ .addr_mask = 0xfffULL,
+ .pasid = 2,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xb001ULL,
+ };
+ uint16_t i;
+
+ ATC *atc = atc_new(4096, 48);
+
+ atc_create_address_space_cache(atc, e1.pasid);
+ atc_create_address_space_cache(atc, e2.pasid);
+ atc_update(atc, &e1);
+ atc_update(atc, &e2);
+
+ for (i = 0; i <= MAX(e1.pasid, e2.pasid) + 1; ++i) {
+ if (i == e1.pasid || i == e2.pasid) {
+ continue;
+ }
+ assert_lookup_equals(atc, NULL, i, addr);
+ }
+ assert_lookup_equals(atc, &e1, e1.pasid, addr);
+ assert_lookup_equals(atc, &e1, e1.pasid, addr);
+ atc_destroy(atc);
+}
+
+static void test_large_address(void)
+{
+ IOMMUTLBEntry e1 = {
+ .iova = 0xaaaaaaaaa000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 8,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eeeeeedULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = 0x1f00baaaaabf000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = e1.pasid,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xdeadbeefULL,
+ };
+
+ ATC *atc = atc_new(4096, 57);
+
+ atc_create_address_space_cache(atc, e1.pasid);
+ atc_update(atc, &e1);
+ atc_update(atc, &e2);
+
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+ atc_destroy(atc);
+}
+
+static void test_bigger_page(void)
+{
+ IOMMUTLBEntry e1 = {
+ .iova = 0xaabbccdde000ULL,
+ .addr_mask = 0x1fffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eeeeeedULL,
+ };
+ hwaddr i;
+
+ ATC *atc = atc_new(8192, 43);
+
+ atc_create_address_space_cache(atc, e1.pasid);
+ atc_update(atc, &e1);
+
+ i = e1.iova & (~e1.addr_mask);
+ assert_lookup_equals(atc, NULL, e1.pasid, i - 1);
+ while (i <= e1.iova + e1.addr_mask) {
+ assert_lookup_equals(atc, &e1, e1.pasid, i);
+ ++i;
+ }
+ assert_lookup_equals(atc, NULL, e1.pasid, i);
+ atc_destroy(atc);
+}
+
+static void test_unknown_pasid(void)
+{
+ IOMMUTLBEntry e1 = {
+ .iova = 0xaabbccfff000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eeeeeedULL,
+ };
+
+ ATC *atc = atc_new(4096, 48);
+ g_assert(atc_update(atc, &e1) != 0);
+ assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+ atc_destroy(atc);
+}
+
+static void test_invalidation(void)
+{
+ static uint64_t page_size = 4096;
+ IOMMUTLBEntry e1 = {
+ .iova = 0xaabbccddf000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eeeeeedULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = 0xffe00000ULL,
+ .addr_mask = 0x1fffffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xb000001ULL,
+ };
+ IOMMUTLBEntry e3;
+
+ ATC *atc = atc_new(page_size , 48);
+ atc_create_address_space_cache(atc, e1.pasid);
+
+ atc_update(atc, &e1);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ atc_invalidate(atc, &e1);
+ assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+
+ atc_update(atc, &e1);
+ atc_update(atc, &e2);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+ atc_invalidate(atc, &e2);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+
+ /* invalidate a huge page by invalidating a small region */
+ for (hwaddr addr = e2.iova; addr <= (e2.iova + e2.addr_mask);
+ addr += page_size) {
+ atc_update(atc, &e2);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+ e3 = (IOMMUTLBEntry){
+ .iova = addr,
+ .addr_mask = page_size - 1,
+ .pasid = e2.pasid,
+ .perm = IOMMU_RW,
+ .translated_addr = 0,
+ };
+ atc_invalidate(atc, &e3);
+ assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+ }
+ atc_destroy(atc);
+}
+
+static void test_delete_address_space_cache(void)
+{
+ static uint64_t page_size = 4096;
+ IOMMUTLBEntry e1 = {
+ .iova = 0xaabbccddf000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eeeeeedULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = e1.iova,
+ .addr_mask = 0xfffULL,
+ .pasid = 2,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eeeeeedULL,
+ };
+
+ ATC *atc = atc_new(page_size , 48);
+ atc_create_address_space_cache(atc, e1.pasid);
+
+ atc_update(atc, &e1);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ atc_invalidate(atc, &e2); /* unkown pasid : is a nop*/
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+
+ atc_create_address_space_cache(atc, e2.pasid);
+ atc_update(atc, &e2);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+ atc_invalidate(atc, &e1);
+ /* e1 has been removed but e2 is still there */
+ assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+
+ atc_update(atc, &e1);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+
+ atc_delete_address_space_cache(atc, e2.pasid);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+ atc_destroy(atc);
+}
+
+static void test_invalidate_entire_address_space(void)
+{
+ static uint64_t page_size = 4096;
+ IOMMUTLBEntry e1 = {
+ .iova = 0x1000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eedULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = 0xfffffffff000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xbeefULL,
+ };
+ IOMMUTLBEntry e3 = {
+ .iova = 0,
+ .addr_mask = 0xffffffffffffffffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0,
+ };
+
+ ATC *atc = atc_new(page_size , 48);
+ atc_create_address_space_cache(atc, e1.pasid);
+
+ atc_update(atc, &e1);
+ atc_update(atc, &e2);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+ atc_invalidate(atc, &e3);
+ /* e1 has been removed but e2 is still there */
+ assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+
+ atc_destroy(atc);
+}
+
+static void test_reset(void)
+{
+ static uint64_t page_size = 4096;
+ IOMMUTLBEntry e1 = {
+ .iova = 0x1000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eedULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = 0xfffffffff000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 2,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xbeefULL,
+ };
+
+ ATC *atc = atc_new(page_size , 48);
+ atc_create_address_space_cache(atc, e1.pasid);
+ atc_create_address_space_cache(atc, e2.pasid);
+ atc_update(atc, &e1);
+ atc_update(atc, &e2);
+
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+
+ atc_reset(atc);
+
+ assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+ atc_destroy(atc);
+}
+
+static void test_get_max_number_of_pages(void)
+{
+ static uint64_t page_size = 4096;
+ hwaddr base = 0xc0fee000; /* aligned */
+ ATC *atc = atc_new(page_size , 48);
+ g_assert(atc_get_max_number_of_pages(atc, base, page_size / 2) == 1);
+ g_assert(atc_get_max_number_of_pages(atc, base, page_size) == 1);
+ g_assert(atc_get_max_number_of_pages(atc, base, page_size + 1) == 2);
+
+ g_assert(atc_get_max_number_of_pages(atc, base + 10, 1) == 1);
+ g_assert(atc_get_max_number_of_pages(atc, base + 10, page_size - 10) == 1);
+ g_assert(atc_get_max_number_of_pages(atc, base + 10,
+ page_size - 10 + 1) == 2);
+ g_assert(atc_get_max_number_of_pages(atc, base + 10,
+ page_size - 10 + 2) == 2);
+
+ g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 1) == 1);
+ g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 2) == 2);
+ g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 3) == 2);
+
+ g_assert(atc_get_max_number_of_pages(atc, base + 10, page_size * 20) == 21);
+ g_assert(atc_get_max_number_of_pages(atc, base + 10,
+ (page_size * 20) + (page_size - 10))
+ == 21);
+ g_assert(atc_get_max_number_of_pages(atc, base + 10,
+ (page_size * 20) +
+ (page_size - 10 + 1)) == 22);
+}
+
+int main(int argc, char **argv)
+{
+ g_test_init(&argc, &argv, NULL);
+ g_test_add_func("/atc/test_creation_parameters", test_creation_parameters);
+ g_test_add_func("/atc/test_single_entry", test_single_entry);
+ g_test_add_func("/atc/test_single_entry_2", test_single_entry_2);
+ g_test_add_func("/atc/test_page_boundaries", test_page_boundaries);
+ g_test_add_func("/atc/test_huge_page", test_huge_page);
+ g_test_add_func("/atc/test_pasid", test_pasid);
+ g_test_add_func("/atc/test_large_address", test_large_address);
+ g_test_add_func("/atc/test_bigger_page", test_bigger_page);
+ g_test_add_func("/atc/test_unknown_pasid", test_unknown_pasid);
+ g_test_add_func("/atc/test_invalidation", test_invalidation);
+ g_test_add_func("/atc/test_delete_address_space_cache",
+ test_delete_address_space_cache);
+ g_test_add_func("/atc/test_invalidate_entire_address_space",
+ test_invalidate_entire_address_space);
+ g_test_add_func("/atc/test_reset", test_reset);
+ g_test_add_func("/atc/test_get_max_number_of_pages",
+ test_get_max_number_of_pages);
+ return g_test_run();
+}
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 15/19] memory: Add an API for ATS support
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (13 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 14/19] atc: Add unit tests CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 16/19] pci: Add a pci-level API for ATS CLEMENT MATHIEU--DRIF
` (5 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
IOMMU have to implement iommu_ats_request_translation to support ATS.
Devices can use IOMMU_TLB_ENTRY_TRANSLATION_ERROR to check the tlb
entries returned by a translation request.
We decided not to use the existing translation operation for 2 reasons.
First, ATS is designed to translate ranges and not isolated addresses.
Second, we need ATS-specific parameters.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
include/exec/memory.h | 26 ++++++++++++++++++++++++++
system/memory.c | 21 +++++++++++++++++++++
2 files changed, 47 insertions(+)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 468b003bf1..042d4ea5be 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -148,6 +148,10 @@ struct IOMMUTLBEntry {
uint32_t pasid;
};
+/* Check if an IOMMU TLB entry indicates a translation error */
+#define IOMMU_TLB_ENTRY_TRANSLATION_ERROR(entry) ((((entry)->perm) & IOMMU_RW) \
+ == IOMMU_NONE)
+
/*
* Bitmap for different IOMMUNotifier capabilities. Each notifier can
* register with one or multiple IOMMU Notifier capability bit(s).
@@ -525,6 +529,20 @@ struct IOMMUMemoryRegionClass {
* @iommu: the IOMMUMemoryRegion
*/
int (*num_indexes)(IOMMUMemoryRegion *iommu);
+
+ /**
+ * @iommu_ats_request_translation:
+ * This method must be implemented if the IOMMU has ATS enabled
+ *
+ * @see pci_ats_request_translation_pasid
+ */
+ ssize_t (*iommu_ats_request_translation)(IOMMUMemoryRegion *iommu,
+ bool priv_req, bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write,
+ IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count);
};
typedef struct RamDiscardListener RamDiscardListener;
@@ -1882,6 +1900,14 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n);
void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
IOMMUNotifier *n);
+ssize_t memory_region_iommu_ats_request_translation(IOMMUMemoryRegion *iommu_mr,
+ bool priv_req, bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write,
+ IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count);
+
/**
* memory_region_iommu_get_attr: return an IOMMU attr if get_attr() is
* defined on the IOMMU.
diff --git a/system/memory.c b/system/memory.c
index b17b5538ff..0a379a72bb 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -2011,6 +2011,27 @@ void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
memory_region_update_iommu_notify_flags(iommu_mr, NULL);
}
+ssize_t memory_region_iommu_ats_request_translation(IOMMUMemoryRegion *iommu_mr,
+ bool priv_req,
+ bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write,
+ IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count)
+{
+ IOMMUMemoryRegionClass *imrc =
+ memory_region_get_iommu_class_nocheck(iommu_mr);
+
+ if (!imrc->iommu_ats_request_translation) {
+ return -ENODEV;
+ }
+
+ return imrc->iommu_ats_request_translation(iommu_mr, priv_req, exec_req,
+ addr, length, no_write, result,
+ result_length, err_count);
+}
+
void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
const IOMMUTLBEvent *event)
{
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 16/19] pci: Add a pci-level API for ATS
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (14 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 15/19] memory: Add an API for ATS support CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 17/19] intel_iommu: Set address mask when a translation fails and adjust W permission CLEMENT MATHIEU--DRIF
` (4 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Devices implementing ATS can send translation requests using
pci_ats_request_translation_pasid.
The invalidation events are sent back to the device using the iommu
notifier managed with pci_register_iommu_tlb_event_notifier and
pci_unregister_iommu_tlb_event_notifier
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/pci/pci.c | 46 +++++++++++++++++++++++++++++++++++++++
include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 98 insertions(+)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index be29c0375f..0ccd0656b7 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2896,6 +2896,52 @@ void pci_device_unset_iommu_device(PCIDevice *dev)
}
}
+ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
+ bool priv_req, bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write, IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count)
+{
+ IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+ pasid);
+
+ assert(result_length);
+
+ if (!iommu_mr || !pcie_ats_enabled(dev)) {
+ return -EPERM;
+ }
+ return memory_region_iommu_ats_request_translation(iommu_mr, priv_req,
+ exec_req, addr, length,
+ no_write, result,
+ result_length,
+ err_count);
+}
+
+int pci_register_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n)
+{
+ IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+ pasid);
+ if (!iommu_mr) {
+ return -EPERM;
+ }
+ return memory_region_register_iommu_notifier(MEMORY_REGION(iommu_mr), n,
+ &error_fatal);
+}
+
+int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n)
+{
+ IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+ pasid);
+ if (!iommu_mr) {
+ return -EPERM;
+ }
+ memory_region_unregister_iommu_notifier(MEMORY_REGION(iommu_mr), n);
+ return 0;
+}
+
void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque)
{
/*
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index a11366e08d..592e72aee9 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -461,6 +461,58 @@ bool pci_iommu_init_iotlb_notifier(PCIDevice *dev, uint32_t pasid,
IOMMUNotifier *n, IOMMUNotify fn,
void *opaque);
+/**
+ * pci_ats_request_translation_pasid: perform an ATS request
+ *
+ * Return the number of translations stored in @result in case of success,
+ * a negative error code otherwise.
+ * -ENOMEM is returned when the result buffer is not large enough to store
+ * all the translations
+ *
+ * @dev: the ATS-capable PCI device
+ * @pasid: the pasid of the address space in which the translation will be made
+ * @priv_req: privileged mode bit (PASID TLP)
+ * @exec_req: execute request bit (PASID TLP)
+ * @addr: start address of the memory range to be translated
+ * @length: length of the memory range in bytes
+ * @no_write: request a read-only access translation (if supported by the IOMMU)
+ * @result: buffer in which the TLB entries will be stored
+ * @result_length: result buffer length
+ * @err_count: number of untranslated subregions
+ */
+ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
+ bool priv_req, bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write, IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count);
+
+/**
+ * pci_register_iommu_tlb_event_notifier: register a notifier for changes to
+ * IOMMU translation entries in a specific address space.
+ *
+ * Returns 0 on success, or a negative errno otherwise.
+ *
+ * @dev: the device that wants to get notified
+ * @pasid: the pasid of the address space to track
+ * @n: the notifier to register
+ */
+int pci_register_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n);
+
+/**
+ * pci_unregister_iommu_tlb_event_notifier: unregister a notifier that has been
+ * registerd with pci_register_iommu_tlb_event_notifier
+ *
+ * Returns 0 on success, or a negative errno otherwise.
+ *
+ * @dev: the device that wants to unsubscribe
+ * @pasid: the pasid of the address space to be untracked
+ * @n: the notifier to unregister
+ */
+int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n);
+
/**
* pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
*
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 17/19] intel_iommu: Set address mask when a translation fails and adjust W permission
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (15 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 16/19] pci: Add a pci-level API for ATS CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 18/19] intel_iommu: Return page walk level even when the translation fails CLEMENT MATHIEU--DRIF
` (3 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Implements the behavior defined in section 10.2.3.5 of PCIe spec rev 5.
This is needed by devices that support ATS.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/i386/intel_iommu.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 021834c41f..530b75a9a3 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2100,7 +2100,8 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
uint8_t bus_num = pci_bus_num(bus);
VTDContextCacheEntry *cc_entry;
uint64_t pte, page_mask;
- uint32_t level, pasid = vtd_as->pasid;
+ uint32_t level = UINT32_MAX;
+ uint32_t pasid = vtd_as->pasid;
uint16_t source_id = PCI_BUILD_BDF(bus_num, devfn);
int ret_fr;
bool is_fpd_set = false;
@@ -2259,14 +2260,19 @@ out:
entry->iova = addr & page_mask;
entry->translated_addr = vtd_get_pte_addr(pte, s->aw_bits) & page_mask;
entry->addr_mask = ~page_mask;
- entry->perm = access_flags;
+ entry->perm = (is_write ? access_flags : (access_flags & (~IOMMU_WO)));
return true;
error:
vtd_iommu_unlock(s);
entry->iova = 0;
entry->translated_addr = 0;
- entry->addr_mask = 0;
+ /*
+ * Set the mask for ATS (the range must be present even when the
+ * translation fails : PCIe rev 5 10.2.3.5)
+ */
+ entry->addr_mask = (level != UINT32_MAX) ?
+ (~vtd_pt_level_page_mask(level)) : (~VTD_PAGE_MASK_4K);
entry->perm = IOMMU_NONE;
return false;
}
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 18/19] intel_iommu: Return page walk level even when the translation fails
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (16 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 17/19] intel_iommu: Set address mask when a translation fails and adjust W permission CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 19/19] intel_iommu: Add support for ATS CLEMENT MATHIEU--DRIF
` (2 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
We use this information in vtd_do_iommu_translate to populate the
IOMMUTLBEntry and indicate the correct page mask. This prevents ATS
devices from sending many useless translation requests when a megapage
or gigapage iova is not mapped to a physical address.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/i386/intel_iommu.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 530b75a9a3..3c31dc1047 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1995,9 +1995,9 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s, VTDContextEntry *ce,
uint32_t pasid)
{
dma_addr_t addr = vtd_get_iova_pgtbl_base(s, ce, pasid);
- uint32_t level = vtd_get_iova_level(s, ce, pasid);
uint32_t offset;
uint64_t flpte, flag_ad = VTD_FL_A;
+ *flpte_level = vtd_get_iova_level(s, ce, pasid);
if (!vtd_iova_fl_check_canonical(s, iova, ce, pasid)) {
error_report_once("%s: detected non canonical IOVA (iova=0x%" PRIx64 ","
@@ -2006,11 +2006,11 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s, VTDContextEntry *ce,
}
while (true) {
- offset = vtd_iova_level_offset(iova, level);
+ offset = vtd_iova_level_offset(iova, *flpte_level);
flpte = vtd_get_pte(addr, offset);
if (flpte == (uint64_t)-1) {
- if (level == vtd_get_iova_level(s, ce, pasid)) {
+ if (*flpte_level == vtd_get_iova_level(s, ce, pasid)) {
/* Invalid programming of pasid-entry */
return -VTD_FR_PASID_ENTRY_FSPTPTR_INV;
} else {
@@ -2036,15 +2036,15 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s, VTDContextEntry *ce,
if (is_write && !(flpte & VTD_FL_RW)) {
return -VTD_FR_SM_WRITE;
}
- if (vtd_flpte_nonzero_rsvd(flpte, level)) {
+ if (vtd_flpte_nonzero_rsvd(flpte, *flpte_level)) {
error_report_once("%s: detected flpte reserved non-zero "
"iova=0x%" PRIx64 ", level=0x%" PRIx32
"flpte=0x%" PRIx64 ", pasid=0x%" PRIX32 ")",
- __func__, iova, level, flpte, pasid);
+ __func__, iova, *flpte_level, flpte, pasid);
return -VTD_FR_FS_PAGING_ENTRY_RSVD;
}
- if (vtd_is_last_pte(flpte, level) && is_write) {
+ if (vtd_is_last_pte(flpte, *flpte_level) && is_write) {
flag_ad |= VTD_FL_D;
}
@@ -2052,14 +2052,13 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s, VTDContextEntry *ce,
return -VTD_FR_FS_BIT_UPDATE_FAILED;
}
- if (vtd_is_last_pte(flpte, level)) {
+ if (vtd_is_last_pte(flpte, *flpte_level)) {
*flptep = flpte;
- *flpte_level = level;
return 0;
}
addr = vtd_get_pte_addr(flpte, aw_bits);
- level--;
+ (*flpte_level)--;
}
}
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 19/19] intel_iommu: Add support for ATS
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (17 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 18/19] intel_iommu: Return page walk level even when the translation fails CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-02-19 6:10 ` [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
2025-02-20 21:13 ` Michael S. Tsirkin
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/i386/intel_iommu.c | 71 ++++++++++++++++++++++++++++++++--
hw/i386/intel_iommu_internal.h | 1 +
2 files changed, 69 insertions(+), 3 deletions(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 3c31dc1047..698e1286da 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -4159,12 +4159,10 @@ static void vtd_report_ir_illegal_access(VTDAddressSpace *vtd_as,
bool is_fpd_set = false;
VTDContextEntry ce;
- assert(vtd_as->pasid != PCI_NO_PASID);
-
/* Try out best to fetch FPD, we can't do anything more */
if (vtd_dev_to_context_entry(s, bus_n, vtd_as->devfn, &ce) == 0) {
is_fpd_set = ce.lo & VTD_CONTEXT_ENTRY_FPD;
- if (!is_fpd_set && s->root_scalable) {
+ if (!is_fpd_set && s->root_scalable && vtd_as->pasid != PCI_NO_PASID) {
vtd_ce_get_pasid_fpd(s, &ce, &is_fpd_set, vtd_as->pasid);
}
}
@@ -4738,6 +4736,71 @@ static IOMMUMemoryRegion *vtd_get_memory_region_pasid(PCIBus *bus,
return &vtd_as->iommu;
}
+static IOMMUTLBEntry vtd_iommu_ats_do_translate(IOMMUMemoryRegion *iommu,
+ hwaddr addr,
+ IOMMUAccessFlags flags,
+ int iommu_idx)
+{
+ IOMMUTLBEntry entry;
+ VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
+
+ if (vtd_is_interrupt_addr(addr)) {
+ vtd_report_ir_illegal_access(vtd_as, addr, flags & IOMMU_WO);
+ entry.iova = 0;
+ entry.translated_addr = 0;
+ entry.addr_mask = ~VTD_PAGE_MASK_4K;
+ entry.perm = IOMMU_NONE;
+ entry.pasid = PCI_NO_PASID;
+ } else {
+ entry = vtd_iommu_translate(iommu, addr, flags, iommu_idx);
+ }
+ return entry;
+}
+
+static ssize_t vtd_iommu_ats_request_translation(IOMMUMemoryRegion *iommu,
+ bool priv_req, bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write,
+ IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count)
+{
+ IOMMUAccessFlags flags = IOMMU_ACCESS_FLAG_FULL(true, !no_write, exec_req,
+ priv_req, false, false);
+ ssize_t res_index = 0;
+ hwaddr target_address = addr + length;
+ IOMMUTLBEntry entry;
+
+ *err_count = 0;
+
+ while ((addr < target_address) && (res_index < result_length)) {
+ entry = vtd_iommu_ats_do_translate(iommu, addr, flags, 0);
+ if (!IOMMU_TLB_ENTRY_TRANSLATION_ERROR(&entry)) { /* Translation done */
+ /*
+ * 4.1.2 : Global Mapping (G) : Remapping hardware provides a value
+ * of 0 in this field
+ */
+ entry.perm &= ~IOMMU_GLOBAL;
+ } else {
+ *err_count += 1;
+ }
+ result[res_index] = entry;
+ res_index += 1;
+ addr = (addr & (~entry.addr_mask)) + (entry.addr_mask + 1);
+ }
+
+ /* Buffer too small */
+ if (addr < target_address) {
+ return -ENOMEM;
+ }
+ return res_index;
+}
+
+static uint64_t vtd_get_min_page_size(IOMMUMemoryRegion *iommu)
+{
+ return VTD_PAGE_SIZE;
+}
+
static PCIIOMMUOps vtd_iommu_ops = {
.get_address_space = vtd_host_dma_iommu,
.get_memory_region_pasid = vtd_get_memory_region_pasid,
@@ -4915,6 +4978,8 @@ static void vtd_iommu_memory_region_class_init(ObjectClass *klass,
imrc->translate = vtd_iommu_translate;
imrc->notify_flag_changed = vtd_iommu_notify_flag_changed;
imrc->replay = vtd_iommu_replay;
+ imrc->iommu_ats_request_translation = vtd_iommu_ats_request_translation;
+ imrc->get_min_page_size = vtd_get_min_page_size;
}
static const TypeInfo vtd_iommu_memory_region_info = {
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 238f1f443f..7e2071cd4d 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -192,6 +192,7 @@
#define VTD_ECAP_SC (1ULL << 7)
#define VTD_ECAP_MHMV (15ULL << 20)
#define VTD_ECAP_SRS (1ULL << 31)
+#define VTD_ECAP_NWFS (1ULL << 33)
#define VTD_ECAP_PSS (19ULL << 35)
#define VTD_ECAP_PASID (1ULL << 40)
#define VTD_ECAP_SMTS (1ULL << 43)
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v2 00/19] intel_iommu: Add ATS support
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (18 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 19/19] intel_iommu: Add support for ATS CLEMENT MATHIEU--DRIF
@ 2025-02-19 6:10 ` CLEMENT MATHIEU--DRIF
2025-02-20 21:13 ` Michael S. Tsirkin
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-02-19 6:10 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com
Kindly ping
Thanks everyone
>cmd
On 20/01/2025 18:41, CLEMENT MATHIEU--DRIF wrote:
> From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
>
> This patch set belongs to a list of series that add SVM support for VT-d.
>
> Here we focus on implementing ATS support in the IOMMU and adding a
> PCI-level API to be used by virtual devices.
>
> This work is based on the VT-d specification version 4.1 (March 2023).
>
> Here is a link to our GitHub repository where you can find the following elements:
> - Qemu with all the patches for SVM
> - ATS
> - PRI
> - Device IOTLB invalidations
> - Requests with already pre-translated addresses
> - A demo device
> - A simple driver for the demo device
> - A userspace program (for testing and demonstration purposes)
>
> https://github.com/BullSequana/Qemu-in-guest-SVM-demo
>
> ===============
>
> Context and design notes
> ''''''''''''''''''''''''
>
> The main purpose of this work is to enable vVT-d users to make
> translation requests to the vIOMMU as described in the PCIe Gen 5.0
> specification (section 10). Moreover, we aim to implement a
> PCI/Memory-level framework that could be used by other vIOMMUs
> to implement the same features.
>
> What is ATS?
> ''''''''''''
>
> ATS (Address Translation Service) is a PCIe-level protocol that
> enables PCIe devices to query an IOMMU for virtual to physical
> address translations in a specific address space (such as a userland
> process address space). When a device receives translation responses
> from an IOMMU, it may decide to store them in an internal cache,
> often known as "ATC" (Address Translation Cache) or "Device IOTLB".
> To keep page tables and caches consistent, the IOMMU is allowed to
> send asynchronous invalidation requests to its client devices.
>
> To avoid introducing an unnecessarily complex API, this series simply
> exposes 3 functions. The first 2 are a pair of setup functions that
> are called to install and remove the ATS invalidation callback during
> the initialization phase of a process. The third one will be
> used to request translations. The callback setup API introduced in
> this series calls the IOMMUNotifier API under the hood.
>
> API design
> ''''''''''
>
> - int pci_register_iommu_tlb_event_notifier(PCIDevice *dev,
> uint32_t pasid,
> IOMMUNotifier *n);
>
> - int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
> IOMMUNotifier *n);
>
> - ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
> bool priv_req, bool exec_req,
> hwaddr addr, size_t length,
> bool no_write,
> IOMMUTLBEntry *result,
> size_t result_length,
> uint32_t *err_count);
>
> Although device developers may want to implement custom ATC for
> testing or performance measurement purposes, we provide a generic
> implementation as a utility module.
>
> Overview
> ''''''''
>
> Here are the interactions between an ATS-capable PCIe device and the vVT-d:
>
>
>
> ┌───────────┐ ┌────────────┐
> │Device │ │PCI / Memory│
> │ │ pci_ats_request_│abstraction │ iommu_ats_
> │ │ translation_ │ │ request_
> │┌─────────┐│ pasid │ AS lookup │ translation
> ││Logic ││────────────────>│╶╶╶╶╶╶╶╶╶╶╶>│──────┐
> │└─────────┘│<────────────────│<╶╶╶╶╶╶╶╶╶╶╶│<──┐ │
> │┌─────────┐│ │ │ │ │
> ││inv func ││<───────┐ │ │ │ │
> │└─────────┘│ │ │ │ │ │
> │ │ │ │ │ │ │ │
> │ ∨ │ │ │ │ │ │
> │┌─────────┐│ │ │ │ │ │
> ││ATC ││ │ │ │ │ │
> │└─────────┘│ │ │ │ │ │
> └───────────┘ │ └────────────┘ │ │
> │ │ │
> │ │ │
> │ │ │
> │ │ │
> │ ┌────────────────────┼──┼─┐
> │ │vVT-d │ │ │
> │ │ │ │ │
> │ │ │ │ │
> │ │ │ │ │
> │ │ │ │ │
> │ │ │ ∨ │
> │ │┌───────────────────────┐│
> │ ││Translation logic ││
> │ │└───────────────────────┘│
> └────┼────────────┐ │
> │ │ │
> │┌───────────────────────┐│
> ││ Invalidation queue ││
> │└───────────∧───────────┘│
> └────────────┼────────────┘
> │
> │
> │
> ┌────────────────────────┐
> │Kernel driver │
> │ │
> └────────────────────────┘
>
> v2
> Rebase on master after merge of Zhenzhong's FLTS series
> Rename the series as it is now based on master.
>
> Changes after review by Michael:
> - Split long lines in memory.h
> - Change patch encoding (no UTF-8)
>
> Changes after review by Zhenzhong:
> - Rework "Fill the PASID field when creating an IOMMUTLBEntry"
>
>
>
> Clement Mathieu--Drif (19):
> memory: Add permissions in IOMMUAccessFlags
> intel_iommu: Declare supported PASID size
> memory: Allow to store the PASID in IOMMUTLBEntry
> intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry
> pcie: Add helper to declare PASID capability for a pcie device
> pcie: Helper functions to check if PASID is enabled
> pcie: Helper function to check if ATS is enabled
> pci: Cache the bus mastering status in the device
> pci: Add IOMMU operations to get memory regions with PASID
> intel_iommu: Implement the get_memory_region_pasid iommu operation
> memory: Store user data pointer in the IOMMU notifiers
> pci: Add a pci-level initialization function for iommu notifiers
> atc: Generic ATC that can be used by PCIe devices that support SVM
> atc: Add unit tests
> memory: Add an API for ATS support
> pci: Add a pci-level API for ATS
> intel_iommu: Set address mask when a translation fails and adjust W
> permission
> intel_iommu: Return page walk level even when the translation fails
> intel_iommu: Add support for ATS
>
> hw/i386/intel_iommu.c | 122 ++++++--
> hw/i386/intel_iommu_internal.h | 2 +
> hw/pci/pci.c | 111 ++++++-
> hw/pci/pcie.c | 42 +++
> include/exec/memory.h | 51 +++-
> include/hw/i386/intel_iommu.h | 2 +-
> include/hw/pci/pci.h | 83 ++++++
> include/hw/pci/pci_device.h | 1 +
> include/hw/pci/pcie.h | 9 +-
> include/hw/pci/pcie_regs.h | 5 +
> system/memory.c | 21 ++
> tests/unit/meson.build | 1 +
> tests/unit/test-atc.c | 527 +++++++++++++++++++++++++++++++++
> util/atc.c | 211 +++++++++++++
> util/atc.h | 117 ++++++++
> util/meson.build | 1 +
> 16 files changed, 1275 insertions(+), 31 deletions(-)
> create mode 100644 tests/unit/test-atc.c
> create mode 100644 util/atc.c
> create mode 100644 util/atc.h
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 00/19] intel_iommu: Add ATS support
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (19 preceding siblings ...)
2025-02-19 6:10 ` [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
@ 2025-02-20 21:13 ` Michael S. Tsirkin
2025-02-21 7:54 ` CLEMENT MATHIEU--DRIF
20 siblings, 1 reply; 23+ messages in thread
From: Michael S. Tsirkin @ 2025-02-20 21:13 UTC (permalink / raw)
To: CLEMENT MATHIEU--DRIF
Cc: qemu-devel@nongnu.org, jasowang@redhat.com,
zhenzhong.duan@intel.com, kevin.tian@intel.com,
yi.l.liu@intel.com, joao.m.martins@oracle.com, peterx@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com
On Mon, Jan 20, 2025 at 05:41:32PM +0000, CLEMENT MATHIEU--DRIF wrote:
> From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
>
> This patch set belongs to a list of series that add SVM support for VT-d.
>
> Here we focus on implementing ATS support in the IOMMU and adding a
> PCI-level API to be used by virtual devices.
>
> This work is based on the VT-d specification version 4.1 (March 2023).
>
> Here is a link to our GitHub repository where you can find the following elements:
> - Qemu with all the patches for SVM
> - ATS
> - PRI
> - Device IOTLB invalidations
> - Requests with already pre-translated addresses
> - A demo device
> - A simple driver for the demo device
> - A userspace program (for testing and demonstration purposes)
>
> https://github.com/BullSequana/Qemu-in-guest-SVM-demo
Fails build:
https://gitlab.com/mstredhat/qemu/-/jobs/9200372388
In function ‘vtd_iommu_ats_do_translate’,
inlined from ‘vtd_iommu_ats_request_translation’ at ../hw/i386/intel_iommu.c:4778:17:
../hw/i386/intel_iommu.c:4758:12: error: ‘entry.target_as’ may be used uninitialized [-Werror=maybe-uninitialized]
4758 | return entry;
| ^~~~~
../hw/i386/intel_iommu.c: In function ‘vtd_iommu_ats_request_translation’:
../hw/i386/intel_iommu.c:4745:19: note: ‘entry’ declared here
4745 | IOMMUTLBEntry entry;
| ^~~~~
cc1: all warnings being treated as errors
> ===============
>
> Context and design notes
> ''''''''''''''''''''''''
>
> The main purpose of this work is to enable vVT-d users to make
> translation requests to the vIOMMU as described in the PCIe Gen 5.0
> specification (section 10). Moreover, we aim to implement a
> PCI/Memory-level framework that could be used by other vIOMMUs
> to implement the same features.
>
> What is ATS?
> ''''''''''''
>
> ATS (Address Translation Service) is a PCIe-level protocol that
> enables PCIe devices to query an IOMMU for virtual to physical
> address translations in a specific address space (such as a userland
> process address space). When a device receives translation responses
> from an IOMMU, it may decide to store them in an internal cache,
> often known as "ATC" (Address Translation Cache) or "Device IOTLB".
> To keep page tables and caches consistent, the IOMMU is allowed to
> send asynchronous invalidation requests to its client devices.
>
> To avoid introducing an unnecessarily complex API, this series simply
> exposes 3 functions. The first 2 are a pair of setup functions that
> are called to install and remove the ATS invalidation callback during
> the initialization phase of a process. The third one will be
> used to request translations. The callback setup API introduced in
> this series calls the IOMMUNotifier API under the hood.
>
> API design
> ''''''''''
>
> - int pci_register_iommu_tlb_event_notifier(PCIDevice *dev,
> uint32_t pasid,
> IOMMUNotifier *n);
>
> - int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
> IOMMUNotifier *n);
>
> - ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
> bool priv_req, bool exec_req,
> hwaddr addr, size_t length,
> bool no_write,
> IOMMUTLBEntry *result,
> size_t result_length,
> uint32_t *err_count);
>
> Although device developers may want to implement custom ATC for
> testing or performance measurement purposes, we provide a generic
> implementation as a utility module.
>
> Overview
> ''''''''
>
> Here are the interactions between an ATS-capable PCIe device and the vVT-d:
>
>
>
> ┌───────────┐ ┌────────────┐
> │Device │ │PCI / Memory│
> │ │ pci_ats_request_│abstraction │ iommu_ats_
> │ │ translation_ │ │ request_
> │┌─────────┐│ pasid │ AS lookup │ translation
> ││Logic ││────────────────>│╶╶╶╶╶╶╶╶╶╶╶>│──────┐
> │└─────────┘│<────────────────│<╶╶╶╶╶╶╶╶╶╶╶│<──┐ │
> │┌─────────┐│ │ │ │ │
> ││inv func ││<───────┐ │ │ │ │
> │└─────────┘│ │ │ │ │ │
> │ │ │ │ │ │ │ │
> │ ∨ │ │ │ │ │ │
> │┌─────────┐│ │ │ │ │ │
> ││ATC ││ │ │ │ │ │
> │└─────────┘│ │ │ │ │ │
> └───────────┘ │ └────────────┘ │ │
> │ │ │
> │ │ │
> │ │ │
> │ │ │
> │ ┌────────────────────┼──┼─┐
> │ │vVT-d │ │ │
> │ │ │ │ │
> │ │ │ │ │
> │ │ │ │ │
> │ │ │ │ │
> │ │ │ ∨ │
> │ │┌───────────────────────┐│
> │ ││Translation logic ││
> │ │└───────────────────────┘│
> └────┼────────────┐ │
> │ │ │
> │┌───────────────────────┐│
> ││ Invalidation queue ││
> │└───────────∧───────────┘│
> └────────────┼────────────┘
> │
> │
> │
> ┌────────────────────────┐
> │Kernel driver │
> │ │
> └────────────────────────┘
>
> v2
> Rebase on master after merge of Zhenzhong's FLTS series
> Rename the series as it is now based on master.
>
> Changes after review by Michael:
> - Split long lines in memory.h
> - Change patch encoding (no UTF-8)
>
> Changes after review by Zhenzhong:
> - Rework "Fill the PASID field when creating an IOMMUTLBEntry"
>
>
>
> Clement Mathieu--Drif (19):
> memory: Add permissions in IOMMUAccessFlags
> intel_iommu: Declare supported PASID size
> memory: Allow to store the PASID in IOMMUTLBEntry
> intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry
> pcie: Add helper to declare PASID capability for a pcie device
> pcie: Helper functions to check if PASID is enabled
> pcie: Helper function to check if ATS is enabled
> pci: Cache the bus mastering status in the device
> pci: Add IOMMU operations to get memory regions with PASID
> intel_iommu: Implement the get_memory_region_pasid iommu operation
> memory: Store user data pointer in the IOMMU notifiers
> pci: Add a pci-level initialization function for iommu notifiers
> atc: Generic ATC that can be used by PCIe devices that support SVM
> atc: Add unit tests
> memory: Add an API for ATS support
> pci: Add a pci-level API for ATS
> intel_iommu: Set address mask when a translation fails and adjust W
> permission
> intel_iommu: Return page walk level even when the translation fails
> intel_iommu: Add support for ATS
>
> hw/i386/intel_iommu.c | 122 ++++++--
> hw/i386/intel_iommu_internal.h | 2 +
> hw/pci/pci.c | 111 ++++++-
> hw/pci/pcie.c | 42 +++
> include/exec/memory.h | 51 +++-
> include/hw/i386/intel_iommu.h | 2 +-
> include/hw/pci/pci.h | 83 ++++++
> include/hw/pci/pci_device.h | 1 +
> include/hw/pci/pcie.h | 9 +-
> include/hw/pci/pcie_regs.h | 5 +
> system/memory.c | 21 ++
> tests/unit/meson.build | 1 +
> tests/unit/test-atc.c | 527 +++++++++++++++++++++++++++++++++
> util/atc.c | 211 +++++++++++++
> util/atc.h | 117 ++++++++
> util/meson.build | 1 +
> 16 files changed, 1275 insertions(+), 31 deletions(-)
> create mode 100644 tests/unit/test-atc.c
> create mode 100644 util/atc.c
> create mode 100644 util/atc.h
>
> --
> 2.47.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 00/19] intel_iommu: Add ATS support
2025-02-20 21:13 ` Michael S. Tsirkin
@ 2025-02-21 7:54 ` CLEMENT MATHIEU--DRIF
0 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-02-21 7:54 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: qemu-devel@nongnu.org, jasowang@redhat.com,
zhenzhong.duan@intel.com, kevin.tian@intel.com,
yi.l.liu@intel.com, joao.m.martins@oracle.com, peterx@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com
On 20/02/2025 22:13, Michael S. Tsirkin wrote:
> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>
>
> On Mon, Jan 20, 2025 at 05:41:32PM +0000, CLEMENT MATHIEU--DRIF wrote:
>> From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
>>
>> This patch set belongs to a list of series that add SVM support for VT-d.
>>
>> Here we focus on implementing ATS support in the IOMMU and adding a
>> PCI-level API to be used by virtual devices.
>>
>> This work is based on the VT-d specification version 4.1 (March 2023).
>>
>> Here is a link to our GitHub repository where you can find the following elements:
>> - Qemu with all the patches for SVM
>> - ATS
>> - PRI
>> - Device IOTLB invalidations
>> - Requests with already pre-translated addresses
>> - A demo device
>> - A simple driver for the demo device
>> - A userspace program (for testing and demonstration purposes)
>>
>> https://github.com/BullSequana/Qemu-in-guest-SVM-demo
>
> Fails build:
>
> https://gitlab.com/mstredhat/qemu/-/jobs/9200372388
>
> In function ‘vtd_iommu_ats_do_translate’,
> inlined from ‘vtd_iommu_ats_request_translation’ at ../hw/i386/intel_iommu.c:4778:17:
> ../hw/i386/intel_iommu.c:4758:12: error: ‘entry.target_as’ may be used uninitialized [-Werror=maybe-uninitialized]
> 4758 | return entry;
> | ^~~~~
> ../hw/i386/intel_iommu.c: In function ‘vtd_iommu_ats_request_translation’:
> ../hw/i386/intel_iommu.c:4745:19: note: ‘entry’ declared here
> 4745 | IOMMUTLBEntry entry;
> | ^~~~~
> cc1: all warnings being treated as errors
>
Uh, looks like the error is only present in non-debug mode.
I'll send a v3
Thanks Michael
>
>
>> ===============
>>
>> Context and design notes
>> ''''''''''''''''''''''''
>>
>> The main purpose of this work is to enable vVT-d users to make
>> translation requests to the vIOMMU as described in the PCIe Gen 5.0
>> specification (section 10). Moreover, we aim to implement a
>> PCI/Memory-level framework that could be used by other vIOMMUs
>> to implement the same features.
>>
>> What is ATS?
>> ''''''''''''
>>
>> ATS (Address Translation Service) is a PCIe-level protocol that
>> enables PCIe devices to query an IOMMU for virtual to physical
>> address translations in a specific address space (such as a userland
>> process address space). When a device receives translation responses
>> from an IOMMU, it may decide to store them in an internal cache,
>> often known as "ATC" (Address Translation Cache) or "Device IOTLB".
>> To keep page tables and caches consistent, the IOMMU is allowed to
>> send asynchronous invalidation requests to its client devices.
>>
>> To avoid introducing an unnecessarily complex API, this series simply
>> exposes 3 functions. The first 2 are a pair of setup functions that
>> are called to install and remove the ATS invalidation callback during
>> the initialization phase of a process. The third one will be
>> used to request translations. The callback setup API introduced in
>> this series calls the IOMMUNotifier API under the hood.
>>
>> API design
>> ''''''''''
>>
>> - int pci_register_iommu_tlb_event_notifier(PCIDevice *dev,
>> uint32_t pasid,
>> IOMMUNotifier *n);
>>
>> - int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
>> IOMMUNotifier *n);
>>
>> - ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
>> bool priv_req, bool exec_req,
>> hwaddr addr, size_t length,
>> bool no_write,
>> IOMMUTLBEntry *result,
>> size_t result_length,
>> uint32_t *err_count);
>>
>> Although device developers may want to implement custom ATC for
>> testing or performance measurement purposes, we provide a generic
>> implementation as a utility module.
>>
>> Overview
>> ''''''''
>>
>> Here are the interactions between an ATS-capable PCIe device and the vVT-d:
>>
>>
>>
>> ┌───────────┐ ┌────────────┐
>> │Device │ │PCI / Memory│
>> │ │ pci_ats_request_│abstraction │ iommu_ats_
>> │ │ translation_ │ │ request_
>> │┌─────────┐│ pasid │ AS lookup │ translation
>> ││Logic ││────────────────>│╶╶╶╶╶╶╶╶╶╶╶>│──────┐
>> │└─────────┘│<────────────────│<╶╶╶╶╶╶╶╶╶╶╶│<──┐ │
>> │┌─────────┐│ │ │ │ │
>> ││inv func ││<───────┐ │ │ │ │
>> │└─────────┘│ │ │ │ │ │
>> │ │ │ │ │ │ │ │
>> │ ∨ │ │ │ │ │ │
>> │┌─────────┐│ │ │ │ │ │
>> ││ATC ││ │ │ │ │ │
>> │└─────────┘│ │ │ │ │ │
>> └───────────┘ │ └────────────┘ │ │
>> │ │ │
>> │ │ │
>> │ │ │
>> │ │ │
>> │ ┌────────────────────┼──┼─┐
>> │ │vVT-d │ │ │
>> │ │ │ │ │
>> │ │ │ │ │
>> │ │ │ │ │
>> │ │ │ │ │
>> │ │ │ ∨ │
>> │ │┌───────────────────────┐│
>> │ ││Translation logic ││
>> │ │└───────────────────────┘│
>> └────┼────────────┐ │
>> │ │ │
>> │┌───────────────────────┐│
>> ││ Invalidation queue ││
>> │└───────────∧───────────┘│
>> └────────────┼────────────┘
>> │
>> │
>> │
>> ┌────────────────────────┐
>> │Kernel driver │
>> │ │
>> └────────────────────────┘
>>
>> v2
>> Rebase on master after merge of Zhenzhong's FLTS series
>> Rename the series as it is now based on master.
>>
>> Changes after review by Michael:
>> - Split long lines in memory.h
>> - Change patch encoding (no UTF-8)
>>
>> Changes after review by Zhenzhong:
>> - Rework "Fill the PASID field when creating an IOMMUTLBEntry"
>>
>>
>>
>> Clement Mathieu--Drif (19):
>> memory: Add permissions in IOMMUAccessFlags
>> intel_iommu: Declare supported PASID size
>> memory: Allow to store the PASID in IOMMUTLBEntry
>> intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry
>> pcie: Add helper to declare PASID capability for a pcie device
>> pcie: Helper functions to check if PASID is enabled
>> pcie: Helper function to check if ATS is enabled
>> pci: Cache the bus mastering status in the device
>> pci: Add IOMMU operations to get memory regions with PASID
>> intel_iommu: Implement the get_memory_region_pasid iommu operation
>> memory: Store user data pointer in the IOMMU notifiers
>> pci: Add a pci-level initialization function for iommu notifiers
>> atc: Generic ATC that can be used by PCIe devices that support SVM
>> atc: Add unit tests
>> memory: Add an API for ATS support
>> pci: Add a pci-level API for ATS
>> intel_iommu: Set address mask when a translation fails and adjust W
>> permission
>> intel_iommu: Return page walk level even when the translation fails
>> intel_iommu: Add support for ATS
>>
>> hw/i386/intel_iommu.c | 122 ++++++--
>> hw/i386/intel_iommu_internal.h | 2 +
>> hw/pci/pci.c | 111 ++++++-
>> hw/pci/pcie.c | 42 +++
>> include/exec/memory.h | 51 +++-
>> include/hw/i386/intel_iommu.h | 2 +-
>> include/hw/pci/pci.h | 83 ++++++
>> include/hw/pci/pci_device.h | 1 +
>> include/hw/pci/pcie.h | 9 +-
>> include/hw/pci/pcie_regs.h | 5 +
>> system/memory.c | 21 ++
>> tests/unit/meson.build | 1 +
>> tests/unit/test-atc.c | 527 +++++++++++++++++++++++++++++++++
>> util/atc.c | 211 +++++++++++++
>> util/atc.h | 117 ++++++++
>> util/meson.build | 1 +
>> 16 files changed, 1275 insertions(+), 31 deletions(-)
>> create mode 100644 tests/unit/test-atc.c
>> create mode 100644 util/atc.c
>> create mode 100644 util/atc.h
>>
>> --
>> 2.47.1
^ permalink raw reply [flat|nested] 23+ messages in thread