* [PATCH v2 00/19] intel_iommu: Add ATS support
@ 2025-01-20 17:41 CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 01/19] memory: Add permissions in IOMMUAccessFlags CLEMENT MATHIEU--DRIF
` (20 more replies)
0 siblings, 21 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
This patch set belongs to a list of series that add SVM support for VT-d.
Here we focus on implementing ATS support in the IOMMU and adding a
PCI-level API to be used by virtual devices.
This work is based on the VT-d specification version 4.1 (March 2023).
Here is a link to our GitHub repository where you can find the following elements:
- Qemu with all the patches for SVM
- ATS
- PRI
- Device IOTLB invalidations
- Requests with already pre-translated addresses
- A demo device
- A simple driver for the demo device
- A userspace program (for testing and demonstration purposes)
https://github.com/BullSequana/Qemu-in-guest-SVM-demo
===============
Context and design notes
''''''''''''''''''''''''
The main purpose of this work is to enable vVT-d users to make
translation requests to the vIOMMU as described in the PCIe Gen 5.0
specification (section 10). Moreover, we aim to implement a
PCI/Memory-level framework that could be used by other vIOMMUs
to implement the same features.
What is ATS?
''''''''''''
ATS (Address Translation Service) is a PCIe-level protocol that
enables PCIe devices to query an IOMMU for virtual to physical
address translations in a specific address space (such as a userland
process address space). When a device receives translation responses
from an IOMMU, it may decide to store them in an internal cache,
often known as "ATC" (Address Translation Cache) or "Device IOTLB".
To keep page tables and caches consistent, the IOMMU is allowed to
send asynchronous invalidation requests to its client devices.
To avoid introducing an unnecessarily complex API, this series simply
exposes 3 functions. The first 2 are a pair of setup functions that
are called to install and remove the ATS invalidation callback during
the initialization phase of a process. The third one will be
used to request translations. The callback setup API introduced in
this series calls the IOMMUNotifier API under the hood.
API design
''''''''''
- int pci_register_iommu_tlb_event_notifier(PCIDevice *dev,
uint32_t pasid,
IOMMUNotifier *n);
- int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
IOMMUNotifier *n);
- ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
bool priv_req, bool exec_req,
hwaddr addr, size_t length,
bool no_write,
IOMMUTLBEntry *result,
size_t result_length,
uint32_t *err_count);
Although device developers may want to implement custom ATC for
testing or performance measurement purposes, we provide a generic
implementation as a utility module.
Overview
''''''''
Here are the interactions between an ATS-capable PCIe device and the vVT-d:
┌───────────┐ ┌────────────┐
│Device │ │PCI / Memory│
│ │ pci_ats_request_│abstraction │ iommu_ats_
│ │ translation_ │ │ request_
│┌─────────┐│ pasid │ AS lookup │ translation
││Logic ││────────────────>│╶╶╶╶╶╶╶╶╶╶╶>│──────┐
│└─────────┘│<────────────────│<╶╶╶╶╶╶╶╶╶╶╶│<──┐ │
│┌─────────┐│ │ │ │ │
││inv func ││<───────┐ │ │ │ │
│└─────────┘│ │ │ │ │ │
│ │ │ │ │ │ │ │
│ ∨ │ │ │ │ │ │
│┌─────────┐│ │ │ │ │ │
││ATC ││ │ │ │ │ │
│└─────────┘│ │ │ │ │ │
└───────────┘ │ └────────────┘ │ │
│ │ │
│ │ │
│ │ │
│ │ │
│ ┌────────────────────┼──┼─┐
│ │vVT-d │ │ │
│ │ │ │ │
│ │ │ │ │
│ │ │ │ │
│ │ │ │ │
│ │ │ ∨ │
│ │┌───────────────────────┐│
│ ││Translation logic ││
│ │└───────────────────────┘│
└────┼────────────┐ │
│ │ │
│┌───────────────────────┐│
││ Invalidation queue ││
│└───────────∧───────────┘│
└────────────┼────────────┘
│
│
│
┌────────────────────────┐
│Kernel driver │
│ │
└────────────────────────┘
v2
Rebase on master after merge of Zhenzhong's FLTS series
Rename the series as it is now based on master.
Changes after review by Michael:
- Split long lines in memory.h
- Change patch encoding (no UTF-8)
Changes after review by Zhenzhong:
- Rework "Fill the PASID field when creating an IOMMUTLBEntry"
Clement Mathieu--Drif (19):
memory: Add permissions in IOMMUAccessFlags
intel_iommu: Declare supported PASID size
memory: Allow to store the PASID in IOMMUTLBEntry
intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry
pcie: Add helper to declare PASID capability for a pcie device
pcie: Helper functions to check if PASID is enabled
pcie: Helper function to check if ATS is enabled
pci: Cache the bus mastering status in the device
pci: Add IOMMU operations to get memory regions with PASID
intel_iommu: Implement the get_memory_region_pasid iommu operation
memory: Store user data pointer in the IOMMU notifiers
pci: Add a pci-level initialization function for iommu notifiers
atc: Generic ATC that can be used by PCIe devices that support SVM
atc: Add unit tests
memory: Add an API for ATS support
pci: Add a pci-level API for ATS
intel_iommu: Set address mask when a translation fails and adjust W
permission
intel_iommu: Return page walk level even when the translation fails
intel_iommu: Add support for ATS
hw/i386/intel_iommu.c | 122 ++++++--
hw/i386/intel_iommu_internal.h | 2 +
hw/pci/pci.c | 111 ++++++-
hw/pci/pcie.c | 42 +++
include/exec/memory.h | 51 +++-
include/hw/i386/intel_iommu.h | 2 +-
include/hw/pci/pci.h | 83 ++++++
include/hw/pci/pci_device.h | 1 +
include/hw/pci/pcie.h | 9 +-
include/hw/pci/pcie_regs.h | 5 +
system/memory.c | 21 ++
tests/unit/meson.build | 1 +
tests/unit/test-atc.c | 527 +++++++++++++++++++++++++++++++++
util/atc.c | 211 +++++++++++++
util/atc.h | 117 ++++++++
util/meson.build | 1 +
16 files changed, 1275 insertions(+), 31 deletions(-)
create mode 100644 tests/unit/test-atc.c
create mode 100644 util/atc.c
create mode 100644 util/atc.h
--
2.47.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 01/19] memory: Add permissions in IOMMUAccessFlags
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 02/19] intel_iommu: Declare supported PASID size CLEMENT MATHIEU--DRIF
` (19 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
This will be necessary for devices implementing ATS.
We also define a new macro IOMMU_ACCESS_FLAG_FULL in addition to
IOMMU_ACCESS_FLAG to support more access flags.
IOMMU_ACCESS_FLAG is kept for convenience and backward compatibility.
Here are the flags added (defined by the PCIe 5 specification) :
- Execute Requested
- Privileged Mode Requested
- Global
- Untranslated Only
IOMMU_ACCESS_FLAG sets the additional flags to 0
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
include/exec/memory.h | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 3ee1901b52..56c3a3515e 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -110,15 +110,34 @@ struct MemoryRegionSection {
typedef struct IOMMUTLBEntry IOMMUTLBEntry;
-/* See address_space_translate: bit 0 is read, bit 1 is write. */
+/*
+ * See address_space_translate:
+ * - bit 0 : read
+ * - bit 1 : write
+ * - bit 2 : exec
+ * - bit 3 : priv
+ * - bit 4 : global
+ * - bit 5 : untranslated only
+ */
typedef enum {
IOMMU_NONE = 0,
IOMMU_RO = 1,
IOMMU_WO = 2,
IOMMU_RW = 3,
+ IOMMU_EXEC = 4,
+ IOMMU_PRIV = 8,
+ IOMMU_GLOBAL = 16,
+ IOMMU_UNTRANSLATED_ONLY = 32,
} IOMMUAccessFlags;
-#define IOMMU_ACCESS_FLAG(r, w) (((r) ? IOMMU_RO : 0) | ((w) ? IOMMU_WO : 0))
+#define IOMMU_ACCESS_FLAG(r, w) (((r) ? IOMMU_RO : 0) | \
+ ((w) ? IOMMU_WO : 0))
+#define IOMMU_ACCESS_FLAG_FULL(r, w, x, p, g, uo) \
+ (IOMMU_ACCESS_FLAG(r, w) | \
+ ((x) ? IOMMU_EXEC : 0) | \
+ ((p) ? IOMMU_PRIV : 0) | \
+ ((g) ? IOMMU_GLOBAL : 0) | \
+ ((uo) ? IOMMU_UNTRANSLATED_ONLY : 0))
struct IOMMUTLBEntry {
AddressSpace *target_as;
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 02/19] intel_iommu: Declare supported PASID size
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 01/19] memory: Add permissions in IOMMUAccessFlags CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 03/19] memory: Allow to store the PASID in IOMMUTLBEntry CLEMENT MATHIEU--DRIF
` (18 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
PSS field of the ecap register stores the supported PASID size minus 1.
Thus, this commit adds support for 20bits PASIDs.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/i386/intel_iommu.c | 2 +-
hw/i386/intel_iommu_internal.h | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index f366c223d0..1d5ff8f4f6 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -4574,7 +4574,7 @@ static void vtd_cap_init(IntelIOMMUState *s)
}
if (s->pasid) {
- s->ecap |= VTD_ECAP_PASID;
+ s->ecap |= VTD_ECAP_PASID | VTD_ECAP_PSS;
}
}
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index e8b211e8b0..238f1f443f 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -192,6 +192,7 @@
#define VTD_ECAP_SC (1ULL << 7)
#define VTD_ECAP_MHMV (15ULL << 20)
#define VTD_ECAP_SRS (1ULL << 31)
+#define VTD_ECAP_PSS (19ULL << 35)
#define VTD_ECAP_PASID (1ULL << 40)
#define VTD_ECAP_SMTS (1ULL << 43)
#define VTD_ECAP_SLTS (1ULL << 46)
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 03/19] memory: Allow to store the PASID in IOMMUTLBEntry
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 01/19] memory: Add permissions in IOMMUAccessFlags CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 02/19] intel_iommu: Declare supported PASID size CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 04/19] intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry CLEMENT MATHIEU--DRIF
` (17 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
This will be useful for devices that support ATS
and need to store entries in an ATC (device IOTLB).
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
include/exec/memory.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 56c3a3515e..9889b97abb 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -145,6 +145,7 @@ struct IOMMUTLBEntry {
hwaddr translated_addr;
hwaddr addr_mask; /* 0xfff = 4k translation */
IOMMUAccessFlags perm;
+ uint32_t pasid;
};
/*
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 04/19] intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (2 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 03/19] memory: Allow to store the PASID in IOMMUTLBEntry CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 05/19] pcie: Add helper to declare PASID capability for a pcie device CLEMENT MATHIEU--DRIF
` (16 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
PASID value must be used by devices as a key (or part of a key)
when populating their ATC with the IOTLB entries returned by the IOMMU.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/i386/intel_iommu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 1d5ff8f4f6..c58e18a56c 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2511,6 +2511,7 @@ static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s,
.translated_addr = 0,
.addr_mask = size - 1,
.perm = IOMMU_NONE,
+ .pasid = vtd_as->pasid,
},
};
memory_region_notify_iommu(&vtd_as->iommu, 0, event);
@@ -3098,6 +3099,7 @@ static void do_invalidate_device_tlb(VTDAddressSpace *vtd_dev_as,
event.entry.iova = addr;
event.entry.perm = IOMMU_NONE;
event.entry.translated_addr = 0;
+ event.entry.pasid = vtd_dev_as->pasid;
memory_region_notify_iommu(&vtd_dev_as->iommu, 0, event);
}
@@ -3680,6 +3682,7 @@ static IOMMUTLBEntry vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
IOMMUTLBEntry iotlb = {
/* We'll fill in the rest later. */
.target_as = &address_space_memory,
+ .pasid = vtd_as->pasid,
};
bool success;
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 05/19] pcie: Add helper to declare PASID capability for a pcie device
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (3 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 04/19] intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 06/19] pcie: Helper functions to check if PASID is enabled CLEMENT MATHIEU--DRIF
` (15 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/pci/pcie.c | 24 ++++++++++++++++++++++++
include/hw/pci/pcie.h | 6 +++++-
include/hw/pci/pcie_regs.h | 5 +++++
3 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index 1b12db6fa2..f42a256f15 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -1214,3 +1214,27 @@ void pcie_acs_reset(PCIDevice *dev)
pci_set_word(dev->config + dev->exp.acs_cap + PCI_ACS_CTRL, 0);
}
}
+
+/* PASID */
+void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
+ bool exec_perm, bool priv_mod)
+{
+ assert(pasid_width <= PCI_EXT_CAP_PASID_MAX_WIDTH);
+ static const uint16_t control_reg_rw_mask = 0x07;
+ uint16_t capability_reg = pasid_width;
+
+ pcie_add_capability(dev, PCI_EXT_CAP_ID_PASID, PCI_PASID_VER, offset,
+ PCI_EXT_CAP_PASID_SIZEOF);
+
+ capability_reg <<= PCI_PASID_CAP_WIDTH_SHIFT;
+ capability_reg |= exec_perm ? PCI_PASID_CAP_EXEC : 0;
+ capability_reg |= priv_mod ? PCI_PASID_CAP_PRIV : 0;
+ pci_set_word(dev->config + offset + PCI_PASID_CAP, capability_reg);
+
+ /* Everything is disabled by default */
+ pci_set_word(dev->config + offset + PCI_PASID_CTRL, 0);
+
+ pci_set_word(dev->wmask + offset + PCI_PASID_CTRL, control_reg_rw_mask);
+
+ dev->exp.pasid_cap = offset;
+}
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index b8d59732bc..aa040c3e97 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -72,8 +72,9 @@ struct PCIExpressDevice {
uint16_t aer_cap;
PCIEAERLog aer_log;
- /* Offset of ATS capability in config space */
+ /* Offset of ATS and PASID capabilities in config space */
uint16_t ats_cap;
+ uint16_t pasid_cap;
/* ACS */
uint16_t acs_cap;
@@ -152,4 +153,7 @@ void pcie_cap_slot_unplug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
Error **errp);
void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev,
DeviceState *dev, Error **errp);
+
+void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
+ bool exec_perm, bool priv_mod);
#endif /* QEMU_PCIE_H */
diff --git a/include/hw/pci/pcie_regs.h b/include/hw/pci/pcie_regs.h
index 9d3b6868dc..4d9cf4a29c 100644
--- a/include/hw/pci/pcie_regs.h
+++ b/include/hw/pci/pcie_regs.h
@@ -86,6 +86,11 @@ typedef enum PCIExpLinkWidth {
#define PCI_ARI_VER 1
#define PCI_ARI_SIZEOF 8
+/* PASID */
+#define PCI_PASID_VER 1
+#define PCI_EXT_CAP_PASID_MAX_WIDTH 20
+#define PCI_PASID_CAP_WIDTH_SHIFT 8
+
/* AER */
#define PCI_ERR_VER 2
#define PCI_ERR_SIZEOF 0x48
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 06/19] pcie: Helper functions to check if PASID is enabled
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (4 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 05/19] pcie: Add helper to declare PASID capability for a pcie device CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 07/19] pcie: Helper function to check if ATS " CLEMENT MATHIEU--DRIF
` (14 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
pasid_enabled checks whether the capability is
present or not. If so, we read the configuration space to get
the status of the feature (enabled or not).
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/pci/pcie.c | 9 +++++++++
include/hw/pci/pcie.h | 2 ++
2 files changed, 11 insertions(+)
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index f42a256f15..8186d64234 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -1238,3 +1238,12 @@ void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
dev->exp.pasid_cap = offset;
}
+
+bool pcie_pasid_enabled(const PCIDevice *dev)
+{
+ if (!pci_is_express(dev) || !dev->exp.pasid_cap) {
+ return false;
+ }
+ return (pci_get_word(dev->config + dev->exp.pasid_cap + PCI_PASID_CTRL) &
+ PCI_PASID_CTRL_ENABLE) != 0;
+}
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index aa040c3e97..63604ccc6e 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -156,4 +156,6 @@ void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev,
void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
bool exec_perm, bool priv_mod);
+
+bool pcie_pasid_enabled(const PCIDevice *dev);
#endif /* QEMU_PCIE_H */
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 07/19] pcie: Helper function to check if ATS is enabled
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (5 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 06/19] pcie: Helper functions to check if PASID is enabled CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 08/19] pci: Cache the bus mastering status in the device CLEMENT MATHIEU--DRIF
` (13 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
ats_enabled checks whether the capability is
present or not. If so, we read the configuration space to get
the status of the feature (enabled or not).
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/pci/pcie.c | 9 +++++++++
include/hw/pci/pcie.h | 1 +
2 files changed, 10 insertions(+)
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index 8186d64234..3b8fd6f33c 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -1247,3 +1247,12 @@ bool pcie_pasid_enabled(const PCIDevice *dev)
return (pci_get_word(dev->config + dev->exp.pasid_cap + PCI_PASID_CTRL) &
PCI_PASID_CTRL_ENABLE) != 0;
}
+
+bool pcie_ats_enabled(const PCIDevice *dev)
+{
+ if (!pci_is_express(dev) || !dev->exp.ats_cap) {
+ return false;
+ }
+ return (pci_get_word(dev->config + dev->exp.ats_cap + PCI_ATS_CTRL) &
+ PCI_ATS_CTRL_ENABLE) != 0;
+}
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index 63604ccc6e..7e7b8baa6e 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -158,4 +158,5 @@ void pcie_pasid_init(PCIDevice *dev, uint16_t offset, uint8_t pasid_width,
bool exec_perm, bool priv_mod);
bool pcie_pasid_enabled(const PCIDevice *dev);
+bool pcie_ats_enabled(const PCIDevice *dev);
#endif /* QEMU_PCIE_H */
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 08/19] pci: Cache the bus mastering status in the device
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (6 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 07/19] pcie: Helper function to check if ATS " CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 09/19] pci: Add IOMMU operations to get memory regions with PASID CLEMENT MATHIEU--DRIF
` (12 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
The cached is_master value is necessary to know if a device is
allowed to issue ATS requests or not.
This behavior is implemented in an upcoming patch.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/pci/pci.c | 25 +++++++++++++++----------
include/hw/pci/pci_device.h | 1 +
2 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 2afa423925..164bb22e05 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -134,6 +134,12 @@ static GSequence *pci_acpi_index_list(void)
return used_acpi_index_list;
}
+static void pci_set_master(PCIDevice *d, bool enable)
+{
+ memory_region_set_enabled(&d->bus_master_enable_region, enable);
+ d->is_master = enable; /* cache the status */
+}
+
static void pci_init_bus_master(PCIDevice *pci_dev)
{
AddressSpace *dma_as = pci_device_iommu_address_space(pci_dev);
@@ -141,7 +147,7 @@ static void pci_init_bus_master(PCIDevice *pci_dev)
memory_region_init_alias(&pci_dev->bus_master_enable_region,
OBJECT(pci_dev), "bus master",
dma_as->root, 0, memory_region_size(dma_as->root));
- memory_region_set_enabled(&pci_dev->bus_master_enable_region, false);
+ pci_set_master(pci_dev, false);
memory_region_add_subregion(&pci_dev->bus_master_container_region, 0,
&pci_dev->bus_master_enable_region);
}
@@ -727,9 +733,8 @@ static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
pci_bridge_update_mappings(PCI_BRIDGE(s));
}
- memory_region_set_enabled(&s->bus_master_enable_region,
- pci_get_word(s->config + PCI_COMMAND)
- & PCI_COMMAND_MASTER);
+ pci_set_master(s, pci_get_word(s->config + PCI_COMMAND)
+ & PCI_COMMAND_MASTER);
g_free(config);
return 0;
@@ -1684,9 +1689,10 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
if (ranges_overlap(addr, l, PCI_COMMAND, 2)) {
pci_update_irq_disabled(d, was_irq_disabled);
- memory_region_set_enabled(&d->bus_master_enable_region,
- (pci_get_word(d->config + PCI_COMMAND)
- & PCI_COMMAND_MASTER) && d->enabled);
+ pci_set_master(d,
+ (pci_get_word(d->config + PCI_COMMAND) &
+ PCI_COMMAND_MASTER) &&
+ d->enabled);
}
msi_write_config(d, addr, val_in, l);
@@ -2974,9 +2980,8 @@ void pci_set_enabled(PCIDevice *d, bool state)
d->enabled = state;
pci_update_mappings(d);
- memory_region_set_enabled(&d->bus_master_enable_region,
- (pci_get_word(d->config + PCI_COMMAND)
- & PCI_COMMAND_MASTER) && d->enabled);
+ pci_set_master(d, (pci_get_word(d->config + PCI_COMMAND)
+ & PCI_COMMAND_MASTER) && d->enabled);
if (!d->enabled) {
pci_device_reset(d);
}
diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h
index add208edfa..40606baa5d 100644
--- a/include/hw/pci/pci_device.h
+++ b/include/hw/pci/pci_device.h
@@ -88,6 +88,7 @@ struct PCIDevice {
char name[64];
PCIIORegion io_regions[PCI_NUM_REGIONS];
AddressSpace bus_master_as;
+ bool is_master;
MemoryRegion bus_master_container_region;
MemoryRegion bus_master_enable_region;
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 09/19] pci: Add IOMMU operations to get memory regions with PASID
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (7 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 08/19] pci: Cache the bus mastering status in the device CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 10/19] intel_iommu: Implement the get_memory_region_pasid iommu operation CLEMENT MATHIEU--DRIF
` (11 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
The region returned by this operation will be used as the input region
for ATS.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
include/hw/pci/pci.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 4002bbeebd..644551550b 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -391,6 +391,22 @@ typedef struct PCIIOMMUOps {
* @devfn: device and function number
*/
AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
+ /**
+ * @get_memory_region_pasid: get the iommu memory region for a given
+ * device and pasid
+ *
+ * @bus: the #PCIBus being accessed.
+ *
+ * @opaque: the data passed to pci_setup_iommu().
+ *
+ * @devfn: device and function number
+ *
+ * @pasid: the pasid associated with the requested memory region
+ */
+ IOMMUMemoryRegion * (*get_memory_region_pasid)(PCIBus *bus,
+ void *opaque,
+ int devfn,
+ uint32_t pasid);
/**
* @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
*
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 10/19] intel_iommu: Implement the get_memory_region_pasid iommu operation
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (8 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 09/19] pci: Add IOMMU operations to get memory regions with PASID CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 11/19] memory: Store user data pointer in the IOMMU notifiers CLEMENT MATHIEU--DRIF
` (10 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/i386/intel_iommu.c | 17 ++++++++++++++++-
include/hw/i386/intel_iommu.h | 2 +-
2 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index c58e18a56c..021834c41f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -4202,7 +4202,7 @@ static const MemoryRegionOps vtd_mem_ir_fault_ops = {
};
VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
- int devfn, unsigned int pasid)
+ int devfn, uint32_t pasid)
{
/*
* We can't simply use sid here since the bus number might not be
@@ -4719,8 +4719,23 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
return &vtd_as->as;
}
+static IOMMUMemoryRegion *vtd_get_memory_region_pasid(PCIBus *bus,
+ void *opaque,
+ int devfn,
+ uint32_t pasid)
+{
+ IntelIOMMUState *s = opaque;
+ VTDAddressSpace *vtd_as;
+
+ assert(0 <= devfn && devfn < PCI_DEVFN_MAX);
+
+ vtd_as = vtd_find_add_as(s, bus, devfn, pasid);
+ return &vtd_as->iommu;
+}
+
static PCIIOMMUOps vtd_iommu_ops = {
.get_address_space = vtd_host_dma_iommu,
+ .get_memory_region_pasid = vtd_get_memory_region_pasid,
.set_iommu_device = vtd_dev_set_iommu_device,
.unset_iommu_device = vtd_dev_unset_iommu_device,
};
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index e95477e855..08f71c262e 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -324,6 +324,6 @@ struct IntelIOMMUState {
* create a new one if none exists
*/
VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
- int devfn, unsigned int pasid);
+ int devfn, uint32_t pasid);
#endif
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 11/19] memory: Store user data pointer in the IOMMU notifiers
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (9 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 10/19] intel_iommu: Implement the get_memory_region_pasid iommu operation CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 12/19] pci: Add a pci-level initialization function for iommu notifiers CLEMENT MATHIEU--DRIF
` (9 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
This will help developers of svm devices to track a state
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
include/exec/memory.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 9889b97abb..468b003bf1 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -204,6 +204,7 @@ struct IOMMUNotifier {
hwaddr start;
hwaddr end;
int iommu_idx;
+ void *opaque;
QLIST_ENTRY(IOMMUNotifier) node;
};
typedef struct IOMMUNotifier IOMMUNotifier;
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 12/19] pci: Add a pci-level initialization function for iommu notifiers
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (10 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 11/19] memory: Store user data pointer in the IOMMU notifiers CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 13/19] atc: Generic ATC that can be used by PCIe devices that support SVM CLEMENT MATHIEU--DRIF
` (8 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
We add a convenient way to initialize an device-iotlb notifier.
This is meant to be used by ATS-capable devices.
pci_device_iommu_memory_region_pasid is introduces in this commit and
will be used in several other SVM-related functions exposed in
the PCI API.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/pci/pci.c | 40 ++++++++++++++++++++++++++++++++++++++++
include/hw/pci/pci.h | 15 +++++++++++++++
2 files changed, 55 insertions(+)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 164bb22e05..be29c0375f 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2825,6 +2825,46 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
return &address_space_memory;
}
+static IOMMUMemoryRegion *pci_device_iommu_memory_region_pasid(PCIDevice *dev,
+ uint32_t pasid)
+{
+ PCIBus *bus;
+ PCIBus *iommu_bus;
+ int devfn;
+
+ /*
+ * This function is for internal use in the module,
+ * we can call it with PCI_NO_PASID
+ */
+ if (!dev->is_master ||
+ ((pasid != PCI_NO_PASID) && !pcie_pasid_enabled(dev))) {
+ return NULL;
+ }
+
+ pci_device_get_iommu_bus_devfn(dev, &bus, &iommu_bus, &devfn);
+ if (iommu_bus && iommu_bus->iommu_ops->get_memory_region_pasid) {
+ return iommu_bus->iommu_ops->get_memory_region_pasid(bus,
+ iommu_bus->iommu_opaque, devfn, pasid);
+ }
+ return NULL;
+}
+
+bool pci_iommu_init_iotlb_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n, IOMMUNotify fn,
+ void *opaque)
+{
+ IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+ pasid);
+ if (!iommu_mr) {
+ return false;
+ }
+ iommu_notifier_init(n, fn, IOMMU_NOTIFIER_DEVIOTLB_EVENTS, 0, HWADDR_MAX,
+ memory_region_iommu_attrs_to_index(iommu_mr,
+ MEMTXATTRS_UNSPECIFIED));
+ n->opaque = opaque;
+ return true;
+}
+
bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
Error **errp)
{
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 644551550b..a11366e08d 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -446,6 +446,21 @@ bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
Error **errp);
void pci_device_unset_iommu_device(PCIDevice *dev);
+/**
+ * pci_iommu_init_iotlb_notifier: initialize an IOMMU notifier
+ *
+ * This function is used by devices before registering an IOTLB notifier
+ *
+ * @dev: the device
+ * @pasid: the pasid of the address space to watch
+ * @n: the notifier to initialize
+ * @fn: the callback to be installed
+ * @opaque: user pointer that can be used to store a state
+ */
+bool pci_iommu_init_iotlb_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n, IOMMUNotify fn,
+ void *opaque);
+
/**
* pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
*
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 13/19] atc: Generic ATC that can be used by PCIe devices that support SVM
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (11 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 12/19] pci: Add a pci-level initialization function for iommu notifiers CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 15/19] memory: Add an API for ATS support CLEMENT MATHIEU--DRIF
` (7 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
As the SVM-capable devices will need to cache translations, we provide
an first implementation.
This cache uses a two-level design based on hash tables.
The first level is indexed by a PASID and the second by a virtual addresse.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
util/atc.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++
util/atc.h | 117 ++++++++++++++++++++++++++
util/meson.build | 1 +
3 files changed, 329 insertions(+)
create mode 100644 util/atc.c
create mode 100644 util/atc.h
diff --git a/util/atc.c b/util/atc.c
new file mode 100644
index 0000000000..584ce045db
--- /dev/null
+++ b/util/atc.c
@@ -0,0 +1,211 @@
+/*
+ * QEMU emulation of an ATC
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "util/atc.h"
+
+
+#define PAGE_TABLE_ENTRY_SIZE 8
+
+/* a pasid is hashed using the identity function */
+static guint atc_pasid_key_hash(gconstpointer v)
+{
+ return (guint)(uintptr_t)v; /* pasid */
+}
+
+/* pasid equality */
+static gboolean atc_pasid_key_equal(gconstpointer v1, gconstpointer v2)
+{
+ return v1 == v2;
+}
+
+/* Hash function for IOTLB entries */
+static guint atc_addr_key_hash(gconstpointer v)
+{
+ hwaddr addr = (hwaddr)v;
+ return (guint)((addr >> 32) ^ (addr & 0xffffffffU));
+}
+
+/* Equality test for IOTLB entries */
+static gboolean atc_addr_key_equal(gconstpointer v1, gconstpointer v2)
+{
+ return (hwaddr)v1 == (hwaddr)v2;
+}
+
+static void atc_address_space_free(void *as)
+{
+ g_hash_table_unref(as);
+}
+
+/* return log2(val), or UINT8_MAX if val is not a power of 2 */
+static uint8_t ilog2(uint64_t val)
+{
+ uint8_t result = 0;
+ while (val != 1) {
+ if (val & 1) {
+ return UINT8_MAX;
+ }
+
+ val >>= 1;
+ result += 1;
+ }
+ return result;
+}
+
+ATC *atc_new(uint64_t page_size, uint8_t address_width)
+{
+ ATC *atc;
+ uint8_t log_page_size = ilog2(page_size);
+ /* number of bits each used to store all the intermediate indexes */
+ uint64_t addr_lookup_indexes_size;
+
+ if (log_page_size == UINT8_MAX) {
+ return NULL;
+ }
+ /*
+ * We only support page table entries of 8 (PAGE_TABLE_ENTRY_SIZE) bytes
+ * log2(page_size / 8) = log2(page_size) - 3
+ * is the level offset
+ */
+ if (log_page_size <= 3) {
+ return NULL;
+ }
+
+ atc = g_new0(ATC, 1);
+ atc->address_spaces = g_hash_table_new_full(atc_pasid_key_hash,
+ atc_pasid_key_equal,
+ NULL, atc_address_space_free);
+ atc->level_offset = log_page_size - 3;
+ /* at this point, we know that page_size is a power of 2 */
+ atc->min_addr_mask = page_size - 1;
+ addr_lookup_indexes_size = address_width - log_page_size;
+ if ((addr_lookup_indexes_size % atc->level_offset) != 0) {
+ goto error;
+ }
+ atc->levels = addr_lookup_indexes_size / atc->level_offset;
+ atc->page_size = page_size;
+ return atc;
+
+error:
+ g_free(atc);
+ return NULL;
+}
+
+static inline GHashTable *atc_get_address_space_cache(ATC *atc, uint32_t pasid)
+{
+ return g_hash_table_lookup(atc->address_spaces,
+ (gconstpointer)(uintptr_t)pasid);
+}
+
+void atc_create_address_space_cache(ATC *atc, uint32_t pasid)
+{
+ GHashTable *as_cache;
+
+ as_cache = atc_get_address_space_cache(atc, pasid);
+ if (!as_cache) {
+ as_cache = g_hash_table_new_full(atc_addr_key_hash,
+ atc_addr_key_equal,
+ NULL, g_free);
+ g_hash_table_replace(atc->address_spaces,
+ (gpointer)(uintptr_t)pasid, as_cache);
+ }
+}
+
+void atc_delete_address_space_cache(ATC *atc, uint32_t pasid)
+{
+ g_hash_table_remove(atc->address_spaces, (gpointer)(uintptr_t)pasid);
+}
+
+int atc_update(ATC *atc, IOMMUTLBEntry *entry)
+{
+ IOMMUTLBEntry *value;
+ GHashTable *as_cache = atc_get_address_space_cache(atc, entry->pasid);
+ if (!as_cache) {
+ return -ENODEV;
+ }
+ value = g_memdup2(entry, sizeof(*value));
+ g_hash_table_replace(as_cache, (gpointer)(entry->iova), value);
+ return 0;
+}
+
+IOMMUTLBEntry *atc_lookup(ATC *atc, uint32_t pasid, hwaddr addr)
+{
+ IOMMUTLBEntry *entry;
+ hwaddr mask = atc->min_addr_mask;
+ hwaddr key = addr & (~mask);
+ GHashTable *as_cache = atc_get_address_space_cache(atc, pasid);
+
+ if (!as_cache) {
+ return NULL;
+ }
+
+ /*
+ * Iterate over the possible page sizes and try to find a hit
+ */
+ for (uint8_t level = 0; level < atc->levels; ++level) {
+ entry = g_hash_table_lookup(as_cache, (gconstpointer)key);
+ if (entry && (mask == entry->addr_mask)) {
+ return entry;
+ }
+ mask = (mask << atc->level_offset) | ((1 << atc->level_offset) - 1);
+ key = addr & (~mask);
+ }
+
+ return NULL;
+}
+
+static gboolean atc_invalidate_entry_predicate(gpointer key, gpointer value,
+ gpointer user_data)
+{
+ IOMMUTLBEntry *entry = (IOMMUTLBEntry *)value;
+ IOMMUTLBEntry *target = (IOMMUTLBEntry *)user_data;
+ hwaddr target_mask = ~target->addr_mask;
+ hwaddr entry_mask = ~entry->addr_mask;
+ return ((target->iova & target_mask) == (entry->iova & target_mask)) ||
+ ((target->iova & entry_mask) == (entry->iova & entry_mask));
+}
+
+void atc_invalidate(ATC *atc, IOMMUTLBEntry *entry)
+{
+ GHashTable *as_cache = atc_get_address_space_cache(atc, entry->pasid);
+ if (!as_cache) {
+ return;
+ }
+ g_hash_table_foreach_remove(as_cache,
+ atc_invalidate_entry_predicate,
+ entry);
+}
+
+void atc_destroy(ATC *atc)
+{
+ g_hash_table_unref(atc->address_spaces);
+}
+
+size_t atc_get_max_number_of_pages(ATC *atc, hwaddr addr, size_t length)
+{
+ hwaddr page_mask = ~(atc->min_addr_mask);
+ size_t result = (length / atc->page_size);
+ if ((((addr & page_mask) + length - 1) & page_mask) !=
+ ((addr + length - 1) & page_mask)) {
+ result += 1;
+ }
+ return result + (length % atc->page_size != 0 ? 1 : 0);
+}
+
+void atc_reset(ATC *atc)
+{
+ g_hash_table_remove_all(atc->address_spaces);
+}
diff --git a/util/atc.h b/util/atc.h
new file mode 100644
index 0000000000..8be95f5cca
--- /dev/null
+++ b/util/atc.h
@@ -0,0 +1,117 @@
+/*
+ * QEMU emulation of an ATC
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef UTIL_ATC_H
+#define UTIL_ATC_H
+
+#include "qemu/osdep.h"
+#include "exec/memory.h"
+
+typedef struct ATC {
+ GHashTable *address_spaces; /* Key : pasid, value : GHashTable */
+ hwaddr min_addr_mask;
+ uint64_t page_size;
+ uint8_t levels;
+ uint8_t level_offset;
+} ATC;
+
+/*
+ * atc_new: Create an ATC.
+ *
+ * Return an ATC or NULL if the creation failed
+ *
+ * @page_size: #PCIDevice doing the memory access
+ * @address_width: width of the virtual addresses used by the IOMMU (in bits)
+ */
+ATC *atc_new(uint64_t page_size, uint8_t address_width);
+
+/*
+ * atc_update: Insert or update an entry in the cache
+ *
+ * Return 0 if the operation succeeds, a negative error code otherwise
+ *
+ * The insertion will fail if the address space associated with this pasid
+ * has not been created with atc_create_address_space_cache
+ *
+ * @atc: the ATC to update
+ * @entry: the tlb entry to insert into the cache
+ */
+int atc_update(ATC *atc, IOMMUTLBEntry *entry);
+
+/*
+ * atc_create_address_space_cache: delare a new address space
+ * identified by a PASID
+ *
+ * @atc: the ATC to update
+ * @pasid: the pasid of the address space to be created
+ */
+void atc_create_address_space_cache(ATC *atc, uint32_t pasid);
+
+/*
+ * atc_delete_address_space_cache: delete an address space
+ * identified by a PASID
+ *
+ * @atc: the ATC to update
+ * @pasid: the pasid of the address space to be deleted
+ */
+void atc_delete_address_space_cache(ATC *atc, uint32_t pasid);
+
+/*
+ * atc_lookup: query the cache in a given address space
+ *
+ * @atc: the ATC to query
+ * @pasid: the pasid of the address space to query
+ * @addr: the virtual address to translate
+ */
+IOMMUTLBEntry *atc_lookup(ATC *atc, uint32_t pasid, hwaddr addr);
+
+/*
+ * atc_invalidate: invalidate an entry in the cache
+ *
+ * @atc: the ATC to update
+ * @entry: the entry to invalidate
+ */
+void atc_invalidate(ATC *atc, IOMMUTLBEntry *entry);
+
+/*
+ * atc_destroy: delete an ATC
+ *
+ * @atc: the cache to be deleted
+ */
+void atc_destroy(ATC *atc);
+
+/*
+ * atc_get_max_number_of_pages: get the number of pages a memory operation
+ * will access if all the pages concerned have the minimum size.
+ *
+ * This function can be used to determine the size of the result array to be
+ * allocated when issuing an ATS request.
+ *
+ * @atc: the cache
+ * @addr: start address
+ * @length: number of bytes accessed from addr
+ */
+size_t atc_get_max_number_of_pages(ATC *atc, hwaddr addr, size_t length);
+
+/*
+ * atc_reset: invalidates all the entries stored in the ATC
+ *
+ * @atc: the cache
+ */
+void atc_reset(ATC *atc);
+
+#endif
diff --git a/util/meson.build b/util/meson.build
index 5d8bef9891..f2dec01300 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -93,6 +93,7 @@ if have_block
util_ss.add(files('hbitmap.c'))
util_ss.add(files('hexdump.c'))
util_ss.add(files('iova-tree.c'))
+ util_ss.add(files('atc.c'))
util_ss.add(files('iov.c'))
util_ss.add(files('nvdimm-utils.c'))
util_ss.add(files('block-helpers.c'))
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 14/19] atc: Add unit tests
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (13 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 15/19] memory: Add an API for ATS support CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 16/19] pci: Add a pci-level API for ATS CLEMENT MATHIEU--DRIF
` (5 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
tests/unit/meson.build | 1 +
tests/unit/test-atc.c | 527 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 528 insertions(+)
create mode 100644 tests/unit/test-atc.c
diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index d5248ae51d..810197d5e1 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -48,6 +48,7 @@ tests = {
'test-qapi-util': [],
'test-interval-tree': [],
'test-fifo': [],
+ 'test-atc': [],
}
if have_system or have_tools
diff --git a/tests/unit/test-atc.c b/tests/unit/test-atc.c
new file mode 100644
index 0000000000..0d1c1b7ca7
--- /dev/null
+++ b/tests/unit/test-atc.c
@@ -0,0 +1,527 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "util/atc.h"
+
+static inline bool tlb_entry_equal(IOMMUTLBEntry *e1, IOMMUTLBEntry *e2)
+{
+ if (!e1 || !e2) {
+ return !e1 && !e2;
+ }
+ return e1->iova == e2->iova &&
+ e1->addr_mask == e2->addr_mask &&
+ e1->pasid == e2->pasid &&
+ e1->perm == e2->perm &&
+ e1->target_as == e2->target_as &&
+ e1->translated_addr == e2->translated_addr;
+}
+
+static void assert_lookup_equals(ATC *atc, IOMMUTLBEntry *target,
+ uint32_t pasid, hwaddr iova)
+{
+ IOMMUTLBEntry *result;
+ result = atc_lookup(atc, pasid, iova);
+ g_assert(tlb_entry_equal(result, target));
+}
+
+static void check_creation(uint64_t page_size, uint8_t address_width,
+ uint8_t levels, uint8_t level_offset,
+ bool should_work) {
+ ATC *atc = atc_new(page_size, address_width);
+ if (atc) {
+ g_assert(atc->levels == levels);
+ g_assert(atc->level_offset == level_offset);
+
+ atc_destroy(atc);
+ g_assert(should_work);
+ } else {
+ g_assert(!should_work);
+ }
+}
+
+static void test_creation_parameters(void)
+{
+ check_creation(8, 39, 3, 9, false);
+ check_creation(4095, 39, 3, 9, false);
+ check_creation(4097, 39, 3, 9, false);
+ check_creation(8192, 48, 0, 0, false);
+
+ check_creation(4096, 38, 0, 0, false);
+ check_creation(4096, 39, 3, 9, true);
+ check_creation(4096, 40, 0, 0, false);
+ check_creation(4096, 47, 0, 0, false);
+ check_creation(4096, 48, 4, 9, true);
+ check_creation(4096, 49, 0, 0, false);
+ check_creation(4096, 56, 0, 0, false);
+ check_creation(4096, 57, 5, 9, true);
+ check_creation(4096, 58, 0, 0, false);
+
+ check_creation(16384, 35, 0, 0, false);
+ check_creation(16384, 36, 2, 11, true);
+ check_creation(16384, 37, 0, 0, false);
+ check_creation(16384, 46, 0, 0, false);
+ check_creation(16384, 47, 3, 11, true);
+ check_creation(16384, 48, 0, 0, false);
+ check_creation(16384, 57, 0, 0, false);
+ check_creation(16384, 58, 4, 11, true);
+ check_creation(16384, 59, 0, 0, false);
+}
+
+static void test_single_entry(void)
+{
+ IOMMUTLBEntry entry = {
+ .iova = 0x123456789000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 5,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xdeadbeefULL,
+ };
+
+ ATC *atc = atc_new(4096, 48);
+ g_assert(atc);
+
+ assert_lookup_equals(atc, NULL, entry.pasid,
+ entry.iova + (entry.addr_mask / 2));
+
+ atc_create_address_space_cache(atc, entry.pasid);
+ g_assert(atc_update(atc, &entry) == 0);
+
+ assert_lookup_equals(atc, NULL, entry.pasid + 1,
+ entry.iova + (entry.addr_mask / 2));
+ assert_lookup_equals(atc, &entry, entry.pasid,
+ entry.iova + (entry.addr_mask / 2));
+
+ atc_destroy(atc);
+}
+
+static void test_single_entry_2(void)
+{
+ static uint64_t page_size = 4096;
+ IOMMUTLBEntry e1 = {
+ .iova = 0xabcdef200000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eedULL,
+ };
+
+ ATC *atc = atc_new(page_size , 48);
+ atc_create_address_space_cache(atc, e1.pasid);
+ atc_update(atc, &e1);
+
+ assert_lookup_equals(atc, NULL, e1.pasid, 0xabcdef201000ULL);
+
+ atc_destroy(atc);
+}
+
+static void test_page_boundaries(void)
+{
+ static const uint32_t pasid = 5;
+ static const hwaddr page_size = 4096;
+
+ /* 2 consecutive entries */
+ IOMMUTLBEntry e1 = {
+ .iova = 0x123456789000ULL,
+ .addr_mask = page_size - 1,
+ .pasid = pasid,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xdeadbeefULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = e1.iova + page_size,
+ .addr_mask = page_size - 1,
+ .pasid = pasid,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x900df00dULL,
+ };
+
+ ATC *atc = atc_new(page_size, 48);
+
+ atc_create_address_space_cache(atc, e1.pasid);
+ /* creating the address space twice should not be a problem */
+ atc_create_address_space_cache(atc, e1.pasid);
+
+ atc_update(atc, &e1);
+ atc_update(atc, &e2);
+
+ assert_lookup_equals(atc, NULL, e1.pasid, e1.iova - 1);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova + e1.addr_mask);
+ g_assert((e1.iova + e1.addr_mask + 1) == e2.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova + e2.addr_mask);
+ assert_lookup_equals(atc, NULL, e2.pasid, e2.iova + e2.addr_mask + 1);
+
+ assert_lookup_equals(atc, NULL, e1.pasid + 10, e1.iova);
+ assert_lookup_equals(atc, NULL, e2.pasid + 10, e2.iova);
+ atc_destroy(atc);
+}
+
+static void test_huge_page(void)
+{
+ static const uint32_t pasid = 5;
+ static const hwaddr page_size = 4096;
+ IOMMUTLBEntry e1 = {
+ .iova = 0x123456600000ULL,
+ .addr_mask = 0x1fffffULL,
+ .pasid = pasid,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xdeadbeefULL,
+ };
+ hwaddr addr;
+
+ ATC *atc = atc_new(page_size, 48);
+
+ atc_create_address_space_cache(atc, e1.pasid);
+ atc_update(atc, &e1);
+
+ for (addr = e1.iova; addr <= e1.iova + e1.addr_mask; addr += page_size) {
+ assert_lookup_equals(atc, &e1, e1.pasid, addr);
+ }
+ /* addr is now out of the huge page */
+ assert_lookup_equals(atc, NULL, e1.pasid, addr);
+ atc_destroy(atc);
+}
+
+static void test_pasid(void)
+{
+ hwaddr addr = 0xaaaaaaaaa000ULL;
+ IOMMUTLBEntry e1 = {
+ .iova = addr,
+ .addr_mask = 0xfffULL,
+ .pasid = 8,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xdeadbeefULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = addr,
+ .addr_mask = 0xfffULL,
+ .pasid = 2,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xb001ULL,
+ };
+ uint16_t i;
+
+ ATC *atc = atc_new(4096, 48);
+
+ atc_create_address_space_cache(atc, e1.pasid);
+ atc_create_address_space_cache(atc, e2.pasid);
+ atc_update(atc, &e1);
+ atc_update(atc, &e2);
+
+ for (i = 0; i <= MAX(e1.pasid, e2.pasid) + 1; ++i) {
+ if (i == e1.pasid || i == e2.pasid) {
+ continue;
+ }
+ assert_lookup_equals(atc, NULL, i, addr);
+ }
+ assert_lookup_equals(atc, &e1, e1.pasid, addr);
+ assert_lookup_equals(atc, &e1, e1.pasid, addr);
+ atc_destroy(atc);
+}
+
+static void test_large_address(void)
+{
+ IOMMUTLBEntry e1 = {
+ .iova = 0xaaaaaaaaa000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 8,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eeeeeedULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = 0x1f00baaaaabf000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = e1.pasid,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xdeadbeefULL,
+ };
+
+ ATC *atc = atc_new(4096, 57);
+
+ atc_create_address_space_cache(atc, e1.pasid);
+ atc_update(atc, &e1);
+ atc_update(atc, &e2);
+
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+ atc_destroy(atc);
+}
+
+static void test_bigger_page(void)
+{
+ IOMMUTLBEntry e1 = {
+ .iova = 0xaabbccdde000ULL,
+ .addr_mask = 0x1fffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eeeeeedULL,
+ };
+ hwaddr i;
+
+ ATC *atc = atc_new(8192, 43);
+
+ atc_create_address_space_cache(atc, e1.pasid);
+ atc_update(atc, &e1);
+
+ i = e1.iova & (~e1.addr_mask);
+ assert_lookup_equals(atc, NULL, e1.pasid, i - 1);
+ while (i <= e1.iova + e1.addr_mask) {
+ assert_lookup_equals(atc, &e1, e1.pasid, i);
+ ++i;
+ }
+ assert_lookup_equals(atc, NULL, e1.pasid, i);
+ atc_destroy(atc);
+}
+
+static void test_unknown_pasid(void)
+{
+ IOMMUTLBEntry e1 = {
+ .iova = 0xaabbccfff000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eeeeeedULL,
+ };
+
+ ATC *atc = atc_new(4096, 48);
+ g_assert(atc_update(atc, &e1) != 0);
+ assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+ atc_destroy(atc);
+}
+
+static void test_invalidation(void)
+{
+ static uint64_t page_size = 4096;
+ IOMMUTLBEntry e1 = {
+ .iova = 0xaabbccddf000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eeeeeedULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = 0xffe00000ULL,
+ .addr_mask = 0x1fffffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xb000001ULL,
+ };
+ IOMMUTLBEntry e3;
+
+ ATC *atc = atc_new(page_size , 48);
+ atc_create_address_space_cache(atc, e1.pasid);
+
+ atc_update(atc, &e1);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ atc_invalidate(atc, &e1);
+ assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+
+ atc_update(atc, &e1);
+ atc_update(atc, &e2);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+ atc_invalidate(atc, &e2);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+
+ /* invalidate a huge page by invalidating a small region */
+ for (hwaddr addr = e2.iova; addr <= (e2.iova + e2.addr_mask);
+ addr += page_size) {
+ atc_update(atc, &e2);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+ e3 = (IOMMUTLBEntry){
+ .iova = addr,
+ .addr_mask = page_size - 1,
+ .pasid = e2.pasid,
+ .perm = IOMMU_RW,
+ .translated_addr = 0,
+ };
+ atc_invalidate(atc, &e3);
+ assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+ }
+ atc_destroy(atc);
+}
+
+static void test_delete_address_space_cache(void)
+{
+ static uint64_t page_size = 4096;
+ IOMMUTLBEntry e1 = {
+ .iova = 0xaabbccddf000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eeeeeedULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = e1.iova,
+ .addr_mask = 0xfffULL,
+ .pasid = 2,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eeeeeedULL,
+ };
+
+ ATC *atc = atc_new(page_size , 48);
+ atc_create_address_space_cache(atc, e1.pasid);
+
+ atc_update(atc, &e1);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ atc_invalidate(atc, &e2); /* unkown pasid : is a nop*/
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+
+ atc_create_address_space_cache(atc, e2.pasid);
+ atc_update(atc, &e2);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+ atc_invalidate(atc, &e1);
+ /* e1 has been removed but e2 is still there */
+ assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+
+ atc_update(atc, &e1);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+
+ atc_delete_address_space_cache(atc, e2.pasid);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+ atc_destroy(atc);
+}
+
+static void test_invalidate_entire_address_space(void)
+{
+ static uint64_t page_size = 4096;
+ IOMMUTLBEntry e1 = {
+ .iova = 0x1000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eedULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = 0xfffffffff000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xbeefULL,
+ };
+ IOMMUTLBEntry e3 = {
+ .iova = 0,
+ .addr_mask = 0xffffffffffffffffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0,
+ };
+
+ ATC *atc = atc_new(page_size , 48);
+ atc_create_address_space_cache(atc, e1.pasid);
+
+ atc_update(atc, &e1);
+ atc_update(atc, &e2);
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+ atc_invalidate(atc, &e3);
+ /* e1 has been removed but e2 is still there */
+ assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+
+ atc_destroy(atc);
+}
+
+static void test_reset(void)
+{
+ static uint64_t page_size = 4096;
+ IOMMUTLBEntry e1 = {
+ .iova = 0x1000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 1,
+ .perm = IOMMU_RW,
+ .translated_addr = 0x5eedULL,
+ };
+ IOMMUTLBEntry e2 = {
+ .iova = 0xfffffffff000ULL,
+ .addr_mask = 0xfffULL,
+ .pasid = 2,
+ .perm = IOMMU_RW,
+ .translated_addr = 0xbeefULL,
+ };
+
+ ATC *atc = atc_new(page_size , 48);
+ atc_create_address_space_cache(atc, e1.pasid);
+ atc_create_address_space_cache(atc, e2.pasid);
+ atc_update(atc, &e1);
+ atc_update(atc, &e2);
+
+ assert_lookup_equals(atc, &e1, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, &e2, e2.pasid, e2.iova);
+
+ atc_reset(atc);
+
+ assert_lookup_equals(atc, NULL, e1.pasid, e1.iova);
+ assert_lookup_equals(atc, NULL, e2.pasid, e2.iova);
+ atc_destroy(atc);
+}
+
+static void test_get_max_number_of_pages(void)
+{
+ static uint64_t page_size = 4096;
+ hwaddr base = 0xc0fee000; /* aligned */
+ ATC *atc = atc_new(page_size , 48);
+ g_assert(atc_get_max_number_of_pages(atc, base, page_size / 2) == 1);
+ g_assert(atc_get_max_number_of_pages(atc, base, page_size) == 1);
+ g_assert(atc_get_max_number_of_pages(atc, base, page_size + 1) == 2);
+
+ g_assert(atc_get_max_number_of_pages(atc, base + 10, 1) == 1);
+ g_assert(atc_get_max_number_of_pages(atc, base + 10, page_size - 10) == 1);
+ g_assert(atc_get_max_number_of_pages(atc, base + 10,
+ page_size - 10 + 1) == 2);
+ g_assert(atc_get_max_number_of_pages(atc, base + 10,
+ page_size - 10 + 2) == 2);
+
+ g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 1) == 1);
+ g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 2) == 2);
+ g_assert(atc_get_max_number_of_pages(atc, base + page_size - 1, 3) == 2);
+
+ g_assert(atc_get_max_number_of_pages(atc, base + 10, page_size * 20) == 21);
+ g_assert(atc_get_max_number_of_pages(atc, base + 10,
+ (page_size * 20) + (page_size - 10))
+ == 21);
+ g_assert(atc_get_max_number_of_pages(atc, base + 10,
+ (page_size * 20) +
+ (page_size - 10 + 1)) == 22);
+}
+
+int main(int argc, char **argv)
+{
+ g_test_init(&argc, &argv, NULL);
+ g_test_add_func("/atc/test_creation_parameters", test_creation_parameters);
+ g_test_add_func("/atc/test_single_entry", test_single_entry);
+ g_test_add_func("/atc/test_single_entry_2", test_single_entry_2);
+ g_test_add_func("/atc/test_page_boundaries", test_page_boundaries);
+ g_test_add_func("/atc/test_huge_page", test_huge_page);
+ g_test_add_func("/atc/test_pasid", test_pasid);
+ g_test_add_func("/atc/test_large_address", test_large_address);
+ g_test_add_func("/atc/test_bigger_page", test_bigger_page);
+ g_test_add_func("/atc/test_unknown_pasid", test_unknown_pasid);
+ g_test_add_func("/atc/test_invalidation", test_invalidation);
+ g_test_add_func("/atc/test_delete_address_space_cache",
+ test_delete_address_space_cache);
+ g_test_add_func("/atc/test_invalidate_entire_address_space",
+ test_invalidate_entire_address_space);
+ g_test_add_func("/atc/test_reset", test_reset);
+ g_test_add_func("/atc/test_get_max_number_of_pages",
+ test_get_max_number_of_pages);
+ return g_test_run();
+}
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 15/19] memory: Add an API for ATS support
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (12 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 13/19] atc: Generic ATC that can be used by PCIe devices that support SVM CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 14/19] atc: Add unit tests CLEMENT MATHIEU--DRIF
` (6 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
IOMMU have to implement iommu_ats_request_translation to support ATS.
Devices can use IOMMU_TLB_ENTRY_TRANSLATION_ERROR to check the tlb
entries returned by a translation request.
We decided not to use the existing translation operation for 2 reasons.
First, ATS is designed to translate ranges and not isolated addresses.
Second, we need ATS-specific parameters.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
include/exec/memory.h | 26 ++++++++++++++++++++++++++
system/memory.c | 21 +++++++++++++++++++++
2 files changed, 47 insertions(+)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 468b003bf1..042d4ea5be 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -148,6 +148,10 @@ struct IOMMUTLBEntry {
uint32_t pasid;
};
+/* Check if an IOMMU TLB entry indicates a translation error */
+#define IOMMU_TLB_ENTRY_TRANSLATION_ERROR(entry) ((((entry)->perm) & IOMMU_RW) \
+ == IOMMU_NONE)
+
/*
* Bitmap for different IOMMUNotifier capabilities. Each notifier can
* register with one or multiple IOMMU Notifier capability bit(s).
@@ -525,6 +529,20 @@ struct IOMMUMemoryRegionClass {
* @iommu: the IOMMUMemoryRegion
*/
int (*num_indexes)(IOMMUMemoryRegion *iommu);
+
+ /**
+ * @iommu_ats_request_translation:
+ * This method must be implemented if the IOMMU has ATS enabled
+ *
+ * @see pci_ats_request_translation_pasid
+ */
+ ssize_t (*iommu_ats_request_translation)(IOMMUMemoryRegion *iommu,
+ bool priv_req, bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write,
+ IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count);
};
typedef struct RamDiscardListener RamDiscardListener;
@@ -1882,6 +1900,14 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n);
void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
IOMMUNotifier *n);
+ssize_t memory_region_iommu_ats_request_translation(IOMMUMemoryRegion *iommu_mr,
+ bool priv_req, bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write,
+ IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count);
+
/**
* memory_region_iommu_get_attr: return an IOMMU attr if get_attr() is
* defined on the IOMMU.
diff --git a/system/memory.c b/system/memory.c
index b17b5538ff..0a379a72bb 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -2011,6 +2011,27 @@ void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
memory_region_update_iommu_notify_flags(iommu_mr, NULL);
}
+ssize_t memory_region_iommu_ats_request_translation(IOMMUMemoryRegion *iommu_mr,
+ bool priv_req,
+ bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write,
+ IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count)
+{
+ IOMMUMemoryRegionClass *imrc =
+ memory_region_get_iommu_class_nocheck(iommu_mr);
+
+ if (!imrc->iommu_ats_request_translation) {
+ return -ENODEV;
+ }
+
+ return imrc->iommu_ats_request_translation(iommu_mr, priv_req, exec_req,
+ addr, length, no_write, result,
+ result_length, err_count);
+}
+
void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
const IOMMUTLBEvent *event)
{
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 16/19] pci: Add a pci-level API for ATS
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (14 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 14/19] atc: Add unit tests CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 17/19] intel_iommu: Set address mask when a translation fails and adjust W permission CLEMENT MATHIEU--DRIF
` (4 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Devices implementing ATS can send translation requests using
pci_ats_request_translation_pasid.
The invalidation events are sent back to the device using the iommu
notifier managed with pci_register_iommu_tlb_event_notifier and
pci_unregister_iommu_tlb_event_notifier
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/pci/pci.c | 46 +++++++++++++++++++++++++++++++++++++++
include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 98 insertions(+)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index be29c0375f..0ccd0656b7 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2896,6 +2896,52 @@ void pci_device_unset_iommu_device(PCIDevice *dev)
}
}
+ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
+ bool priv_req, bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write, IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count)
+{
+ IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+ pasid);
+
+ assert(result_length);
+
+ if (!iommu_mr || !pcie_ats_enabled(dev)) {
+ return -EPERM;
+ }
+ return memory_region_iommu_ats_request_translation(iommu_mr, priv_req,
+ exec_req, addr, length,
+ no_write, result,
+ result_length,
+ err_count);
+}
+
+int pci_register_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n)
+{
+ IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+ pasid);
+ if (!iommu_mr) {
+ return -EPERM;
+ }
+ return memory_region_register_iommu_notifier(MEMORY_REGION(iommu_mr), n,
+ &error_fatal);
+}
+
+int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n)
+{
+ IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+ pasid);
+ if (!iommu_mr) {
+ return -EPERM;
+ }
+ memory_region_unregister_iommu_notifier(MEMORY_REGION(iommu_mr), n);
+ return 0;
+}
+
void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque)
{
/*
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index a11366e08d..592e72aee9 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -461,6 +461,58 @@ bool pci_iommu_init_iotlb_notifier(PCIDevice *dev, uint32_t pasid,
IOMMUNotifier *n, IOMMUNotify fn,
void *opaque);
+/**
+ * pci_ats_request_translation_pasid: perform an ATS request
+ *
+ * Return the number of translations stored in @result in case of success,
+ * a negative error code otherwise.
+ * -ENOMEM is returned when the result buffer is not large enough to store
+ * all the translations
+ *
+ * @dev: the ATS-capable PCI device
+ * @pasid: the pasid of the address space in which the translation will be made
+ * @priv_req: privileged mode bit (PASID TLP)
+ * @exec_req: execute request bit (PASID TLP)
+ * @addr: start address of the memory range to be translated
+ * @length: length of the memory range in bytes
+ * @no_write: request a read-only access translation (if supported by the IOMMU)
+ * @result: buffer in which the TLB entries will be stored
+ * @result_length: result buffer length
+ * @err_count: number of untranslated subregions
+ */
+ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
+ bool priv_req, bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write, IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count);
+
+/**
+ * pci_register_iommu_tlb_event_notifier: register a notifier for changes to
+ * IOMMU translation entries in a specific address space.
+ *
+ * Returns 0 on success, or a negative errno otherwise.
+ *
+ * @dev: the device that wants to get notified
+ * @pasid: the pasid of the address space to track
+ * @n: the notifier to register
+ */
+int pci_register_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n);
+
+/**
+ * pci_unregister_iommu_tlb_event_notifier: unregister a notifier that has been
+ * registerd with pci_register_iommu_tlb_event_notifier
+ *
+ * Returns 0 on success, or a negative errno otherwise.
+ *
+ * @dev: the device that wants to unsubscribe
+ * @pasid: the pasid of the address space to be untracked
+ * @n: the notifier to unregister
+ */
+int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n);
+
/**
* pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
*
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 17/19] intel_iommu: Set address mask when a translation fails and adjust W permission
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (15 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 16/19] pci: Add a pci-level API for ATS CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 18/19] intel_iommu: Return page walk level even when the translation fails CLEMENT MATHIEU--DRIF
` (3 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Implements the behavior defined in section 10.2.3.5 of PCIe spec rev 5.
This is needed by devices that support ATS.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/i386/intel_iommu.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 021834c41f..530b75a9a3 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2100,7 +2100,8 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
uint8_t bus_num = pci_bus_num(bus);
VTDContextCacheEntry *cc_entry;
uint64_t pte, page_mask;
- uint32_t level, pasid = vtd_as->pasid;
+ uint32_t level = UINT32_MAX;
+ uint32_t pasid = vtd_as->pasid;
uint16_t source_id = PCI_BUILD_BDF(bus_num, devfn);
int ret_fr;
bool is_fpd_set = false;
@@ -2259,14 +2260,19 @@ out:
entry->iova = addr & page_mask;
entry->translated_addr = vtd_get_pte_addr(pte, s->aw_bits) & page_mask;
entry->addr_mask = ~page_mask;
- entry->perm = access_flags;
+ entry->perm = (is_write ? access_flags : (access_flags & (~IOMMU_WO)));
return true;
error:
vtd_iommu_unlock(s);
entry->iova = 0;
entry->translated_addr = 0;
- entry->addr_mask = 0;
+ /*
+ * Set the mask for ATS (the range must be present even when the
+ * translation fails : PCIe rev 5 10.2.3.5)
+ */
+ entry->addr_mask = (level != UINT32_MAX) ?
+ (~vtd_pt_level_page_mask(level)) : (~VTD_PAGE_MASK_4K);
entry->perm = IOMMU_NONE;
return false;
}
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 18/19] intel_iommu: Return page walk level even when the translation fails
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (16 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 17/19] intel_iommu: Set address mask when a translation fails and adjust W permission CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 19/19] intel_iommu: Add support for ATS CLEMENT MATHIEU--DRIF
` (2 subsequent siblings)
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
We use this information in vtd_do_iommu_translate to populate the
IOMMUTLBEntry and indicate the correct page mask. This prevents ATS
devices from sending many useless translation requests when a megapage
or gigapage iova is not mapped to a physical address.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/i386/intel_iommu.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 530b75a9a3..3c31dc1047 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1995,9 +1995,9 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s, VTDContextEntry *ce,
uint32_t pasid)
{
dma_addr_t addr = vtd_get_iova_pgtbl_base(s, ce, pasid);
- uint32_t level = vtd_get_iova_level(s, ce, pasid);
uint32_t offset;
uint64_t flpte, flag_ad = VTD_FL_A;
+ *flpte_level = vtd_get_iova_level(s, ce, pasid);
if (!vtd_iova_fl_check_canonical(s, iova, ce, pasid)) {
error_report_once("%s: detected non canonical IOVA (iova=0x%" PRIx64 ","
@@ -2006,11 +2006,11 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s, VTDContextEntry *ce,
}
while (true) {
- offset = vtd_iova_level_offset(iova, level);
+ offset = vtd_iova_level_offset(iova, *flpte_level);
flpte = vtd_get_pte(addr, offset);
if (flpte == (uint64_t)-1) {
- if (level == vtd_get_iova_level(s, ce, pasid)) {
+ if (*flpte_level == vtd_get_iova_level(s, ce, pasid)) {
/* Invalid programming of pasid-entry */
return -VTD_FR_PASID_ENTRY_FSPTPTR_INV;
} else {
@@ -2036,15 +2036,15 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s, VTDContextEntry *ce,
if (is_write && !(flpte & VTD_FL_RW)) {
return -VTD_FR_SM_WRITE;
}
- if (vtd_flpte_nonzero_rsvd(flpte, level)) {
+ if (vtd_flpte_nonzero_rsvd(flpte, *flpte_level)) {
error_report_once("%s: detected flpte reserved non-zero "
"iova=0x%" PRIx64 ", level=0x%" PRIx32
"flpte=0x%" PRIx64 ", pasid=0x%" PRIX32 ")",
- __func__, iova, level, flpte, pasid);
+ __func__, iova, *flpte_level, flpte, pasid);
return -VTD_FR_FS_PAGING_ENTRY_RSVD;
}
- if (vtd_is_last_pte(flpte, level) && is_write) {
+ if (vtd_is_last_pte(flpte, *flpte_level) && is_write) {
flag_ad |= VTD_FL_D;
}
@@ -2052,14 +2052,13 @@ static int vtd_iova_to_flpte(IntelIOMMUState *s, VTDContextEntry *ce,
return -VTD_FR_FS_BIT_UPDATE_FAILED;
}
- if (vtd_is_last_pte(flpte, level)) {
+ if (vtd_is_last_pte(flpte, *flpte_level)) {
*flptep = flpte;
- *flpte_level = level;
return 0;
}
addr = vtd_get_pte_addr(flpte, aw_bits);
- level--;
+ (*flpte_level)--;
}
}
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 19/19] intel_iommu: Add support for ATS
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (17 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 18/19] intel_iommu: Return page walk level even when the translation fails CLEMENT MATHIEU--DRIF
@ 2025-01-20 17:41 ` CLEMENT MATHIEU--DRIF
2025-02-19 6:10 ` [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
2025-02-20 21:13 ` Michael S. Tsirkin
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-01-20 17:41 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com,
CLEMENT MATHIEU--DRIF
From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/i386/intel_iommu.c | 71 ++++++++++++++++++++++++++++++++--
hw/i386/intel_iommu_internal.h | 1 +
2 files changed, 69 insertions(+), 3 deletions(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 3c31dc1047..698e1286da 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -4159,12 +4159,10 @@ static void vtd_report_ir_illegal_access(VTDAddressSpace *vtd_as,
bool is_fpd_set = false;
VTDContextEntry ce;
- assert(vtd_as->pasid != PCI_NO_PASID);
-
/* Try out best to fetch FPD, we can't do anything more */
if (vtd_dev_to_context_entry(s, bus_n, vtd_as->devfn, &ce) == 0) {
is_fpd_set = ce.lo & VTD_CONTEXT_ENTRY_FPD;
- if (!is_fpd_set && s->root_scalable) {
+ if (!is_fpd_set && s->root_scalable && vtd_as->pasid != PCI_NO_PASID) {
vtd_ce_get_pasid_fpd(s, &ce, &is_fpd_set, vtd_as->pasid);
}
}
@@ -4738,6 +4736,71 @@ static IOMMUMemoryRegion *vtd_get_memory_region_pasid(PCIBus *bus,
return &vtd_as->iommu;
}
+static IOMMUTLBEntry vtd_iommu_ats_do_translate(IOMMUMemoryRegion *iommu,
+ hwaddr addr,
+ IOMMUAccessFlags flags,
+ int iommu_idx)
+{
+ IOMMUTLBEntry entry;
+ VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
+
+ if (vtd_is_interrupt_addr(addr)) {
+ vtd_report_ir_illegal_access(vtd_as, addr, flags & IOMMU_WO);
+ entry.iova = 0;
+ entry.translated_addr = 0;
+ entry.addr_mask = ~VTD_PAGE_MASK_4K;
+ entry.perm = IOMMU_NONE;
+ entry.pasid = PCI_NO_PASID;
+ } else {
+ entry = vtd_iommu_translate(iommu, addr, flags, iommu_idx);
+ }
+ return entry;
+}
+
+static ssize_t vtd_iommu_ats_request_translation(IOMMUMemoryRegion *iommu,
+ bool priv_req, bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write,
+ IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count)
+{
+ IOMMUAccessFlags flags = IOMMU_ACCESS_FLAG_FULL(true, !no_write, exec_req,
+ priv_req, false, false);
+ ssize_t res_index = 0;
+ hwaddr target_address = addr + length;
+ IOMMUTLBEntry entry;
+
+ *err_count = 0;
+
+ while ((addr < target_address) && (res_index < result_length)) {
+ entry = vtd_iommu_ats_do_translate(iommu, addr, flags, 0);
+ if (!IOMMU_TLB_ENTRY_TRANSLATION_ERROR(&entry)) { /* Translation done */
+ /*
+ * 4.1.2 : Global Mapping (G) : Remapping hardware provides a value
+ * of 0 in this field
+ */
+ entry.perm &= ~IOMMU_GLOBAL;
+ } else {
+ *err_count += 1;
+ }
+ result[res_index] = entry;
+ res_index += 1;
+ addr = (addr & (~entry.addr_mask)) + (entry.addr_mask + 1);
+ }
+
+ /* Buffer too small */
+ if (addr < target_address) {
+ return -ENOMEM;
+ }
+ return res_index;
+}
+
+static uint64_t vtd_get_min_page_size(IOMMUMemoryRegion *iommu)
+{
+ return VTD_PAGE_SIZE;
+}
+
static PCIIOMMUOps vtd_iommu_ops = {
.get_address_space = vtd_host_dma_iommu,
.get_memory_region_pasid = vtd_get_memory_region_pasid,
@@ -4915,6 +4978,8 @@ static void vtd_iommu_memory_region_class_init(ObjectClass *klass,
imrc->translate = vtd_iommu_translate;
imrc->notify_flag_changed = vtd_iommu_notify_flag_changed;
imrc->replay = vtd_iommu_replay;
+ imrc->iommu_ats_request_translation = vtd_iommu_ats_request_translation;
+ imrc->get_min_page_size = vtd_get_min_page_size;
}
static const TypeInfo vtd_iommu_memory_region_info = {
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 238f1f443f..7e2071cd4d 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -192,6 +192,7 @@
#define VTD_ECAP_SC (1ULL << 7)
#define VTD_ECAP_MHMV (15ULL << 20)
#define VTD_ECAP_SRS (1ULL << 31)
+#define VTD_ECAP_NWFS (1ULL << 33)
#define VTD_ECAP_PSS (19ULL << 35)
#define VTD_ECAP_PASID (1ULL << 40)
#define VTD_ECAP_SMTS (1ULL << 43)
--
2.47.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v2 00/19] intel_iommu: Add ATS support
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (18 preceding siblings ...)
2025-01-20 17:41 ` [PATCH v2 19/19] intel_iommu: Add support for ATS CLEMENT MATHIEU--DRIF
@ 2025-02-19 6:10 ` CLEMENT MATHIEU--DRIF
2025-02-20 21:13 ` Michael S. Tsirkin
20 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-02-19 6:10 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: jasowang@redhat.com, zhenzhong.duan@intel.com,
kevin.tian@intel.com, yi.l.liu@intel.com,
joao.m.martins@oracle.com, peterx@redhat.com, mst@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com
Kindly ping
Thanks everyone
>cmd
On 20/01/2025 18:41, CLEMENT MATHIEU--DRIF wrote:
> From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
>
> This patch set belongs to a list of series that add SVM support for VT-d.
>
> Here we focus on implementing ATS support in the IOMMU and adding a
> PCI-level API to be used by virtual devices.
>
> This work is based on the VT-d specification version 4.1 (March 2023).
>
> Here is a link to our GitHub repository where you can find the following elements:
> - Qemu with all the patches for SVM
> - ATS
> - PRI
> - Device IOTLB invalidations
> - Requests with already pre-translated addresses
> - A demo device
> - A simple driver for the demo device
> - A userspace program (for testing and demonstration purposes)
>
> https://github.com/BullSequana/Qemu-in-guest-SVM-demo
>
> ===============
>
> Context and design notes
> ''''''''''''''''''''''''
>
> The main purpose of this work is to enable vVT-d users to make
> translation requests to the vIOMMU as described in the PCIe Gen 5.0
> specification (section 10). Moreover, we aim to implement a
> PCI/Memory-level framework that could be used by other vIOMMUs
> to implement the same features.
>
> What is ATS?
> ''''''''''''
>
> ATS (Address Translation Service) is a PCIe-level protocol that
> enables PCIe devices to query an IOMMU for virtual to physical
> address translations in a specific address space (such as a userland
> process address space). When a device receives translation responses
> from an IOMMU, it may decide to store them in an internal cache,
> often known as "ATC" (Address Translation Cache) or "Device IOTLB".
> To keep page tables and caches consistent, the IOMMU is allowed to
> send asynchronous invalidation requests to its client devices.
>
> To avoid introducing an unnecessarily complex API, this series simply
> exposes 3 functions. The first 2 are a pair of setup functions that
> are called to install and remove the ATS invalidation callback during
> the initialization phase of a process. The third one will be
> used to request translations. The callback setup API introduced in
> this series calls the IOMMUNotifier API under the hood.
>
> API design
> ''''''''''
>
> - int pci_register_iommu_tlb_event_notifier(PCIDevice *dev,
> uint32_t pasid,
> IOMMUNotifier *n);
>
> - int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
> IOMMUNotifier *n);
>
> - ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
> bool priv_req, bool exec_req,
> hwaddr addr, size_t length,
> bool no_write,
> IOMMUTLBEntry *result,
> size_t result_length,
> uint32_t *err_count);
>
> Although device developers may want to implement custom ATC for
> testing or performance measurement purposes, we provide a generic
> implementation as a utility module.
>
> Overview
> ''''''''
>
> Here are the interactions between an ATS-capable PCIe device and the vVT-d:
>
>
>
> ┌───────────┐ ┌────────────┐
> │Device │ │PCI / Memory│
> │ │ pci_ats_request_│abstraction │ iommu_ats_
> │ │ translation_ │ │ request_
> │┌─────────┐│ pasid │ AS lookup │ translation
> ││Logic ││────────────────>│╶╶╶╶╶╶╶╶╶╶╶>│──────┐
> │└─────────┘│<────────────────│<╶╶╶╶╶╶╶╶╶╶╶│<──┐ │
> │┌─────────┐│ │ │ │ │
> ││inv func ││<───────┐ │ │ │ │
> │└─────────┘│ │ │ │ │ │
> │ │ │ │ │ │ │ │
> │ ∨ │ │ │ │ │ │
> │┌─────────┐│ │ │ │ │ │
> ││ATC ││ │ │ │ │ │
> │└─────────┘│ │ │ │ │ │
> └───────────┘ │ └────────────┘ │ │
> │ │ │
> │ │ │
> │ │ │
> │ │ │
> │ ┌────────────────────┼──┼─┐
> │ │vVT-d │ │ │
> │ │ │ │ │
> │ │ │ │ │
> │ │ │ │ │
> │ │ │ │ │
> │ │ │ ∨ │
> │ │┌───────────────────────┐│
> │ ││Translation logic ││
> │ │└───────────────────────┘│
> └────┼────────────┐ │
> │ │ │
> │┌───────────────────────┐│
> ││ Invalidation queue ││
> │└───────────∧───────────┘│
> └────────────┼────────────┘
> │
> │
> │
> ┌────────────────────────┐
> │Kernel driver │
> │ │
> └────────────────────────┘
>
> v2
> Rebase on master after merge of Zhenzhong's FLTS series
> Rename the series as it is now based on master.
>
> Changes after review by Michael:
> - Split long lines in memory.h
> - Change patch encoding (no UTF-8)
>
> Changes after review by Zhenzhong:
> - Rework "Fill the PASID field when creating an IOMMUTLBEntry"
>
>
>
> Clement Mathieu--Drif (19):
> memory: Add permissions in IOMMUAccessFlags
> intel_iommu: Declare supported PASID size
> memory: Allow to store the PASID in IOMMUTLBEntry
> intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry
> pcie: Add helper to declare PASID capability for a pcie device
> pcie: Helper functions to check if PASID is enabled
> pcie: Helper function to check if ATS is enabled
> pci: Cache the bus mastering status in the device
> pci: Add IOMMU operations to get memory regions with PASID
> intel_iommu: Implement the get_memory_region_pasid iommu operation
> memory: Store user data pointer in the IOMMU notifiers
> pci: Add a pci-level initialization function for iommu notifiers
> atc: Generic ATC that can be used by PCIe devices that support SVM
> atc: Add unit tests
> memory: Add an API for ATS support
> pci: Add a pci-level API for ATS
> intel_iommu: Set address mask when a translation fails and adjust W
> permission
> intel_iommu: Return page walk level even when the translation fails
> intel_iommu: Add support for ATS
>
> hw/i386/intel_iommu.c | 122 ++++++--
> hw/i386/intel_iommu_internal.h | 2 +
> hw/pci/pci.c | 111 ++++++-
> hw/pci/pcie.c | 42 +++
> include/exec/memory.h | 51 +++-
> include/hw/i386/intel_iommu.h | 2 +-
> include/hw/pci/pci.h | 83 ++++++
> include/hw/pci/pci_device.h | 1 +
> include/hw/pci/pcie.h | 9 +-
> include/hw/pci/pcie_regs.h | 5 +
> system/memory.c | 21 ++
> tests/unit/meson.build | 1 +
> tests/unit/test-atc.c | 527 +++++++++++++++++++++++++++++++++
> util/atc.c | 211 +++++++++++++
> util/atc.h | 117 ++++++++
> util/meson.build | 1 +
> 16 files changed, 1275 insertions(+), 31 deletions(-)
> create mode 100644 tests/unit/test-atc.c
> create mode 100644 util/atc.c
> create mode 100644 util/atc.h
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 00/19] intel_iommu: Add ATS support
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
` (19 preceding siblings ...)
2025-02-19 6:10 ` [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
@ 2025-02-20 21:13 ` Michael S. Tsirkin
2025-02-21 7:54 ` CLEMENT MATHIEU--DRIF
20 siblings, 1 reply; 23+ messages in thread
From: Michael S. Tsirkin @ 2025-02-20 21:13 UTC (permalink / raw)
To: CLEMENT MATHIEU--DRIF
Cc: qemu-devel@nongnu.org, jasowang@redhat.com,
zhenzhong.duan@intel.com, kevin.tian@intel.com,
yi.l.liu@intel.com, joao.m.martins@oracle.com, peterx@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com
On Mon, Jan 20, 2025 at 05:41:32PM +0000, CLEMENT MATHIEU--DRIF wrote:
> From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
>
> This patch set belongs to a list of series that add SVM support for VT-d.
>
> Here we focus on implementing ATS support in the IOMMU and adding a
> PCI-level API to be used by virtual devices.
>
> This work is based on the VT-d specification version 4.1 (March 2023).
>
> Here is a link to our GitHub repository where you can find the following elements:
> - Qemu with all the patches for SVM
> - ATS
> - PRI
> - Device IOTLB invalidations
> - Requests with already pre-translated addresses
> - A demo device
> - A simple driver for the demo device
> - A userspace program (for testing and demonstration purposes)
>
> https://github.com/BullSequana/Qemu-in-guest-SVM-demo
Fails build:
https://gitlab.com/mstredhat/qemu/-/jobs/9200372388
In function ‘vtd_iommu_ats_do_translate’,
inlined from ‘vtd_iommu_ats_request_translation’ at ../hw/i386/intel_iommu.c:4778:17:
../hw/i386/intel_iommu.c:4758:12: error: ‘entry.target_as’ may be used uninitialized [-Werror=maybe-uninitialized]
4758 | return entry;
| ^~~~~
../hw/i386/intel_iommu.c: In function ‘vtd_iommu_ats_request_translation’:
../hw/i386/intel_iommu.c:4745:19: note: ‘entry’ declared here
4745 | IOMMUTLBEntry entry;
| ^~~~~
cc1: all warnings being treated as errors
> ===============
>
> Context and design notes
> ''''''''''''''''''''''''
>
> The main purpose of this work is to enable vVT-d users to make
> translation requests to the vIOMMU as described in the PCIe Gen 5.0
> specification (section 10). Moreover, we aim to implement a
> PCI/Memory-level framework that could be used by other vIOMMUs
> to implement the same features.
>
> What is ATS?
> ''''''''''''
>
> ATS (Address Translation Service) is a PCIe-level protocol that
> enables PCIe devices to query an IOMMU for virtual to physical
> address translations in a specific address space (such as a userland
> process address space). When a device receives translation responses
> from an IOMMU, it may decide to store them in an internal cache,
> often known as "ATC" (Address Translation Cache) or "Device IOTLB".
> To keep page tables and caches consistent, the IOMMU is allowed to
> send asynchronous invalidation requests to its client devices.
>
> To avoid introducing an unnecessarily complex API, this series simply
> exposes 3 functions. The first 2 are a pair of setup functions that
> are called to install and remove the ATS invalidation callback during
> the initialization phase of a process. The third one will be
> used to request translations. The callback setup API introduced in
> this series calls the IOMMUNotifier API under the hood.
>
> API design
> ''''''''''
>
> - int pci_register_iommu_tlb_event_notifier(PCIDevice *dev,
> uint32_t pasid,
> IOMMUNotifier *n);
>
> - int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
> IOMMUNotifier *n);
>
> - ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
> bool priv_req, bool exec_req,
> hwaddr addr, size_t length,
> bool no_write,
> IOMMUTLBEntry *result,
> size_t result_length,
> uint32_t *err_count);
>
> Although device developers may want to implement custom ATC for
> testing or performance measurement purposes, we provide a generic
> implementation as a utility module.
>
> Overview
> ''''''''
>
> Here are the interactions between an ATS-capable PCIe device and the vVT-d:
>
>
>
> ┌───────────┐ ┌────────────┐
> │Device │ │PCI / Memory│
> │ │ pci_ats_request_│abstraction │ iommu_ats_
> │ │ translation_ │ │ request_
> │┌─────────┐│ pasid │ AS lookup │ translation
> ││Logic ││────────────────>│╶╶╶╶╶╶╶╶╶╶╶>│──────┐
> │└─────────┘│<────────────────│<╶╶╶╶╶╶╶╶╶╶╶│<──┐ │
> │┌─────────┐│ │ │ │ │
> ││inv func ││<───────┐ │ │ │ │
> │└─────────┘│ │ │ │ │ │
> │ │ │ │ │ │ │ │
> │ ∨ │ │ │ │ │ │
> │┌─────────┐│ │ │ │ │ │
> ││ATC ││ │ │ │ │ │
> │└─────────┘│ │ │ │ │ │
> └───────────┘ │ └────────────┘ │ │
> │ │ │
> │ │ │
> │ │ │
> │ │ │
> │ ┌────────────────────┼──┼─┐
> │ │vVT-d │ │ │
> │ │ │ │ │
> │ │ │ │ │
> │ │ │ │ │
> │ │ │ │ │
> │ │ │ ∨ │
> │ │┌───────────────────────┐│
> │ ││Translation logic ││
> │ │└───────────────────────┘│
> └────┼────────────┐ │
> │ │ │
> │┌───────────────────────┐│
> ││ Invalidation queue ││
> │└───────────∧───────────┘│
> └────────────┼────────────┘
> │
> │
> │
> ┌────────────────────────┐
> │Kernel driver │
> │ │
> └────────────────────────┘
>
> v2
> Rebase on master after merge of Zhenzhong's FLTS series
> Rename the series as it is now based on master.
>
> Changes after review by Michael:
> - Split long lines in memory.h
> - Change patch encoding (no UTF-8)
>
> Changes after review by Zhenzhong:
> - Rework "Fill the PASID field when creating an IOMMUTLBEntry"
>
>
>
> Clement Mathieu--Drif (19):
> memory: Add permissions in IOMMUAccessFlags
> intel_iommu: Declare supported PASID size
> memory: Allow to store the PASID in IOMMUTLBEntry
> intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry
> pcie: Add helper to declare PASID capability for a pcie device
> pcie: Helper functions to check if PASID is enabled
> pcie: Helper function to check if ATS is enabled
> pci: Cache the bus mastering status in the device
> pci: Add IOMMU operations to get memory regions with PASID
> intel_iommu: Implement the get_memory_region_pasid iommu operation
> memory: Store user data pointer in the IOMMU notifiers
> pci: Add a pci-level initialization function for iommu notifiers
> atc: Generic ATC that can be used by PCIe devices that support SVM
> atc: Add unit tests
> memory: Add an API for ATS support
> pci: Add a pci-level API for ATS
> intel_iommu: Set address mask when a translation fails and adjust W
> permission
> intel_iommu: Return page walk level even when the translation fails
> intel_iommu: Add support for ATS
>
> hw/i386/intel_iommu.c | 122 ++++++--
> hw/i386/intel_iommu_internal.h | 2 +
> hw/pci/pci.c | 111 ++++++-
> hw/pci/pcie.c | 42 +++
> include/exec/memory.h | 51 +++-
> include/hw/i386/intel_iommu.h | 2 +-
> include/hw/pci/pci.h | 83 ++++++
> include/hw/pci/pci_device.h | 1 +
> include/hw/pci/pcie.h | 9 +-
> include/hw/pci/pcie_regs.h | 5 +
> system/memory.c | 21 ++
> tests/unit/meson.build | 1 +
> tests/unit/test-atc.c | 527 +++++++++++++++++++++++++++++++++
> util/atc.c | 211 +++++++++++++
> util/atc.h | 117 ++++++++
> util/meson.build | 1 +
> 16 files changed, 1275 insertions(+), 31 deletions(-)
> create mode 100644 tests/unit/test-atc.c
> create mode 100644 util/atc.c
> create mode 100644 util/atc.h
>
> --
> 2.47.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 00/19] intel_iommu: Add ATS support
2025-02-20 21:13 ` Michael S. Tsirkin
@ 2025-02-21 7:54 ` CLEMENT MATHIEU--DRIF
0 siblings, 0 replies; 23+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2025-02-21 7:54 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: qemu-devel@nongnu.org, jasowang@redhat.com,
zhenzhong.duan@intel.com, kevin.tian@intel.com,
yi.l.liu@intel.com, joao.m.martins@oracle.com, peterx@redhat.com,
tjeznach@rivosinc.com, minwoo.im@samsung.com
On 20/02/2025 22:13, Michael S. Tsirkin wrote:
> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>
>
> On Mon, Jan 20, 2025 at 05:41:32PM +0000, CLEMENT MATHIEU--DRIF wrote:
>> From: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
>>
>> This patch set belongs to a list of series that add SVM support for VT-d.
>>
>> Here we focus on implementing ATS support in the IOMMU and adding a
>> PCI-level API to be used by virtual devices.
>>
>> This work is based on the VT-d specification version 4.1 (March 2023).
>>
>> Here is a link to our GitHub repository where you can find the following elements:
>> - Qemu with all the patches for SVM
>> - ATS
>> - PRI
>> - Device IOTLB invalidations
>> - Requests with already pre-translated addresses
>> - A demo device
>> - A simple driver for the demo device
>> - A userspace program (for testing and demonstration purposes)
>>
>> https://github.com/BullSequana/Qemu-in-guest-SVM-demo
>
> Fails build:
>
> https://gitlab.com/mstredhat/qemu/-/jobs/9200372388
>
> In function ‘vtd_iommu_ats_do_translate’,
> inlined from ‘vtd_iommu_ats_request_translation’ at ../hw/i386/intel_iommu.c:4778:17:
> ../hw/i386/intel_iommu.c:4758:12: error: ‘entry.target_as’ may be used uninitialized [-Werror=maybe-uninitialized]
> 4758 | return entry;
> | ^~~~~
> ../hw/i386/intel_iommu.c: In function ‘vtd_iommu_ats_request_translation’:
> ../hw/i386/intel_iommu.c:4745:19: note: ‘entry’ declared here
> 4745 | IOMMUTLBEntry entry;
> | ^~~~~
> cc1: all warnings being treated as errors
>
Uh, looks like the error is only present in non-debug mode.
I'll send a v3
Thanks Michael
>
>
>> ===============
>>
>> Context and design notes
>> ''''''''''''''''''''''''
>>
>> The main purpose of this work is to enable vVT-d users to make
>> translation requests to the vIOMMU as described in the PCIe Gen 5.0
>> specification (section 10). Moreover, we aim to implement a
>> PCI/Memory-level framework that could be used by other vIOMMUs
>> to implement the same features.
>>
>> What is ATS?
>> ''''''''''''
>>
>> ATS (Address Translation Service) is a PCIe-level protocol that
>> enables PCIe devices to query an IOMMU for virtual to physical
>> address translations in a specific address space (such as a userland
>> process address space). When a device receives translation responses
>> from an IOMMU, it may decide to store them in an internal cache,
>> often known as "ATC" (Address Translation Cache) or "Device IOTLB".
>> To keep page tables and caches consistent, the IOMMU is allowed to
>> send asynchronous invalidation requests to its client devices.
>>
>> To avoid introducing an unnecessarily complex API, this series simply
>> exposes 3 functions. The first 2 are a pair of setup functions that
>> are called to install and remove the ATS invalidation callback during
>> the initialization phase of a process. The third one will be
>> used to request translations. The callback setup API introduced in
>> this series calls the IOMMUNotifier API under the hood.
>>
>> API design
>> ''''''''''
>>
>> - int pci_register_iommu_tlb_event_notifier(PCIDevice *dev,
>> uint32_t pasid,
>> IOMMUNotifier *n);
>>
>> - int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
>> IOMMUNotifier *n);
>>
>> - ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
>> bool priv_req, bool exec_req,
>> hwaddr addr, size_t length,
>> bool no_write,
>> IOMMUTLBEntry *result,
>> size_t result_length,
>> uint32_t *err_count);
>>
>> Although device developers may want to implement custom ATC for
>> testing or performance measurement purposes, we provide a generic
>> implementation as a utility module.
>>
>> Overview
>> ''''''''
>>
>> Here are the interactions between an ATS-capable PCIe device and the vVT-d:
>>
>>
>>
>> ┌───────────┐ ┌────────────┐
>> │Device │ │PCI / Memory│
>> │ │ pci_ats_request_│abstraction │ iommu_ats_
>> │ │ translation_ │ │ request_
>> │┌─────────┐│ pasid │ AS lookup │ translation
>> ││Logic ││────────────────>│╶╶╶╶╶╶╶╶╶╶╶>│──────┐
>> │└─────────┘│<────────────────│<╶╶╶╶╶╶╶╶╶╶╶│<──┐ │
>> │┌─────────┐│ │ │ │ │
>> ││inv func ││<───────┐ │ │ │ │
>> │└─────────┘│ │ │ │ │ │
>> │ │ │ │ │ │ │ │
>> │ ∨ │ │ │ │ │ │
>> │┌─────────┐│ │ │ │ │ │
>> ││ATC ││ │ │ │ │ │
>> │└─────────┘│ │ │ │ │ │
>> └───────────┘ │ └────────────┘ │ │
>> │ │ │
>> │ │ │
>> │ │ │
>> │ │ │
>> │ ┌────────────────────┼──┼─┐
>> │ │vVT-d │ │ │
>> │ │ │ │ │
>> │ │ │ │ │
>> │ │ │ │ │
>> │ │ │ │ │
>> │ │ │ ∨ │
>> │ │┌───────────────────────┐│
>> │ ││Translation logic ││
>> │ │└───────────────────────┘│
>> └────┼────────────┐ │
>> │ │ │
>> │┌───────────────────────┐│
>> ││ Invalidation queue ││
>> │└───────────∧───────────┘│
>> └────────────┼────────────┘
>> │
>> │
>> │
>> ┌────────────────────────┐
>> │Kernel driver │
>> │ │
>> └────────────────────────┘
>>
>> v2
>> Rebase on master after merge of Zhenzhong's FLTS series
>> Rename the series as it is now based on master.
>>
>> Changes after review by Michael:
>> - Split long lines in memory.h
>> - Change patch encoding (no UTF-8)
>>
>> Changes after review by Zhenzhong:
>> - Rework "Fill the PASID field when creating an IOMMUTLBEntry"
>>
>>
>>
>> Clement Mathieu--Drif (19):
>> memory: Add permissions in IOMMUAccessFlags
>> intel_iommu: Declare supported PASID size
>> memory: Allow to store the PASID in IOMMUTLBEntry
>> intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry
>> pcie: Add helper to declare PASID capability for a pcie device
>> pcie: Helper functions to check if PASID is enabled
>> pcie: Helper function to check if ATS is enabled
>> pci: Cache the bus mastering status in the device
>> pci: Add IOMMU operations to get memory regions with PASID
>> intel_iommu: Implement the get_memory_region_pasid iommu operation
>> memory: Store user data pointer in the IOMMU notifiers
>> pci: Add a pci-level initialization function for iommu notifiers
>> atc: Generic ATC that can be used by PCIe devices that support SVM
>> atc: Add unit tests
>> memory: Add an API for ATS support
>> pci: Add a pci-level API for ATS
>> intel_iommu: Set address mask when a translation fails and adjust W
>> permission
>> intel_iommu: Return page walk level even when the translation fails
>> intel_iommu: Add support for ATS
>>
>> hw/i386/intel_iommu.c | 122 ++++++--
>> hw/i386/intel_iommu_internal.h | 2 +
>> hw/pci/pci.c | 111 ++++++-
>> hw/pci/pcie.c | 42 +++
>> include/exec/memory.h | 51 +++-
>> include/hw/i386/intel_iommu.h | 2 +-
>> include/hw/pci/pci.h | 83 ++++++
>> include/hw/pci/pci_device.h | 1 +
>> include/hw/pci/pcie.h | 9 +-
>> include/hw/pci/pcie_regs.h | 5 +
>> system/memory.c | 21 ++
>> tests/unit/meson.build | 1 +
>> tests/unit/test-atc.c | 527 +++++++++++++++++++++++++++++++++
>> util/atc.c | 211 +++++++++++++
>> util/atc.h | 117 ++++++++
>> util/meson.build | 1 +
>> 16 files changed, 1275 insertions(+), 31 deletions(-)
>> create mode 100644 tests/unit/test-atc.c
>> create mode 100644 util/atc.c
>> create mode 100644 util/atc.h
>>
>> --
>> 2.47.1
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2025-02-21 7:55 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-20 17:41 [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 01/19] memory: Add permissions in IOMMUAccessFlags CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 02/19] intel_iommu: Declare supported PASID size CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 03/19] memory: Allow to store the PASID in IOMMUTLBEntry CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 04/19] intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 05/19] pcie: Add helper to declare PASID capability for a pcie device CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 06/19] pcie: Helper functions to check if PASID is enabled CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 07/19] pcie: Helper function to check if ATS " CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 08/19] pci: Cache the bus mastering status in the device CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 09/19] pci: Add IOMMU operations to get memory regions with PASID CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 10/19] intel_iommu: Implement the get_memory_region_pasid iommu operation CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 11/19] memory: Store user data pointer in the IOMMU notifiers CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 12/19] pci: Add a pci-level initialization function for iommu notifiers CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 13/19] atc: Generic ATC that can be used by PCIe devices that support SVM CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 15/19] memory: Add an API for ATS support CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 14/19] atc: Add unit tests CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 16/19] pci: Add a pci-level API for ATS CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 17/19] intel_iommu: Set address mask when a translation fails and adjust W permission CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 18/19] intel_iommu: Return page walk level even when the translation fails CLEMENT MATHIEU--DRIF
2025-01-20 17:41 ` [PATCH v2 19/19] intel_iommu: Add support for ATS CLEMENT MATHIEU--DRIF
2025-02-19 6:10 ` [PATCH v2 00/19] intel_iommu: Add ATS support CLEMENT MATHIEU--DRIF
2025-02-20 21:13 ` Michael S. Tsirkin
2025-02-21 7:54 ` CLEMENT MATHIEU--DRIF
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).