* [PATCH 0/6] amd_iommu: Fixes to align with AMDVi specification
@ 2025-03-11 15:24 Alejandro Jimenez
2025-03-11 15:24 ` [PATCH 1/6] amd_iommu: Fix Miscellanous Information Register 0 offsets Alejandro Jimenez
` (5 more replies)
0 siblings, 6 replies; 22+ messages in thread
From: Alejandro Jimenez @ 2025-03-11 15:24 UTC (permalink / raw)
To: qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, sarunkod, Wei.Huang2,
joao.m.martins, boris.ostrovsky, alejandro.j.jimenez
Correct mistakes in bitmasks, offsets, decoding of fields, and behavior that
do not match the latest AMD I/O Virtualization Technology (IOMMU)
Specification. These bugs do not trigger problems today in the limited mode
of operation supported by the AMD vIOMMU (passthrough), but upcoming
functionality and tests will require them (and additional fixes).
These are all minor and hopefully not controversial fixes, so I am sending
them out at this time to separate them from later series adding
functionality. It is unclear how relevant these changes will be to stable
releases considering the state of the AMD vIOMMU, but the fixes on this
series should be simple enough to apply, so I Cc'd stable for consideration.
Thank you,
Alejandro
Alejandro Jimenez (6):
amd_iommu: Fix Miscellanous Information Register 0 offsets
amd_iommu: Fix Device ID decoding for INVALIDATE_IOTLB_PAGES command
amd_iommu: Update bitmasks representing DTE reserved fields
amd_iommu: Fix masks for Device Table Address Register
amd_iommu: Fix the calculation for Device Table size
amd_iommu: Do not assume passthrough translation for devices with
DTE[TV]=0
hw/i386/amd_iommu.c | 103 ++++++++++++++++++++++++--------------------
hw/i386/amd_iommu.h | 25 ++++++-----
2 files changed, 71 insertions(+), 57 deletions(-)
base-commit: 5136598e2667f35ef3dc1d757616a266bd5eb3a2
--
2.43.5
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 1/6] amd_iommu: Fix Miscellanous Information Register 0 offsets
2025-03-11 15:24 [PATCH 0/6] amd_iommu: Fixes to align with AMDVi specification Alejandro Jimenez
@ 2025-03-11 15:24 ` Alejandro Jimenez
2025-03-17 12:37 ` Vasant Hegde
2025-03-11 15:24 ` [PATCH 2/6] amd_iommu: Fix Device ID decoding for INVALIDATE_IOTLB_PAGES command Alejandro Jimenez
` (4 subsequent siblings)
5 siblings, 1 reply; 22+ messages in thread
From: Alejandro Jimenez @ 2025-03-11 15:24 UTC (permalink / raw)
To: qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, sarunkod, Wei.Huang2,
joao.m.martins, boris.ostrovsky, alejandro.j.jimenez
The definitions encoding the maximum Virtual, Physical, and Guest Virtual
Address sizes supported by the IOMMU are using incorrect offsets i.e. the
VASize and GVASize offsets are switched.
Cc: qemu-stable@nongnu.org
Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
---
hw/i386/amd_iommu.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/hw/i386/amd_iommu.h b/hw/i386/amd_iommu.h
index 28125130c6..4c708f8d74 100644
--- a/hw/i386/amd_iommu.h
+++ b/hw/i386/amd_iommu.h
@@ -196,9 +196,9 @@
#define AMDVI_PAGE_SHIFT_4K 12
#define AMDVI_PAGE_MASK_4K (~((1ULL << AMDVI_PAGE_SHIFT_4K) - 1))
-#define AMDVI_MAX_VA_ADDR (48UL << 5)
-#define AMDVI_MAX_PH_ADDR (40UL << 8)
-#define AMDVI_MAX_GVA_ADDR (48UL << 15)
+#define AMDVI_MAX_GVA_ADDR (48UL << 5)
+#define AMDVI_MAX_PH_ADDR (40UL << 8)
+#define AMDVI_MAX_VA_ADDR (48UL << 15)
/* Completion Wait data size */
#define AMDVI_COMPLETION_DATA_SIZE 8
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 2/6] amd_iommu: Fix Device ID decoding for INVALIDATE_IOTLB_PAGES command
2025-03-11 15:24 [PATCH 0/6] amd_iommu: Fixes to align with AMDVi specification Alejandro Jimenez
2025-03-11 15:24 ` [PATCH 1/6] amd_iommu: Fix Miscellanous Information Register 0 offsets Alejandro Jimenez
@ 2025-03-11 15:24 ` Alejandro Jimenez
2025-03-17 12:40 ` Vasant Hegde
2025-03-11 15:24 ` [PATCH 3/6] amd_iommu: Update bitmasks representing DTE reserved fields Alejandro Jimenez
` (3 subsequent siblings)
5 siblings, 1 reply; 22+ messages in thread
From: Alejandro Jimenez @ 2025-03-11 15:24 UTC (permalink / raw)
To: qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, sarunkod, Wei.Huang2,
joao.m.martins, boris.ostrovsky, alejandro.j.jimenez
The DeviceID bits are extracted using an incorrect offset in the call to
amdvi_iotlb_remove_page(). This field is read (correctly) earlier, so use
the value already retrieved for devid.
Cc: qemu-stable@nongnu.org
Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
---
hw/i386/amd_iommu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 5b21cf134a..068eeb0cae 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -508,7 +508,7 @@ static void amdvi_inval_inttable(AMDVIState *s, uint64_t *cmd)
static void iommu_inval_iotlb(AMDVIState *s, uint64_t *cmd)
{
- uint16_t devid = extract64(cmd[0], 0, 16);
+ uint16_t devid = cpu_to_le16(extract64(cmd[0], 0, 16));
if (extract64(cmd[1], 1, 1) || extract64(cmd[1], 3, 1) ||
extract64(cmd[1], 6, 6)) {
amdvi_log_illegalcom_error(s, extract64(cmd[0], 60, 4),
@@ -521,7 +521,7 @@ static void iommu_inval_iotlb(AMDVIState *s, uint64_t *cmd)
&devid);
} else {
amdvi_iotlb_remove_page(s, cpu_to_le64(extract64(cmd[1], 12, 52)) << 12,
- cpu_to_le16(extract64(cmd[1], 0, 16)));
+ devid);
}
trace_amdvi_iotlb_inval();
}
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 3/6] amd_iommu: Update bitmasks representing DTE reserved fields
2025-03-11 15:24 [PATCH 0/6] amd_iommu: Fixes to align with AMDVi specification Alejandro Jimenez
2025-03-11 15:24 ` [PATCH 1/6] amd_iommu: Fix Miscellanous Information Register 0 offsets Alejandro Jimenez
2025-03-11 15:24 ` [PATCH 2/6] amd_iommu: Fix Device ID decoding for INVALIDATE_IOTLB_PAGES command Alejandro Jimenez
@ 2025-03-11 15:24 ` Alejandro Jimenez
2025-03-12 4:12 ` Arun Kodilkar, Sairaj
2025-03-17 12:34 ` Vasant Hegde
2025-03-11 15:24 ` [PATCH 4/6] amd_iommu: Fix masks for Device Table Address Register Alejandro Jimenez
` (2 subsequent siblings)
5 siblings, 2 replies; 22+ messages in thread
From: Alejandro Jimenez @ 2025-03-11 15:24 UTC (permalink / raw)
To: qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, sarunkod, Wei.Huang2,
joao.m.martins, boris.ostrovsky, alejandro.j.jimenez
The DTE validation method verifies that all bits in reserved DTE fields are
unset. Update them according to the latest definition available in AMD I/O
Virtualization Technology (IOMMU) Specification - Section 2.2.2.1 Device
Table Entry Format. Remove the magic numbers and use a macro helper to
generate bitmasks covering the specified ranges for better legibility.
Note that some reserved fields specify that events are generated when they
contain non-zero bits, or checks are skipped under certain configurations.
This change only updates the reserved masks, checks for special conditions
are not yet implemented.
Cc: qemu-stable@nongnu.org
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
---
hw/i386/amd_iommu.c | 7 ++++---
hw/i386/amd_iommu.h | 9 ++++++---
2 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 068eeb0cae..8b97abe28c 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -848,9 +848,10 @@ static inline uint64_t amdvi_get_perms(uint64_t entry)
static bool amdvi_validate_dte(AMDVIState *s, uint16_t devid,
uint64_t *dte)
{
- if ((dte[0] & AMDVI_DTE_LOWER_QUAD_RESERVED)
- || (dte[1] & AMDVI_DTE_MIDDLE_QUAD_RESERVED)
- || (dte[2] & AMDVI_DTE_UPPER_QUAD_RESERVED) || dte[3]) {
+ if ((dte[0] & AMDVI_DTE_QUAD0_RESERVED) ||
+ (dte[1] & AMDVI_DTE_QUAD1_RESERVED) ||
+ (dte[2] & AMDVI_DTE_QUAD2_RESERVED) ||
+ (dte[3] & AMDVI_DTE_QUAD3_RESERVED)) {
amdvi_log_illegaldevtab_error(s, devid,
s->devtab +
devid * AMDVI_DEVTAB_ENTRY_SIZE, 0);
diff --git a/hw/i386/amd_iommu.h b/hw/i386/amd_iommu.h
index 4c708f8d74..8d5d882a06 100644
--- a/hw/i386/amd_iommu.h
+++ b/hw/i386/amd_iommu.h
@@ -25,6 +25,8 @@
#include "hw/i386/x86-iommu.h"
#include "qom/object.h"
+#define GENMASK64(h, l) (((~0ULL) >> (63 - (h) + (l))) << (l))
+
/* Capability registers */
#define AMDVI_CAPAB_BAR_LOW 0x04
#define AMDVI_CAPAB_BAR_HIGH 0x08
@@ -162,9 +164,10 @@
#define AMDVI_FEATURE_PC (1ULL << 9) /* Perf counters */
/* reserved DTE bits */
-#define AMDVI_DTE_LOWER_QUAD_RESERVED 0x80300000000000fc
-#define AMDVI_DTE_MIDDLE_QUAD_RESERVED 0x0000000000000100
-#define AMDVI_DTE_UPPER_QUAD_RESERVED 0x08f0000000000000
+#define AMDVI_DTE_QUAD0_RESERVED (GENMASK64(6, 2) | GENMASK64(63, 63))
+#define AMDVI_DTE_QUAD1_RESERVED 0
+#define AMDVI_DTE_QUAD2_RESERVED GENMASK64(53, 52)
+#define AMDVI_DTE_QUAD3_RESERVED (GENMASK64(14, 0) | GENMASK64(53, 48))
/* AMDVI paging mode */
#define AMDVI_GATS_MODE (2ULL << 12)
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 4/6] amd_iommu: Fix masks for Device Table Address Register
2025-03-11 15:24 [PATCH 0/6] amd_iommu: Fixes to align with AMDVi specification Alejandro Jimenez
` (2 preceding siblings ...)
2025-03-11 15:24 ` [PATCH 3/6] amd_iommu: Update bitmasks representing DTE reserved fields Alejandro Jimenez
@ 2025-03-11 15:24 ` Alejandro Jimenez
2025-03-12 5:32 ` Arun Kodilkar, Sairaj
2025-03-17 15:07 ` Vasant Hegde
2025-03-11 15:24 ` [PATCH 5/6] amd_iommu: Fix the calculation for Device Table size Alejandro Jimenez
2025-03-11 15:24 ` [PATCH 6/6] amd_iommu: Do not assume passthrough translation for devices with DTE[TV]=0 Alejandro Jimenez
5 siblings, 2 replies; 22+ messages in thread
From: Alejandro Jimenez @ 2025-03-11 15:24 UTC (permalink / raw)
To: qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, sarunkod, Wei.Huang2,
joao.m.martins, boris.ostrovsky, alejandro.j.jimenez
The size mask currently encompasses reserved bits [11:9]. Extract only the
corrects bits encoding size (i.e. [8:0]).
Cc: qemu-stable@nongnu.org
Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
---
hw/i386/amd_iommu.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/hw/i386/amd_iommu.h b/hw/i386/amd_iommu.h
index 8d5d882a06..2c5c8c70f1 100644
--- a/hw/i386/amd_iommu.h
+++ b/hw/i386/amd_iommu.h
@@ -68,16 +68,16 @@
#define AMDVI_MMIO_SIZE 0x4000
-#define AMDVI_MMIO_DEVTAB_SIZE_MASK ((1ULL << 12) - 1)
-#define AMDVI_MMIO_DEVTAB_BASE_MASK (((1ULL << 52) - 1) & ~ \
- AMDVI_MMIO_DEVTAB_SIZE_MASK)
+#define AMDVI_MMIO_DEVTAB_SIZE_MASK GENMASK64(8, 0)
+#define AMDVI_MMIO_DEVTAB_BASE_MASK GENMASK64(51, 12)
+
#define AMDVI_MMIO_DEVTAB_ENTRY_SIZE 32
#define AMDVI_MMIO_DEVTAB_SIZE_UNIT 4096
/* some of this are similar but just for readability */
#define AMDVI_MMIO_CMDBUF_SIZE_BYTE (AMDVI_MMIO_COMMAND_BASE + 7)
#define AMDVI_MMIO_CMDBUF_SIZE_MASK 0x0f
-#define AMDVI_MMIO_CMDBUF_BASE_MASK AMDVI_MMIO_DEVTAB_BASE_MASK
+#define AMDVI_MMIO_CMDBUF_BASE_MASK GENMASK64(51, 12)
#define AMDVI_MMIO_CMDBUF_HEAD_MASK (((1ULL << 19) - 1) & ~0x0f)
#define AMDVI_MMIO_CMDBUF_TAIL_MASK AMDVI_MMIO_EVTLOG_HEAD_MASK
@@ -95,7 +95,7 @@
#define AMDVI_MMIO_EXCL_ENABLED_MASK (1ULL << 0)
#define AMDVI_MMIO_EXCL_ALLOW_MASK (1ULL << 1)
-#define AMDVI_MMIO_EXCL_LIMIT_MASK AMDVI_MMIO_DEVTAB_BASE_MASK
+#define AMDVI_MMIO_EXCL_LIMIT_MASK GENMASK64(51, 12)
#define AMDVI_MMIO_EXCL_LIMIT_LOW 0xfff
/* mmio control register flags */
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 5/6] amd_iommu: Fix the calculation for Device Table size
2025-03-11 15:24 [PATCH 0/6] amd_iommu: Fixes to align with AMDVi specification Alejandro Jimenez
` (3 preceding siblings ...)
2025-03-11 15:24 ` [PATCH 4/6] amd_iommu: Fix masks for Device Table Address Register Alejandro Jimenez
@ 2025-03-11 15:24 ` Alejandro Jimenez
2025-03-17 13:00 ` Vasant Hegde
2025-03-11 15:24 ` [PATCH 6/6] amd_iommu: Do not assume passthrough translation for devices with DTE[TV]=0 Alejandro Jimenez
5 siblings, 1 reply; 22+ messages in thread
From: Alejandro Jimenez @ 2025-03-11 15:24 UTC (permalink / raw)
To: qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, sarunkod, Wei.Huang2,
joao.m.martins, boris.ostrovsky, alejandro.j.jimenez
Correctly calculate the Device Table size using the format encoded in the
Device Table Base Address Register (MMIO Offset 0000h).
Cc: qemu-stable@nongnu.org
Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
---
hw/i386/amd_iommu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 8b97abe28c..cf00450ebe 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -665,8 +665,8 @@ static inline void amdvi_handle_devtab_write(AMDVIState *s)
uint64_t val = amdvi_readq(s, AMDVI_MMIO_DEVICE_TABLE);
s->devtab = (val & AMDVI_MMIO_DEVTAB_BASE_MASK);
- /* set device table length */
- s->devtab_len = ((val & AMDVI_MMIO_DEVTAB_SIZE_MASK) + 1 *
+ /* set device table length (i.e. number of entries table can hold) */
+ s->devtab_len = (((val & AMDVI_MMIO_DEVTAB_SIZE_MASK) + 1) *
(AMDVI_MMIO_DEVTAB_SIZE_UNIT /
AMDVI_MMIO_DEVTAB_ENTRY_SIZE));
}
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 6/6] amd_iommu: Do not assume passthrough translation for devices with DTE[TV]=0
2025-03-11 15:24 [PATCH 0/6] amd_iommu: Fixes to align with AMDVi specification Alejandro Jimenez
` (4 preceding siblings ...)
2025-03-11 15:24 ` [PATCH 5/6] amd_iommu: Fix the calculation for Device Table size Alejandro Jimenez
@ 2025-03-11 15:24 ` Alejandro Jimenez
2025-03-19 6:06 ` Vasant Hegde
2025-03-20 5:11 ` Arun Kodilkar, Sairaj
5 siblings, 2 replies; 22+ messages in thread
From: Alejandro Jimenez @ 2025-03-11 15:24 UTC (permalink / raw)
To: qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, sarunkod, Wei.Huang2,
joao.m.martins, boris.ostrovsky, alejandro.j.jimenez
The AMD I/O Virtualization Technology (IOMMU) Specification (see Table 8: V,
TV, and GV Fields in Device Table Entry), specifies that a DTE with V=0,
TV=1 does not contain a valid address translation information. If a request
requires a table walk, the walk is terminated when this condition is
encountered.
Do not assume that addresses for a device with DTE[TV]=0 are passed through
(i.e. not remapped) and instead terminate the page table walk early.
Cc: qemu-stable@nongnu.org
Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
---
hw/i386/amd_iommu.c | 88 +++++++++++++++++++++++++--------------------
1 file changed, 49 insertions(+), 39 deletions(-)
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index cf00450ebe..31d5522a62 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -932,51 +932,61 @@ static void amdvi_page_walk(AMDVIAddressSpace *as, uint64_t *dte,
uint64_t pte = dte[0], pte_addr, page_mask;
/* make sure the DTE has TV = 1 */
- if (pte & AMDVI_DEV_TRANSLATION_VALID) {
- level = get_pte_translation_mode(pte);
- if (level >= 7) {
- trace_amdvi_mode_invalid(level, addr);
+ if (!(pte & AMDVI_DEV_TRANSLATION_VALID)) {
+ /*
+ * A DTE with V=1, TV=0 does not have a valid Page Table Root Pointer.
+ * An IOMMU processing a request that requires a table walk terminates
+ * the walk when it encounters this condition. Do the same and return
+ * instead of assuming that the address is forwarded without translation
+ * i.e. the passthrough case, as it is done for the case where DTE[V]=0.
+ */
+ return;
+ }
+
+ level = get_pte_translation_mode(pte);
+ if (level >= 7) {
+ trace_amdvi_mode_invalid(level, addr);
+ return;
+ }
+ if (level == 0) {
+ goto no_remap;
+ }
+
+ /* we are at the leaf page table or page table encodes a huge page */
+ do {
+ pte_perms = amdvi_get_perms(pte);
+ present = pte & 1;
+ if (!present || perms != (perms & pte_perms)) {
+ amdvi_page_fault(as->iommu_state, as->devfn, addr, perms);
+ trace_amdvi_page_fault(addr);
return;
}
- if (level == 0) {
- goto no_remap;
- }
- /* we are at the leaf page table or page table encodes a huge page */
- do {
- pte_perms = amdvi_get_perms(pte);
- present = pte & 1;
- if (!present || perms != (perms & pte_perms)) {
- amdvi_page_fault(as->iommu_state, as->devfn, addr, perms);
- trace_amdvi_page_fault(addr);
- return;
- }
-
- /* go to the next lower level */
- pte_addr = pte & AMDVI_DEV_PT_ROOT_MASK;
- /* add offset and load pte */
- pte_addr += ((addr >> (3 + 9 * level)) & 0x1FF) << 3;
- pte = amdvi_get_pte_entry(as->iommu_state, pte_addr, as->devfn);
- if (!pte) {
- return;
- }
- oldlevel = level;
- level = get_pte_translation_mode(pte);
- } while (level > 0 && level < 7);
-
- if (level == 0x7) {
- page_mask = pte_override_page_mask(pte);
- } else {
- page_mask = pte_get_page_mask(oldlevel);
+ /* go to the next lower level */
+ pte_addr = pte & AMDVI_DEV_PT_ROOT_MASK;
+ /* add offset and load pte */
+ pte_addr += ((addr >> (3 + 9 * level)) & 0x1FF) << 3;
+ pte = amdvi_get_pte_entry(as->iommu_state, pte_addr, as->devfn);
+ if (!pte) {
+ return;
}
+ oldlevel = level;
+ level = get_pte_translation_mode(pte);
+ } while (level > 0 && level < 7);
- /* get access permissions from pte */
- ret->iova = addr & page_mask;
- ret->translated_addr = (pte & AMDVI_DEV_PT_ROOT_MASK) & page_mask;
- ret->addr_mask = ~page_mask;
- ret->perm = amdvi_get_perms(pte);
- return;
+ if (level == 0x7) {
+ page_mask = pte_override_page_mask(pte);
+ } else {
+ page_mask = pte_get_page_mask(oldlevel);
}
+
+ /* get access permissions from pte */
+ ret->iova = addr & page_mask;
+ ret->translated_addr = (pte & AMDVI_DEV_PT_ROOT_MASK) & page_mask;
+ ret->addr_mask = ~page_mask;
+ ret->perm = amdvi_get_perms(pte);
+ return;
+
no_remap:
ret->iova = addr & AMDVI_PAGE_MASK_4K;
ret->translated_addr = addr & AMDVI_PAGE_MASK_4K;
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 3/6] amd_iommu: Update bitmasks representing DTE reserved fields
2025-03-11 15:24 ` [PATCH 3/6] amd_iommu: Update bitmasks representing DTE reserved fields Alejandro Jimenez
@ 2025-03-12 4:12 ` Arun Kodilkar, Sairaj
2025-03-13 14:23 ` Alejandro Jimenez
2025-03-17 12:34 ` Vasant Hegde
1 sibling, 1 reply; 22+ messages in thread
From: Arun Kodilkar, Sairaj @ 2025-03-12 4:12 UTC (permalink / raw)
To: Alejandro Jimenez, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, Wei.Huang2, joao.m.martins,
boris.ostrovsky
Hi Alejandro,
On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
> The DTE validation method verifies that all bits in reserved DTE fields are
> unset. Update them according to the latest definition available in AMD I/O
> Virtualization Technology (IOMMU) Specification - Section 2.2.2.1 Device
> Table Entry Format. Remove the magic numbers and use a macro helper to
> generate bitmasks covering the specified ranges for better legibility.
>
> Note that some reserved fields specify that events are generated when they
> contain non-zero bits, or checks are skipped under certain configurations.
> This change only updates the reserved masks, checks for special conditions
> are not yet implemented.
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
> ---
> hw/i386/amd_iommu.c | 7 ++++---
> hw/i386/amd_iommu.h | 9 ++++++---
> 2 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
> index 068eeb0cae..8b97abe28c 100644
> --- a/hw/i386/amd_iommu.c
> +++ b/hw/i386/amd_iommu.c
> @@ -848,9 +848,10 @@ static inline uint64_t amdvi_get_perms(uint64_t entry)
> static bool amdvi_validate_dte(AMDVIState *s, uint16_t devid,
> uint64_t *dte)
> {
> - if ((dte[0] & AMDVI_DTE_LOWER_QUAD_RESERVED)
> - || (dte[1] & AMDVI_DTE_MIDDLE_QUAD_RESERVED)
> - || (dte[2] & AMDVI_DTE_UPPER_QUAD_RESERVED) || dte[3]) {
> + if ((dte[0] & AMDVI_DTE_QUAD0_RESERVED) ||
> + (dte[1] & AMDVI_DTE_QUAD1_RESERVED) ||
> + (dte[2] & AMDVI_DTE_QUAD2_RESERVED) ||
> + (dte[3] & AMDVI_DTE_QUAD3_RESERVED)) {
> amdvi_log_illegaldevtab_error(s, devid,
> s->devtab +
> devid * AMDVI_DEVTAB_ENTRY_SIZE, 0);
> diff --git a/hw/i386/amd_iommu.h b/hw/i386/amd_iommu.h
> index 4c708f8d74..8d5d882a06 100644
> --- a/hw/i386/amd_iommu.h
> +++ b/hw/i386/amd_iommu.h
> @@ -25,6 +25,8 @@
> #include "hw/i386/x86-iommu.h"
> #include "qom/object.h"
>
> +#define GENMASK64(h, l) (((~0ULL) >> (63 - (h) + (l))) << (l))
> +
qemu provides the similar macro 'MAKE_64BIT_MASK' in file
'include/qemu/bitops.h', you can use this existing macro
instead of redefining.
> /* Capability registers */
> #define AMDVI_CAPAB_BAR_LOW 0x04
> #define AMDVI_CAPAB_BAR_HIGH 0x08
> @@ -162,9 +164,10 @@
> #define AMDVI_FEATURE_PC (1ULL << 9) /* Perf counters */
>
> /* reserved DTE bits */
> -#define AMDVI_DTE_LOWER_QUAD_RESERVED 0x80300000000000fc
> -#define AMDVI_DTE_MIDDLE_QUAD_RESERVED 0x0000000000000100
> -#define AMDVI_DTE_UPPER_QUAD_RESERVED 0x08f0000000000000
> +#define AMDVI_DTE_QUAD0_RESERVED (GENMASK64(6, 2) | GENMASK64(63, 63))
> +#define AMDVI_DTE_QUAD1_RESERVED 0
> +#define AMDVI_DTE_QUAD2_RESERVED GENMASK64(53, 52)
> +#define AMDVI_DTE_QUAD3_RESERVED (GENMASK64(14, 0) | GENMASK64(53, 48))
>
> /* AMDVI paging mode */
> #define AMDVI_GATS_MODE (2ULL << 12)
Regards
Sairaj Kodilkar
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 4/6] amd_iommu: Fix masks for Device Table Address Register
2025-03-11 15:24 ` [PATCH 4/6] amd_iommu: Fix masks for Device Table Address Register Alejandro Jimenez
@ 2025-03-12 5:32 ` Arun Kodilkar, Sairaj
2025-03-17 15:07 ` Vasant Hegde
1 sibling, 0 replies; 22+ messages in thread
From: Arun Kodilkar, Sairaj @ 2025-03-12 5:32 UTC (permalink / raw)
To: Alejandro Jimenez, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, Wei.Huang2, joao.m.martins,
boris.ostrovsky, sarunkod
On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
> The size mask currently encompasses reserved bits [11:9]. Extract only the
> corrects bits encoding size (i.e. [8:0]).
>
> Cc: qemu-stable@nongnu.org
> Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
> ---
> hw/i386/amd_iommu.h | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/hw/i386/amd_iommu.h b/hw/i386/amd_iommu.h
> index 8d5d882a06..2c5c8c70f1 100644
> --- a/hw/i386/amd_iommu.h
> +++ b/hw/i386/amd_iommu.h
> @@ -68,16 +68,16 @@
>
> #define AMDVI_MMIO_SIZE 0x4000
>
> -#define AMDVI_MMIO_DEVTAB_SIZE_MASK ((1ULL << 12) - 1)
> -#define AMDVI_MMIO_DEVTAB_BASE_MASK (((1ULL << 52) - 1) & ~ \
> - AMDVI_MMIO_DEVTAB_SIZE_MASK)
> +#define AMDVI_MMIO_DEVTAB_SIZE_MASK GENMASK64(8, 0)
> +#define AMDVI_MMIO_DEVTAB_BASE_MASK GENMASK64(51, 12)
> +
Use MAKE_64BIT_MASK here as well.
> #define AMDVI_MMIO_DEVTAB_ENTRY_SIZE 32
> #define AMDVI_MMIO_DEVTAB_SIZE_UNIT 4096
>
> /* some of this are similar but just for readability */
> #define AMDVI_MMIO_CMDBUF_SIZE_BYTE (AMDVI_MMIO_COMMAND_BASE + 7)
> #define AMDVI_MMIO_CMDBUF_SIZE_MASK 0x0f
> -#define AMDVI_MMIO_CMDBUF_BASE_MASK AMDVI_MMIO_DEVTAB_BASE_MASK
> +#define AMDVI_MMIO_CMDBUF_BASE_MASK GENMASK64(51, 12)
> #define AMDVI_MMIO_CMDBUF_HEAD_MASK (((1ULL << 19) - 1) & ~0x0f)
I think its better to modify remaining mask fields as well (*HEAD_MASK
and *TAIL_MASK). That way everything is consistent.
> #define AMDVI_MMIO_CMDBUF_TAIL_MASK AMDVI_MMIO_EVTLOG_HEAD_MASK
>
> @@ -95,7 +95,7 @@
>
> #define AMDVI_MMIO_EXCL_ENABLED_MASK (1ULL << 0)
> #define AMDVI_MMIO_EXCL_ALLOW_MASK (1ULL << 1)
> -#define AMDVI_MMIO_EXCL_LIMIT_MASK AMDVI_MMIO_DEVTAB_BASE_MASK
> +#define AMDVI_MMIO_EXCL_LIMIT_MASK GENMASK64(51, 12)
> #define AMDVI_MMIO_EXCL_LIMIT_LOW 0xfff
>
> /* mmio control register flags */
Regards
Sairaj Kodilkar
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/6] amd_iommu: Update bitmasks representing DTE reserved fields
2025-03-12 4:12 ` Arun Kodilkar, Sairaj
@ 2025-03-13 14:23 ` Alejandro Jimenez
2025-03-16 9:34 ` Arun Kodilkar, Sairaj
2025-03-17 12:36 ` Vasant Hegde
0 siblings, 2 replies; 22+ messages in thread
From: Alejandro Jimenez @ 2025-03-13 14:23 UTC (permalink / raw)
To: Arun Kodilkar, Sairaj, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, Wei.Huang2, joao.m.martins,
boris.ostrovsky
On 3/12/25 12:12 AM, Arun Kodilkar, Sairaj wrote:
> Hi Alejandro,
>
> On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
[...]
>> --- a/hw/i386/amd_iommu.h
>> +++ b/hw/i386/amd_iommu.h
>> @@ -25,6 +25,8 @@
>> #include "hw/i386/x86-iommu.h"
>> #include "qom/object.h"
>> +#define GENMASK64(h, l) (((~0ULL) >> (63 - (h) + (l))) << (l))
>> +
>
> qemu provides the similar macro 'MAKE_64BIT_MASK' in file
> 'include/qemu/bitops.h', you can use this existing macro
> instead of redefining.
Hi Sairaj,
I became aware of MAKE_64BIT_MASK() because you used it in your recent
patch, but as you mentioned they are similar but not the same. I
personally find that using bit indexes is less prone to errors since
that is the same format the spec uses to define the fields.
So translating a RSVD[6:2] field from the spec becomes:
GENMASK64(6, 2);
vs
MAKE_64BIT_MASK(6, 5);
The latter is more prone to off-by-one errors in my opinion, specially
when you are defining lots of masks. Perhaps more importantly, I'd like
to progressively convert the amd_iommu definitions to use GENMASK() and
the code that retrieves bit fields to use FIELD_GET().
I am planning on later porting the generic GENMASK* definitions (from
the kernel into "qemu/bitops.h", since the RISC-V IOMMU is also a
consumer of GENMASK, but I am trying to keep the focus on the AMD vIOMMU
for this series.
Thank you,
Alejandro
>
>> /* Capability registers */
>> #define AMDVI_CAPAB_BAR_LOW 0x04
>> #define AMDVI_CAPAB_BAR_HIGH 0x08
>> @@ -162,9 +164,10 @@
>> #define AMDVI_FEATURE_PC (1ULL << 9) /* Perf
>> counters */
>> /* reserved DTE bits */
>> -#define AMDVI_DTE_LOWER_QUAD_RESERVED 0x80300000000000fc
>> -#define AMDVI_DTE_MIDDLE_QUAD_RESERVED 0x0000000000000100
>> -#define AMDVI_DTE_UPPER_QUAD_RESERVED 0x08f0000000000000
>> +#define AMDVI_DTE_QUAD0_RESERVED (GENMASK64(6, 2) |
>> GENMASK64(63, 63))
>> +#define AMDVI_DTE_QUAD1_RESERVED 0
>> +#define AMDVI_DTE_QUAD2_RESERVED GENMASK64(53, 52)
>> +#define AMDVI_DTE_QUAD3_RESERVED (GENMASK64(14, 0) |
>> GENMASK64(53, 48))
>> /* AMDVI paging mode */
>> #define AMDVI_GATS_MODE (2ULL << 12)
>
> Regards
> Sairaj Kodilkar
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/6] amd_iommu: Update bitmasks representing DTE reserved fields
2025-03-13 14:23 ` Alejandro Jimenez
@ 2025-03-16 9:34 ` Arun Kodilkar, Sairaj
2025-03-17 12:36 ` Vasant Hegde
1 sibling, 0 replies; 22+ messages in thread
From: Arun Kodilkar, Sairaj @ 2025-03-16 9:34 UTC (permalink / raw)
To: Alejandro Jimenez, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, Wei.Huang2, joao.m.martins,
boris.ostrovsky
On 3/13/2025 7:53 PM, Alejandro Jimenez wrote:
>
>
> On 3/12/25 12:12 AM, Arun Kodilkar, Sairaj wrote:
>> Hi Alejandro,
>>
>> On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
>
> [...]
>
>>> --- a/hw/i386/amd_iommu.h
>>> +++ b/hw/i386/amd_iommu.h
>>> @@ -25,6 +25,8 @@
>>> #include "hw/i386/x86-iommu.h"
>>> #include "qom/object.h"
>>> +#define GENMASK64(h, l) (((~0ULL) >> (63 - (h) + (l))) << (l))
>>> +
>>
>> qemu provides the similar macro 'MAKE_64BIT_MASK' in file
>> 'include/qemu/bitops.h', you can use this existing macro
>> instead of redefining.
>
> Hi Sairaj,
>
> I became aware of MAKE_64BIT_MASK() because you used it in your recent
> patch, but as you mentioned they are similar but not the same. I
> personally find that using bit indexes is less prone to errors since
> that is the same format the spec uses to define the fields.
> So translating a RSVD[6:2] field from the spec becomes:
>
> GENMASK64(6, 2);
> vs
> MAKE_64BIT_MASK(6, 5);
>
> The latter is more prone to off-by-one errors in my opinion, specially
> when you are defining lots of masks. Perhaps more importantly, I'd like
> to progressively convert the amd_iommu definitions to use GENMASK() and
> the code that retrieves bit fields to use FIELD_GET().
>
> I am planning on later porting the generic GENMASK* definitions (from
> the kernel into "qemu/bitops.h", since the RISC-V IOMMU is also a
> consumer of GENMASK, but I am trying to keep the focus on the AMD vIOMMU
> for this series.
>
> Thank you,
> Alejandro
>
Hi Alejandro,
Yes indeed, GENMASK64() is more readable than the MAKE_64BIT_MASK().
and Since you are going to port it to generic layer, its better
to use GENMASK64().
Feel free to replace existing MAKE_64BIT_MASK() in my recent patches.
otherwise, I can modify it during next patch series.
Regards
Sairaj Kodilkar
>>
>>> /* Capability registers */
>>> #define AMDVI_CAPAB_BAR_LOW 0x04
>>> #define AMDVI_CAPAB_BAR_HIGH 0x08
>>> @@ -162,9 +164,10 @@
>>> #define AMDVI_FEATURE_PC (1ULL << 9) /* Perf
>>> counters */
>>> /* reserved DTE bits */
>>> -#define AMDVI_DTE_LOWER_QUAD_RESERVED 0x80300000000000fc
>>> -#define AMDVI_DTE_MIDDLE_QUAD_RESERVED 0x0000000000000100
>>> -#define AMDVI_DTE_UPPER_QUAD_RESERVED 0x08f0000000000000
>>> +#define AMDVI_DTE_QUAD0_RESERVED (GENMASK64(6, 2) |
>>> GENMASK64(63, 63))
>>> +#define AMDVI_DTE_QUAD1_RESERVED 0
>>> +#define AMDVI_DTE_QUAD2_RESERVED GENMASK64(53, 52)
>>> +#define AMDVI_DTE_QUAD3_RESERVED (GENMASK64(14, 0) |
>>> GENMASK64(53, 48))
>>> /* AMDVI paging mode */
>>> #define AMDVI_GATS_MODE (2ULL << 12)
>>
>> Regards
>> Sairaj Kodilkar
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/6] amd_iommu: Update bitmasks representing DTE reserved fields
2025-03-11 15:24 ` [PATCH 3/6] amd_iommu: Update bitmasks representing DTE reserved fields Alejandro Jimenez
2025-03-12 4:12 ` Arun Kodilkar, Sairaj
@ 2025-03-17 12:34 ` Vasant Hegde
1 sibling, 0 replies; 22+ messages in thread
From: Vasant Hegde @ 2025-03-17 12:34 UTC (permalink / raw)
To: Alejandro Jimenez, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, suravee.suthikulpanit,
santosh.shukla, sarunkod, Wei.Huang2, joao.m.martins,
boris.ostrovsky
Hi Alejandro,
On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
> The DTE validation method verifies that all bits in reserved DTE fields are
> unset. Update them according to the latest definition available in AMD I/O
> Virtualization Technology (IOMMU) Specification - Section 2.2.2.1 Device
> Table Entry Format. Remove the magic numbers and use a macro helper to
> generate bitmasks covering the specified ranges for better legibility.
>
> Note that some reserved fields specify that events are generated when they
> contain non-zero bits, or checks are skipped under certain configurations.
> This change only updates the reserved masks, checks for special conditions
> are not yet implemented.
Thanks! Eventually we should add some check in amdvi_get_dte(). We can do it later.
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
> ---
> hw/i386/amd_iommu.c | 7 ++++---
> hw/i386/amd_iommu.h | 9 ++++++---
> 2 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
> index 068eeb0cae..8b97abe28c 100644
> --- a/hw/i386/amd_iommu.c
> +++ b/hw/i386/amd_iommu.c
> @@ -848,9 +848,10 @@ static inline uint64_t amdvi_get_perms(uint64_t entry)
> static bool amdvi_validate_dte(AMDVIState *s, uint16_t devid,
> uint64_t *dte)
> {
> - if ((dte[0] & AMDVI_DTE_LOWER_QUAD_RESERVED)
> - || (dte[1] & AMDVI_DTE_MIDDLE_QUAD_RESERVED)
> - || (dte[2] & AMDVI_DTE_UPPER_QUAD_RESERVED) || dte[3]) {
> + if ((dte[0] & AMDVI_DTE_QUAD0_RESERVED) ||
> + (dte[1] & AMDVI_DTE_QUAD1_RESERVED) ||
> + (dte[2] & AMDVI_DTE_QUAD2_RESERVED) ||
> + (dte[3] & AMDVI_DTE_QUAD3_RESERVED)) {
> amdvi_log_illegaldevtab_error(s, devid,
> s->devtab +
> devid * AMDVI_DEVTAB_ENTRY_SIZE, 0);
> diff --git a/hw/i386/amd_iommu.h b/hw/i386/amd_iommu.h
> index 4c708f8d74..8d5d882a06 100644
> --- a/hw/i386/amd_iommu.h
> +++ b/hw/i386/amd_iommu.h
> @@ -25,6 +25,8 @@
> #include "hw/i386/x86-iommu.h"
> #include "qom/object.h"
>
> +#define GENMASK64(h, l) (((~0ULL) >> (63 - (h) + (l))) << (l))
> +
> /* Capability registers */
> #define AMDVI_CAPAB_BAR_LOW 0x04
> #define AMDVI_CAPAB_BAR_HIGH 0x08
> @@ -162,9 +164,10 @@
> #define AMDVI_FEATURE_PC (1ULL << 9) /* Perf counters */
>
> /* reserved DTE bits */
> -#define AMDVI_DTE_LOWER_QUAD_RESERVED 0x80300000000000fc
> -#define AMDVI_DTE_MIDDLE_QUAD_RESERVED 0x0000000000000100
> -#define AMDVI_DTE_UPPER_QUAD_RESERVED 0x08f0000000000000
> +#define AMDVI_DTE_QUAD0_RESERVED (GENMASK64(6, 2) | GENMASK64(63, 63))
> +#define AMDVI_DTE_QUAD1_RESERVED 0
> +#define AMDVI_DTE_QUAD2_RESERVED GENMASK64(53, 52)
> +#define AMDVI_DTE_QUAD3_RESERVED (GENMASK64(14, 0) | GENMASK64(53, 48))
In vIOMMU case guest is not expected to set any value in DTE[3]. So I am
wondering whether to match the spec -OR- what is expected in vIOMMU context. If
we want to go with later then it complicates as there are many other fields like
GV, etc is not expected to be used.
So since this matches the spec I think we are good for now.
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
-Vasant
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/6] amd_iommu: Update bitmasks representing DTE reserved fields
2025-03-13 14:23 ` Alejandro Jimenez
2025-03-16 9:34 ` Arun Kodilkar, Sairaj
@ 2025-03-17 12:36 ` Vasant Hegde
1 sibling, 0 replies; 22+ messages in thread
From: Vasant Hegde @ 2025-03-17 12:36 UTC (permalink / raw)
To: Alejandro Jimenez, Arun Kodilkar, Sairaj, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, suravee.suthikulpanit,
santosh.shukla, Wei.Huang2, joao.m.martins, boris.ostrovsky
Hi ,
On 3/13/2025 7:53 PM, Alejandro Jimenez wrote:
>
>
> On 3/12/25 12:12 AM, Arun Kodilkar, Sairaj wrote:
>> Hi Alejandro,
>>
>> On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
>
> [...]
>
>>> --- a/hw/i386/amd_iommu.h
>>> +++ b/hw/i386/amd_iommu.h
>>> @@ -25,6 +25,8 @@
>>> #include "hw/i386/x86-iommu.h"
>>> #include "qom/object.h"
>>> +#define GENMASK64(h, l) (((~0ULL) >> (63 - (h) + (l))) << (l))
>>> +
>>
>> qemu provides the similar macro 'MAKE_64BIT_MASK' in file
>> 'include/qemu/bitops.h', you can use this existing macro
>> instead of redefining.
>
> Hi Sairaj,
>
> I became aware of MAKE_64BIT_MASK() because you used it in your recent patch,
> but as you mentioned they are similar but not the same. I personally find that
> using bit indexes is less prone to errors since that is the same format the spec
> uses to define the fields.
> So translating a RSVD[6:2] field from the spec becomes:
>
> GENMASK64(6, 2);
> vs
> MAKE_64BIT_MASK(6, 5);
>
> The latter is more prone to off-by-one errors in my opinion, specially when you
> are defining lots of masks. Perhaps more importantly, I'd like to progressively
> convert the amd_iommu definitions to use GENMASK() and the code that retrieves
> bit fields to use FIELD_GET().
>
> I am planning on later porting the generic GENMASK* definitions (from the kernel
> into "qemu/bitops.h", since the RISC-V IOMMU is also a consumer of GENMASK, but
> I am trying to keep the focus on the AMD vIOMMU for this series.
I like GENMASK. Its easy to read.
RISCV has GENMASK_ULL. As you said, we can consider consolidating them later.
-Vasant
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/6] amd_iommu: Fix Miscellanous Information Register 0 offsets
2025-03-11 15:24 ` [PATCH 1/6] amd_iommu: Fix Miscellanous Information Register 0 offsets Alejandro Jimenez
@ 2025-03-17 12:37 ` Vasant Hegde
0 siblings, 0 replies; 22+ messages in thread
From: Vasant Hegde @ 2025-03-17 12:37 UTC (permalink / raw)
To: Alejandro Jimenez, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, suravee.suthikulpanit,
santosh.shukla, sarunkod, Wei.Huang2, joao.m.martins,
boris.ostrovsky
On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
> The definitions encoding the maximum Virtual, Physical, and Guest Virtual
> Address sizes supported by the IOMMU are using incorrect offsets i.e. the
> VASize and GVASize offsets are switched.
>
> Cc: qemu-stable@nongnu.org
> Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Good catch. I had read this code but missed to catch this!
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
-Vasant
> ---
> hw/i386/amd_iommu.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/hw/i386/amd_iommu.h b/hw/i386/amd_iommu.h
> index 28125130c6..4c708f8d74 100644
> --- a/hw/i386/amd_iommu.h
> +++ b/hw/i386/amd_iommu.h
> @@ -196,9 +196,9 @@
> #define AMDVI_PAGE_SHIFT_4K 12
> #define AMDVI_PAGE_MASK_4K (~((1ULL << AMDVI_PAGE_SHIFT_4K) - 1))
>
> -#define AMDVI_MAX_VA_ADDR (48UL << 5)
> -#define AMDVI_MAX_PH_ADDR (40UL << 8)
> -#define AMDVI_MAX_GVA_ADDR (48UL << 15)
> +#define AMDVI_MAX_GVA_ADDR (48UL << 5)
> +#define AMDVI_MAX_PH_ADDR (40UL << 8)
> +#define AMDVI_MAX_VA_ADDR (48UL << 15)
>
> /* Completion Wait data size */
> #define AMDVI_COMPLETION_DATA_SIZE 8
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/6] amd_iommu: Fix Device ID decoding for INVALIDATE_IOTLB_PAGES command
2025-03-11 15:24 ` [PATCH 2/6] amd_iommu: Fix Device ID decoding for INVALIDATE_IOTLB_PAGES command Alejandro Jimenez
@ 2025-03-17 12:40 ` Vasant Hegde
0 siblings, 0 replies; 22+ messages in thread
From: Vasant Hegde @ 2025-03-17 12:40 UTC (permalink / raw)
To: Alejandro Jimenez, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, suravee.suthikulpanit,
santosh.shukla, sarunkod, Wei.Huang2, joao.m.martins,
boris.ostrovsky
Alejandro,
On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
> The DeviceID bits are extracted using an incorrect offset in the call to
> amdvi_iotlb_remove_page(). This field is read (correctly) earlier, so use
> the value already retrieved for devid.
>
> Cc: qemu-stable@nongnu.org
> Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Fix looks good.
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
FYI. I think we need to reconsider IOTLB invalidation stuff. Its suppose to
invalidate the device side TLB, not IOMMU one. Before that we need to fix the
way we generate hash it. Ideally we should switch to domain ID and fix the size
as well. We are looking into this.
-Vasant
> ---
> hw/i386/amd_iommu.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
> index 5b21cf134a..068eeb0cae 100644
> --- a/hw/i386/amd_iommu.c
> +++ b/hw/i386/amd_iommu.c
> @@ -508,7 +508,7 @@ static void amdvi_inval_inttable(AMDVIState *s, uint64_t *cmd)
> static void iommu_inval_iotlb(AMDVIState *s, uint64_t *cmd)
> {
>
> - uint16_t devid = extract64(cmd[0], 0, 16);
> + uint16_t devid = cpu_to_le16(extract64(cmd[0], 0, 16));
> if (extract64(cmd[1], 1, 1) || extract64(cmd[1], 3, 1) ||
> extract64(cmd[1], 6, 6)) {
> amdvi_log_illegalcom_error(s, extract64(cmd[0], 60, 4),
> @@ -521,7 +521,7 @@ static void iommu_inval_iotlb(AMDVIState *s, uint64_t *cmd)
> &devid);
> } else {
> amdvi_iotlb_remove_page(s, cpu_to_le64(extract64(cmd[1], 12, 52)) << 12,
> - cpu_to_le16(extract64(cmd[1], 0, 16)));
> + devid);
> }
> trace_amdvi_iotlb_inval();
> }
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 5/6] amd_iommu: Fix the calculation for Device Table size
2025-03-11 15:24 ` [PATCH 5/6] amd_iommu: Fix the calculation for Device Table size Alejandro Jimenez
@ 2025-03-17 13:00 ` Vasant Hegde
0 siblings, 0 replies; 22+ messages in thread
From: Vasant Hegde @ 2025-03-17 13:00 UTC (permalink / raw)
To: Alejandro Jimenez, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, suravee.suthikulpanit,
santosh.shukla, sarunkod, Wei.Huang2, joao.m.martins,
boris.ostrovsky
Alejandro,
On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
> Correctly calculate the Device Table size using the format encoded in the
> Device Table Base Address Register (MMIO Offset 0000h).
>
> Cc: qemu-stable@nongnu.org
> Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
I understand its a fix. But its not breaking anything. So not sure whether its
worth to backport.
> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
-Vasant
> ---
> hw/i386/amd_iommu.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
> index 8b97abe28c..cf00450ebe 100644
> --- a/hw/i386/amd_iommu.c
> +++ b/hw/i386/amd_iommu.c
> @@ -665,8 +665,8 @@ static inline void amdvi_handle_devtab_write(AMDVIState *s)
> uint64_t val = amdvi_readq(s, AMDVI_MMIO_DEVICE_TABLE);
> s->devtab = (val & AMDVI_MMIO_DEVTAB_BASE_MASK);
>
> - /* set device table length */
> - s->devtab_len = ((val & AMDVI_MMIO_DEVTAB_SIZE_MASK) + 1 *
> + /* set device table length (i.e. number of entries table can hold) */
> + s->devtab_len = (((val & AMDVI_MMIO_DEVTAB_SIZE_MASK) + 1) *
> (AMDVI_MMIO_DEVTAB_SIZE_UNIT /
> AMDVI_MMIO_DEVTAB_ENTRY_SIZE));
> }
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 4/6] amd_iommu: Fix masks for Device Table Address Register
2025-03-11 15:24 ` [PATCH 4/6] amd_iommu: Fix masks for Device Table Address Register Alejandro Jimenez
2025-03-12 5:32 ` Arun Kodilkar, Sairaj
@ 2025-03-17 15:07 ` Vasant Hegde
1 sibling, 0 replies; 22+ messages in thread
From: Vasant Hegde @ 2025-03-17 15:07 UTC (permalink / raw)
To: Alejandro Jimenez, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, suravee.suthikulpanit,
santosh.shukla, sarunkod, Wei.Huang2, joao.m.martins,
boris.ostrovsky
Hi Alejandro,
On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
> The size mask currently encompasses reserved bits [11:9]. Extract only the
> corrects bits encoding size (i.e. [8:0]).
>
> Cc: qemu-stable@nongnu.org
> Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
> ---
> hw/i386/amd_iommu.h | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/hw/i386/amd_iommu.h b/hw/i386/amd_iommu.h
> index 8d5d882a06..2c5c8c70f1 100644
> --- a/hw/i386/amd_iommu.h
> +++ b/hw/i386/amd_iommu.h
> @@ -68,16 +68,16 @@
>
> #define AMDVI_MMIO_SIZE 0x4000
>
> -#define AMDVI_MMIO_DEVTAB_SIZE_MASK ((1ULL << 12) - 1)
> -#define AMDVI_MMIO_DEVTAB_BASE_MASK (((1ULL << 52) - 1) & ~ \
> - AMDVI_MMIO_DEVTAB_SIZE_MASK)
> +#define AMDVI_MMIO_DEVTAB_SIZE_MASK GENMASK64(8, 0)
> +#define AMDVI_MMIO_DEVTAB_BASE_MASK GENMASK64(51, 12)
> +
> #define AMDVI_MMIO_DEVTAB_ENTRY_SIZE 32
> #define AMDVI_MMIO_DEVTAB_SIZE_UNIT 4096
>
> /* some of this are similar but just for readability */
> #define AMDVI_MMIO_CMDBUF_SIZE_BYTE (AMDVI_MMIO_COMMAND_BASE + 7)
> #define AMDVI_MMIO_CMDBUF_SIZE_MASK 0x0f
> -#define AMDVI_MMIO_CMDBUF_BASE_MASK AMDVI_MMIO_DEVTAB_BASE_MASK
> +#define AMDVI_MMIO_CMDBUF_BASE_MASK GENMASK64(51, 12)
May be update AMDVI_MMIO_EVTLOG_BASE_MASK / AMDVI_MMIO_PPRLOG_BASE_MASK as well?
(I mean use GENMASK64 for these macros instead of they referring to some other
macros).
-Vasant
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 6/6] amd_iommu: Do not assume passthrough translation for devices with DTE[TV]=0
2025-03-11 15:24 ` [PATCH 6/6] amd_iommu: Do not assume passthrough translation for devices with DTE[TV]=0 Alejandro Jimenez
@ 2025-03-19 6:06 ` Vasant Hegde
2025-03-19 14:10 ` Alejandro Jimenez
2025-03-20 5:11 ` Arun Kodilkar, Sairaj
1 sibling, 1 reply; 22+ messages in thread
From: Vasant Hegde @ 2025-03-19 6:06 UTC (permalink / raw)
To: Alejandro Jimenez, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, suravee.suthikulpanit,
santosh.shukla, sarunkod, Wei.Huang2, joao.m.martins,
boris.ostrovsky
Alejandro,
On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
> The AMD I/O Virtualization Technology (IOMMU) Specification (see Table 8: V,
> TV, and GV Fields in Device Table Entry), specifies that a DTE with V=0,
> TV=1 does not contain a valid address translation information. If a request
> requires a table walk, the walk is terminated when this condition is
> encountered.
>
> Do not assume that addresses for a device with DTE[TV]=0 are passed through
> (i.e. not remapped) and instead terminate the page table walk early.
>
> Cc: qemu-stable@nongnu.org
> Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
I did double check and I think this patch matches HW behaviour. I did run few
tests w/ this series. It seems to work fine for me.
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
> ---
> hw/i386/amd_iommu.c | 88 +++++++++++++++++++++++++--------------------
> 1 file changed, 49 insertions(+), 39 deletions(-)
>
> diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
> index cf00450ebe..31d5522a62 100644
> --- a/hw/i386/amd_iommu.c
> +++ b/hw/i386/amd_iommu.c
> @@ -932,51 +932,61 @@ static void amdvi_page_walk(AMDVIAddressSpace *as, uint64_t *dte,
> uint64_t pte = dte[0], pte_addr, page_mask;
>
> /* make sure the DTE has TV = 1 */
> - if (pte & AMDVI_DEV_TRANSLATION_VALID) {
> - level = get_pte_translation_mode(pte);
> - if (level >= 7) {
> - trace_amdvi_mode_invalid(level, addr);
> + if (!(pte & AMDVI_DEV_TRANSLATION_VALID)) {
> + /*
> + * A DTE with V=1, TV=0 does not have a valid Page Table Root Pointer.
> + * An IOMMU processing a request that requires a table walk terminates
> + * the walk when it encounters this condition. Do the same and return
> + * instead of assuming that the address is forwarded without translation
> + * i.e. the passthrough case, as it is done for the case where DTE[V]=0.
> + */
> + return;
> + }
> +
> + level = get_pte_translation_mode(pte);
> + if (level >= 7) {
> + trace_amdvi_mode_invalid(level, addr);
> + return;
> + }
> + if (level == 0) {
> + goto no_remap;
> + }
> +
> + /* we are at the leaf page table or page table encodes a huge page */
> + do {
> + pte_perms = amdvi_get_perms(pte);
> + present = pte & 1;
> + if (!present || perms != (perms & pte_perms)) {
> + amdvi_page_fault(as->iommu_state, as->devfn, addr, perms);
> + trace_amdvi_page_fault(addr);
> return;
> }
> - if (level == 0) {
> - goto no_remap;
> - }
>
> - /* we are at the leaf page table or page table encodes a huge page */
> - do {
> - pte_perms = amdvi_get_perms(pte);
> - present = pte & 1;
> - if (!present || perms != (perms & pte_perms)) {
> - amdvi_page_fault(as->iommu_state, as->devfn, addr, perms);
> - trace_amdvi_page_fault(addr);
> - return;
> - }
> -
> - /* go to the next lower level */
> - pte_addr = pte & AMDVI_DEV_PT_ROOT_MASK;
> - /* add offset and load pte */
> - pte_addr += ((addr >> (3 + 9 * level)) & 0x1FF) << 3;
> - pte = amdvi_get_pte_entry(as->iommu_state, pte_addr, as->devfn);
> - if (!pte) {
> - return;
> - }
> - oldlevel = level;
> - level = get_pte_translation_mode(pte);
> - } while (level > 0 && level < 7);
> -
> - if (level == 0x7) {
> - page_mask = pte_override_page_mask(pte);
> - } else {
> - page_mask = pte_get_page_mask(oldlevel);
> + /* go to the next lower level */
> + pte_addr = pte & AMDVI_DEV_PT_ROOT_MASK;
> + /* add offset and load pte */
> + pte_addr += ((addr >> (3 + 9 * level)) & 0x1FF) << 3;
> + pte = amdvi_get_pte_entry(as->iommu_state, pte_addr, as->devfn);
> + if (!pte) {
> + return;
> }
> + oldlevel = level;
> + level = get_pte_translation_mode(pte);
Unrelated to this patch.. We may want to add check to make sure level is
decreasing. Something like
if (oldlevel <= level)
error out
Otherwise bad page table can cause the issue.
-Vasant
> + } while (level > 0 && level < 7);
>
> - /* get access permissions from pte */
> - ret->iova = addr & page_mask;
> - ret->translated_addr = (pte & AMDVI_DEV_PT_ROOT_MASK) & page_mask;
> - ret->addr_mask = ~page_mask;
> - ret->perm = amdvi_get_perms(pte);
> - return;
> + if (level == 0x7) {
> + page_mask = pte_override_page_mask(pte);
> + } else {
> + page_mask = pte_get_page_mask(oldlevel);
> }
> +
> + /* get access permissions from pte */
> + ret->iova = addr & page_mask;
> + ret->translated_addr = (pte & AMDVI_DEV_PT_ROOT_MASK) & page_mask;
> + ret->addr_mask = ~page_mask;
> + ret->perm = amdvi_get_perms(pte);
> + return;
> +
> no_remap:
> ret->iova = addr & AMDVI_PAGE_MASK_4K;
> ret->translated_addr = addr & AMDVI_PAGE_MASK_4K;
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 6/6] amd_iommu: Do not assume passthrough translation for devices with DTE[TV]=0
2025-03-19 6:06 ` Vasant Hegde
@ 2025-03-19 14:10 ` Alejandro Jimenez
0 siblings, 0 replies; 22+ messages in thread
From: Alejandro Jimenez @ 2025-03-19 14:10 UTC (permalink / raw)
To: Vasant Hegde, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, suravee.suthikulpanit,
santosh.shukla, sarunkod, Wei.Huang2, joao.m.martins,
boris.ostrovsky
On 3/19/25 2:06 AM, Vasant Hegde wrote:
> Alejandro,
>
>
> On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
>> The AMD I/O Virtualization Technology (IOMMU) Specification (see Table 8: V,
>> TV, and GV Fields in Device Table Entry), specifies that a DTE with V=0,
>> TV=1 does not contain a valid address translation information. If a request
>> requires a table walk, the walk is terminated when this condition is
>> encountered.
>>
>> Do not assume that addresses for a device with DTE[TV]=0 are passed through
>> (i.e. not remapped) and instead terminate the page table walk early.
>>
>> Cc: qemu-stable@nongnu.org
>> Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
>> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
>
> I did double check and I think this patch matches HW behaviour. I did run few
> tests w/ this series. It seems to work fine for me.
Thank you for confirming and reviewing.
My understanding from the Linux driver code is that the TV field is not
really used to control behavior and DTE[V] ends up doing that job
(though I need to take a closer look at the more involved scenarios with
SNP enabled). So I have not seen the case with V=1,TV=0 in the basic
scenarios I have run, but my intention is to adhere to the spec as
closely as possible. So I appreciate the confirmation.
>
>
> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
>
>
>
[...]
>> + oldlevel = level;
>> + level = get_pte_translation_mode(pte);
>
> Unrelated to this patch.. We may want to add check to make sure level is
> decreasing. Something like
> if (oldlevel <= level)
> error out
>
> Otherwise bad page table can cause the issue.
ACK. I have considered replacing this (amdvi_page_walk()) with code that
more closely mimics the fetch_pte() algorithm in the Linux driver. I've
been using it on my (coming soon) RFC for DMA remapping, and comparing
against amdvi_page_walk() for verification. I put in assertions like the
one you proposed in that new code. But if that approach is not taken,
I'll add the safety checks you suggested here.
Thank you,
Alejandro
>
> -Vasant
>
>
>> + } while (level > 0 && level < 7);
>>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 6/6] amd_iommu: Do not assume passthrough translation for devices with DTE[TV]=0
2025-03-11 15:24 ` [PATCH 6/6] amd_iommu: Do not assume passthrough translation for devices with DTE[TV]=0 Alejandro Jimenez
2025-03-19 6:06 ` Vasant Hegde
@ 2025-03-20 5:11 ` Arun Kodilkar, Sairaj
2025-03-20 16:56 ` Alejandro Jimenez
1 sibling, 1 reply; 22+ messages in thread
From: Arun Kodilkar, Sairaj @ 2025-03-20 5:11 UTC (permalink / raw)
To: Alejandro Jimenez, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, Wei.Huang2, joao.m.martins,
boris.ostrovsky
On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
> The AMD I/O Virtualization Technology (IOMMU) Specification (see Table 8: V,
> TV, and GV Fields in Device Table Entry), specifies that a DTE with V=0,
> TV=1 does not contain a valid address translation information. If a request
> requires a table walk, the walk is terminated when this condition is
> encountered.
>
> Do not assume that addresses for a device with DTE[TV]=0 are passed through
> (i.e. not remapped) and instead terminate the page table walk early.
>
> Cc: qemu-stable@nongnu.org
> Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
> ---
> hw/i386/amd_iommu.c | 88 +++++++++++++++++++++++++--------------------
> 1 file changed, 49 insertions(+), 39 deletions(-)
>
> diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
> index cf00450ebe..31d5522a62 100644
> --- a/hw/i386/amd_iommu.c
> +++ b/hw/i386/amd_iommu.c
> @@ -932,51 +932,61 @@ static void amdvi_page_walk(AMDVIAddressSpace *as, uint64_t *dte,
> uint64_t pte = dte[0], pte_addr, page_mask;
>
> /* make sure the DTE has TV = 1 */
> - if (pte & AMDVI_DEV_TRANSLATION_VALID) {
> - level = get_pte_translation_mode(pte);
> - if (level >= 7) {
> - trace_amdvi_mode_invalid(level, addr);
> + if (!(pte & AMDVI_DEV_TRANSLATION_VALID)) {
> + /*
> + * A DTE with V=1, TV=0 does not have a valid Page Table Root Pointer.
> + * An IOMMU processing a request that requires a table walk terminates
> + * the walk when it encounters this condition. Do the same and return
> + * instead of assuming that the address is forwarded without translation
> + * i.e. the passthrough case, as it is done for the case where DTE[V]=0.
> + */
Hi Alejandro,
According to AMD IOMMU specs TABLE 44 (IO_PAGE_FAULT Event Types), IOMMU
reports IO_PAGE_FAULT event when TV bit is not set in the DTE.
Hence you should use amdvi_page_fault to report the IO_PAGE_FAULT
event before returning.
Regards
Sairaj Kodilkar
> + return;
> + }
> +
> + level = get_pte_translation_mode(pte);
> + if (level >= 7) {
> + trace_amdvi_mode_invalid(level, addr);
> + return;
> + }
> + if (level == 0) {
> + goto no_remap;
> + }
> +
> + /* we are at the leaf page table or page table encodes a huge page */
> + do {
> + pte_perms = amdvi_get_perms(pte);
> + present = pte & 1;
> + if (!present || perms != (perms & pte_perms)) {
> + amdvi_page_fault(as->iommu_state, as->devfn, addr, perms);
> + trace_amdvi_page_fault(addr);
> return;
> }
> - if (level == 0) {
> - goto no_remap;
> - }
>
> - /* we are at the leaf page table or page table encodes a huge page */
> - do {
> - pte_perms = amdvi_get_perms(pte);
> - present = pte & 1;
> - if (!present || perms != (perms & pte_perms)) {
> - amdvi_page_fault(as->iommu_state, as->devfn, addr, perms);
> - trace_amdvi_page_fault(addr);
> - return;
> - }
> -
> - /* go to the next lower level */
> - pte_addr = pte & AMDVI_DEV_PT_ROOT_MASK;
> - /* add offset and load pte */
> - pte_addr += ((addr >> (3 + 9 * level)) & 0x1FF) << 3;
> - pte = amdvi_get_pte_entry(as->iommu_state, pte_addr, as->devfn);
> - if (!pte) {
> - return;
> - }
> - oldlevel = level;
> - level = get_pte_translation_mode(pte);
> - } while (level > 0 && level < 7);
> -
> - if (level == 0x7) {
> - page_mask = pte_override_page_mask(pte);
> - } else {
> - page_mask = pte_get_page_mask(oldlevel);
> + /* go to the next lower level */
> + pte_addr = pte & AMDVI_DEV_PT_ROOT_MASK;
> + /* add offset and load pte */
> + pte_addr += ((addr >> (3 + 9 * level)) & 0x1FF) << 3;
> + pte = amdvi_get_pte_entry(as->iommu_state, pte_addr, as->devfn);
> + if (!pte) {
> + return;
> }
> + oldlevel = level;
> + level = get_pte_translation_mode(pte);
> + } while (level > 0 && level < 7);
>
> - /* get access permissions from pte */
> - ret->iova = addr & page_mask;
> - ret->translated_addr = (pte & AMDVI_DEV_PT_ROOT_MASK) & page_mask;
> - ret->addr_mask = ~page_mask;
> - ret->perm = amdvi_get_perms(pte);
> - return;
> + if (level == 0x7) {
> + page_mask = pte_override_page_mask(pte);
> + } else {
> + page_mask = pte_get_page_mask(oldlevel);
> }
> +
> + /* get access permissions from pte */
> + ret->iova = addr & page_mask;
> + ret->translated_addr = (pte & AMDVI_DEV_PT_ROOT_MASK) & page_mask;
> + ret->addr_mask = ~page_mask;
> + ret->perm = amdvi_get_perms(pte);
> + return;
> +
> no_remap:
> ret->iova = addr & AMDVI_PAGE_MASK_4K;
> ret->translated_addr = addr & AMDVI_PAGE_MASK_4K;
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 6/6] amd_iommu: Do not assume passthrough translation for devices with DTE[TV]=0
2025-03-20 5:11 ` Arun Kodilkar, Sairaj
@ 2025-03-20 16:56 ` Alejandro Jimenez
2025-03-21 8:37 ` Arun Kodilkar, Sairaj
0 siblings, 1 reply; 22+ messages in thread
From: Alejandro Jimenez @ 2025-03-20 16:56 UTC (permalink / raw)
To: Arun Kodilkar, Sairaj, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, Wei.Huang2, joao.m.martins,
boris.ostrovsky
Hi Sairaj Kodilkar,
On 3/20/25 1:11 AM, Arun Kodilkar, Sairaj wrote:
>
>
> On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
>> The AMD I/O Virtualization Technology (IOMMU) Specification (see Table
>> 8: V,
>> TV, and GV Fields in Device Table Entry), specifies that a DTE with V=0,
>> TV=1 does not contain a valid address translation information. If a
>> request
>> requires a table walk, the walk is terminated when this condition is
>> encountered.
>>
>> Do not assume that addresses for a device with DTE[TV]=0 are passed
>> through
>> (i.e. not remapped) and instead terminate the page table walk early.
>>
>> Cc: qemu-stable@nongnu.org
>> Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
>> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
>> ---
>> hw/i386/amd_iommu.c | 88 +++++++++++++++++++++++++--------------------
>> 1 file changed, 49 insertions(+), 39 deletions(-)
>>
>> diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
>> index cf00450ebe..31d5522a62 100644
>> --- a/hw/i386/amd_iommu.c
>> +++ b/hw/i386/amd_iommu.c
>> @@ -932,51 +932,61 @@ static void amdvi_page_walk(AMDVIAddressSpace
>> *as, uint64_t *dte,
>> uint64_t pte = dte[0], pte_addr, page_mask;
>> /* make sure the DTE has TV = 1 */
>> - if (pte & AMDVI_DEV_TRANSLATION_VALID) {
>> - level = get_pte_translation_mode(pte);
>> - if (level >= 7) {
>> - trace_amdvi_mode_invalid(level, addr);
>> + if (!(pte & AMDVI_DEV_TRANSLATION_VALID)) {
>> + /*
>> + * A DTE with V=1, TV=0 does not have a valid Page Table Root
>> Pointer.
>> + * An IOMMU processing a request that requires a table walk
>> terminates
>> + * the walk when it encounters this condition. Do the same
>> and return
>> + * instead of assuming that the address is forwarded without
>> translation
>> + * i.e. the passthrough case, as it is done for the case
>> where DTE[V]=0.
>> + */
> Hi Alejandro,
> According to AMD IOMMU specs TABLE 44 (IO_PAGE_FAULT Event Types), IOMMU
> reports IO_PAGE_FAULT event when TV bit is not set in the DTE.
>
> Hence you should use amdvi_page_fault to report the IO_PAGE_FAULT
> event before returning.
Good point, I had not considered this (and really haven't paid attention
to the page fault reporting until now). On first impression, I tend to
agree with your observation and will include in on v2. That being said...
The current role of amdvi_page_walk() is to be a helper for the
translate() method and ultimately the IOMMU replay() functionality.
These are needed for emulation but do not necessarily correspond 1:1
with guest-initiated operations. e.g. VFIO code will call
memory_region_iommu_replay() (which ends up calling amdvi_walk_page())
when syncing the dirty bitmap or after registering a new notifier. The
guest would get IO_PAGE_FAULT events for all the regions where a mapping
doesn't currently exist, which doesn't seem correct.
IOW, I think even this existing call to amdvi_page_fault() is not
necessary/correct:
do {
pte_perms = amdvi_get_perms(pte);
present = pte & 1;
if (!present || perms != (perms & pte_perms)) {
amdvi_page_fault(as->iommu_state, as->devfn, addr, perms);
trace_amdvi_page_fault(addr);
return;
}
[...]
I am aware I am jumping ahead a bit too much, but the point is that the
whole event reporting likely needs a comprehensive review.
I need to study the area a lot more since even the simplest/only current
use of amdvi_page_fault() right now is not very clear to me.
Alejandro
P.S.
Also confusing is this assertion in 2.5.3 IO_PAGE_FAULT Event:
"I/O page faults detected for translation requests return a
translation-not-present response (R=W=0) to the device and are not
logged in the event log."
so this suggests we should not emit a page fault i.e. don't log it in
the event log.
>
> Regards
> Sairaj Kodilkar
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 6/6] amd_iommu: Do not assume passthrough translation for devices with DTE[TV]=0
2025-03-20 16:56 ` Alejandro Jimenez
@ 2025-03-21 8:37 ` Arun Kodilkar, Sairaj
0 siblings, 0 replies; 22+ messages in thread
From: Arun Kodilkar, Sairaj @ 2025-03-21 8:37 UTC (permalink / raw)
To: Alejandro Jimenez, qemu-devel
Cc: pbonzini, mst, mjt, marcel.apfelbaum, vasant.hegde,
suravee.suthikulpanit, santosh.shukla, Wei.Huang2, joao.m.martins,
boris.ostrovsky
On 3/20/2025 10:26 PM, Alejandro Jimenez wrote:
> Hi Sairaj Kodilkar,
>
> On 3/20/25 1:11 AM, Arun Kodilkar, Sairaj wrote:
>>
>>
>> On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
>>> The AMD I/O Virtualization Technology (IOMMU) Specification (see
>>> Table 8: V,
>>> TV, and GV Fields in Device Table Entry), specifies that a DTE with V=0,
>>> TV=1 does not contain a valid address translation information. If a
>>> request
>>> requires a table walk, the walk is terminated when this condition is
>>> encountered.
>>>
>>> Do not assume that addresses for a device with DTE[TV]=0 are passed
>>> through
>>> (i.e. not remapped) and instead terminate the page table walk early.
>>>
>>> Cc: qemu-stable@nongnu.org
>>> Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
>>> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
>>> ---
>>> hw/i386/amd_iommu.c | 88 +++++++++++++++++++++++++--------------------
>>> 1 file changed, 49 insertions(+), 39 deletions(-)
>>>
>>> diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
>>> index cf00450ebe..31d5522a62 100644
>>> --- a/hw/i386/amd_iommu.c
>>> +++ b/hw/i386/amd_iommu.c
>>> @@ -932,51 +932,61 @@ static void amdvi_page_walk(AMDVIAddressSpace
>>> *as, uint64_t *dte,
>>> uint64_t pte = dte[0], pte_addr, page_mask;
>>> /* make sure the DTE has TV = 1 */
>>> - if (pte & AMDVI_DEV_TRANSLATION_VALID) {
>>> - level = get_pte_translation_mode(pte);
>>> - if (level >= 7) {
>>> - trace_amdvi_mode_invalid(level, addr);
>>> + if (!(pte & AMDVI_DEV_TRANSLATION_VALID)) {
>>> + /*
>>> + * A DTE with V=1, TV=0 does not have a valid Page Table
>>> Root Pointer.
>>> + * An IOMMU processing a request that requires a table walk
>>> terminates
>>> + * the walk when it encounters this condition. Do the same
>>> and return
>>> + * instead of assuming that the address is forwarded without
>>> translation
>>> + * i.e. the passthrough case, as it is done for the case
>>> where DTE[V]=0.
>>> + */
>> Hi Alejandro,
>> According to AMD IOMMU specs TABLE 44 (IO_PAGE_FAULT Event Types), IOMMU
>> reports IO_PAGE_FAULT event when TV bit is not set in the DTE.
>>
>> Hence you should use amdvi_page_fault to report the IO_PAGE_FAULT
>> event before returning.
>
> Good point, I had not considered this (and really haven't paid attention
> to the page fault reporting until now). On first impression, I tend to
> agree with your observation and will include in on v2. That being said...
>
> The current role of amdvi_page_walk() is to be a helper for the
> translate() method and ultimately the IOMMU replay() functionality.
> These are needed for emulation but do not necessarily correspond 1:1
> with guest-initiated operations. e.g. VFIO code will call
> memory_region_iommu_replay() (which ends up calling amdvi_walk_page())
> when syncing the dirty bitmap or after registering a new notifier. The
> guest would get IO_PAGE_FAULT events for all the regions where a mapping
> doesn't currently exist, which doesn't seem correct.
>
> IOW, I think even this existing call to amdvi_page_fault() is not
> necessary/correct:
>
> do {
> pte_perms = amdvi_get_perms(pte);
> present = pte & 1;
> if (!present || perms != (perms & pte_perms)) {
> amdvi_page_fault(as->iommu_state, as->devfn, addr, perms);
> trace_amdvi_page_fault(addr);
> return;
> }
> [...]
>
> I am aware I am jumping ahead a bit too much, but the point is that the
> whole event reporting likely needs a comprehensive review.
>
Hi Alejandro,
I agree with this and we can do a comprehensive review before reporting
page fault in this path.
I think in future we should separate these two paths (replay and normal
DMA translation). That way we can report page faults in DMA translation
path and keep the replay path clean.
> I need to study the area a lot more since even the simplest/only current
> use of amdvi_page_fault() right now is not very clear to me.
>
> Alejandro
>
> P.S.
> Also confusing is this assertion in 2.5.3 IO_PAGE_FAULT Event:
>
> "I/O page faults detected for translation requests return a translation-
> not-present response (R=W=0) to the device and are not logged in the
> event log."
>
> so this suggests we should not emit a page fault i.e. don't log it in
> the event log.
>
Here "translation requests" means ATS translation requests and not the
DMA read/write. Basically IOMMU emits the IO_PAGE_FAULT when it fails to
translate the DMA request but returns "translation-not-present" response
to the device for ATS request
Regards
Sairaj Kodilkar
>
>>
>> Regards
>> Sairaj Kodilkar
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2025-03-21 8:43 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-11 15:24 [PATCH 0/6] amd_iommu: Fixes to align with AMDVi specification Alejandro Jimenez
2025-03-11 15:24 ` [PATCH 1/6] amd_iommu: Fix Miscellanous Information Register 0 offsets Alejandro Jimenez
2025-03-17 12:37 ` Vasant Hegde
2025-03-11 15:24 ` [PATCH 2/6] amd_iommu: Fix Device ID decoding for INVALIDATE_IOTLB_PAGES command Alejandro Jimenez
2025-03-17 12:40 ` Vasant Hegde
2025-03-11 15:24 ` [PATCH 3/6] amd_iommu: Update bitmasks representing DTE reserved fields Alejandro Jimenez
2025-03-12 4:12 ` Arun Kodilkar, Sairaj
2025-03-13 14:23 ` Alejandro Jimenez
2025-03-16 9:34 ` Arun Kodilkar, Sairaj
2025-03-17 12:36 ` Vasant Hegde
2025-03-17 12:34 ` Vasant Hegde
2025-03-11 15:24 ` [PATCH 4/6] amd_iommu: Fix masks for Device Table Address Register Alejandro Jimenez
2025-03-12 5:32 ` Arun Kodilkar, Sairaj
2025-03-17 15:07 ` Vasant Hegde
2025-03-11 15:24 ` [PATCH 5/6] amd_iommu: Fix the calculation for Device Table size Alejandro Jimenez
2025-03-17 13:00 ` Vasant Hegde
2025-03-11 15:24 ` [PATCH 6/6] amd_iommu: Do not assume passthrough translation for devices with DTE[TV]=0 Alejandro Jimenez
2025-03-19 6:06 ` Vasant Hegde
2025-03-19 14:10 ` Alejandro Jimenez
2025-03-20 5:11 ` Arun Kodilkar, Sairaj
2025-03-20 16:56 ` Alejandro Jimenez
2025-03-21 8:37 ` Arun Kodilkar, Sairaj
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).