* [PATCH v3 1/5] iommu/amd: Simplify build_inv_address()
2026-05-14 0:11 [PATCH v3 0/5] Improve the invalidation path in AMD Jason Gunthorpe
@ 2026-05-14 0:11 ` Jason Gunthorpe
2026-05-14 0:11 ` [PATCH v3 2/5] iommu/amd: Pass last in through to build_inv_address() Jason Gunthorpe
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Jason Gunthorpe @ 2026-05-14 0:11 UTC (permalink / raw)
To: iommu, Joerg Roedel, Robin Murphy, Suravee Suthikulpanit,
Will Deacon
Cc: patches, Wei Wang
This function is doing more work than it needs to:
- iommu_num_pages() is pointless, the fls() is going to compute the
required page size already.
- It is easier to understand as sz_lg2, which is 12 if size is 4K,
than msb_diff which is 11 if size is 4K.
- Simplify the control flow to early exit on the out of range cases.
- Use the usual last instead of end to signify an inclusive last
address.
- Use GENMASK to compute the 1's mask.
- Use GENMASK to compute the address mask for the command layout,
not PAGE_MASK.
- Directly reference the spec language that defines the 52 bit
limit.
No functional change intended.
Reviewed-by: Wei Wang <wei.w.wang@hotmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/amd/iommu.c | 49 ++++++++++++++++++---------------------
1 file changed, 23 insertions(+), 26 deletions(-)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 01171361f9bc22..24374c8aa853af 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1268,39 +1268,36 @@ static void build_inv_dte(struct iommu_cmd *cmd, u16 devid)
*/
static inline u64 build_inv_address(u64 address, size_t size)
{
- u64 pages, end, msb_diff;
+ u64 last = address + size - 1;
+ unsigned int sz_lg2;
- pages = iommu_num_pages(address, size, PAGE_SIZE);
-
- if (pages == 1)
- return address & PAGE_MASK;
-
- end = address + size - 1;
+ address &= GENMASK_U64(63, 12);
+ sz_lg2 = fls64(address ^ last);
+ if (sz_lg2 <= 12)
+ return address;
/*
- * msb_diff would hold the index of the most significant bit that
- * flipped between the start and end.
+ * Encode sz_lg2 according to Table 14: Example Page Size Encodings
+ *
+ * See "Note *":
+ * Address bits 51:32 can be used to encode page sizes greater
+ * that 4 Gbytes.
+ * Which we take to mean that the highest page size has bit
+ * [51]=0, [50:12]=1
+ * and that coding happens when sz_lg2 is 52. Fall back to full
+ * invalidation if the size is too big.
*/
- msb_diff = fls64(end ^ address) - 1;
+ if (unlikely(sz_lg2 > 52))
+ return (CMD_INV_IOMMU_ALL_PAGES_ADDRESS & PAGE_MASK) |
+ CMD_INV_IOMMU_PAGES_SIZE_MASK;
/*
- * Bits 63:52 are sign extended. If for some reason bit 51 is different
- * between the start and the end, invalidate everything.
+ * The sz_lg2 calculation with fls() ensures that:
+ * address & BIT(sz_lg2 - 1) == 0
+ * Therefore only the 1's need to be added. 8KB requires no 1's
*/
- if (unlikely(msb_diff > 51)) {
- address = CMD_INV_IOMMU_ALL_PAGES_ADDRESS;
- } else {
- /*
- * The msb-bit must be clear on the address. Just set all the
- * lower bits.
- */
- address |= (1ull << msb_diff) - 1;
- }
-
- /* Clear bits 11:0 */
- address &= PAGE_MASK;
-
- /* Set the size bit - we flush more than one 4kb page */
+ if (sz_lg2 > 13)
+ address |= GENMASK_U64(sz_lg2 - 2, 12);
return address | CMD_INV_IOMMU_PAGES_SIZE_MASK;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v3 2/5] iommu/amd: Pass last in through to build_inv_address()
2026-05-14 0:11 [PATCH v3 0/5] Improve the invalidation path in AMD Jason Gunthorpe
2026-05-14 0:11 ` [PATCH v3 1/5] iommu/amd: Simplify build_inv_address() Jason Gunthorpe
@ 2026-05-14 0:11 ` Jason Gunthorpe
2026-05-14 0:11 ` [PATCH v3 3/5] iommu/amd: Have amd_iommu_domain_flush_pages() use last Jason Gunthorpe
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Jason Gunthorpe @ 2026-05-14 0:11 UTC (permalink / raw)
To: iommu, Joerg Roedel, Robin Murphy, Suravee Suthikulpanit,
Will Deacon
Cc: patches, Wei Wang
This is the trivial call chain below amd_iommu_domain_flush_pages().
Cases that are doing a full invalidate will pass a last of U64_MAX.
This avoids converting between size and last, and type confusion with
size_t, unsigned long and u64 all being used in different places along
the driver's invalidation path. Consistently use u64 in the internals.
Reviewed-by: Wei Wang <wei.w.wang@hotmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/amd/amd_iommu.h | 2 +-
drivers/iommu/amd/iommu.c | 71 +++++++++++++++++------------------
drivers/iommu/amd/pasid.c | 2 +-
3 files changed, 36 insertions(+), 39 deletions(-)
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 1342e764a5486d..831d457c8dad13 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -90,7 +90,7 @@ void amd_iommu_flush_all_caches(struct amd_iommu *iommu);
void amd_iommu_domain_flush_pages(struct protection_domain *domain,
u64 address, size_t size);
void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
- ioasid_t pasid, u64 address, size_t size);
+ ioasid_t pasid, u64 start, u64 last);
#ifdef CONFIG_IRQ_REMAP
int amd_iommu_create_irq_domain(struct amd_iommu *iommu);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 24374c8aa853af..52b91831c73e67 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1266,9 +1266,8 @@ static void build_inv_dte(struct iommu_cmd *cmd, u16 devid)
* Builds an invalidation address which is suitable for one page or multiple
* pages. Sets the size bit (S) as needed is more than one page is flushed.
*/
-static inline u64 build_inv_address(u64 address, size_t size)
+static inline u64 build_inv_address(u64 address, u64 last)
{
- u64 last = address + size - 1;
unsigned int sz_lg2;
address &= GENMASK_U64(63, 12);
@@ -1301,11 +1300,10 @@ static inline u64 build_inv_address(u64 address, size_t size)
return address | CMD_INV_IOMMU_PAGES_SIZE_MASK;
}
-static void build_inv_iommu_pages(struct iommu_cmd *cmd, u64 address,
- size_t size, u16 domid,
- ioasid_t pasid, bool gn)
+static void build_inv_iommu_pages(struct iommu_cmd *cmd, u64 start, u64 last,
+ u16 domid, ioasid_t pasid, bool gn)
{
- u64 inv_address = build_inv_address(address, size);
+ u64 inv_address = build_inv_address(start, last);
memset(cmd, 0, sizeof(*cmd));
@@ -1322,10 +1320,9 @@ static void build_inv_iommu_pages(struct iommu_cmd *cmd, u64 address,
}
static void build_inv_iotlb_pages(struct iommu_cmd *cmd, u16 devid, int qdep,
- u64 address, size_t size,
- ioasid_t pasid, bool gn)
+ u64 start, u64 last, ioasid_t pasid, bool gn)
{
- u64 inv_address = build_inv_address(address, size);
+ u64 inv_address = build_inv_address(start, last);
memset(cmd, 0, sizeof(*cmd));
@@ -1523,7 +1520,7 @@ static void amd_iommu_flush_tlb_all(struct amd_iommu *iommu)
for (dom_id = 0; dom_id <= last_bdf; ++dom_id) {
struct iommu_cmd cmd;
- build_inv_iommu_pages(&cmd, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS,
+ build_inv_iommu_pages(&cmd, 0, U64_MAX,
dom_id, IOMMU_NO_PASID, false);
iommu_queue_command(iommu, &cmd);
}
@@ -1535,14 +1532,15 @@ static void amd_iommu_flush_tlb_domid(struct amd_iommu *iommu, u32 dom_id)
{
struct iommu_cmd cmd;
- build_inv_iommu_pages(&cmd, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS,
+ build_inv_iommu_pages(&cmd, 0, U64_MAX,
dom_id, IOMMU_NO_PASID, false);
iommu_queue_command(iommu, &cmd);
iommu_completion_wait(iommu);
}
-static int iommu_flush_pages_v1_hdom_ids(struct protection_domain *pdom, u64 address, size_t size)
+static int iommu_flush_pages_v1_hdom_ids(struct protection_domain *pdom,
+ u64 start, u64 last)
{
int ret = 0;
struct amd_iommu_viommu *aviommu;
@@ -1559,7 +1557,7 @@ static int iommu_flush_pages_v1_hdom_ids(struct protection_domain *pdom, u64 add
pr_debug("%s: iommu=%#x, hdom_id=%#x\n", __func__,
iommu->devid, gdom_info->hdom_id);
- build_inv_iommu_pages(&cmd, address, size, gdom_info->hdom_id,
+ build_inv_iommu_pages(&cmd, start, last, gdom_info->hdom_id,
IOMMU_NO_PASID, false);
ret |= iommu_queue_command(iommu, &cmd);
}
@@ -1615,15 +1613,15 @@ void amd_iommu_flush_all_caches(struct amd_iommu *iommu)
/*
* Command send function for flushing on-device TLB
*/
-static int device_flush_iotlb(struct iommu_dev_data *dev_data, u64 address,
- size_t size, ioasid_t pasid, bool gn)
+static int device_flush_iotlb(struct iommu_dev_data *dev_data, u64 start,
+ u64 last, ioasid_t pasid, bool gn)
{
struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
struct iommu_cmd cmd;
int qdep = dev_data->ats_qdep;
- build_inv_iotlb_pages(&cmd, dev_data->devid, qdep, address,
- size, pasid, gn);
+ build_inv_iotlb_pages(&cmd, dev_data->devid, qdep, start,
+ last, pasid, gn);
return iommu_queue_command(iommu, &cmd);
}
@@ -1667,15 +1665,15 @@ static int device_flush_dte(struct iommu_dev_data *dev_data)
if (dev_data->ats_enabled) {
/* Invalidate the entire contents of an IOTLB */
- ret = device_flush_iotlb(dev_data, 0, ~0UL,
+ ret = device_flush_iotlb(dev_data, 0, U64_MAX,
IOMMU_NO_PASID, false);
}
return ret;
}
-static int domain_flush_pages_v2(struct protection_domain *pdom,
- u64 address, size_t size)
+static int domain_flush_pages_v2(struct protection_domain *pdom, u64 start,
+ u64 last)
{
struct iommu_dev_data *dev_data;
struct iommu_cmd cmd;
@@ -1686,7 +1684,7 @@ static int domain_flush_pages_v2(struct protection_domain *pdom,
struct amd_iommu *iommu = get_amd_iommu_from_dev(dev_data->dev);
u16 domid = dev_data->gcr3_info.domid;
- build_inv_iommu_pages(&cmd, address, size,
+ build_inv_iommu_pages(&cmd, start, last,
domid, IOMMU_NO_PASID, true);
ret |= iommu_queue_command(iommu, &cmd);
@@ -1695,8 +1693,8 @@ static int domain_flush_pages_v2(struct protection_domain *pdom,
return ret;
}
-static int domain_flush_pages_v1(struct protection_domain *pdom,
- u64 address, size_t size)
+static int domain_flush_pages_v1(struct protection_domain *pdom, u64 start,
+ u64 last)
{
struct pdom_iommu_info *pdom_iommu_info;
struct iommu_cmd cmd;
@@ -1705,7 +1703,7 @@ static int domain_flush_pages_v1(struct protection_domain *pdom,
lockdep_assert_held(&pdom->lock);
- build_inv_iommu_pages(&cmd, address, size,
+ build_inv_iommu_pages(&cmd, start, last,
pdom->id, IOMMU_NO_PASID, false);
xa_for_each(&pdom->iommu_array, i, pdom_iommu_info) {
@@ -1725,7 +1723,7 @@ static int domain_flush_pages_v1(struct protection_domain *pdom,
* See drivers/iommu/amd/nested.c: amd_iommu_alloc_domain_nested()
*/
if (!list_empty(&pdom->viommu_list))
- ret |= iommu_flush_pages_v1_hdom_ids(pdom, address, size);
+ ret |= iommu_flush_pages_v1_hdom_ids(pdom, start, last);
return ret;
}
@@ -1734,8 +1732,8 @@ static int domain_flush_pages_v1(struct protection_domain *pdom,
* TLB invalidation function which is called from the mapping functions.
* It flushes range of PTEs of the domain.
*/
-static void __domain_flush_pages(struct protection_domain *domain,
- u64 address, size_t size)
+static void __domain_flush_pages(struct protection_domain *domain, u64 start,
+ u64 last)
{
struct iommu_dev_data *dev_data;
int ret = 0;
@@ -1746,9 +1744,9 @@ static void __domain_flush_pages(struct protection_domain *domain,
if (pdom_is_v2_pgtbl_mode(domain)) {
gn = true;
- ret = domain_flush_pages_v2(domain, address, size);
+ ret = domain_flush_pages_v2(domain, start, last);
} else {
- ret = domain_flush_pages_v1(domain, address, size);
+ ret = domain_flush_pages_v1(domain, start, last);
}
list_for_each_entry(dev_data, &domain->dev_list, list) {
@@ -1756,7 +1754,7 @@ static void __domain_flush_pages(struct protection_domain *domain,
if (!dev_data->ats_enabled)
continue;
- ret |= device_flush_iotlb(dev_data, address, size, pasid, gn);
+ ret |= device_flush_iotlb(dev_data, start, last, pasid, gn);
}
WARN_ON(ret);
@@ -1768,7 +1766,7 @@ void amd_iommu_domain_flush_pages(struct protection_domain *domain,
lockdep_assert_held(&domain->lock);
if (likely(!amd_iommu_np_cache)) {
- __domain_flush_pages(domain, address, size);
+ __domain_flush_pages(domain, address, address + size - 1);
/* Wait until IOMMU TLB and all device IOTLB flushes are complete */
domain_flush_complete(domain);
@@ -1805,7 +1803,7 @@ void amd_iommu_domain_flush_pages(struct protection_domain *domain,
flush_size = 1ul << min_alignment;
- __domain_flush_pages(domain, address, flush_size);
+ __domain_flush_pages(domain, address, address + flush_size - 1);
address += flush_size;
size -= flush_size;
}
@@ -1822,17 +1820,17 @@ static void amd_iommu_domain_flush_all(struct protection_domain *domain)
}
void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
- ioasid_t pasid, u64 address, size_t size)
+ ioasid_t pasid, u64 start, u64 last)
{
struct iommu_cmd cmd;
struct amd_iommu *iommu = get_amd_iommu_from_dev(dev_data->dev);
- build_inv_iommu_pages(&cmd, address, size,
+ build_inv_iommu_pages(&cmd, start, last,
dev_data->gcr3_info.domid, pasid, true);
iommu_queue_command(iommu, &cmd);
if (dev_data->ats_enabled)
- device_flush_iotlb(dev_data, address, size, pasid, true);
+ device_flush_iotlb(dev_data, start, last, pasid, true);
iommu_completion_wait(iommu);
}
@@ -1840,8 +1838,7 @@ void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
static void dev_flush_pasid_all(struct iommu_dev_data *dev_data,
ioasid_t pasid)
{
- amd_iommu_dev_flush_pasid_pages(dev_data, pasid, 0,
- CMD_INV_IOMMU_ALL_PAGES_ADDRESS);
+ amd_iommu_dev_flush_pasid_pages(dev_data, pasid, 0, U64_MAX);
}
int amd_iommu_complete_ppr(struct device *dev, u32 pasid, int status, int tag)
diff --git a/drivers/iommu/amd/pasid.c b/drivers/iommu/amd/pasid.c
index 67eace9809f16a..d708c6532480a4 100644
--- a/drivers/iommu/amd/pasid.c
+++ b/drivers/iommu/amd/pasid.c
@@ -71,7 +71,7 @@ static void sva_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
for_each_pdom_dev_data(pdom_dev_data, sva_pdom) {
amd_iommu_dev_flush_pasid_pages(pdom_dev_data->dev_data,
pdom_dev_data->pasid,
- start, end - start);
+ start, end - 1);
}
spin_unlock_irqrestore(&sva_pdom->lock, flags);
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v3 3/5] iommu/amd: Have amd_iommu_domain_flush_pages() use last
2026-05-14 0:11 [PATCH v3 0/5] Improve the invalidation path in AMD Jason Gunthorpe
2026-05-14 0:11 ` [PATCH v3 1/5] iommu/amd: Simplify build_inv_address() Jason Gunthorpe
2026-05-14 0:11 ` [PATCH v3 2/5] iommu/amd: Pass last in through to build_inv_address() Jason Gunthorpe
@ 2026-05-14 0:11 ` Jason Gunthorpe
2026-05-14 0:11 ` [PATCH v3 4/5] iommu/amd: Make CMD_INV_IOMMU_ALL_PAGES_ADDRESS match the spec Jason Gunthorpe
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Jason Gunthorpe @ 2026-05-14 0:11 UTC (permalink / raw)
To: iommu, Joerg Roedel, Robin Murphy, Suravee Suthikulpanit,
Will Deacon
Cc: patches, Wei Wang
Finish clearing out the size/last/end switching by converting
amd_iommu_domain_flush_pages() to use last-based logic.
This algorithm is simpler than the previous. Ultimately all this wants
to do is select powers of two that are aligned to address and not
longer than the distance to last.
The new version is fully safe for size = U64_MAX and last = U64_MAX.
Finally, the gather can be passed through natively without risking an
overflow in (gather->end - gather->start + 1).
Reviewed-by: Wei Wang <wei.w.wang@hotmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/amd/amd_iommu.h | 2 +-
drivers/iommu/amd/iommu.c | 44 +++++++++++++----------------------
2 files changed, 17 insertions(+), 29 deletions(-)
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 831d457c8dad13..92410b9cf544a9 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -88,7 +88,7 @@ int amd_iommu_complete_ppr(struct device *dev, u32 pasid, int status, int tag);
*/
void amd_iommu_flush_all_caches(struct amd_iommu *iommu);
void amd_iommu_domain_flush_pages(struct protection_domain *domain,
- u64 address, size_t size);
+ u64 address, u64 last);
void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
ioasid_t pasid, u64 start, u64 last);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 52b91831c73e67..9a541330f599ed 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1761,12 +1761,13 @@ static void __domain_flush_pages(struct protection_domain *domain, u64 start,
}
void amd_iommu_domain_flush_pages(struct protection_domain *domain,
- u64 address, size_t size)
+ u64 address, u64 last)
{
lockdep_assert_held(&domain->lock);
- if (likely(!amd_iommu_np_cache)) {
- __domain_flush_pages(domain, address, address + size - 1);
+ if (likely(!amd_iommu_np_cache) ||
+ unlikely(address == 0 && last == U64_MAX)) {
+ __domain_flush_pages(domain, address, last);
/* Wait until IOMMU TLB and all device IOTLB flushes are complete */
domain_flush_complete(domain);
@@ -1784,28 +1785,17 @@ void amd_iommu_domain_flush_pages(struct protection_domain *domain,
* between the natural alignment of the address that we flush and the
* greatest naturally aligned region that fits in the range.
*/
- while (size != 0) {
- int addr_alignment = __ffs(address);
- int size_alignment = __fls(size);
- int min_alignment;
- size_t flush_size;
+ while (address <= last) {
+ unsigned int sz_lg2 = ilog2(last - address + 1);
+ u64 flush_last;
- /*
- * size is always non-zero, but address might be zero, causing
- * addr_alignment to be negative. As the casting of the
- * argument in __ffs(address) to long might trim the high bits
- * of the address on x86-32, cast to long when doing the check.
- */
- if (likely((unsigned long)address != 0))
- min_alignment = min(addr_alignment, size_alignment);
- else
- min_alignment = size_alignment;
+ if (likely(address))
+ sz_lg2 = min_t(unsigned int, sz_lg2, __ffs64(address));
- flush_size = 1ul << min_alignment;
-
- __domain_flush_pages(domain, address, address + flush_size - 1);
- address += flush_size;
- size -= flush_size;
+ flush_last = address + (1ULL << sz_lg2) - 1;
+ __domain_flush_pages(domain, address, flush_last);
+ if (check_add_overflow(flush_last, 1, &address))
+ break;
}
/* Wait until IOMMU TLB and all device IOTLB flushes are complete */
@@ -1815,8 +1805,7 @@ void amd_iommu_domain_flush_pages(struct protection_domain *domain,
/* Flush the whole IO/TLB for a given protection domain - including PDE */
static void amd_iommu_domain_flush_all(struct protection_domain *domain)
{
- amd_iommu_domain_flush_pages(domain, 0,
- CMD_INV_IOMMU_ALL_PAGES_ADDRESS);
+ amd_iommu_domain_flush_pages(domain, 0, U64_MAX);
}
void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
@@ -2616,7 +2605,7 @@ static int amd_iommu_iotlb_sync_map(struct iommu_domain *dom,
return 0;
spin_lock_irqsave(&domain->lock, flags);
- amd_iommu_domain_flush_pages(domain, iova, size);
+ amd_iommu_domain_flush_pages(domain, iova, iova + size - 1);
spin_unlock_irqrestore(&domain->lock, flags);
return 0;
}
@@ -2638,8 +2627,7 @@ static void amd_iommu_iotlb_sync(struct iommu_domain *domain,
unsigned long flags;
spin_lock_irqsave(&dom->lock, flags);
- amd_iommu_domain_flush_pages(dom, gather->start,
- gather->end - gather->start + 1);
+ amd_iommu_domain_flush_pages(dom, gather->start, gather->end);
spin_unlock_irqrestore(&dom->lock, flags);
iommu_put_pages_list(&gather->freelist);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v3 4/5] iommu/amd: Make CMD_INV_IOMMU_ALL_PAGES_ADDRESS match the spec
2026-05-14 0:11 [PATCH v3 0/5] Improve the invalidation path in AMD Jason Gunthorpe
` (2 preceding siblings ...)
2026-05-14 0:11 ` [PATCH v3 3/5] iommu/amd: Have amd_iommu_domain_flush_pages() use last Jason Gunthorpe
@ 2026-05-14 0:11 ` Jason Gunthorpe
2026-05-14 0:11 ` [PATCH v3 5/5] iommu/amd: Control INVALIDATE_IOMMU_PAGES PDE from the gather Jason Gunthorpe
2026-05-28 7:42 ` [PATCH v3 0/5] Improve the invalidation path in AMD Joerg Roedel
5 siblings, 0 replies; 7+ messages in thread
From: Jason Gunthorpe @ 2026-05-14 0:11 UTC (permalink / raw)
To: iommu, Joerg Roedel, Robin Murphy, Suravee Suthikulpanit,
Will Deacon
Cc: patches, Wei Wang
The spec in Table 14 defines the "Entire Cache" case as having the low
12 bits as zero. Indeed the command format doesn't even have the low
12 bits. Since there is only one user now, fix the constant to have 0
in the low 12 bits instead of 1 and remove the masking.
Reviewed-by: Wei Wang <wei.w.wang@hotmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/amd/amd_iommu_types.h | 2 +-
drivers/iommu/amd/iommu.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index c685d3771436a2..c239c5fbe47461 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -221,7 +221,7 @@
#define PPR_STATUS_MASK 0xf
#define PPR_STATUS_SHIFT 12
-#define CMD_INV_IOMMU_ALL_PAGES_ADDRESS 0x7fffffffffffffffULL
+#define CMD_INV_IOMMU_ALL_PAGES_ADDRESS 0x7ffffffffffff000ULL
/* macros and definitions for device table entries */
#define DEV_ENTRY_VALID 0x00
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 9a541330f599ed..38cfeba8313b43 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1287,7 +1287,7 @@ static inline u64 build_inv_address(u64 address, u64 last)
* invalidation if the size is too big.
*/
if (unlikely(sz_lg2 > 52))
- return (CMD_INV_IOMMU_ALL_PAGES_ADDRESS & PAGE_MASK) |
+ return CMD_INV_IOMMU_ALL_PAGES_ADDRESS |
CMD_INV_IOMMU_PAGES_SIZE_MASK;
/*
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v3 5/5] iommu/amd: Control INVALIDATE_IOMMU_PAGES PDE from the gather
2026-05-14 0:11 [PATCH v3 0/5] Improve the invalidation path in AMD Jason Gunthorpe
` (3 preceding siblings ...)
2026-05-14 0:11 ` [PATCH v3 4/5] iommu/amd: Make CMD_INV_IOMMU_ALL_PAGES_ADDRESS match the spec Jason Gunthorpe
@ 2026-05-14 0:11 ` Jason Gunthorpe
2026-05-28 7:42 ` [PATCH v3 0/5] Improve the invalidation path in AMD Joerg Roedel
5 siblings, 0 replies; 7+ messages in thread
From: Jason Gunthorpe @ 2026-05-14 0:11 UTC (permalink / raw)
To: iommu, Joerg Roedel, Robin Murphy, Suravee Suthikulpanit,
Will Deacon
Cc: patches, Wei Wang
Now that AMD uses iommupt, it is easy to make use of the PDE bit. If
the gather has no free list then no page directory entries were
changed.
Pass GN/PDE through the invalidation call chain in a u32 flags field
that is OR'd into data[2] and set it properly from the gather.
Reviewed-by: Wei Wang <wei.w.wang@hotmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/amd/amd_iommu.h | 2 +-
drivers/iommu/amd/iommu.c | 78 +++++++++++++++++++----------------
2 files changed, 44 insertions(+), 36 deletions(-)
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 92410b9cf544a9..1a6d9cae7b23b9 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -88,7 +88,7 @@ int amd_iommu_complete_ppr(struct device *dev, u32 pasid, int status, int tag);
*/
void amd_iommu_flush_all_caches(struct amd_iommu *iommu);
void amd_iommu_domain_flush_pages(struct protection_domain *domain,
- u64 address, u64 last);
+ u64 address, u64 last, u32 flags);
void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
ioasid_t pasid, u64 start, u64 last);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 38cfeba8313b43..c546fb6dd42eda 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1300,8 +1300,9 @@ static inline u64 build_inv_address(u64 address, u64 last)
return address | CMD_INV_IOMMU_PAGES_SIZE_MASK;
}
-static void build_inv_iommu_pages(struct iommu_cmd *cmd, u64 start, u64 last,
- u16 domid, ioasid_t pasid, bool gn)
+static void build_inv_iommu_pages(struct iommu_cmd *cmd, u64 start,
+ u64 last, u16 domid, ioasid_t pasid,
+ u32 flags)
{
u64 inv_address = build_inv_address(start, last);
@@ -1310,19 +1311,17 @@ static void build_inv_iommu_pages(struct iommu_cmd *cmd, u64 start, u64 last,
cmd->data[1] |= domid;
cmd->data[2] = lower_32_bits(inv_address);
cmd->data[3] = upper_32_bits(inv_address);
- /* PDE bit - we want to flush everything, not only the PTEs */
- cmd->data[2] |= CMD_INV_IOMMU_PAGES_PDE_MASK;
- if (gn) {
+ cmd->data[2] |= flags;
+ if (flags & CMD_INV_IOMMU_PAGES_GN_MASK)
cmd->data[0] |= pasid;
- cmd->data[2] |= CMD_INV_IOMMU_PAGES_GN_MASK;
- }
CMD_SET_TYPE(cmd, CMD_INV_IOMMU_PAGES);
}
static void build_inv_iotlb_pages(struct iommu_cmd *cmd, u16 devid, int qdep,
- u64 start, u64 last, ioasid_t pasid, bool gn)
+ u64 address, u64 last,
+ ioasid_t pasid, bool gn)
{
- u64 inv_address = build_inv_address(start, last);
+ u64 inv_address = build_inv_address(address, last);
memset(cmd, 0, sizeof(*cmd));
@@ -1521,7 +1520,8 @@ static void amd_iommu_flush_tlb_all(struct amd_iommu *iommu)
for (dom_id = 0; dom_id <= last_bdf; ++dom_id) {
struct iommu_cmd cmd;
build_inv_iommu_pages(&cmd, 0, U64_MAX,
- dom_id, IOMMU_NO_PASID, false);
+ dom_id, IOMMU_NO_PASID,
+ CMD_INV_IOMMU_PAGES_PDE_MASK);
iommu_queue_command(iommu, &cmd);
}
@@ -1533,14 +1533,15 @@ static void amd_iommu_flush_tlb_domid(struct amd_iommu *iommu, u32 dom_id)
struct iommu_cmd cmd;
build_inv_iommu_pages(&cmd, 0, U64_MAX,
- dom_id, IOMMU_NO_PASID, false);
+ dom_id, IOMMU_NO_PASID,
+ CMD_INV_IOMMU_PAGES_PDE_MASK);
iommu_queue_command(iommu, &cmd);
iommu_completion_wait(iommu);
}
static int iommu_flush_pages_v1_hdom_ids(struct protection_domain *pdom,
- u64 start, u64 last)
+ u64 start, u64 last, u32 flags)
{
int ret = 0;
struct amd_iommu_viommu *aviommu;
@@ -1558,7 +1559,7 @@ static int iommu_flush_pages_v1_hdom_ids(struct protection_domain *pdom,
pr_debug("%s: iommu=%#x, hdom_id=%#x\n", __func__,
iommu->devid, gdom_info->hdom_id);
build_inv_iommu_pages(&cmd, start, last, gdom_info->hdom_id,
- IOMMU_NO_PASID, false);
+ IOMMU_NO_PASID, flags);
ret |= iommu_queue_command(iommu, &cmd);
}
xa_unlock(&aviommu->gdomid_array);
@@ -1613,14 +1614,14 @@ void amd_iommu_flush_all_caches(struct amd_iommu *iommu)
/*
* Command send function for flushing on-device TLB
*/
-static int device_flush_iotlb(struct iommu_dev_data *dev_data, u64 start,
+static int device_flush_iotlb(struct iommu_dev_data *dev_data, u64 address,
u64 last, ioasid_t pasid, bool gn)
{
struct amd_iommu *iommu = get_amd_iommu_from_dev_data(dev_data);
struct iommu_cmd cmd;
int qdep = dev_data->ats_qdep;
- build_inv_iotlb_pages(&cmd, dev_data->devid, qdep, start,
+ build_inv_iotlb_pages(&cmd, dev_data->devid, qdep, address,
last, pasid, gn);
return iommu_queue_command(iommu, &cmd);
@@ -1672,8 +1673,8 @@ static int device_flush_dte(struct iommu_dev_data *dev_data)
return ret;
}
-static int domain_flush_pages_v2(struct protection_domain *pdom, u64 start,
- u64 last)
+static int domain_flush_pages_v2(struct protection_domain *pdom,
+ u64 start, u64 last, u32 flags)
{
struct iommu_dev_data *dev_data;
struct iommu_cmd cmd;
@@ -1684,8 +1685,9 @@ static int domain_flush_pages_v2(struct protection_domain *pdom, u64 start,
struct amd_iommu *iommu = get_amd_iommu_from_dev(dev_data->dev);
u16 domid = dev_data->gcr3_info.domid;
- build_inv_iommu_pages(&cmd, start, last,
- domid, IOMMU_NO_PASID, true);
+ build_inv_iommu_pages(&cmd, start, last, domid,
+ IOMMU_NO_PASID,
+ flags | CMD_INV_IOMMU_PAGES_GN_MASK);
ret |= iommu_queue_command(iommu, &cmd);
}
@@ -1693,8 +1695,8 @@ static int domain_flush_pages_v2(struct protection_domain *pdom, u64 start,
return ret;
}
-static int domain_flush_pages_v1(struct protection_domain *pdom, u64 start,
- u64 last)
+static int domain_flush_pages_v1(struct protection_domain *pdom,
+ u64 start, u64 last, u32 flags)
{
struct pdom_iommu_info *pdom_iommu_info;
struct iommu_cmd cmd;
@@ -1704,7 +1706,7 @@ static int domain_flush_pages_v1(struct protection_domain *pdom, u64 start,
lockdep_assert_held(&pdom->lock);
build_inv_iommu_pages(&cmd, start, last,
- pdom->id, IOMMU_NO_PASID, false);
+ pdom->id, IOMMU_NO_PASID, flags);
xa_for_each(&pdom->iommu_array, i, pdom_iommu_info) {
/*
@@ -1723,7 +1725,7 @@ static int domain_flush_pages_v1(struct protection_domain *pdom, u64 start,
* See drivers/iommu/amd/nested.c: amd_iommu_alloc_domain_nested()
*/
if (!list_empty(&pdom->viommu_list))
- ret |= iommu_flush_pages_v1_hdom_ids(pdom, start, last);
+ ret |= iommu_flush_pages_v1_hdom_ids(pdom, start, last, flags);
return ret;
}
@@ -1732,8 +1734,8 @@ static int domain_flush_pages_v1(struct protection_domain *pdom, u64 start,
* TLB invalidation function which is called from the mapping functions.
* It flushes range of PTEs of the domain.
*/
-static void __domain_flush_pages(struct protection_domain *domain, u64 start,
- u64 last)
+static void __domain_flush_pages(struct protection_domain *domain,
+ u64 address, u64 last, u32 flags)
{
struct iommu_dev_data *dev_data;
int ret = 0;
@@ -1744,9 +1746,9 @@ static void __domain_flush_pages(struct protection_domain *domain, u64 start,
if (pdom_is_v2_pgtbl_mode(domain)) {
gn = true;
- ret = domain_flush_pages_v2(domain, start, last);
+ ret = domain_flush_pages_v2(domain, address, last, flags);
} else {
- ret = domain_flush_pages_v1(domain, start, last);
+ ret = domain_flush_pages_v1(domain, address, last, flags);
}
list_for_each_entry(dev_data, &domain->dev_list, list) {
@@ -1754,20 +1756,20 @@ static void __domain_flush_pages(struct protection_domain *domain, u64 start,
if (!dev_data->ats_enabled)
continue;
- ret |= device_flush_iotlb(dev_data, start, last, pasid, gn);
+ ret |= device_flush_iotlb(dev_data, address, last, pasid, gn);
}
WARN_ON(ret);
}
void amd_iommu_domain_flush_pages(struct protection_domain *domain,
- u64 address, u64 last)
+ u64 address, u64 last, u32 flags)
{
lockdep_assert_held(&domain->lock);
if (likely(!amd_iommu_np_cache) ||
unlikely(address == 0 && last == U64_MAX)) {
- __domain_flush_pages(domain, address, last);
+ __domain_flush_pages(domain, address, last, flags);
/* Wait until IOMMU TLB and all device IOTLB flushes are complete */
domain_flush_complete(domain);
@@ -1793,7 +1795,7 @@ void amd_iommu_domain_flush_pages(struct protection_domain *domain,
sz_lg2 = min_t(unsigned int, sz_lg2, __ffs64(address));
flush_last = address + (1ULL << sz_lg2) - 1;
- __domain_flush_pages(domain, address, flush_last);
+ __domain_flush_pages(domain, address, flush_last, flags);
if (check_add_overflow(flush_last, 1, &address))
break;
}
@@ -1805,7 +1807,8 @@ void amd_iommu_domain_flush_pages(struct protection_domain *domain,
/* Flush the whole IO/TLB for a given protection domain - including PDE */
static void amd_iommu_domain_flush_all(struct protection_domain *domain)
{
- amd_iommu_domain_flush_pages(domain, 0, U64_MAX);
+ amd_iommu_domain_flush_pages(domain, 0, U64_MAX,
+ CMD_INV_IOMMU_PAGES_PDE_MASK);
}
void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
@@ -1815,7 +1818,9 @@ void amd_iommu_dev_flush_pasid_pages(struct iommu_dev_data *dev_data,
struct amd_iommu *iommu = get_amd_iommu_from_dev(dev_data->dev);
build_inv_iommu_pages(&cmd, start, last,
- dev_data->gcr3_info.domid, pasid, true);
+ dev_data->gcr3_info.domid, pasid,
+ CMD_INV_IOMMU_PAGES_GN_MASK |
+ CMD_INV_IOMMU_PAGES_PDE_MASK);
iommu_queue_command(iommu, &cmd);
if (dev_data->ats_enabled)
@@ -2605,7 +2610,8 @@ static int amd_iommu_iotlb_sync_map(struct iommu_domain *dom,
return 0;
spin_lock_irqsave(&domain->lock, flags);
- amd_iommu_domain_flush_pages(domain, iova, iova + size - 1);
+ amd_iommu_domain_flush_pages(domain, iova, iova + size - 1,
+ CMD_INV_IOMMU_PAGES_PDE_MASK);
spin_unlock_irqrestore(&domain->lock, flags);
return 0;
}
@@ -2627,7 +2633,9 @@ static void amd_iommu_iotlb_sync(struct iommu_domain *domain,
unsigned long flags;
spin_lock_irqsave(&dom->lock, flags);
- amd_iommu_domain_flush_pages(dom, gather->start, gather->end);
+ amd_iommu_domain_flush_pages(dom, gather->start, gather->end,
+ iommu_pages_list_empty(&gather->freelist) ?
+ 0 : CMD_INV_IOMMU_PAGES_PDE_MASK);
spin_unlock_irqrestore(&dom->lock, flags);
iommu_put_pages_list(&gather->freelist);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH v3 0/5] Improve the invalidation path in AMD
2026-05-14 0:11 [PATCH v3 0/5] Improve the invalidation path in AMD Jason Gunthorpe
` (4 preceding siblings ...)
2026-05-14 0:11 ` [PATCH v3 5/5] iommu/amd: Control INVALIDATE_IOMMU_PAGES PDE from the gather Jason Gunthorpe
@ 2026-05-28 7:42 ` Joerg Roedel
5 siblings, 0 replies; 7+ messages in thread
From: Joerg Roedel @ 2026-05-28 7:42 UTC (permalink / raw)
To: Jason Gunthorpe, Vasant Hegde
Cc: iommu, Robin Murphy, Suravee Suthikulpanit, Will Deacon, patches,
Wei Wang
On Wed, May 13, 2026 at 09:11:15PM -0300, Jason Gunthorpe wrote:
> Jason Gunthorpe (5):
> iommu/amd: Simplify build_inv_address()
> iommu/amd: Pass last in through to build_inv_address()
> iommu/amd: Have amd_iommu_domain_flush_pages() use last
> iommu/amd: Make CMD_INV_IOMMU_ALL_PAGES_ADDRESS match the spec
> iommu/amd: Control INVALIDATE_IOMMU_PAGES PDE from the gather
>
> drivers/iommu/amd/amd_iommu.h | 4 +-
> drivers/iommu/amd/amd_iommu_types.h | 2 +-
> drivers/iommu/amd/iommu.c | 176 +++++++++++++---------------
> drivers/iommu/amd/pasid.c | 2 +-
> 4 files changed, 87 insertions(+), 97 deletions(-)
Vasant, can you please help reviewing these changes?
Thanks,
Joerg
^ permalink raw reply [flat|nested] 7+ messages in thread