All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/9] Improve TLB invalidation logic
@ 2023-11-22  9:02 Vasant Hegde
  2023-11-22  9:02 ` [PATCH v2 1/9] iommu/amd: Rename iommu_flush_all_caches() -> amd_iommu_flush_all_caches() Vasant Hegde
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: Vasant Hegde @ 2023-11-22  9:02 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, jgg, Vasant Hegde

Current code invalidates single page or entire range for the given
domain ID. IOMMU hardware supports multiple page invalidation.
This series adds support to invalidate range of pages. Also consolidate
various invalidation functions.

Note that this series doesn't add support for PASID based invalidation.
We will include necessary changes in SVA series.

Testing:
  - This series is tested with v1 and v2 page table and it works fine.

This patch series is based on top of upstream v6.7-rc2.

This is also available at github :
   https://github.com/AMDESE/linux/tree/iommu_flush_improvement_v2_v6.7_rc2

Thanks Jason for reviewing previous versions and providing valuable feedbacks.

Changes from v1 -> v2:
  - Rebased on top of v6.7-rc2
  - Based on review comments removed PASID related helper functions
  - Reworked domain_flush_pages() code path to detect/handle v2 page table flush.
  - Re-organized code made it difficult to fix __set_gcr3() code path. It
    still uses __amd_iommu_flush_tlb(). Also as part of SVA part3, we are
    reworking this code path. Hence dropped __flush_pasid() removal
    patch from this series. It will be added to SVA part3 series.
  - Dropped code re-arragement patch as I think it should be done after removing
    flush_pasid() related code. It will be added to SVA part3.


v1: https://lore.kernel.org/linux-iommu/20231006101624.5912-1-vasant.hegde@amd.com/T/#t

Thank you,
Vasant

Vasant Hegde (9):
  iommu/amd: Rename iommu_flush_all_caches() -> amd_iommu_flush_all_caches()
  iommu/amd: Remove redundant domain flush from attach_device()
  iommu/amd: Remove redundant passing of PDE bit
  iommu/amd: Add support to invalidate multiple guest pages
  iommu/amd: Refactor IOMMU tlb invalidation code
  iommu/amd: Refactor device iotlb invalidation code
  iommu/amd: Consolidate amd_iommu_domain_flush_complete() call
  iommu/amd: Make domain_flush_pages as global function
  iommu/amd/pgtbl_v2: Invalidate updated page ranges only

 drivers/iommu/amd/amd_iommu.h       |   8 +-
 drivers/iommu/amd/amd_iommu_types.h |   6 -
 drivers/iommu/amd/init.c            |   8 +-
 drivers/iommu/amd/io_pgtable.c      |   5 +-
 drivers/iommu/amd/io_pgtable_v2.c   |  10 +-
 drivers/iommu/amd/iommu.c           | 163 ++++++++++++----------------
 6 files changed, 88 insertions(+), 112 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/9] iommu/amd: Rename iommu_flush_all_caches() -> amd_iommu_flush_all_caches()
  2023-11-22  9:02 [PATCH v2 0/9] Improve TLB invalidation logic Vasant Hegde
@ 2023-11-22  9:02 ` Vasant Hegde
  2023-11-22  9:02 ` [PATCH v2 2/9] iommu/amd: Remove redundant domain flush from attach_device() Vasant Hegde
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Vasant Hegde @ 2023-11-22  9:02 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, jgg, Vasant Hegde, Jason Gunthorpe

Rename function inline with driver naming convention.

No functional changes.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/amd_iommu.h       | 5 +++++
 drivers/iommu/amd/amd_iommu_types.h | 6 ------
 drivers/iommu/amd/init.c            | 8 ++++----
 drivers/iommu/amd/iommu.c           | 2 +-
 4 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 86be1edd50ee..234db57cd320 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -53,6 +53,11 @@ int amd_iommu_pdev_enable_cap_pri(struct pci_dev *pdev);
 void amd_iommu_pdev_disable_cap_pri(struct pci_dev *pdev);
 
 int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid, u64 address);
+/*
+ * This function flushes all internal caches of
+ * the IOMMU used by this driver.
+ */
+void amd_iommu_flush_all_caches(struct amd_iommu *iommu);
 void amd_iommu_update_and_flush_device_table(struct protection_domain *domain);
 void amd_iommu_domain_update(struct protection_domain *domain);
 void amd_iommu_domain_flush_complete(struct protection_domain *domain);
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 90b7d7950a9e..809d74faa1a5 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -902,12 +902,6 @@ extern int amd_iommu_max_glx_val;
 extern u64 amd_iommu_efr;
 extern u64 amd_iommu_efr2;
 
-/*
- * This function flushes all internal caches of
- * the IOMMU used by this driver.
- */
-void iommu_flush_all_caches(struct amd_iommu *iommu);
-
 static inline int get_ioapic_devid(int id)
 {
 	struct devid_map *entry;
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 64bcf3df37ee..c83bd0c2a1c9 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2223,7 +2223,7 @@ static int __init amd_iommu_init_pci(void)
 		init_device_table_dma(pci_seg);
 
 	for_each_iommu(iommu)
-		iommu_flush_all_caches(iommu);
+		amd_iommu_flush_all_caches(iommu);
 
 	print_iommu_info();
 
@@ -2773,7 +2773,7 @@ static void early_enable_iommu(struct amd_iommu *iommu)
 	iommu_enable_xt(iommu);
 	iommu_enable_irtcachedis(iommu);
 	iommu_enable(iommu);
-	iommu_flush_all_caches(iommu);
+	amd_iommu_flush_all_caches(iommu);
 }
 
 /*
@@ -2829,7 +2829,7 @@ static void early_enable_iommus(void)
 			iommu_enable_xt(iommu);
 			iommu_enable_irtcachedis(iommu);
 			iommu_set_device_table(iommu);
-			iommu_flush_all_caches(iommu);
+			amd_iommu_flush_all_caches(iommu);
 		}
 	}
 }
@@ -3293,7 +3293,7 @@ static int __init state_next(void)
 				uninit_device_table_dma(pci_seg);
 
 			for_each_iommu(iommu)
-				iommu_flush_all_caches(iommu);
+				amd_iommu_flush_all_caches(iommu);
 		}
 	}
 	return ret;
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index fcc987f5d4ed..6a43ebddaf87 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1392,7 +1392,7 @@ static void amd_iommu_flush_irt_all(struct amd_iommu *iommu)
 	iommu_completion_wait(iommu);
 }
 
-void iommu_flush_all_caches(struct amd_iommu *iommu)
+void amd_iommu_flush_all_caches(struct amd_iommu *iommu)
 {
 	if (check_feature(FEATURE_IA)) {
 		amd_iommu_flush_all(iommu);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/9] iommu/amd: Remove redundant domain flush from attach_device()
  2023-11-22  9:02 [PATCH v2 0/9] Improve TLB invalidation logic Vasant Hegde
  2023-11-22  9:02 ` [PATCH v2 1/9] iommu/amd: Rename iommu_flush_all_caches() -> amd_iommu_flush_all_caches() Vasant Hegde
@ 2023-11-22  9:02 ` Vasant Hegde
  2023-11-22  9:02 ` [PATCH v2 3/9] iommu/amd: Remove redundant passing of PDE bit Vasant Hegde
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Vasant Hegde @ 2023-11-22  9:02 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, jgg, Vasant Hegde, Jason Gunthorpe

Domain flush was introduced in attach_device() path to handle kdump
scenario. Later init code was enhanced to handle kdump scenario where
it also takes care of flushing everything including TLB
(see early_enable_iommus()).

Hence remove redundant flush from attach_device() function.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 6a43ebddaf87..2d40bf5f406e 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1896,15 +1896,6 @@ static int attach_device(struct device *dev,
 
 	do_attach(dev_data, domain);
 
-	/*
-	 * We might boot into a crash-kernel here. The crashed kernel
-	 * left the caches in the IOMMU dirty. So we have to flush
-	 * here to evict all dirty stuff.
-	 */
-	amd_iommu_domain_flush_tlb_pde(domain);
-
-	amd_iommu_domain_flush_complete(domain);
-
 out:
 	spin_unlock(&dev_data->lock);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/9] iommu/amd: Remove redundant passing of PDE bit
  2023-11-22  9:02 [PATCH v2 0/9] Improve TLB invalidation logic Vasant Hegde
  2023-11-22  9:02 ` [PATCH v2 1/9] iommu/amd: Rename iommu_flush_all_caches() -> amd_iommu_flush_all_caches() Vasant Hegde
  2023-11-22  9:02 ` [PATCH v2 2/9] iommu/amd: Remove redundant domain flush from attach_device() Vasant Hegde
@ 2023-11-22  9:02 ` Vasant Hegde
  2023-11-22  9:02 ` [PATCH v2 4/9] iommu/amd: Add support to invalidate multiple guest pages Vasant Hegde
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Vasant Hegde @ 2023-11-22  9:02 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, jgg, Vasant Hegde, Jason Gunthorpe

Current code always sets PDE bit in INVALIDATE_IOMMU_PAGES command.
Hence get rid of 'pde' variable across functions.

We can re-introduce this bit whenever its needed.

Suggested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 2d40bf5f406e..1ad889acf8cb 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1124,7 +1124,7 @@ static inline u64 build_inv_address(u64 address, size_t size)
 }
 
 static void build_inv_iommu_pages(struct iommu_cmd *cmd, u64 address,
-				  size_t size, u16 domid, int pde)
+				  size_t size, u16 domid)
 {
 	u64 inv_address = build_inv_address(address, size);
 
@@ -1133,8 +1133,8 @@ static void build_inv_iommu_pages(struct iommu_cmd *cmd, u64 address,
 	cmd->data[2]  = lower_32_bits(inv_address);
 	cmd->data[3]  = upper_32_bits(inv_address);
 	CMD_SET_TYPE(cmd, CMD_INV_IOMMU_PAGES);
-	if (pde) /* PDE bit - we want to flush everything, not only the PTEs */
-		cmd->data[2] |= CMD_INV_IOMMU_PAGES_PDE_MASK;
+	/* PDE bit - we want to flush everything, not only the PTEs */
+	cmd->data[2] |= CMD_INV_IOMMU_PAGES_PDE_MASK;
 }
 
 static void build_inv_iotlb_pages(struct iommu_cmd *cmd, u16 devid, int qdep,
@@ -1341,7 +1341,7 @@ static void amd_iommu_flush_tlb_all(struct amd_iommu *iommu)
 	for (dom_id = 0; dom_id <= last_bdf; ++dom_id) {
 		struct iommu_cmd cmd;
 		build_inv_iommu_pages(&cmd, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS,
-				      dom_id, 1);
+				      dom_id);
 		iommu_queue_command(iommu, &cmd);
 	}
 
@@ -1352,8 +1352,7 @@ static void amd_iommu_flush_tlb_domid(struct amd_iommu *iommu, u32 dom_id)
 {
 	struct iommu_cmd cmd;
 
-	build_inv_iommu_pages(&cmd, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS,
-			      dom_id, 1);
+	build_inv_iommu_pages(&cmd, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS, dom_id);
 	iommu_queue_command(iommu, &cmd);
 
 	iommu_completion_wait(iommu);
@@ -1476,13 +1475,13 @@ static int device_flush_dte(struct iommu_dev_data *dev_data)
  * page. Otherwise it flushes the whole TLB of the IOMMU.
  */
 static void __domain_flush_pages(struct protection_domain *domain,
-				 u64 address, size_t size, int pde)
+				 u64 address, size_t size)
 {
 	struct iommu_dev_data *dev_data;
 	struct iommu_cmd cmd;
 	int ret = 0, i;
 
-	build_inv_iommu_pages(&cmd, address, size, domain->id, pde);
+	build_inv_iommu_pages(&cmd, address, size, domain->id);
 
 	for (i = 0; i < amd_iommu_get_num_iommus(); ++i) {
 		if (!domain->dev_iommu[i])
@@ -1507,10 +1506,10 @@ static void __domain_flush_pages(struct protection_domain *domain,
 }
 
 static void domain_flush_pages(struct protection_domain *domain,
-			       u64 address, size_t size, int pde)
+			       u64 address, size_t size)
 {
 	if (likely(!amd_iommu_np_cache)) {
-		__domain_flush_pages(domain, address, size, pde);
+		__domain_flush_pages(domain, address, size);
 		return;
 	}
 
@@ -1543,7 +1542,7 @@ static void domain_flush_pages(struct protection_domain *domain,
 
 		flush_size = 1ul << min_alignment;
 
-		__domain_flush_pages(domain, address, flush_size, pde);
+		__domain_flush_pages(domain, address, flush_size);
 		address += flush_size;
 		size -= flush_size;
 	}
@@ -1552,7 +1551,7 @@ static void domain_flush_pages(struct protection_domain *domain,
 /* Flush the whole IO/TLB for a given protection domain - including PDE */
 void amd_iommu_domain_flush_tlb_pde(struct protection_domain *domain)
 {
-	domain_flush_pages(domain, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS, 1);
+	domain_flush_pages(domain, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS);
 }
 
 void amd_iommu_domain_flush_complete(struct protection_domain *domain)
@@ -1579,7 +1578,7 @@ static void domain_flush_np_cache(struct protection_domain *domain,
 		unsigned long flags;
 
 		spin_lock_irqsave(&domain->lock, flags);
-		domain_flush_pages(domain, iova, size, 1);
+		domain_flush_pages(domain, iova, size);
 		amd_iommu_domain_flush_complete(domain);
 		spin_unlock_irqrestore(&domain->lock, flags);
 	}
@@ -2591,7 +2590,7 @@ static void amd_iommu_iotlb_sync(struct iommu_domain *domain,
 	unsigned long flags;
 
 	spin_lock_irqsave(&dom->lock, flags);
-	domain_flush_pages(dom, gather->start, gather->end - gather->start + 1, 1);
+	domain_flush_pages(dom, gather->start, gather->end - gather->start + 1);
 	amd_iommu_domain_flush_complete(dom);
 	spin_unlock_irqrestore(&dom->lock, flags);
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 4/9] iommu/amd: Add support to invalidate multiple guest pages
  2023-11-22  9:02 [PATCH v2 0/9] Improve TLB invalidation logic Vasant Hegde
                   ` (2 preceding siblings ...)
  2023-11-22  9:02 ` [PATCH v2 3/9] iommu/amd: Remove redundant passing of PDE bit Vasant Hegde
@ 2023-11-22  9:02 ` Vasant Hegde
  2023-11-22  9:02 ` [PATCH v2 5/9] iommu/amd: Refactor IOMMU tlb invalidation code Vasant Hegde
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Vasant Hegde @ 2023-11-22  9:02 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, jgg, Vasant Hegde, Jason Gunthorpe

Current interface supports invalidating single page or entire guest
translation information for a single process address space.

IOMMU CMD_INV_IOMMU_PAGES and CMD_INV_IOTLB_PAGES commands supports
invalidating range of pages. Add support to invalidate multiple pages.

This is preparatory patch before consolidating host and guest
invalidation code into single function. Following patches will
consolidation tlb invalidation code.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c | 31 +++++++++++++------------------
 1 file changed, 13 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 1ad889acf8cb..68dc19784f4f 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1152,40 +1152,36 @@ static void build_inv_iotlb_pages(struct iommu_cmd *cmd, u16 devid, int qdep,
 }
 
 static void build_inv_iommu_pasid(struct iommu_cmd *cmd, u16 domid, u32 pasid,
-				  u64 address, bool size)
+				  u64 address, size_t size)
 {
-	memset(cmd, 0, sizeof(*cmd));
+	u64 inv_address = build_inv_address(address, size);
 
-	address &= ~(0xfffULL);
+	memset(cmd, 0, sizeof(*cmd));
 
 	cmd->data[0]  = pasid;
 	cmd->data[1]  = domid;
-	cmd->data[2]  = lower_32_bits(address);
-	cmd->data[3]  = upper_32_bits(address);
+	cmd->data[2]  = lower_32_bits(inv_address);
+	cmd->data[3]  = upper_32_bits(inv_address);
 	cmd->data[2] |= CMD_INV_IOMMU_PAGES_PDE_MASK;
 	cmd->data[2] |= CMD_INV_IOMMU_PAGES_GN_MASK;
-	if (size)
-		cmd->data[2] |= CMD_INV_IOMMU_PAGES_SIZE_MASK;
 	CMD_SET_TYPE(cmd, CMD_INV_IOMMU_PAGES);
 }
 
 static void build_inv_iotlb_pasid(struct iommu_cmd *cmd, u16 devid, u32 pasid,
-				  int qdep, u64 address, bool size)
+				  int qdep, u64 address, size_t size)
 {
-	memset(cmd, 0, sizeof(*cmd));
+	u64 inv_address = build_inv_address(address, size);
 
-	address &= ~(0xfffULL);
+	memset(cmd, 0, sizeof(*cmd));
 
 	cmd->data[0]  = devid;
 	cmd->data[0] |= ((pasid >> 8) & 0xff) << 16;
 	cmd->data[0] |= (qdep  & 0xff) << 24;
 	cmd->data[1]  = devid;
 	cmd->data[1] |= (pasid & 0xff) << 16;
-	cmd->data[2]  = lower_32_bits(address);
+	cmd->data[2]  = lower_32_bits(inv_address);
 	cmd->data[2] |= CMD_INV_IOMMU_PAGES_GN_MASK;
-	cmd->data[3]  = upper_32_bits(address);
-	if (size)
-		cmd->data[2] |= CMD_INV_IOMMU_PAGES_SIZE_MASK;
+	cmd->data[3]  = upper_32_bits(inv_address);
 	CMD_SET_TYPE(cmd, CMD_INV_IOTLB_PAGES);
 }
 
@@ -2656,7 +2652,7 @@ const struct iommu_ops amd_iommu_ops = {
 };
 
 static int __flush_pasid(struct protection_domain *domain, u32 pasid,
-			 u64 address, bool size)
+			 u64 address, size_t size)
 {
 	struct iommu_dev_data *dev_data;
 	struct iommu_cmd cmd;
@@ -2720,7 +2716,7 @@ static int __flush_pasid(struct protection_domain *domain, u32 pasid,
 static int __amd_iommu_flush_page(struct protection_domain *domain, u32 pasid,
 				  u64 address)
 {
-	return __flush_pasid(domain, pasid, address, false);
+	return __flush_pasid(domain, pasid, address, PAGE_SIZE);
 }
 
 int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid,
@@ -2739,8 +2735,7 @@ int amd_iommu_flush_page(struct iommu_domain *dom, u32 pasid,
 
 static int __amd_iommu_flush_tlb(struct protection_domain *domain, u32 pasid)
 {
-	return __flush_pasid(domain, pasid, CMD_INV_IOMMU_ALL_PAGES_ADDRESS,
-			     true);
+	return __flush_pasid(domain, pasid, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS);
 }
 
 int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 5/9] iommu/amd: Refactor IOMMU tlb invalidation code
  2023-11-22  9:02 [PATCH v2 0/9] Improve TLB invalidation logic Vasant Hegde
                   ` (3 preceding siblings ...)
  2023-11-22  9:02 ` [PATCH v2 4/9] iommu/amd: Add support to invalidate multiple guest pages Vasant Hegde
@ 2023-11-22  9:02 ` Vasant Hegde
  2023-11-22  9:02 ` [PATCH v2 6/9] iommu/amd: Refactor device iotlb " Vasant Hegde
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Vasant Hegde @ 2023-11-22  9:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, jgg, Vasant Hegde, Kishon Vijay Abraham I,
	Jason Gunthorpe

build_inv_iommu_pages() and build_inv_iommu_pasid() pretty much
duplicates the code. Hence enhance build_inv_iommu_pages() to
invalidate guest pages as well. And remove build_inv_iommu_pasid().

Suggested-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c | 36 ++++++++++++++----------------------
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 68dc19784f4f..913875dc8730 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1124,17 +1124,23 @@ static inline u64 build_inv_address(u64 address, size_t size)
 }
 
 static void build_inv_iommu_pages(struct iommu_cmd *cmd, u64 address,
-				  size_t size, u16 domid)
+				  size_t size, u16 domid,
+				  ioasid_t pasid, bool gn)
 {
 	u64 inv_address = build_inv_address(address, size);
 
 	memset(cmd, 0, sizeof(*cmd));
+
 	cmd->data[1] |= domid;
 	cmd->data[2]  = lower_32_bits(inv_address);
 	cmd->data[3]  = upper_32_bits(inv_address);
-	CMD_SET_TYPE(cmd, CMD_INV_IOMMU_PAGES);
 	/* PDE bit - we want to flush everything, not only the PTEs */
 	cmd->data[2] |= CMD_INV_IOMMU_PAGES_PDE_MASK;
+	if (gn) {
+		cmd->data[0] |= pasid;
+		cmd->data[2] |= CMD_INV_IOMMU_PAGES_GN_MASK;
+	}
+	CMD_SET_TYPE(cmd, CMD_INV_IOMMU_PAGES);
 }
 
 static void build_inv_iotlb_pages(struct iommu_cmd *cmd, u16 devid, int qdep,
@@ -1151,22 +1157,6 @@ static void build_inv_iotlb_pages(struct iommu_cmd *cmd, u16 devid, int qdep,
 	CMD_SET_TYPE(cmd, CMD_INV_IOTLB_PAGES);
 }
 
-static void build_inv_iommu_pasid(struct iommu_cmd *cmd, u16 domid, u32 pasid,
-				  u64 address, size_t size)
-{
-	u64 inv_address = build_inv_address(address, size);
-
-	memset(cmd, 0, sizeof(*cmd));
-
-	cmd->data[0]  = pasid;
-	cmd->data[1]  = domid;
-	cmd->data[2]  = lower_32_bits(inv_address);
-	cmd->data[3]  = upper_32_bits(inv_address);
-	cmd->data[2] |= CMD_INV_IOMMU_PAGES_PDE_MASK;
-	cmd->data[2] |= CMD_INV_IOMMU_PAGES_GN_MASK;
-	CMD_SET_TYPE(cmd, CMD_INV_IOMMU_PAGES);
-}
-
 static void build_inv_iotlb_pasid(struct iommu_cmd *cmd, u16 devid, u32 pasid,
 				  int qdep, u64 address, size_t size)
 {
@@ -1337,7 +1327,7 @@ static void amd_iommu_flush_tlb_all(struct amd_iommu *iommu)
 	for (dom_id = 0; dom_id <= last_bdf; ++dom_id) {
 		struct iommu_cmd cmd;
 		build_inv_iommu_pages(&cmd, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS,
-				      dom_id);
+				      dom_id, IOMMU_NO_PASID, false);
 		iommu_queue_command(iommu, &cmd);
 	}
 
@@ -1348,7 +1338,8 @@ static void amd_iommu_flush_tlb_domid(struct amd_iommu *iommu, u32 dom_id)
 {
 	struct iommu_cmd cmd;
 
-	build_inv_iommu_pages(&cmd, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS, dom_id);
+	build_inv_iommu_pages(&cmd, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS,
+			      dom_id, IOMMU_NO_PASID, false);
 	iommu_queue_command(iommu, &cmd);
 
 	iommu_completion_wait(iommu);
@@ -1477,7 +1468,8 @@ static void __domain_flush_pages(struct protection_domain *domain,
 	struct iommu_cmd cmd;
 	int ret = 0, i;
 
-	build_inv_iommu_pages(&cmd, address, size, domain->id);
+	build_inv_iommu_pages(&cmd, address, size, domain->id,
+			      IOMMU_NO_PASID, false);
 
 	for (i = 0; i < amd_iommu_get_num_iommus(); ++i) {
 		if (!domain->dev_iommu[i])
@@ -2661,7 +2653,7 @@ static int __flush_pasid(struct protection_domain *domain, u32 pasid,
 	if (!(domain->flags & PD_IOMMUV2_MASK))
 		return -EINVAL;
 
-	build_inv_iommu_pasid(&cmd, domain->id, pasid, address, size);
+	build_inv_iommu_pages(&cmd, address, size, domain->id, pasid, true);
 
 	/*
 	 * IOMMU TLB needs to be flushed before Device TLB to
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 6/9] iommu/amd: Refactor device iotlb invalidation code
  2023-11-22  9:02 [PATCH v2 0/9] Improve TLB invalidation logic Vasant Hegde
                   ` (4 preceding siblings ...)
  2023-11-22  9:02 ` [PATCH v2 5/9] iommu/amd: Refactor IOMMU tlb invalidation code Vasant Hegde
@ 2023-11-22  9:02 ` Vasant Hegde
  2023-11-22  9:02 ` [PATCH v2 7/9] iommu/amd: Consolidate amd_iommu_domain_flush_complete() call Vasant Hegde
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Vasant Hegde @ 2023-11-22  9:02 UTC (permalink / raw)
  To: iommu, joro
  Cc: suravee.suthikulpanit, jgg, Vasant Hegde, Kishon Vijay Abraham I,
	Jason Gunthorpe

build_inv_iotlb_pages() and build_inv_iotlb_pasid() pretty much duplicates
the code. Enhance build_inv_iotlb_pages() to invalidate guest IOTLB as
well. And remove build_inv_iotlb_pasid() function.

Suggested-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/iommu.c | 33 ++++++++++++---------------------
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 913875dc8730..11bc222be528 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1144,34 +1144,24 @@ static void build_inv_iommu_pages(struct iommu_cmd *cmd, u64 address,
 }
 
 static void build_inv_iotlb_pages(struct iommu_cmd *cmd, u16 devid, int qdep,
-				  u64 address, size_t size)
+				  u64 address, size_t size,
+				  ioasid_t pasid, bool gn)
 {
 	u64 inv_address = build_inv_address(address, size);
 
 	memset(cmd, 0, sizeof(*cmd));
+
 	cmd->data[0]  = devid;
 	cmd->data[0] |= (qdep & 0xff) << 24;
 	cmd->data[1]  = devid;
 	cmd->data[2]  = lower_32_bits(inv_address);
 	cmd->data[3]  = upper_32_bits(inv_address);
-	CMD_SET_TYPE(cmd, CMD_INV_IOTLB_PAGES);
-}
-
-static void build_inv_iotlb_pasid(struct iommu_cmd *cmd, u16 devid, u32 pasid,
-				  int qdep, u64 address, size_t size)
-{
-	u64 inv_address = build_inv_address(address, size);
-
-	memset(cmd, 0, sizeof(*cmd));
+	if (gn) {
+		cmd->data[0] |= ((pasid >> 8) & 0xff) << 16;
+		cmd->data[1] |= (pasid & 0xff) << 16;
+		cmd->data[2] |= CMD_INV_IOMMU_PAGES_GN_MASK;
+	}
 
-	cmd->data[0]  = devid;
-	cmd->data[0] |= ((pasid >> 8) & 0xff) << 16;
-	cmd->data[0] |= (qdep  & 0xff) << 24;
-	cmd->data[1]  = devid;
-	cmd->data[1] |= (pasid & 0xff) << 16;
-	cmd->data[2]  = lower_32_bits(inv_address);
-	cmd->data[2] |= CMD_INV_IOMMU_PAGES_GN_MASK;
-	cmd->data[3]  = upper_32_bits(inv_address);
 	CMD_SET_TYPE(cmd, CMD_INV_IOTLB_PAGES);
 }
 
@@ -1404,7 +1394,8 @@ static int device_flush_iotlb(struct iommu_dev_data *dev_data,
 	if (!iommu)
 		return -EINVAL;
 
-	build_inv_iotlb_pages(&cmd, dev_data->devid, qdep, address, size);
+	build_inv_iotlb_pages(&cmd, dev_data->devid, qdep, address,
+			      size, IOMMU_NO_PASID, false);
 
 	return iommu_queue_command(iommu, &cmd);
 }
@@ -2687,8 +2678,8 @@ static int __flush_pasid(struct protection_domain *domain, u32 pasid,
 		iommu = rlookup_amd_iommu(dev_data->dev);
 		if (!iommu)
 			continue;
-		build_inv_iotlb_pasid(&cmd, dev_data->devid, pasid,
-				      qdep, address, size);
+		build_inv_iotlb_pages(&cmd, dev_data->devid, qdep,
+				      address, size, pasid, true);
 
 		ret = iommu_queue_command(iommu, &cmd);
 		if (ret != 0)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 7/9] iommu/amd: Consolidate amd_iommu_domain_flush_complete() call
  2023-11-22  9:02 [PATCH v2 0/9] Improve TLB invalidation logic Vasant Hegde
                   ` (5 preceding siblings ...)
  2023-11-22  9:02 ` [PATCH v2 6/9] iommu/amd: Refactor device iotlb " Vasant Hegde
@ 2023-11-22  9:02 ` Vasant Hegde
  2023-11-30 17:51   ` Jason Gunthorpe
  2023-11-22  9:02 ` [PATCH v2 8/9] iommu/amd: Make domain_flush_pages as global function Vasant Hegde
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 14+ messages in thread
From: Vasant Hegde @ 2023-11-22  9:02 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, jgg, Vasant Hegde

Call amd_iommu_domain_flush_complete() from domain_flush_pages().
That way we can remove explicit call of amd_iommu_domain_flush_complete()
from various places.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/io_pgtable.c |  1 -
 drivers/iommu/amd/iommu.c      | 21 ++++++++++-----------
 2 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 6c0621f6f572..ca22546e4d1a 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -425,7 +425,6 @@ static int iommu_v1_map_pages(struct io_pgtable_ops *ops, unsigned long iova,
 		 * increase_address_space().
 		 */
 		amd_iommu_domain_flush_tlb_pde(dom);
-		amd_iommu_domain_flush_complete(dom);
 		spin_unlock_irqrestore(&dom->lock, flags);
 	}
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 11bc222be528..279f0da896d0 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1489,6 +1489,10 @@ static void domain_flush_pages(struct protection_domain *domain,
 {
 	if (likely(!amd_iommu_np_cache)) {
 		__domain_flush_pages(domain, address, size);
+
+		/* Wait until IOMMU TLB and all device IOTLB flushes are complete */
+		amd_iommu_domain_flush_complete(domain);
+
 		return;
 	}
 
@@ -1525,6 +1529,9 @@ static void domain_flush_pages(struct protection_domain *domain,
 		address += flush_size;
 		size -= flush_size;
 	}
+
+	/* Wait until IOMMU TLB and all device IOTLB flushes are complete */
+	amd_iommu_domain_flush_complete(domain);
 }
 
 /* Flush the whole IO/TLB for a given protection domain - including PDE */
@@ -1558,7 +1565,6 @@ static void domain_flush_np_cache(struct protection_domain *domain,
 
 		spin_lock_irqsave(&domain->lock, flags);
 		domain_flush_pages(domain, iova, size);
-		amd_iommu_domain_flush_complete(domain);
 		spin_unlock_irqrestore(&domain->lock, flags);
 	}
 }
@@ -1836,12 +1842,9 @@ static void do_detach(struct iommu_dev_data *dev_data)
 	/* Flush the DTE entry */
 	device_flush_dte(dev_data);
 
-	/* Flush IOTLB */
+	/* Flush IOTLB and wait for the flushes to finish */
 	amd_iommu_domain_flush_tlb_pde(domain);
 
-	/* Wait for the flushes to finish */
-	amd_iommu_domain_flush_complete(domain);
-
 	/* decrease reference counters - needs to happen after the flushes */
 	domain->dev_iommu[iommu->index] -= 1;
 	domain->dev_cnt                 -= 1;
@@ -2018,7 +2021,6 @@ void amd_iommu_domain_update(struct protection_domain *domain)
 
 	/* Flush domain TLB(s) and wait for completion */
 	amd_iommu_domain_flush_tlb_pde(domain);
-	amd_iommu_domain_flush_complete(domain);
 }
 
 /*****************************************************************************
@@ -2451,10 +2453,9 @@ static int amd_iommu_set_dirty_tracking(struct iommu_domain *domain,
 	}
 
 	/* Flush IOTLB to mark IOPTE dirty on the next translation(s) */
-	if (domain_flush) {
+	if (domain_flush)
 		amd_iommu_domain_flush_tlb_pde(pdomain);
-		amd_iommu_domain_flush_complete(pdomain);
-	}
+
 	pdomain->dirty_tracking = enable;
 	spin_unlock_irqrestore(&pdomain->lock, flags);
 
@@ -2558,7 +2559,6 @@ static void amd_iommu_flush_iotlb_all(struct iommu_domain *domain)
 
 	spin_lock_irqsave(&dom->lock, flags);
 	amd_iommu_domain_flush_tlb_pde(dom);
-	amd_iommu_domain_flush_complete(dom);
 	spin_unlock_irqrestore(&dom->lock, flags);
 }
 
@@ -2570,7 +2570,6 @@ static void amd_iommu_iotlb_sync(struct iommu_domain *domain,
 
 	spin_lock_irqsave(&dom->lock, flags);
 	domain_flush_pages(dom, gather->start, gather->end - gather->start + 1);
-	amd_iommu_domain_flush_complete(dom);
 	spin_unlock_irqrestore(&dom->lock, flags);
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 8/9] iommu/amd: Make domain_flush_pages as global function
  2023-11-22  9:02 [PATCH v2 0/9] Improve TLB invalidation logic Vasant Hegde
                   ` (6 preceding siblings ...)
  2023-11-22  9:02 ` [PATCH v2 7/9] iommu/amd: Consolidate amd_iommu_domain_flush_complete() call Vasant Hegde
@ 2023-11-22  9:02 ` Vasant Hegde
  2023-11-30 17:52   ` Jason Gunthorpe
  2023-11-22  9:02 ` [PATCH v2 9/9] iommu/amd/pgtbl_v2: Invalidate updated page ranges only Vasant Hegde
  2023-12-11 14:26 ` [PATCH v2 0/9] Improve TLB invalidation logic Joerg Roedel
  9 siblings, 1 reply; 14+ messages in thread
From: Vasant Hegde @ 2023-11-22  9:02 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, jgg, Vasant Hegde

- Rename domain_flush_pages() -> amd_iommu_domain_flush_pages() and make
  it as global function.

- Rename amd_iommu_domain_flush_tlb_pde() -> amd_iommu_domain_flush_all()
  and make it as static.

- Convert v1 page table (io_pgtble.c) to use amd_iommu_domain_flush_pages().

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/amd_iommu.h  |  3 ++-
 drivers/iommu/amd/io_pgtable.c |  4 +++-
 drivers/iommu/amd/iommu.c      | 22 ++++++++++++----------
 3 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 234db57cd320..8b3601f285fd 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -61,7 +61,8 @@ void amd_iommu_flush_all_caches(struct amd_iommu *iommu);
 void amd_iommu_update_and_flush_device_table(struct protection_domain *domain);
 void amd_iommu_domain_update(struct protection_domain *domain);
 void amd_iommu_domain_flush_complete(struct protection_domain *domain);
-void amd_iommu_domain_flush_tlb_pde(struct protection_domain *domain);
+void amd_iommu_domain_flush_pages(struct protection_domain *domain,
+				  u64 address, size_t size);
 int amd_iommu_flush_tlb(struct iommu_domain *dom, u32 pasid);
 int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, u32 pasid,
 			      unsigned long cr3);
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index ca22546e4d1a..2a0d1e97e52f 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -369,6 +369,8 @@ static int iommu_v1_map_pages(struct io_pgtable_ops *ops, unsigned long iova,
 	bool updated = false;
 	u64 __pte, *pte;
 	int ret, i, count;
+	size_t size = pgcount << __ffs(pgsize);
+	unsigned long o_iova = iova;
 
 	BUG_ON(!IS_ALIGNED(iova, pgsize));
 	BUG_ON(!IS_ALIGNED(paddr, pgsize));
@@ -424,7 +426,7 @@ static int iommu_v1_map_pages(struct io_pgtable_ops *ops, unsigned long iova,
 		 * Updates and flushing already happened in
 		 * increase_address_space().
 		 */
-		amd_iommu_domain_flush_tlb_pde(dom);
+		amd_iommu_domain_flush_pages(dom, o_iova, size);
 		spin_unlock_irqrestore(&dom->lock, flags);
 	}
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 279f0da896d0..a52e795c4cfa 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1484,8 +1484,8 @@ static void __domain_flush_pages(struct protection_domain *domain,
 	WARN_ON(ret);
 }
 
-static void domain_flush_pages(struct protection_domain *domain,
-			       u64 address, size_t size)
+void amd_iommu_domain_flush_pages(struct protection_domain *domain,
+				  u64 address, size_t size)
 {
 	if (likely(!amd_iommu_np_cache)) {
 		__domain_flush_pages(domain, address, size);
@@ -1535,9 +1535,10 @@ static void domain_flush_pages(struct protection_domain *domain,
 }
 
 /* Flush the whole IO/TLB for a given protection domain - including PDE */
-void amd_iommu_domain_flush_tlb_pde(struct protection_domain *domain)
+static void amd_iommu_domain_flush_all(struct protection_domain *domain)
 {
-	domain_flush_pages(domain, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS);
+	amd_iommu_domain_flush_pages(domain, 0,
+				     CMD_INV_IOMMU_ALL_PAGES_ADDRESS);
 }
 
 void amd_iommu_domain_flush_complete(struct protection_domain *domain)
@@ -1564,7 +1565,7 @@ static void domain_flush_np_cache(struct protection_domain *domain,
 		unsigned long flags;
 
 		spin_lock_irqsave(&domain->lock, flags);
-		domain_flush_pages(domain, iova, size);
+		amd_iommu_domain_flush_pages(domain, iova, size);
 		spin_unlock_irqrestore(&domain->lock, flags);
 	}
 }
@@ -1843,7 +1844,7 @@ static void do_detach(struct iommu_dev_data *dev_data)
 	device_flush_dte(dev_data);
 
 	/* Flush IOTLB and wait for the flushes to finish */
-	amd_iommu_domain_flush_tlb_pde(domain);
+	amd_iommu_domain_flush_all(domain);
 
 	/* decrease reference counters - needs to happen after the flushes */
 	domain->dev_iommu[iommu->index] -= 1;
@@ -2020,7 +2021,7 @@ void amd_iommu_domain_update(struct protection_domain *domain)
 	amd_iommu_update_and_flush_device_table(domain);
 
 	/* Flush domain TLB(s) and wait for completion */
-	amd_iommu_domain_flush_tlb_pde(domain);
+	amd_iommu_domain_flush_all(domain);
 }
 
 /*****************************************************************************
@@ -2454,7 +2455,7 @@ static int amd_iommu_set_dirty_tracking(struct iommu_domain *domain,
 
 	/* Flush IOTLB to mark IOPTE dirty on the next translation(s) */
 	if (domain_flush)
-		amd_iommu_domain_flush_tlb_pde(pdomain);
+		amd_iommu_domain_flush_all(pdomain);
 
 	pdomain->dirty_tracking = enable;
 	spin_unlock_irqrestore(&pdomain->lock, flags);
@@ -2558,7 +2559,7 @@ static void amd_iommu_flush_iotlb_all(struct iommu_domain *domain)
 	unsigned long flags;
 
 	spin_lock_irqsave(&dom->lock, flags);
-	amd_iommu_domain_flush_tlb_pde(dom);
+	amd_iommu_domain_flush_all(dom);
 	spin_unlock_irqrestore(&dom->lock, flags);
 }
 
@@ -2569,7 +2570,8 @@ static void amd_iommu_iotlb_sync(struct iommu_domain *domain,
 	unsigned long flags;
 
 	spin_lock_irqsave(&dom->lock, flags);
-	domain_flush_pages(dom, gather->start, gather->end - gather->start + 1);
+	amd_iommu_domain_flush_pages(dom, gather->start,
+				     gather->end - gather->start + 1);
 	spin_unlock_irqrestore(&dom->lock, flags);
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 9/9] iommu/amd/pgtbl_v2: Invalidate updated page ranges only
  2023-11-22  9:02 [PATCH v2 0/9] Improve TLB invalidation logic Vasant Hegde
                   ` (7 preceding siblings ...)
  2023-11-22  9:02 ` [PATCH v2 8/9] iommu/amd: Make domain_flush_pages as global function Vasant Hegde
@ 2023-11-22  9:02 ` Vasant Hegde
  2023-11-30 17:53   ` Jason Gunthorpe
  2023-12-11 14:26 ` [PATCH v2 0/9] Improve TLB invalidation logic Joerg Roedel
  9 siblings, 1 reply; 14+ messages in thread
From: Vasant Hegde @ 2023-11-22  9:02 UTC (permalink / raw)
  To: iommu, joro; +Cc: suravee.suthikulpanit, jgg, Vasant Hegde

Enhance __domain_flush_pages() to detect domain page table mode and use
that info to build invalidation commands. So that we can use
amd_iommu_domain_flush_pages() to invalidate v2 page table.

Also pass PASID, gn variable to device_flush_iotlb() so that it can build
IOTLB invalidation command for both v1 and v2 page table.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
 drivers/iommu/amd/io_pgtable_v2.c | 10 ++--------
 drivers/iommu/amd/iommu.c         | 28 ++++++++++++++++++++--------
 2 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/amd/io_pgtable_v2.c b/drivers/iommu/amd/io_pgtable_v2.c
index f818a7e254d4..6d69ba60744f 100644
--- a/drivers/iommu/amd/io_pgtable_v2.c
+++ b/drivers/iommu/amd/io_pgtable_v2.c
@@ -244,7 +244,6 @@ static int iommu_v2_map_pages(struct io_pgtable_ops *ops, unsigned long iova,
 	unsigned long mapped_size = 0;
 	unsigned long o_iova = iova;
 	size_t size = pgcount << __ffs(pgsize);
-	int count = 0;
 	int ret = 0;
 	bool updated = false;
 
@@ -265,19 +264,14 @@ static int iommu_v2_map_pages(struct io_pgtable_ops *ops, unsigned long iova,
 
 		*pte = set_pte_attr(paddr, map_size, prot);
 
-		count++;
 		iova += map_size;
 		paddr += map_size;
 		mapped_size += map_size;
 	}
 
 out:
-	if (updated) {
-		if (count > 1)
-			amd_iommu_flush_tlb(&pdom->domain, 0);
-		else
-			amd_iommu_flush_page(&pdom->domain, 0, o_iova);
-	}
+	if (updated)
+		amd_iommu_domain_flush_pages(pdom, o_iova, size);
 
 	if (mapped)
 		*mapped += mapped_size;
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a52e795c4cfa..849935ac9372 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -85,6 +85,11 @@ static void detach_device(struct device *dev);
  *
  ****************************************************************************/
 
+static inline bool pdom_is_v2_pgtbl_mode(struct protection_domain *pdom)
+{
+	return (pdom && (pdom->flags & PD_IOMMUV2_MASK));
+}
+
 static inline int get_acpihid_device_id(struct device *dev,
 					struct acpihid_map_entry **entry)
 {
@@ -1382,8 +1387,8 @@ void amd_iommu_flush_all_caches(struct amd_iommu *iommu)
 /*
  * Command send function for flushing on-device TLB
  */
-static int device_flush_iotlb(struct iommu_dev_data *dev_data,
-			      u64 address, size_t size)
+static int device_flush_iotlb(struct iommu_dev_data *dev_data, u64 address,
+			      size_t size, ioasid_t pasid, bool gn)
 {
 	struct amd_iommu *iommu;
 	struct iommu_cmd cmd;
@@ -1395,7 +1400,7 @@ static int device_flush_iotlb(struct iommu_dev_data *dev_data,
 		return -EINVAL;
 
 	build_inv_iotlb_pages(&cmd, dev_data->devid, qdep, address,
-			      size, IOMMU_NO_PASID, false);
+			      size, pasid, gn);
 
 	return iommu_queue_command(iommu, &cmd);
 }
@@ -1441,8 +1446,11 @@ static int device_flush_dte(struct iommu_dev_data *dev_data)
 			return ret;
 	}
 
-	if (dev_data->ats_enabled)
-		ret = device_flush_iotlb(dev_data, 0, ~0UL);
+	if (dev_data->ats_enabled) {
+		/* Invalidate the entire contents of an IOTLB */
+		ret = device_flush_iotlb(dev_data, 0, ~0UL,
+					 IOMMU_NO_PASID, false);
+	}
 
 	return ret;
 }
@@ -1458,9 +1466,13 @@ static void __domain_flush_pages(struct protection_domain *domain,
 	struct iommu_dev_data *dev_data;
 	struct iommu_cmd cmd;
 	int ret = 0, i;
+	ioasid_t pasid = IOMMU_NO_PASID;
+	bool gn = false;
+
+	if (pdom_is_v2_pgtbl_mode(domain))
+		gn = true;
 
-	build_inv_iommu_pages(&cmd, address, size, domain->id,
-			      IOMMU_NO_PASID, false);
+	build_inv_iommu_pages(&cmd, address, size, domain->id, pasid, gn);
 
 	for (i = 0; i < amd_iommu_get_num_iommus(); ++i) {
 		if (!domain->dev_iommu[i])
@@ -1478,7 +1490,7 @@ static void __domain_flush_pages(struct protection_domain *domain,
 		if (!dev_data->ats_enabled)
 			continue;
 
-		ret |= device_flush_iotlb(dev_data, address, size);
+		ret |= device_flush_iotlb(dev_data, address, size, pasid, gn);
 	}
 
 	WARN_ON(ret);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 7/9] iommu/amd: Consolidate amd_iommu_domain_flush_complete() call
  2023-11-22  9:02 ` [PATCH v2 7/9] iommu/amd: Consolidate amd_iommu_domain_flush_complete() call Vasant Hegde
@ 2023-11-30 17:51   ` Jason Gunthorpe
  0 siblings, 0 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2023-11-30 17:51 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit

On Wed, Nov 22, 2023 at 09:02:13AM +0000, Vasant Hegde wrote:
> Call amd_iommu_domain_flush_complete() from domain_flush_pages().
> That way we can remove explicit call of amd_iommu_domain_flush_complete()
> from various places.
> 
> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> ---
>  drivers/iommu/amd/io_pgtable.c |  1 -
>  drivers/iommu/amd/iommu.c      | 21 ++++++++++-----------
>  2 files changed, 10 insertions(+), 12 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 8/9] iommu/amd: Make domain_flush_pages as global function
  2023-11-22  9:02 ` [PATCH v2 8/9] iommu/amd: Make domain_flush_pages as global function Vasant Hegde
@ 2023-11-30 17:52   ` Jason Gunthorpe
  0 siblings, 0 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2023-11-30 17:52 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit

On Wed, Nov 22, 2023 at 09:02:14AM +0000, Vasant Hegde wrote:
> - Rename domain_flush_pages() -> amd_iommu_domain_flush_pages() and make
>   it as global function.
> 
> - Rename amd_iommu_domain_flush_tlb_pde() -> amd_iommu_domain_flush_all()
>   and make it as static.
> 
> - Convert v1 page table (io_pgtble.c) to use amd_iommu_domain_flush_pages().
> 
> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> ---
>  drivers/iommu/amd/amd_iommu.h  |  3 ++-
>  drivers/iommu/amd/io_pgtable.c |  4 +++-
>  drivers/iommu/amd/iommu.c      | 22 ++++++++++++----------
>  3 files changed, 17 insertions(+), 12 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 9/9] iommu/amd/pgtbl_v2: Invalidate updated page ranges only
  2023-11-22  9:02 ` [PATCH v2 9/9] iommu/amd/pgtbl_v2: Invalidate updated page ranges only Vasant Hegde
@ 2023-11-30 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2023-11-30 17:53 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit

On Wed, Nov 22, 2023 at 09:02:15AM +0000, Vasant Hegde wrote:
> Enhance __domain_flush_pages() to detect domain page table mode and use
> that info to build invalidation commands. So that we can use
> amd_iommu_domain_flush_pages() to invalidate v2 page table.
> 
> Also pass PASID, gn variable to device_flush_iotlb() so that it can build
> IOTLB invalidation command for both v1 and v2 page table.
> 
> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
> ---
>  drivers/iommu/amd/io_pgtable_v2.c | 10 ++--------
>  drivers/iommu/amd/iommu.c         | 28 ++++++++++++++++++++--------
>  2 files changed, 22 insertions(+), 16 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/9] Improve TLB invalidation logic
  2023-11-22  9:02 [PATCH v2 0/9] Improve TLB invalidation logic Vasant Hegde
                   ` (8 preceding siblings ...)
  2023-11-22  9:02 ` [PATCH v2 9/9] iommu/amd/pgtbl_v2: Invalidate updated page ranges only Vasant Hegde
@ 2023-12-11 14:26 ` Joerg Roedel
  9 siblings, 0 replies; 14+ messages in thread
From: Joerg Roedel @ 2023-12-11 14:26 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: iommu, suravee.suthikulpanit, jgg

On Wed, Nov 22, 2023 at 09:02:06AM +0000, Vasant Hegde wrote:
> Vasant Hegde (9):
>   iommu/amd: Rename iommu_flush_all_caches() -> amd_iommu_flush_all_caches()
>   iommu/amd: Remove redundant domain flush from attach_device()
>   iommu/amd: Remove redundant passing of PDE bit
>   iommu/amd: Add support to invalidate multiple guest pages
>   iommu/amd: Refactor IOMMU tlb invalidation code
>   iommu/amd: Refactor device iotlb invalidation code
>   iommu/amd: Consolidate amd_iommu_domain_flush_complete() call
>   iommu/amd: Make domain_flush_pages as global function
>   iommu/amd/pgtbl_v2: Invalidate updated page ranges only
> 
>  drivers/iommu/amd/amd_iommu.h       |   8 +-
>  drivers/iommu/amd/amd_iommu_types.h |   6 -
>  drivers/iommu/amd/init.c            |   8 +-
>  drivers/iommu/amd/io_pgtable.c      |   5 +-
>  drivers/iommu/amd/io_pgtable_v2.c   |  10 +-
>  drivers/iommu/amd/iommu.c           | 163 ++++++++++++----------------
>  6 files changed, 88 insertions(+), 112 deletions(-)

Applied, thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-12-11 14:26 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-22  9:02 [PATCH v2 0/9] Improve TLB invalidation logic Vasant Hegde
2023-11-22  9:02 ` [PATCH v2 1/9] iommu/amd: Rename iommu_flush_all_caches() -> amd_iommu_flush_all_caches() Vasant Hegde
2023-11-22  9:02 ` [PATCH v2 2/9] iommu/amd: Remove redundant domain flush from attach_device() Vasant Hegde
2023-11-22  9:02 ` [PATCH v2 3/9] iommu/amd: Remove redundant passing of PDE bit Vasant Hegde
2023-11-22  9:02 ` [PATCH v2 4/9] iommu/amd: Add support to invalidate multiple guest pages Vasant Hegde
2023-11-22  9:02 ` [PATCH v2 5/9] iommu/amd: Refactor IOMMU tlb invalidation code Vasant Hegde
2023-11-22  9:02 ` [PATCH v2 6/9] iommu/amd: Refactor device iotlb " Vasant Hegde
2023-11-22  9:02 ` [PATCH v2 7/9] iommu/amd: Consolidate amd_iommu_domain_flush_complete() call Vasant Hegde
2023-11-30 17:51   ` Jason Gunthorpe
2023-11-22  9:02 ` [PATCH v2 8/9] iommu/amd: Make domain_flush_pages as global function Vasant Hegde
2023-11-30 17:52   ` Jason Gunthorpe
2023-11-22  9:02 ` [PATCH v2 9/9] iommu/amd/pgtbl_v2: Invalidate updated page ranges only Vasant Hegde
2023-11-30 17:53   ` Jason Gunthorpe
2023-12-11 14:26 ` [PATCH v2 0/9] Improve TLB invalidation logic Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.