public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11
@ 2024-07-02 13:08 Lu Baolu
  2024-07-02 13:08 ` [PATCH 1/7] iommu/vt-d: Handle volatile descriptor status read Lu Baolu
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Lu Baolu @ 2024-07-02 13:08 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon; +Cc: Jacob Pan, iommu, linux-kernel

Hi Will,

The following changes have been queued for v6.11-rc1. They are all about
small refactoring, including:

 - Use READ_ONCE to read volatile descriptor status
 - Remove support for handling Execute-Requested requests
 - Downgrade warning for pre-enabled interrupt remapping
 - Remove calling iommu_domain_alloc()
 - Refactor the PRI enable/disable flows
 - Cleanups

These patches are based on v6.10-rc5. The complete patches are also
available at:

https://github.com/LuBaolu/intel-iommu/commits/vtd-update-for-v6.11

Please consider them for iommu next.

Best regards,
baolu

Jacob Pan (1):
  iommu/vt-d: Handle volatile descriptor status read

Lu Baolu (6):
  iommu/vt-d: Remove comment for def_domain_type
  iommu/vt-d: Remove control over Execute-Requested requests
  iommu/vt-d: Downgrade warning for pre-enabled IR
  iommu/vt-d: Add helper to allocate paging domain
  iommu/vt-d: Add helper to flush caches for context change
  iommu/vt-d: Refactor PCI PRI enabling/disabling callbacks

 drivers/iommu/intel/iommu.h         |  19 ++-
 drivers/iommu/intel/pasid.h         |  10 --
 drivers/iommu/intel/dmar.c          |   2 +-
 drivers/iommu/intel/iommu.c         | 194 +++++++++++++++++++---------
 drivers/iommu/intel/irq_remapping.c |   4 +-
 drivers/iommu/intel/pasid.c         | 109 ++++++++++++----
 6 files changed, 241 insertions(+), 97 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/7] iommu/vt-d: Handle volatile descriptor status read
  2024-07-02 13:08 [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11 Lu Baolu
@ 2024-07-02 13:08 ` Lu Baolu
  2024-07-02 13:08 ` [PATCH 2/7] iommu/vt-d: Remove comment for def_domain_type Lu Baolu
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Lu Baolu @ 2024-07-02 13:08 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon; +Cc: Jacob Pan, iommu, linux-kernel

From: Jacob Pan <jacob.jun.pan@linux.intel.com>

Queued invalidation wait descriptor status is volatile in that IOMMU
hardware writes the data upon completion.

Use READ_ONCE() to prevent compiler optimizations which ensures memory
reads every time. As a side effect, READ_ONCE() also enforces strict
types and may add an extra instruction. But it should not have negative
performance impact since we use cpu_relax anyway and the extra time(by
adding an instruction) may allow IOMMU HW request cacheline ownership
easier.

e.g. gcc 12.3
BEFORE:
	81 38 ad de 00 00       cmpl   $0x2,(%rax)

AFTER (with READ_ONCE())
    772f:       8b 00                   mov    (%rax),%eax
    7731:       3d ad de 00 00          cmp    $0x2,%eax
                                        //status data is 32 bit

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Link: https://lore.kernel.org/r/20240607173817.3914600-1-jacob.jun.pan@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/dmar.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 304e84949ca7..1c8d3141cb55 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1446,7 +1446,7 @@ int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
 	 */
 	writel(qi->free_head << shift, iommu->reg + DMAR_IQT_REG);
 
-	while (qi->desc_status[wait_index] != QI_DONE) {
+	while (READ_ONCE(qi->desc_status[wait_index]) != QI_DONE) {
 		/*
 		 * We will leave the interrupts disabled, to prevent interrupt
 		 * context to queue another cmd while a cmd is already submitted
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/7] iommu/vt-d: Remove comment for def_domain_type
  2024-07-02 13:08 [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11 Lu Baolu
  2024-07-02 13:08 ` [PATCH 1/7] iommu/vt-d: Handle volatile descriptor status read Lu Baolu
@ 2024-07-02 13:08 ` Lu Baolu
  2024-07-02 13:08 ` [PATCH 3/7] iommu/vt-d: Remove control over Execute-Requested requests Lu Baolu
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Lu Baolu @ 2024-07-02 13:08 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon; +Cc: Jacob Pan, iommu, linux-kernel

The comment for def_domain_type is outdated. Part of it is irrelevant.
Furthermore, it could just be deleted since the iommu_ops::def_domain_type
callback is properly documented in iommu.h, so individual implementations
shouldn't need to repeat that. Remove it to avoid confusion.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240624024327.234979-1-baolu.lu@linux.intel.com
---
 drivers/iommu/intel/iommu.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 2e9811bf2a4e..abf0097f899d 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2177,17 +2177,6 @@ static bool device_rmrr_is_relaxable(struct device *dev)
 		return false;
 }
 
-/*
- * Return the required default domain type for a specific device.
- *
- * @dev: the device in query
- * @startup: true if this is during early boot
- *
- * Returns:
- *  - IOMMU_DOMAIN_DMA: device requires a dynamic mapping domain
- *  - IOMMU_DOMAIN_IDENTITY: device requires an identical mapping domain
- *  - 0: both identity and dynamic domains work for this device
- */
 static int device_def_domain_type(struct device *dev)
 {
 	if (dev_is_pci(dev)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/7] iommu/vt-d: Remove control over Execute-Requested requests
  2024-07-02 13:08 [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11 Lu Baolu
  2024-07-02 13:08 ` [PATCH 1/7] iommu/vt-d: Handle volatile descriptor status read Lu Baolu
  2024-07-02 13:08 ` [PATCH 2/7] iommu/vt-d: Remove comment for def_domain_type Lu Baolu
@ 2024-07-02 13:08 ` Lu Baolu
  2024-07-02 13:08 ` [PATCH 4/7] iommu/vt-d: Downgrade warning for pre-enabled IR Lu Baolu
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Lu Baolu @ 2024-07-02 13:08 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon; +Cc: Jacob Pan, iommu, linux-kernel

The VT-d specification has removed architectural support of the requests
with pasid with a value of 1 for Execute-Requested (ER). And the NXE bit
in the pasid table entry and XD bit in the first-stage paging Entries are
deprecated accordingly.

Remove the programming of these bits to make it consistent with the spec.

Suggested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240624032351.249858-1-baolu.lu@linux.intel.com
---
 drivers/iommu/intel/iommu.h |  6 ++----
 drivers/iommu/intel/pasid.h | 10 ----------
 drivers/iommu/intel/iommu.c |  4 ++--
 drivers/iommu/intel/pasid.c |  1 -
 4 files changed, 4 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index eaf015b4353b..9a3b064126de 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -49,7 +49,6 @@
 #define DMA_FL_PTE_US		BIT_ULL(2)
 #define DMA_FL_PTE_ACCESS	BIT_ULL(5)
 #define DMA_FL_PTE_DIRTY	BIT_ULL(6)
-#define DMA_FL_PTE_XD		BIT_ULL(63)
 
 #define DMA_SL_PTE_DIRTY_BIT	9
 #define DMA_SL_PTE_DIRTY	BIT_ULL(DMA_SL_PTE_DIRTY_BIT)
@@ -831,11 +830,10 @@ static inline void dma_clear_pte(struct dma_pte *pte)
 static inline u64 dma_pte_addr(struct dma_pte *pte)
 {
 #ifdef CONFIG_64BIT
-	return pte->val & VTD_PAGE_MASK & (~DMA_FL_PTE_XD);
+	return pte->val & VTD_PAGE_MASK;
 #else
 	/* Must have a full atomic 64-bit read */
-	return  __cmpxchg64(&pte->val, 0ULL, 0ULL) &
-			VTD_PAGE_MASK & (~DMA_FL_PTE_XD);
+	return  __cmpxchg64(&pte->val, 0ULL, 0ULL) & VTD_PAGE_MASK;
 #endif
 }
 
diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h
index da9978fef7ac..dde6d3ba5ae0 100644
--- a/drivers/iommu/intel/pasid.h
+++ b/drivers/iommu/intel/pasid.h
@@ -247,16 +247,6 @@ static inline void pasid_set_page_snoop(struct pasid_entry *pe, bool value)
 	pasid_set_bits(&pe->val[1], 1 << 23, value << 23);
 }
 
-/*
- * Setup No Execute Enable bit (Bit 133) of a scalable mode PASID
- * entry. It is required when XD bit of the first level page table
- * entry is about to be set.
- */
-static inline void pasid_set_nxe(struct pasid_entry *pe)
-{
-	pasid_set_bits(&pe->val[2], 1 << 5, 1 << 5);
-}
-
 /*
  * Setup the Page Snoop (PGSNP) field (Bit 88) of a scalable mode
  * PASID entry.
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index abf0097f899d..1b5519dfa085 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -854,7 +854,7 @@ static struct dma_pte *pfn_to_dma_pte(struct dmar_domain *domain,
 			domain_flush_cache(domain, tmp_page, VTD_PAGE_SIZE);
 			pteval = ((uint64_t)virt_to_dma_pfn(tmp_page) << VTD_PAGE_SHIFT) | DMA_PTE_READ | DMA_PTE_WRITE;
 			if (domain->use_first_level)
-				pteval |= DMA_FL_PTE_XD | DMA_FL_PTE_US | DMA_FL_PTE_ACCESS;
+				pteval |= DMA_FL_PTE_US | DMA_FL_PTE_ACCESS;
 
 			tmp = 0ULL;
 			if (!try_cmpxchg64(&pte->val, &tmp, pteval))
@@ -1872,7 +1872,7 @@ __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
 	attr = prot & (DMA_PTE_READ | DMA_PTE_WRITE | DMA_PTE_SNP);
 	attr |= DMA_FL_PTE_PRESENT;
 	if (domain->use_first_level) {
-		attr |= DMA_FL_PTE_XD | DMA_FL_PTE_US | DMA_FL_PTE_ACCESS;
+		attr |= DMA_FL_PTE_US | DMA_FL_PTE_ACCESS;
 		if (prot & DMA_PTE_WRITE)
 			attr |= DMA_FL_PTE_DIRTY;
 	}
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index abce19e2ad6f..aabcdf756581 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -333,7 +333,6 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
 	pasid_set_domain_id(pte, did);
 	pasid_set_address_width(pte, iommu->agaw);
 	pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
-	pasid_set_nxe(pte);
 
 	/* Setup Present and PASID Granular Transfer Type: */
 	pasid_set_translation_type(pte, PASID_ENTRY_PGTT_FL_ONLY);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/7] iommu/vt-d: Downgrade warning for pre-enabled IR
  2024-07-02 13:08 [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11 Lu Baolu
                   ` (2 preceding siblings ...)
  2024-07-02 13:08 ` [PATCH 3/7] iommu/vt-d: Remove control over Execute-Requested requests Lu Baolu
@ 2024-07-02 13:08 ` Lu Baolu
  2024-07-02 13:08 ` [PATCH 5/7] iommu/vt-d: Add helper to allocate paging domain Lu Baolu
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Lu Baolu @ 2024-07-02 13:08 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon; +Cc: Jacob Pan, iommu, linux-kernel

Emitting a warning is overkill in intel_setup_irq_remapping() since the
interrupt remapping is pre-enabled. For example, there's no guarantee
that kexec will explicitly disable interrupt remapping before booting a
new kernel. As a result, users are seeing warning messages like below
when they kexec boot a kernel, though there is nothing wrong:

 DMAR-IR: IRQ remapping was enabled on dmar18 but we are not in kdump mode
 DMAR-IR: IRQ remapping was enabled on dmar17 but we are not in kdump mode
 DMAR-IR: IRQ remapping was enabled on dmar16 but we are not in kdump mode
 ... ...

Downgrade the severity of this message to avoid user confusion.

CC: Paul Menzel <pmenzel@molgen.mpg.de>
Link: https://lore.kernel.org/linux-iommu/5517f76a-94ad-452c-bae6-34ecc0ec4831@molgen.mpg.de/
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240625043912.258036-1-baolu.lu@linux.intel.com
---
 drivers/iommu/intel/irq_remapping.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c
index e4a70886678c..e090ca07364b 100644
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -597,8 +597,8 @@ static int intel_setup_irq_remapping(struct intel_iommu *iommu)
 
 	if (ir_pre_enabled(iommu)) {
 		if (!is_kdump_kernel()) {
-			pr_warn("IRQ remapping was enabled on %s but we are not in kdump mode\n",
-				iommu->name);
+			pr_info_once("IRQ remapping was enabled on %s but we are not in kdump mode\n",
+				     iommu->name);
 			clear_ir_pre_enabled(iommu);
 			iommu_disable_irq_remapping(iommu);
 		} else if (iommu_load_old_irte(iommu))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/7] iommu/vt-d: Add helper to allocate paging domain
  2024-07-02 13:08 [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11 Lu Baolu
                   ` (3 preceding siblings ...)
  2024-07-02 13:08 ` [PATCH 4/7] iommu/vt-d: Downgrade warning for pre-enabled IR Lu Baolu
@ 2024-07-02 13:08 ` Lu Baolu
  2024-07-02 13:25   ` Yi Liu
  2024-07-02 13:08 ` [PATCH 6/7] iommu/vt-d: Add helper to flush caches for context change Lu Baolu
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Lu Baolu @ 2024-07-02 13:08 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon; +Cc: Jacob Pan, iommu, linux-kernel

The domain_alloc_user operation is currently implemented by allocating a
paging domain using iommu_domain_alloc(). This is because it needs to fully
initialize the domain before return. Add a helper to do this to avoid using
iommu_domain_alloc().

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240610085555.88197-16-baolu.lu@linux.intel.com
---
 drivers/iommu/intel/iommu.c | 90 +++++++++++++++++++++++++++++++++----
 1 file changed, 81 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 1b5519dfa085..1f0d6892a0b6 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3622,6 +3622,79 @@ static struct iommu_domain blocking_domain = {
 	}
 };
 
+static int iommu_superpage_capability(struct intel_iommu *iommu, bool first_stage)
+{
+	if (!intel_iommu_superpage)
+		return 0;
+
+	if (first_stage)
+		return cap_fl1gp_support(iommu->cap) ? 2 : 1;
+
+	return fls(cap_super_page_val(iommu->cap));
+}
+
+static struct dmar_domain *paging_domain_alloc(struct device *dev, bool first_stage)
+{
+	struct device_domain_info *info = dev_iommu_priv_get(dev);
+	struct intel_iommu *iommu = info->iommu;
+	struct dmar_domain *domain;
+	int addr_width;
+
+	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+	if (!domain)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&domain->devices);
+	INIT_LIST_HEAD(&domain->dev_pasids);
+	INIT_LIST_HEAD(&domain->cache_tags);
+	spin_lock_init(&domain->lock);
+	spin_lock_init(&domain->cache_lock);
+	xa_init(&domain->iommu_array);
+
+	domain->nid = dev_to_node(dev);
+	domain->has_iotlb_device = info->ats_enabled;
+	domain->use_first_level = first_stage;
+
+	/* calculate the address width */
+	addr_width = agaw_to_width(iommu->agaw);
+	if (addr_width > cap_mgaw(iommu->cap))
+		addr_width = cap_mgaw(iommu->cap);
+	domain->gaw = addr_width;
+	domain->agaw = iommu->agaw;
+	domain->max_addr = __DOMAIN_MAX_ADDR(addr_width);
+
+	/* iommu memory access coherency */
+	domain->iommu_coherency = iommu_paging_structure_coherency(iommu);
+
+	/* pagesize bitmap */
+	domain->domain.pgsize_bitmap = SZ_4K;
+	domain->iommu_superpage = iommu_superpage_capability(iommu, first_stage);
+	domain->domain.pgsize_bitmap |= domain_super_pgsize_bitmap(domain);
+
+	/*
+	 * IOVA aperture: First-level translation restricts the input-address
+	 * to a canonical address (i.e., address bits 63:N have the same value
+	 * as address bit [N-1], where N is 48-bits with 4-level paging and
+	 * 57-bits with 5-level paging). Hence, skip bit [N-1].
+	 */
+	domain->domain.geometry.force_aperture = true;
+	domain->domain.geometry.aperture_start = 0;
+	if (first_stage)
+		domain->domain.geometry.aperture_end = __DOMAIN_MAX_ADDR(domain->gaw - 1);
+	else
+		domain->domain.geometry.aperture_end = __DOMAIN_MAX_ADDR(domain->gaw);
+
+	/* always allocate the top pgd */
+	domain->pgd = iommu_alloc_page_node(domain->nid, GFP_KERNEL);
+	if (!domain->pgd) {
+		kfree(domain);
+		return ERR_PTR(-ENOMEM);
+	}
+	domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
+
+	return domain;
+}
+
 static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
 {
 	struct dmar_domain *dmar_domain;
@@ -3684,15 +3757,14 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags,
 	if (user_data || (dirty_tracking && !ssads_supported(iommu)))
 		return ERR_PTR(-EOPNOTSUPP);
 
-	/*
-	 * domain_alloc_user op needs to fully initialize a domain before
-	 * return, so uses iommu_domain_alloc() here for simple.
-	 */
-	domain = iommu_domain_alloc(dev->bus);
-	if (!domain)
-		return ERR_PTR(-ENOMEM);
-
-	dmar_domain = to_dmar_domain(domain);
+	/* Do not use first stage for user domain translation. */
+	dmar_domain = paging_domain_alloc(dev, false);
+	if (IS_ERR(dmar_domain))
+		return ERR_CAST(dmar_domain);
+	domain = &dmar_domain->domain;
+	domain->type = IOMMU_DOMAIN_UNMANAGED;
+	domain->owner = &intel_iommu_ops;
+	domain->ops = intel_iommu_ops.default_domain_ops;
 
 	if (nested_parent) {
 		dmar_domain->nested_parent = true;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/7] iommu/vt-d: Add helper to flush caches for context change
  2024-07-02 13:08 [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11 Lu Baolu
                   ` (4 preceding siblings ...)
  2024-07-02 13:08 ` [PATCH 5/7] iommu/vt-d: Add helper to allocate paging domain Lu Baolu
@ 2024-07-02 13:08 ` Lu Baolu
  2024-08-14 22:27   ` Alex Williamson
  2024-07-02 13:08 ` [PATCH 7/7] iommu/vt-d: Refactor PCI PRI enabling/disabling callbacks Lu Baolu
  2024-07-03 16:13 ` [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11 Will Deacon
  7 siblings, 1 reply; 14+ messages in thread
From: Lu Baolu @ 2024-07-02 13:08 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon; +Cc: Jacob Pan, iommu, linux-kernel

This helper is used to flush the related caches following a change in a
context table entry that was previously present. The VT-d specification
provides guidance for such invalidations in section 6.5.3.3.

This helper replaces the existing open code in the code paths where a
present context entry is being torn down.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240701112317.94022-2-baolu.lu@linux.intel.com
---
 drivers/iommu/intel/iommu.h |   4 ++
 drivers/iommu/intel/iommu.c |  32 +----------
 drivers/iommu/intel/pasid.c | 106 +++++++++++++++++++++++++++++-------
 3 files changed, 92 insertions(+), 50 deletions(-)

diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 9a3b064126de..63eb3306c025 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -1143,6 +1143,10 @@ void cache_tag_flush_all(struct dmar_domain *domain);
 void cache_tag_flush_range_np(struct dmar_domain *domain, unsigned long start,
 			      unsigned long end);
 
+void intel_context_flush_present(struct device_domain_info *info,
+				 struct context_entry *context,
+				 bool affect_domains);
+
 #ifdef CONFIG_INTEL_IOMMU_SVM
 void intel_svm_check(struct intel_iommu *iommu);
 int intel_svm_enable_prq(struct intel_iommu *iommu);
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 1f0d6892a0b6..e84b0fdca107 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1359,21 +1359,6 @@ static void iommu_disable_pci_caps(struct device_domain_info *info)
 	}
 }
 
-static void __iommu_flush_dev_iotlb(struct device_domain_info *info,
-				    u64 addr, unsigned int mask)
-{
-	u16 sid, qdep;
-
-	if (!info || !info->ats_enabled)
-		return;
-
-	sid = info->bus << 8 | info->devfn;
-	qdep = info->ats_qdep;
-	qi_flush_dev_iotlb(info->iommu, sid, info->pfsid,
-			   qdep, addr, mask);
-	quirk_extra_dev_tlb_flush(info, addr, mask, IOMMU_NO_PASID, qdep);
-}
-
 static void intel_flush_iotlb_all(struct iommu_domain *domain)
 {
 	cache_tag_flush_all(to_dmar_domain(domain));
@@ -1959,7 +1944,6 @@ static void domain_context_clear_one(struct device_domain_info *info, u8 bus, u8
 {
 	struct intel_iommu *iommu = info->iommu;
 	struct context_entry *context;
-	u16 did_old;
 
 	spin_lock(&iommu->lock);
 	context = iommu_context_addr(iommu, bus, devfn, 0);
@@ -1968,24 +1952,10 @@ static void domain_context_clear_one(struct device_domain_info *info, u8 bus, u8
 		return;
 	}
 
-	did_old = context_domain_id(context);
-
 	context_clear_entry(context);
 	__iommu_flush_cache(iommu, context, sizeof(*context));
 	spin_unlock(&iommu->lock);
-	iommu->flush.flush_context(iommu,
-				   did_old,
-				   (((u16)bus) << 8) | devfn,
-				   DMA_CCMD_MASK_NOBIT,
-				   DMA_CCMD_DEVICE_INVL);
-
-	iommu->flush.flush_iotlb(iommu,
-				 did_old,
-				 0,
-				 0,
-				 DMA_TLB_DSI_FLUSH);
-
-	__iommu_flush_dev_iotlb(info, 0, MAX_AGAW_PFN_WIDTH);
+	intel_context_flush_present(info, context, true);
 }
 
 static int domain_setup_first_level(struct intel_iommu *iommu,
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index aabcdf756581..d6623d2c2050 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -691,25 +691,7 @@ static void device_pasid_table_teardown(struct device *dev, u8 bus, u8 devfn)
 	context_clear_entry(context);
 	__iommu_flush_cache(iommu, context, sizeof(*context));
 	spin_unlock(&iommu->lock);
-
-	/*
-	 * Cache invalidation for changes to a scalable-mode context table
-	 * entry.
-	 *
-	 * Section 6.5.3.3 of the VT-d spec:
-	 * - Device-selective context-cache invalidation;
-	 * - Domain-selective PASID-cache invalidation to affected domains
-	 *   (can be skipped if all PASID entries were not-present);
-	 * - Domain-selective IOTLB invalidation to affected domains;
-	 * - Global Device-TLB invalidation to affected functions.
-	 *
-	 * The iommu has been parked in the blocking state. All domains have
-	 * been detached from the device or PASID. The PASID and IOTLB caches
-	 * have been invalidated during the domain detach path.
-	 */
-	iommu->flush.flush_context(iommu, 0, PCI_DEVID(bus, devfn),
-				   DMA_CCMD_MASK_NOBIT, DMA_CCMD_DEVICE_INVL);
-	devtlb_invalidation_with_pasid(iommu, dev, IOMMU_NO_PASID);
+	intel_context_flush_present(info, context, false);
 }
 
 static int pci_pasid_table_teardown(struct pci_dev *pdev, u16 alias, void *data)
@@ -871,3 +853,89 @@ int intel_pasid_setup_sm_context(struct device *dev)
 
 	return pci_for_each_dma_alias(to_pci_dev(dev), pci_pasid_table_setup, dev);
 }
+
+/*
+ * Global Device-TLB invalidation following changes in a context entry which
+ * was present.
+ */
+static void __context_flush_dev_iotlb(struct device_domain_info *info)
+{
+	if (!info->ats_enabled)
+		return;
+
+	qi_flush_dev_iotlb(info->iommu, PCI_DEVID(info->bus, info->devfn),
+			   info->pfsid, info->ats_qdep, 0, MAX_AGAW_PFN_WIDTH);
+
+	/*
+	 * There is no guarantee that the device DMA is stopped when it reaches
+	 * here. Therefore, always attempt the extra device TLB invalidation
+	 * quirk. The impact on performance is acceptable since this is not a
+	 * performance-critical path.
+	 */
+	quirk_extra_dev_tlb_flush(info, 0, MAX_AGAW_PFN_WIDTH, IOMMU_NO_PASID,
+				  info->ats_qdep);
+}
+
+/*
+ * Cache invalidations after change in a context table entry that was present
+ * according to the Spec 6.5.3.3 (Guidance to Software for Invalidations). If
+ * IOMMU is in scalable mode and all PASID table entries of the device were
+ * non-present, set flush_domains to false. Otherwise, true.
+ */
+void intel_context_flush_present(struct device_domain_info *info,
+				 struct context_entry *context,
+				 bool flush_domains)
+{
+	struct intel_iommu *iommu = info->iommu;
+	u16 did = context_domain_id(context);
+	struct pasid_entry *pte;
+	int i;
+
+	/*
+	 * Device-selective context-cache invalidation. The Domain-ID field
+	 * of the Context-cache Invalidate Descriptor is ignored by hardware
+	 * when operating in scalable mode. Therefore the @did value doesn't
+	 * matter in scalable mode.
+	 */
+	iommu->flush.flush_context(iommu, did, PCI_DEVID(info->bus, info->devfn),
+				   DMA_CCMD_MASK_NOBIT, DMA_CCMD_DEVICE_INVL);
+
+	/*
+	 * For legacy mode:
+	 * - Domain-selective IOTLB invalidation
+	 * - Global Device-TLB invalidation to all affected functions
+	 */
+	if (!sm_supported(iommu)) {
+		iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH);
+		__context_flush_dev_iotlb(info);
+
+		return;
+	}
+
+	/*
+	 * For scalable mode:
+	 * - Domain-selective PASID-cache invalidation to affected domains
+	 * - Domain-selective IOTLB invalidation to affected domains
+	 * - Global Device-TLB invalidation to affected functions
+	 */
+	if (flush_domains) {
+		/*
+		 * If the IOMMU is running in scalable mode and there might
+		 * be potential PASID translations, the caller should hold
+		 * the lock to ensure that context changes and cache flushes
+		 * are atomic.
+		 */
+		assert_spin_locked(&iommu->lock);
+		for (i = 0; i < info->pasid_table->max_pasid; i++) {
+			pte = intel_pasid_get_entry(info->dev, i);
+			if (!pte || !pasid_pte_is_present(pte))
+				continue;
+
+			did = pasid_get_domain_id(pte);
+			qi_flush_pasid_cache(iommu, did, QI_PC_ALL_PASIDS, 0);
+			iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH);
+		}
+	}
+
+	__context_flush_dev_iotlb(info);
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 7/7] iommu/vt-d: Refactor PCI PRI enabling/disabling callbacks
  2024-07-02 13:08 [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11 Lu Baolu
                   ` (5 preceding siblings ...)
  2024-07-02 13:08 ` [PATCH 6/7] iommu/vt-d: Add helper to flush caches for context change Lu Baolu
@ 2024-07-02 13:08 ` Lu Baolu
  2024-07-03 16:13 ` [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11 Will Deacon
  7 siblings, 0 replies; 14+ messages in thread
From: Lu Baolu @ 2024-07-02 13:08 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon; +Cc: Jacob Pan, iommu, linux-kernel

Commit 0095bf83554f8 ("iommu: Improve iopf_queue_remove_device()")
specified the flow for disabling the PRI on a device. Refactor the
PRI callbacks in the intel iommu driver to better manage PRI
enabling and disabling and align it with the device queue interfaces
in the iommu core.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240701112317.94022-3-baolu.lu@linux.intel.com
---
 drivers/iommu/intel/iommu.h |  9 ++++++
 drivers/iommu/intel/iommu.c | 57 +++++++++++++++++++++++++++++++++----
 drivers/iommu/intel/pasid.c |  2 --
 3 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 63eb3306c025..b67c14da1240 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -1045,6 +1045,15 @@ static inline void context_set_sm_pre(struct context_entry *context)
 	context->lo |= BIT_ULL(4);
 }
 
+/*
+ * Clear the PRE(Page Request Enable) field of a scalable mode context
+ * entry.
+ */
+static inline void context_clear_sm_pre(struct context_entry *context)
+{
+	context->lo &= ~BIT_ULL(4);
+}
+
 /* Returns a number of VTD pages, but aligned to MM page size */
 static inline unsigned long aligned_nrpages(unsigned long host_addr, size_t size)
 {
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index e84b0fdca107..523407f6f6b2 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4244,6 +4244,37 @@ static int intel_iommu_enable_sva(struct device *dev)
 	return 0;
 }
 
+static int context_flip_pri(struct device_domain_info *info, bool enable)
+{
+	struct intel_iommu *iommu = info->iommu;
+	u8 bus = info->bus, devfn = info->devfn;
+	struct context_entry *context;
+
+	spin_lock(&iommu->lock);
+	if (context_copied(iommu, bus, devfn)) {
+		spin_unlock(&iommu->lock);
+		return -EINVAL;
+	}
+
+	context = iommu_context_addr(iommu, bus, devfn, false);
+	if (!context || !context_present(context)) {
+		spin_unlock(&iommu->lock);
+		return -ENODEV;
+	}
+
+	if (enable)
+		context_set_sm_pre(context);
+	else
+		context_clear_sm_pre(context);
+
+	if (!ecap_coherent(iommu->ecap))
+		clflush_cache_range(context, sizeof(*context));
+	intel_context_flush_present(info, context, true);
+	spin_unlock(&iommu->lock);
+
+	return 0;
+}
+
 static int intel_iommu_enable_iopf(struct device *dev)
 {
 	struct pci_dev *pdev = dev_is_pci(dev) ? to_pci_dev(dev) : NULL;
@@ -4273,15 +4304,23 @@ static int intel_iommu_enable_iopf(struct device *dev)
 	if (ret)
 		return ret;
 
+	ret = context_flip_pri(info, true);
+	if (ret)
+		goto err_remove_device;
+
 	ret = pci_enable_pri(pdev, PRQ_DEPTH);
-	if (ret) {
-		iopf_queue_remove_device(iommu->iopf_queue, dev);
-		return ret;
-	}
+	if (ret)
+		goto err_clear_pri;
 
 	info->pri_enabled = 1;
 
 	return 0;
+err_clear_pri:
+	context_flip_pri(info, false);
+err_remove_device:
+	iopf_queue_remove_device(iommu->iopf_queue, dev);
+
+	return ret;
 }
 
 static int intel_iommu_disable_iopf(struct device *dev)
@@ -4292,6 +4331,15 @@ static int intel_iommu_disable_iopf(struct device *dev)
 	if (!info->pri_enabled)
 		return -EINVAL;
 
+	/* Disable new PRI reception: */
+	context_flip_pri(info, false);
+
+	/*
+	 * Remove device from fault queue and acknowledge all outstanding
+	 * PRQs to the device:
+	 */
+	iopf_queue_remove_device(iommu->iopf_queue, dev);
+
 	/*
 	 * PCIe spec states that by clearing PRI enable bit, the Page
 	 * Request Interface will not issue new page requests, but has
@@ -4302,7 +4350,6 @@ static int intel_iommu_disable_iopf(struct device *dev)
 	 */
 	pci_disable_pri(to_pci_dev(dev));
 	info->pri_enabled = 0;
-	iopf_queue_remove_device(iommu->iopf_queue, dev);
 
 	return 0;
 }
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index d6623d2c2050..9a7b5668c723 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -749,8 +749,6 @@ static int context_entry_set_pasid_table(struct context_entry *context,
 
 	if (info->ats_supported)
 		context_set_sm_dte(context);
-	if (info->pri_supported)
-		context_set_sm_pre(context);
 	if (info->pasid_supported)
 		context_set_pasid(context);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/7] iommu/vt-d: Add helper to allocate paging domain
  2024-07-02 13:08 ` [PATCH 5/7] iommu/vt-d: Add helper to allocate paging domain Lu Baolu
@ 2024-07-02 13:25   ` Yi Liu
  0 siblings, 0 replies; 14+ messages in thread
From: Yi Liu @ 2024-07-02 13:25 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel, Will Deacon; +Cc: Jacob Pan, iommu, linux-kernel

On 2024/7/2 21:08, Lu Baolu wrote:
> The domain_alloc_user operation is currently implemented by allocating a
> paging domain using iommu_domain_alloc(). This is because it needs to fully
> initialize the domain before return. Add a helper to do this to avoid using
> iommu_domain_alloc().
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Link: https://lore.kernel.org/r/20240610085555.88197-16-baolu.lu@linux.intel.com
> ---
>   drivers/iommu/intel/iommu.c | 90 +++++++++++++++++++++++++++++++++----
>   1 file changed, 81 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 1b5519dfa085..1f0d6892a0b6 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -3622,6 +3622,79 @@ static struct iommu_domain blocking_domain = {
>   	}
>   };
>   
> +static int iommu_superpage_capability(struct intel_iommu *iommu, bool first_stage)
> +{
> +	if (!intel_iommu_superpage)
> +		return 0;
> +
> +	if (first_stage)
> +		return cap_fl1gp_support(iommu->cap) ? 2 : 1;
> +
> +	return fls(cap_super_page_val(iommu->cap));
> +}
> +
> +static struct dmar_domain *paging_domain_alloc(struct device *dev, bool first_stage)
> +{
> +	struct device_domain_info *info = dev_iommu_priv_get(dev);
> +	struct intel_iommu *iommu = info->iommu;
> +	struct dmar_domain *domain;
> +	int addr_width;
> +
> +	domain = kzalloc(sizeof(*domain), GFP_KERNEL);
> +	if (!domain)
> +		return ERR_PTR(-ENOMEM);
> +
> +	INIT_LIST_HEAD(&domain->devices);
> +	INIT_LIST_HEAD(&domain->dev_pasids);
> +	INIT_LIST_HEAD(&domain->cache_tags);
> +	spin_lock_init(&domain->lock);
> +	spin_lock_init(&domain->cache_lock);
> +	xa_init(&domain->iommu_array);
> +
> +	domain->nid = dev_to_node(dev);
> +	domain->has_iotlb_device = info->ats_enabled;
> +	domain->use_first_level = first_stage;
> +
> +	/* calculate the address width */
> +	addr_width = agaw_to_width(iommu->agaw);
> +	if (addr_width > cap_mgaw(iommu->cap))
> +		addr_width = cap_mgaw(iommu->cap);
> +	domain->gaw = addr_width;
> +	domain->agaw = iommu->agaw;
> +	domain->max_addr = __DOMAIN_MAX_ADDR(addr_width);
> +
> +	/* iommu memory access coherency */
> +	domain->iommu_coherency = iommu_paging_structure_coherency(iommu);
> +
> +	/* pagesize bitmap */
> +	domain->domain.pgsize_bitmap = SZ_4K;
> +	domain->iommu_superpage = iommu_superpage_capability(iommu, first_stage);
> +	domain->domain.pgsize_bitmap |= domain_super_pgsize_bitmap(domain);
> +
> +	/*
> +	 * IOVA aperture: First-level translation restricts the input-address
> +	 * to a canonical address (i.e., address bits 63:N have the same value
> +	 * as address bit [N-1], where N is 48-bits with 4-level paging and
> +	 * 57-bits with 5-level paging). Hence, skip bit [N-1].
> +	 */
> +	domain->domain.geometry.force_aperture = true;
> +	domain->domain.geometry.aperture_start = 0;
> +	if (first_stage)
> +		domain->domain.geometry.aperture_end = __DOMAIN_MAX_ADDR(domain->gaw - 1);
> +	else
> +		domain->domain.geometry.aperture_end = __DOMAIN_MAX_ADDR(domain->gaw);
> +
> +	/* always allocate the top pgd */
> +	domain->pgd = iommu_alloc_page_node(domain->nid, GFP_KERNEL);
> +	if (!domain->pgd) {
> +		kfree(domain);
> +		return ERR_PTR(-ENOMEM);
> +	}
> +	domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
> +
> +	return domain;
> +}
> +
>   static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
>   {
>   	struct dmar_domain *dmar_domain;
> @@ -3684,15 +3757,14 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags,
>   	if (user_data || (dirty_tracking && !ssads_supported(iommu)))
>   		return ERR_PTR(-EOPNOTSUPP);
>   
> -	/*
> -	 * domain_alloc_user op needs to fully initialize a domain before
> -	 * return, so uses iommu_domain_alloc() here for simple.

I think it is still valuable to note the requirement of full initialize
a domain. is it? Otherwise, looks good to me.

Reviewed-by: Yi Liu <yi.l.liu@intel.com>

> -	 */
> -	domain = iommu_domain_alloc(dev->bus);
> -	if (!domain)
> -		return ERR_PTR(-ENOMEM);
> -
> -	dmar_domain = to_dmar_domain(domain);
> +	/* Do not use first stage for user domain translation. */
> +	dmar_domain = paging_domain_alloc(dev, false);
> +	if (IS_ERR(dmar_domain))
> +		return ERR_CAST(dmar_domain);
> +	domain = &dmar_domain->domain;
> +	domain->type = IOMMU_DOMAIN_UNMANAGED;
> +	domain->owner = &intel_iommu_ops;
> +	domain->ops = intel_iommu_ops.default_domain_ops;
>   
>   	if (nested_parent) {
>   		dmar_domain->nested_parent = true;

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11
  2024-07-02 13:08 [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11 Lu Baolu
                   ` (6 preceding siblings ...)
  2024-07-02 13:08 ` [PATCH 7/7] iommu/vt-d: Refactor PCI PRI enabling/disabling callbacks Lu Baolu
@ 2024-07-03 16:13 ` Will Deacon
  7 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2024-07-03 16:13 UTC (permalink / raw)
  To: Joerg Roedel, Lu Baolu
  Cc: catalin.marinas, kernel-team, Will Deacon, Jacob Pan, iommu,
	linux-kernel

On Tue, 02 Jul 2024 21:08:32 +0800, Lu Baolu wrote:
> The following changes have been queued for v6.11-rc1. They are all about
> small refactoring, including:
> 
>  - Use READ_ONCE to read volatile descriptor status
>  - Remove support for handling Execute-Requested requests
>  - Downgrade warning for pre-enabled interrupt remapping
>  - Remove calling iommu_domain_alloc()
>  - Refactor the PRI enable/disable flows
>  - Cleanups
> 
> [...]

Applied to iommu (intel/vt-d), thanks!

[1/7] iommu/vt-d: Handle volatile descriptor status read
      https://git.kernel.org/iommu/c/b5e86a95541c
[2/7] iommu/vt-d: Remove comment for def_domain_type
      https://git.kernel.org/iommu/c/5fbf97371dc0
[3/7] iommu/vt-d: Remove control over Execute-Requested requests
      https://git.kernel.org/iommu/c/e995fcde6070
[4/7] iommu/vt-d: Downgrade warning for pre-enabled IR
      https://git.kernel.org/iommu/c/804f98e224e4
[5/7] iommu/vt-d: Add helper to allocate paging domain
      https://git.kernel.org/iommu/c/2b989ab9bc89
[6/7] iommu/vt-d: Add helper to flush caches for context change
      https://git.kernel.org/iommu/c/f90584f4beb8
[7/7] iommu/vt-d: Refactor PCI PRI enabling/disabling callbacks
      https://git.kernel.org/iommu/c/3753311c9190

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/7] iommu/vt-d: Add helper to flush caches for context change
  2024-07-02 13:08 ` [PATCH 6/7] iommu/vt-d: Add helper to flush caches for context change Lu Baolu
@ 2024-08-14 22:27   ` Alex Williamson
  2024-08-15  4:47     ` Baolu Lu
  0 siblings, 1 reply; 14+ messages in thread
From: Alex Williamson @ 2024-08-14 22:27 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Joerg Roedel, Will Deacon, Jacob Pan, iommu, linux-kernel


Hi Baolu,

This appears to be non-functional and breaks device assignment...

On Tue,  2 Jul 2024 21:08:38 +0800
Lu Baolu <baolu.lu@linux.intel.com> wrote:
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 1f0d6892a0b6..e84b0fdca107 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
...
> @@ -1959,7 +1944,6 @@ static void domain_context_clear_one(struct device_domain_info *info, u8 bus, u8
>  {
>  	struct intel_iommu *iommu = info->iommu;
>  	struct context_entry *context;
> -	u16 did_old;
>  
>  	spin_lock(&iommu->lock);
>  	context = iommu_context_addr(iommu, bus, devfn, 0);
> @@ -1968,24 +1952,10 @@ static void domain_context_clear_one(struct device_domain_info *info, u8 bus, u8
>  		return;
>  	}
>  
> -	did_old = context_domain_id(context);
> -

    static inline int context_domain_id(struct context_entry *c)
    {
            return((c->hi >> 8) & 0xffff);
    }

>  	context_clear_entry(context);

    static inline void context_clear_entry(struct context_entry *context)
    {       
            context->lo = 0;
            context->hi = 0;
    }

>  	__iommu_flush_cache(iommu, context, sizeof(*context));
>  	spin_unlock(&iommu->lock);
> -	iommu->flush.flush_context(iommu,
> -				   did_old,
> -				   (((u16)bus) << 8) | devfn,
> -				   DMA_CCMD_MASK_NOBIT,
> -				   DMA_CCMD_DEVICE_INVL);
> -
> -	iommu->flush.flush_iotlb(iommu,
> -				 did_old,
> -				 0,
> -				 0,
> -				 DMA_TLB_DSI_FLUSH);
> -
> -	__iommu_flush_dev_iotlb(info, 0, MAX_AGAW_PFN_WIDTH);
> +	intel_context_flush_present(info, context, true);
>  }
>  
>  static int domain_setup_first_level(struct intel_iommu *iommu,
> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
> index aabcdf756581..d6623d2c2050 100644
> --- a/drivers/iommu/intel/pasid.c
> +++ b/drivers/iommu/intel/pasid.c
...
> +void intel_context_flush_present(struct device_domain_info *info,
> +				 struct context_entry *context,
> +				 bool flush_domains)
> +{
> +	struct intel_iommu *iommu = info->iommu;
> +	u16 did = context_domain_id(context);

Whoops, did is always zero here, intel_context_flush_present() precedes
all callers.

> +	struct pasid_entry *pte;
> +	int i;
> +
> +	/*
> +	 * Device-selective context-cache invalidation. The Domain-ID field
> +	 * of the Context-cache Invalidate Descriptor is ignored by hardware
> +	 * when operating in scalable mode. Therefore the @did value doesn't
> +	 * matter in scalable mode.
> +	 */
> +	iommu->flush.flush_context(iommu, did, PCI_DEVID(info->bus, info->devfn),
                                          ^^^ Bogus
> +				   DMA_CCMD_MASK_NOBIT, DMA_CCMD_DEVICE_INVL);
> +
> +	/*
> +	 * For legacy mode:
> +	 * - Domain-selective IOTLB invalidation
> +	 * - Global Device-TLB invalidation to all affected functions
> +	 */
> +	if (!sm_supported(iommu)) {
> +		iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH);
                                                ^^^ etc...
> +		__context_flush_dev_iotlb(info);
> +
> +		return;
> +	}

Thanks,
Alex


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/7] iommu/vt-d: Add helper to flush caches for context change
  2024-08-14 22:27   ` Alex Williamson
@ 2024-08-15  4:47     ` Baolu Lu
  2024-08-15 12:34       ` Alex Williamson
  0 siblings, 1 reply; 14+ messages in thread
From: Baolu Lu @ 2024-08-15  4:47 UTC (permalink / raw)
  To: Alex Williamson
  Cc: baolu.lu, Joerg Roedel, Will Deacon, Jacob Pan, iommu,
	linux-kernel

On 2024/8/15 6:27, Alex Williamson wrote:
> Hi Baolu,

Hi Alex,

> This appears to be non-functional and breaks device assignment...

Yes. This is broken. Thanks for pointing it out.

Perhaps I can fix it by passing domain id to the helper? Something like
below:

diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 01002ae2a091..b3b295e60626 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -1152,7 +1152,7 @@ void cache_tag_flush_range_np(struct dmar_domain 
*domain, unsigned long start,

  void intel_context_flush_present(struct device_domain_info *info,
  				 struct context_entry *context,
-				 bool affect_domains);
+				 u16 did, bool affect_domains);

  #ifdef CONFIG_INTEL_IOMMU_SVM
  void intel_svm_check(struct intel_iommu *iommu);
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 159da629349c..34006b6e89eb 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1903,6 +1903,7 @@ static void domain_context_clear_one(struct 
device_domain_info *info, u8 bus, u8
  {
  	struct intel_iommu *iommu = info->iommu;
  	struct context_entry *context;
+	u16 did;

  	spin_lock(&iommu->lock);
  	context = iommu_context_addr(iommu, bus, devfn, 0);
@@ -1911,10 +1912,11 @@ static void domain_context_clear_one(struct 
device_domain_info *info, u8 bus, u8
  		return;
  	}

+	did = context_domain_id(context);
  	context_clear_entry(context);
  	__iommu_flush_cache(iommu, context, sizeof(*context));
  	spin_unlock(&iommu->lock);
-	intel_context_flush_present(info, context, true);
+	intel_context_flush_present(info, context, did, true);
  }

  static int domain_setup_first_level(struct intel_iommu *iommu,
@@ -4077,6 +4079,7 @@ static int context_flip_pri(struct 
device_domain_info *info, bool enable)
  	struct intel_iommu *iommu = info->iommu;
  	u8 bus = info->bus, devfn = info->devfn;
  	struct context_entry *context;
+	u16 did;

  	spin_lock(&iommu->lock);
  	if (context_copied(iommu, bus, devfn)) {
@@ -4089,6 +4092,7 @@ static int context_flip_pri(struct 
device_domain_info *info, bool enable)
  		spin_unlock(&iommu->lock);
  		return -ENODEV;
  	}
+	did = context_domain_id(context);

  	if (enable)
  		context_set_sm_pre(context);
@@ -4097,7 +4101,7 @@ static int context_flip_pri(struct 
device_domain_info *info, bool enable)

  	if (!ecap_coherent(iommu->ecap))
  		clflush_cache_range(context, sizeof(*context));
-	intel_context_flush_present(info, context, true);
+	intel_context_flush_present(info, context, did, true);
  	spin_unlock(&iommu->lock);

  	return 0;
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 5792c817cefa..b51fc268dc84 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -683,6 +683,7 @@ static void device_pasid_table_teardown(struct 
device *dev, u8 bus, u8 devfn)
  	struct device_domain_info *info = dev_iommu_priv_get(dev);
  	struct intel_iommu *iommu = info->iommu;
  	struct context_entry *context;
+	u16 did;

  	spin_lock(&iommu->lock);
  	context = iommu_context_addr(iommu, bus, devfn, false);
@@ -691,10 +692,11 @@ static void device_pasid_table_teardown(struct 
device *dev, u8 bus, u8 devfn)
  		return;
  	}

+	did = context_domain_id(context);
  	context_clear_entry(context);
  	__iommu_flush_cache(iommu, context, sizeof(*context));
  	spin_unlock(&iommu->lock);
-	intel_context_flush_present(info, context, false);
+	intel_context_flush_present(info, context, did, false);
  }

  static int pci_pasid_table_teardown(struct pci_dev *pdev, u16 alias, 
void *data)
@@ -885,10 +887,9 @@ static void __context_flush_dev_iotlb(struct 
device_domain_info *info)
   */
  void intel_context_flush_present(struct device_domain_info *info,
  				 struct context_entry *context,
-				 bool flush_domains)
+				 u16 did, bool flush_domains)
  {
  	struct intel_iommu *iommu = info->iommu;
-	u16 did = context_domain_id(context);
  	struct pasid_entry *pte;
  	int i;

Thanks,
baolu

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/7] iommu/vt-d: Add helper to flush caches for context change
  2024-08-15  4:47     ` Baolu Lu
@ 2024-08-15 12:34       ` Alex Williamson
  2024-08-15 12:41         ` Baolu Lu
  0 siblings, 1 reply; 14+ messages in thread
From: Alex Williamson @ 2024-08-15 12:34 UTC (permalink / raw)
  To: Baolu Lu; +Cc: Joerg Roedel, Will Deacon, Jacob Pan, iommu, linux-kernel

On Thu, 15 Aug 2024 12:47:58 +0800
Baolu Lu <baolu.lu@linux.intel.com> wrote:

> On 2024/8/15 6:27, Alex Williamson wrote:
> > Hi Baolu,  
> 
> Hi Alex,
> 
> > This appears to be non-functional and breaks device assignment...  
> 
> Yes. This is broken. Thanks for pointing it out.
> 
> Perhaps I can fix it by passing domain id to the helper? Something like
> below:

Hi Baolu,

Yes, I did a similar quick fix to prove this was the issue.  Thanks,

Alex

> diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
> index 01002ae2a091..b3b295e60626 100644
> --- a/drivers/iommu/intel/iommu.h
> +++ b/drivers/iommu/intel/iommu.h
> @@ -1152,7 +1152,7 @@ void cache_tag_flush_range_np(struct dmar_domain 
> *domain, unsigned long start,
> 
>   void intel_context_flush_present(struct device_domain_info *info,
>   				 struct context_entry *context,
> -				 bool affect_domains);
> +				 u16 did, bool affect_domains);
> 
>   #ifdef CONFIG_INTEL_IOMMU_SVM
>   void intel_svm_check(struct intel_iommu *iommu);
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 159da629349c..34006b6e89eb 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1903,6 +1903,7 @@ static void domain_context_clear_one(struct 
> device_domain_info *info, u8 bus, u8
>   {
>   	struct intel_iommu *iommu = info->iommu;
>   	struct context_entry *context;
> +	u16 did;
> 
>   	spin_lock(&iommu->lock);
>   	context = iommu_context_addr(iommu, bus, devfn, 0);
> @@ -1911,10 +1912,11 @@ static void domain_context_clear_one(struct 
> device_domain_info *info, u8 bus, u8
>   		return;
>   	}
> 
> +	did = context_domain_id(context);
>   	context_clear_entry(context);
>   	__iommu_flush_cache(iommu, context, sizeof(*context));
>   	spin_unlock(&iommu->lock);
> -	intel_context_flush_present(info, context, true);
> +	intel_context_flush_present(info, context, did, true);
>   }
> 
>   static int domain_setup_first_level(struct intel_iommu *iommu,
> @@ -4077,6 +4079,7 @@ static int context_flip_pri(struct 
> device_domain_info *info, bool enable)
>   	struct intel_iommu *iommu = info->iommu;
>   	u8 bus = info->bus, devfn = info->devfn;
>   	struct context_entry *context;
> +	u16 did;
> 
>   	spin_lock(&iommu->lock);
>   	if (context_copied(iommu, bus, devfn)) {
> @@ -4089,6 +4092,7 @@ static int context_flip_pri(struct 
> device_domain_info *info, bool enable)
>   		spin_unlock(&iommu->lock);
>   		return -ENODEV;
>   	}
> +	did = context_domain_id(context);
> 
>   	if (enable)
>   		context_set_sm_pre(context);
> @@ -4097,7 +4101,7 @@ static int context_flip_pri(struct 
> device_domain_info *info, bool enable)
> 
>   	if (!ecap_coherent(iommu->ecap))
>   		clflush_cache_range(context, sizeof(*context));
> -	intel_context_flush_present(info, context, true);
> +	intel_context_flush_present(info, context, did, true);
>   	spin_unlock(&iommu->lock);
> 
>   	return 0;
> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
> index 5792c817cefa..b51fc268dc84 100644
> --- a/drivers/iommu/intel/pasid.c
> +++ b/drivers/iommu/intel/pasid.c
> @@ -683,6 +683,7 @@ static void device_pasid_table_teardown(struct 
> device *dev, u8 bus, u8 devfn)
>   	struct device_domain_info *info = dev_iommu_priv_get(dev);
>   	struct intel_iommu *iommu = info->iommu;
>   	struct context_entry *context;
> +	u16 did;
> 
>   	spin_lock(&iommu->lock);
>   	context = iommu_context_addr(iommu, bus, devfn, false);
> @@ -691,10 +692,11 @@ static void device_pasid_table_teardown(struct 
> device *dev, u8 bus, u8 devfn)
>   		return;
>   	}
> 
> +	did = context_domain_id(context);
>   	context_clear_entry(context);
>   	__iommu_flush_cache(iommu, context, sizeof(*context));
>   	spin_unlock(&iommu->lock);
> -	intel_context_flush_present(info, context, false);
> +	intel_context_flush_present(info, context, did, false);
>   }
> 
>   static int pci_pasid_table_teardown(struct pci_dev *pdev, u16 alias, 
> void *data)
> @@ -885,10 +887,9 @@ static void __context_flush_dev_iotlb(struct 
> device_domain_info *info)
>    */
>   void intel_context_flush_present(struct device_domain_info *info,
>   				 struct context_entry *context,
> -				 bool flush_domains)
> +				 u16 did, bool flush_domains)
>   {
>   	struct intel_iommu *iommu = info->iommu;
> -	u16 did = context_domain_id(context);
>   	struct pasid_entry *pte;
>   	int i;
> 
> Thanks,
> baolu
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/7] iommu/vt-d: Add helper to flush caches for context change
  2024-08-15 12:34       ` Alex Williamson
@ 2024-08-15 12:41         ` Baolu Lu
  0 siblings, 0 replies; 14+ messages in thread
From: Baolu Lu @ 2024-08-15 12:41 UTC (permalink / raw)
  To: Alex Williamson
  Cc: baolu.lu, Joerg Roedel, Will Deacon, Jacob Pan, iommu,
	linux-kernel

On 2024/8/15 20:34, Alex Williamson wrote:
> On Thu, 15 Aug 2024 12:47:58 +0800
> Baolu Lu<baolu.lu@linux.intel.com>  wrote:
> 
>> On 2024/8/15 6:27, Alex Williamson wrote:
>>> Hi Baolu,
>> Hi Alex,
>>
>>> This appears to be non-functional and breaks device assignment...
>> Yes. This is broken. Thanks for pointing it out.
>>
>> Perhaps I can fix it by passing domain id to the helper? Something like
>> below:
> Hi Baolu,
> 
> Yes, I did a similar quick fix to prove this was the issue.  Thanks,

Thank you! Then, I will make it a real patch.

Thanks,
baolu

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-08-15 12:41 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-02 13:08 [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11 Lu Baolu
2024-07-02 13:08 ` [PATCH 1/7] iommu/vt-d: Handle volatile descriptor status read Lu Baolu
2024-07-02 13:08 ` [PATCH 2/7] iommu/vt-d: Remove comment for def_domain_type Lu Baolu
2024-07-02 13:08 ` [PATCH 3/7] iommu/vt-d: Remove control over Execute-Requested requests Lu Baolu
2024-07-02 13:08 ` [PATCH 4/7] iommu/vt-d: Downgrade warning for pre-enabled IR Lu Baolu
2024-07-02 13:08 ` [PATCH 5/7] iommu/vt-d: Add helper to allocate paging domain Lu Baolu
2024-07-02 13:25   ` Yi Liu
2024-07-02 13:08 ` [PATCH 6/7] iommu/vt-d: Add helper to flush caches for context change Lu Baolu
2024-08-14 22:27   ` Alex Williamson
2024-08-15  4:47     ` Baolu Lu
2024-08-15 12:34       ` Alex Williamson
2024-08-15 12:41         ` Baolu Lu
2024-07-02 13:08 ` [PATCH 7/7] iommu/vt-d: Refactor PCI PRI enabling/disabling callbacks Lu Baolu
2024-07-03 16:13 ` [PATCH 0/7] [PULL REQUEST] Intel IOMMU updates for v6.11 Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox