public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.13-rc
@ 2024-12-13  1:17 Lu Baolu
  2024-12-13  1:17 ` [PATCH 1/3] iommu/vt-d: Remove cache tags before disabling ATS Lu Baolu
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Lu Baolu @ 2024-12-13  1:17 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Dan Carpenter, Yi Liu, iommu, linux-kernel

Hi Joerg,

The following fixes have been queued for v6.13-rc:

 - Remove cache tags before disabling ATS
 - Avoid draining PRQ in sva mm release path
 - Fix qi_batch NULL pointer with nested parent domain

They have been reviewed and tested. Please can you consider them for
v6.13-rc?

Best regards,
baolu

Lu Baolu (2):
  iommu/vt-d: Remove cache tags before disabling ATS
  iommu/vt-d: Avoid draining PRQ in sva mm release path

Yi Liu (1):
  iommu/vt-d: Fix qi_batch NULL pointer with nested parent domain

 drivers/iommu/intel/cache.c | 34 +++++++++++++++++++++++++++-------
 drivers/iommu/intel/iommu.c |  4 +++-
 drivers/iommu/intel/pasid.c |  3 ++-
 3 files changed, 32 insertions(+), 9 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] iommu/vt-d: Remove cache tags before disabling ATS
  2024-12-13  1:17 [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.13-rc Lu Baolu
@ 2024-12-13  1:17 ` Lu Baolu
  2024-12-13  1:17 ` [PATCH 2/3] iommu/vt-d: Fix qi_batch NULL pointer with nested parent domain Lu Baolu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Lu Baolu @ 2024-12-13  1:17 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Dan Carpenter, Yi Liu, iommu, linux-kernel

The current implementation removes cache tags after disabling ATS,
leading to potential memory leaks and kernel crashes. Specifically,
CACHE_TAG_DEVTLB type cache tags may still remain in the list even
after the domain is freed, causing a use-after-free condition.

This issue really shows up when multiple VFs from different PFs
passed through to a single user-space process via vfio-pci. In such
cases, the kernel may crash with kernel messages like:

 BUG: kernel NULL pointer dereference, address: 0000000000000014
 PGD 19036a067 P4D 1940a3067 PUD 136c9b067 PMD 0
 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 74 UID: 0 PID: 3183 Comm: testCli Not tainted 6.11.9 #2
 RIP: 0010:cache_tag_flush_range+0x9b/0x250
 Call Trace:
  <TASK>
  ? __die+0x1f/0x60
  ? page_fault_oops+0x163/0x590
  ? exc_page_fault+0x72/0x190
  ? asm_exc_page_fault+0x22/0x30
  ? cache_tag_flush_range+0x9b/0x250
  ? cache_tag_flush_range+0x5d/0x250
  intel_iommu_tlb_sync+0x29/0x40
  intel_iommu_unmap_pages+0xfe/0x160
  __iommu_unmap+0xd8/0x1a0
  vfio_unmap_unpin+0x182/0x340 [vfio_iommu_type1]
  vfio_remove_dma+0x2a/0xb0 [vfio_iommu_type1]
  vfio_iommu_type1_ioctl+0xafa/0x18e0 [vfio_iommu_type1]

Move cache_tag_unassign_domain() before iommu_disable_pci_caps() to fix
it.

Fixes: 3b1d9e2b2d68 ("iommu/vt-d: Add cache tag assignment interface")
Cc: stable@vger.kernel.org
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20241129020506.576413-1-baolu.lu@linux.intel.com
---
 drivers/iommu/intel/iommu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 7d0acb74d5a5..79e0da9eb626 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3220,6 +3220,9 @@ void device_block_translation(struct device *dev)
 	struct intel_iommu *iommu = info->iommu;
 	unsigned long flags;
 
+	if (info->domain)
+		cache_tag_unassign_domain(info->domain, dev, IOMMU_NO_PASID);
+
 	iommu_disable_pci_caps(info);
 	if (!dev_is_real_dma_subdevice(dev)) {
 		if (sm_supported(iommu))
@@ -3236,7 +3239,6 @@ void device_block_translation(struct device *dev)
 	list_del(&info->link);
 	spin_unlock_irqrestore(&info->domain->lock, flags);
 
-	cache_tag_unassign_domain(info->domain, dev, IOMMU_NO_PASID);
 	domain_detach_iommu(info->domain, iommu);
 	info->domain = NULL;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] iommu/vt-d: Fix qi_batch NULL pointer with nested parent domain
  2024-12-13  1:17 [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.13-rc Lu Baolu
  2024-12-13  1:17 ` [PATCH 1/3] iommu/vt-d: Remove cache tags before disabling ATS Lu Baolu
@ 2024-12-13  1:17 ` Lu Baolu
  2024-12-13  1:17 ` [PATCH 3/3] iommu/vt-d: Avoid draining PRQ in sva mm release path Lu Baolu
  2024-12-13 14:55 ` [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.13-rc Joerg Roedel
  3 siblings, 0 replies; 5+ messages in thread
From: Lu Baolu @ 2024-12-13  1:17 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Dan Carpenter, Yi Liu, iommu, linux-kernel

From: Yi Liu <yi.l.liu@intel.com>

The qi_batch is allocated when assigning cache tag for a domain. While
for nested parent domain, it is missed. Hence, when trying to map pages
to the nested parent, NULL dereference occurred. Also, there is potential
memleak since there is no lock around domain->qi_batch allocation.

To solve it, add a helper for qi_batch allocation, and call it in both
the __cache_tag_assign_domain() and __cache_tag_assign_parent_domain().

  BUG: kernel NULL pointer dereference, address: 0000000000000200
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 8104795067 P4D 0
  Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 223 UID: 0 PID: 4357 Comm: qemu-system-x86 Not tainted 6.13.0-rc1-00028-g4b50c3c3b998-dirty #2632
  Call Trace:
   ? __die+0x24/0x70
   ? page_fault_oops+0x80/0x150
   ? do_user_addr_fault+0x63/0x7b0
   ? exc_page_fault+0x7c/0x220
   ? asm_exc_page_fault+0x26/0x30
   ? cache_tag_flush_range_np+0x13c/0x260
   intel_iommu_iotlb_sync_map+0x1a/0x30
   iommu_map+0x61/0xf0
   batch_to_domain+0x188/0x250
   iopt_area_fill_domains+0x125/0x320
   ? rcu_is_watching+0x11/0x50
   iopt_map_pages+0x63/0x100
   iopt_map_common.isra.0+0xa7/0x190
   iopt_map_user_pages+0x6a/0x80
   iommufd_ioas_map+0xcd/0x1d0
   iommufd_fops_ioctl+0x118/0x1c0
   __x64_sys_ioctl+0x93/0xc0
   do_syscall_64+0x71/0x140
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

Fixes: 705c1cdf1e73 ("iommu/vt-d: Introduce batched cache invalidation")
Cc: stable@vger.kernel.org
Co-developed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20241210130322.17175-1-yi.l.liu@intel.com
---
 drivers/iommu/intel/cache.c | 34 +++++++++++++++++++++++++++-------
 1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c
index e5b89f728ad3..09694cca8752 100644
--- a/drivers/iommu/intel/cache.c
+++ b/drivers/iommu/intel/cache.c
@@ -105,12 +105,35 @@ static void cache_tag_unassign(struct dmar_domain *domain, u16 did,
 	spin_unlock_irqrestore(&domain->cache_lock, flags);
 }
 
+/* domain->qi_batch will be freed in iommu_free_domain() path. */
+static int domain_qi_batch_alloc(struct dmar_domain *domain)
+{
+	unsigned long flags;
+	int ret = 0;
+
+	spin_lock_irqsave(&domain->cache_lock, flags);
+	if (domain->qi_batch)
+		goto out_unlock;
+
+	domain->qi_batch = kzalloc(sizeof(*domain->qi_batch), GFP_ATOMIC);
+	if (!domain->qi_batch)
+		ret = -ENOMEM;
+out_unlock:
+	spin_unlock_irqrestore(&domain->cache_lock, flags);
+
+	return ret;
+}
+
 static int __cache_tag_assign_domain(struct dmar_domain *domain, u16 did,
 				     struct device *dev, ioasid_t pasid)
 {
 	struct device_domain_info *info = dev_iommu_priv_get(dev);
 	int ret;
 
+	ret = domain_qi_batch_alloc(domain);
+	if (ret)
+		return ret;
+
 	ret = cache_tag_assign(domain, did, dev, pasid, CACHE_TAG_IOTLB);
 	if (ret || !info->ats_enabled)
 		return ret;
@@ -139,6 +162,10 @@ static int __cache_tag_assign_parent_domain(struct dmar_domain *domain, u16 did,
 	struct device_domain_info *info = dev_iommu_priv_get(dev);
 	int ret;
 
+	ret = domain_qi_batch_alloc(domain);
+	if (ret)
+		return ret;
+
 	ret = cache_tag_assign(domain, did, dev, pasid, CACHE_TAG_NESTING_IOTLB);
 	if (ret || !info->ats_enabled)
 		return ret;
@@ -190,13 +217,6 @@ int cache_tag_assign_domain(struct dmar_domain *domain,
 	u16 did = domain_get_id_for_dev(domain, dev);
 	int ret;
 
-	/* domain->qi_bach will be freed in iommu_free_domain() path. */
-	if (!domain->qi_batch) {
-		domain->qi_batch = kzalloc(sizeof(*domain->qi_batch), GFP_KERNEL);
-		if (!domain->qi_batch)
-			return -ENOMEM;
-	}
-
 	ret = __cache_tag_assign_domain(domain, did, dev, pasid);
 	if (ret || domain->domain.type != IOMMU_DOMAIN_NESTED)
 		return ret;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] iommu/vt-d: Avoid draining PRQ in sva mm release path
  2024-12-13  1:17 [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.13-rc Lu Baolu
  2024-12-13  1:17 ` [PATCH 1/3] iommu/vt-d: Remove cache tags before disabling ATS Lu Baolu
  2024-12-13  1:17 ` [PATCH 2/3] iommu/vt-d: Fix qi_batch NULL pointer with nested parent domain Lu Baolu
@ 2024-12-13  1:17 ` Lu Baolu
  2024-12-13 14:55 ` [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.13-rc Joerg Roedel
  3 siblings, 0 replies; 5+ messages in thread
From: Lu Baolu @ 2024-12-13  1:17 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Dan Carpenter, Yi Liu, iommu, linux-kernel

When a PASID is used for SVA by a device, it's possible that the PASID
entry is cleared before the device flushes all ongoing DMA requests and
removes the SVA domain. This can occur when an exception happens and the
process terminates before the device driver stops DMA and calls the
iommu driver to unbind the PASID.

There's no need to drain the PRQ in the mm release path. Instead, the PRQ
will be drained in the SVA unbind path.

Unfortunately, commit c43e1ccdebf2 ("iommu/vt-d: Drain PRQs when domain
removed from RID") changed this behavior by unconditionally draining the
PRQ in intel_pasid_tear_down_entry(). This can lead to a potential
sleeping-in-atomic-context issue.

Smatch static checker warning:

	drivers/iommu/intel/prq.c:95 intel_iommu_drain_pasid_prq()
	warn: sleeping in atomic context

To avoid this issue, prevent draining the PRQ in the SVA mm release path
and restore the previous behavior.

Fixes: c43e1ccdebf2 ("iommu/vt-d: Drain PRQs when domain removed from RID")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-iommu/c5187676-2fa2-4e29-94e0-4a279dc88b49@stanley.mountain/
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20241212021529.1104745-1-baolu.lu@linux.intel.com
---
 drivers/iommu/intel/pasid.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 0f2a926d3bd5..5b7d85f1e143 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -265,7 +265,8 @@ void intel_pasid_tear_down_entry(struct intel_iommu *iommu, struct device *dev,
 		iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH);
 
 	devtlb_invalidation_with_pasid(iommu, dev, pasid);
-	intel_iommu_drain_pasid_prq(dev, pasid);
+	if (!fault_ignore)
+		intel_iommu_drain_pasid_prq(dev, pasid);
 }
 
 /*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.13-rc
  2024-12-13  1:17 [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.13-rc Lu Baolu
                   ` (2 preceding siblings ...)
  2024-12-13  1:17 ` [PATCH 3/3] iommu/vt-d: Avoid draining PRQ in sva mm release path Lu Baolu
@ 2024-12-13 14:55 ` Joerg Roedel
  3 siblings, 0 replies; 5+ messages in thread
From: Joerg Roedel @ 2024-12-13 14:55 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Dan Carpenter, Yi Liu, iommu, linux-kernel

On Fri, Dec 13, 2024 at 09:17:49AM +0800, Lu Baolu wrote:
> Lu Baolu (2):
>   iommu/vt-d: Remove cache tags before disabling ATS
>   iommu/vt-d: Avoid draining PRQ in sva mm release path
> 
> Yi Liu (1):
>   iommu/vt-d: Fix qi_batch NULL pointer with nested parent domain

Applied, thanks Baolu.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-12-13 14:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-13  1:17 [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.13-rc Lu Baolu
2024-12-13  1:17 ` [PATCH 1/3] iommu/vt-d: Remove cache tags before disabling ATS Lu Baolu
2024-12-13  1:17 ` [PATCH 2/3] iommu/vt-d: Fix qi_batch NULL pointer with nested parent domain Lu Baolu
2024-12-13  1:17 ` [PATCH 3/3] iommu/vt-d: Avoid draining PRQ in sva mm release path Lu Baolu
2024-12-13 14:55 ` [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.13-rc Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox