public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] iommu/vt-d: Remove cache tags before disabling ATS
@ 2024-11-29  2:05 Lu Baolu
  2024-12-11  7:21 ` Tian, Kevin
  0 siblings, 1 reply; 4+ messages in thread
From: Lu Baolu @ 2024-11-29  2:05 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian, Yi Liu
  Cc: iommu, linux-kernel, Lu Baolu, stable

The current implementation removes cache tags after disabling ATS,
leading to potential memory leaks and kernel crashes. Specifically,
CACHE_TAG_DEVTLB type cache tags may still remain in the list even
after the domain is freed, causing a use-after-free condition.

This issue really shows up when multiple VFs from different PFs
passed through to a single user-space process via vfio-pci. In such
cases, the kernel may crash with kernel messages like:

 BUG: kernel NULL pointer dereference, address: 0000000000000014
 PGD 19036a067 P4D 1940a3067 PUD 136c9b067 PMD 0
 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 74 UID: 0 PID: 3183 Comm: testCli Not tainted 6.11.9 #2
 RIP: 0010:cache_tag_flush_range+0x9b/0x250
 Call Trace:
  <TASK>
  ? __die+0x1f/0x60
  ? page_fault_oops+0x163/0x590
  ? exc_page_fault+0x72/0x190
  ? asm_exc_page_fault+0x22/0x30
  ? cache_tag_flush_range+0x9b/0x250
  ? cache_tag_flush_range+0x5d/0x250
  intel_iommu_tlb_sync+0x29/0x40
  intel_iommu_unmap_pages+0xfe/0x160
  __iommu_unmap+0xd8/0x1a0
  vfio_unmap_unpin+0x182/0x340 [vfio_iommu_type1]
  vfio_remove_dma+0x2a/0xb0 [vfio_iommu_type1]
  vfio_iommu_type1_ioctl+0xafa/0x18e0 [vfio_iommu_type1]

Move cache_tag_unassign_domain() before iommu_disable_pci_caps() to fix
it.

Fixes: 3b1d9e2b2d68 ("iommu/vt-d: Add cache tag assignment interface")
Cc: stable@vger.kernel.org
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/iommu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 7d0acb74d5a5..79e0da9eb626 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3220,6 +3220,9 @@ void device_block_translation(struct device *dev)
 	struct intel_iommu *iommu = info->iommu;
 	unsigned long flags;
 
+	if (info->domain)
+		cache_tag_unassign_domain(info->domain, dev, IOMMU_NO_PASID);
+
 	iommu_disable_pci_caps(info);
 	if (!dev_is_real_dma_subdevice(dev)) {
 		if (sm_supported(iommu))
@@ -3236,7 +3239,6 @@ void device_block_translation(struct device *dev)
 	list_del(&info->link);
 	spin_unlock_irqrestore(&info->domain->lock, flags);
 
-	cache_tag_unassign_domain(info->domain, dev, IOMMU_NO_PASID);
 	domain_detach_iommu(info->domain, iommu);
 	info->domain = NULL;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: [PATCH] iommu/vt-d: Remove cache tags before disabling ATS
  2024-11-29  2:05 [PATCH] iommu/vt-d: Remove cache tags before disabling ATS Lu Baolu
@ 2024-12-11  7:21 ` Tian, Kevin
  2024-12-11  7:35   ` Baolu Lu
  0 siblings, 1 reply; 4+ messages in thread
From: Tian, Kevin @ 2024-12-11  7:21 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel, Will Deacon, Robin Murphy, Liu, Yi L
  Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Friday, November 29, 2024 10:05 AM
> 
> The current implementation removes cache tags after disabling ATS,
> leading to potential memory leaks and kernel crashes. Specifically,
> CACHE_TAG_DEVTLB type cache tags may still remain in the list even
> after the domain is freed, causing a use-after-free condition.
> 
> This issue really shows up when multiple VFs from different PFs
> passed through to a single user-space process via vfio-pci. In such
> cases, the kernel may crash with kernel messages like:

Is "multiple VFs from different PFs" the key to trigger the problem?

what about multiple VFs from the same PF or just assigning multiple
devices to a single process/vm?

My understanding from the below fix is that this issue will be triggered
as long as the domain is still being actively used after one device with
ATS is detached from it, i.e. sounds like a problem in multi-device
assignment scenario.

> 
>  BUG: kernel NULL pointer dereference, address: 0000000000000014
>  PGD 19036a067 P4D 1940a3067 PUD 136c9b067 PMD 0
>  Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
>  CPU: 74 UID: 0 PID: 3183 Comm: testCli Not tainted 6.11.9 #2
>  RIP: 0010:cache_tag_flush_range+0x9b/0x250
>  Call Trace:
>   <TASK>
>   ? __die+0x1f/0x60
>   ? page_fault_oops+0x163/0x590
>   ? exc_page_fault+0x72/0x190
>   ? asm_exc_page_fault+0x22/0x30
>   ? cache_tag_flush_range+0x9b/0x250
>   ? cache_tag_flush_range+0x5d/0x250
>   intel_iommu_tlb_sync+0x29/0x40
>   intel_iommu_unmap_pages+0xfe/0x160
>   __iommu_unmap+0xd8/0x1a0
>   vfio_unmap_unpin+0x182/0x340 [vfio_iommu_type1]
>   vfio_remove_dma+0x2a/0xb0 [vfio_iommu_type1]
>   vfio_iommu_type1_ioctl+0xafa/0x18e0 [vfio_iommu_type1]
> 
> Move cache_tag_unassign_domain() before iommu_disable_pci_caps() to fix
> it.
> 
> Fixes: 3b1d9e2b2d68 ("iommu/vt-d: Add cache tag assignment interface")
> Cc: stable@vger.kernel.org
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] iommu/vt-d: Remove cache tags before disabling ATS
  2024-12-11  7:21 ` Tian, Kevin
@ 2024-12-11  7:35   ` Baolu Lu
  2024-12-11  7:42     ` Tian, Kevin
  0 siblings, 1 reply; 4+ messages in thread
From: Baolu Lu @ 2024-12-11  7:35 UTC (permalink / raw)
  To: Tian, Kevin, Joerg Roedel, Will Deacon, Robin Murphy, Liu, Yi L
  Cc: baolu.lu, iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org

On 2024/12/11 15:21, Tian, Kevin wrote:
>> From: Lu Baolu<baolu.lu@linux.intel.com>
>> Sent: Friday, November 29, 2024 10:05 AM
>>
>> The current implementation removes cache tags after disabling ATS,
>> leading to potential memory leaks and kernel crashes. Specifically,
>> CACHE_TAG_DEVTLB type cache tags may still remain in the list even
>> after the domain is freed, causing a use-after-free condition.
>>
>> This issue really shows up when multiple VFs from different PFs
>> passed through to a single user-space process via vfio-pci. In such
>> cases, the kernel may crash with kernel messages like:
> Is "multiple VFs from different PFs" the key to trigger the problem?

This is the real test case that triggered this issue. It's definitely
not the only case that could trigger this issue.

> 
> what about multiple VFs from the same PF or just assigning multiple
> devices to a single process/vm?

I think it's possible.

> My understanding from the below fix is that this issue will be triggered
> as long as the domain is still being actively used after one device with
> ATS is detached from it, i.e. sounds like a problem in multi-device
> assignment scenario.

Yes.

Thanks,
baolu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH] iommu/vt-d: Remove cache tags before disabling ATS
  2024-12-11  7:35   ` Baolu Lu
@ 2024-12-11  7:42     ` Tian, Kevin
  0 siblings, 0 replies; 4+ messages in thread
From: Tian, Kevin @ 2024-12-11  7:42 UTC (permalink / raw)
  To: Baolu Lu, Joerg Roedel, Will Deacon, Robin Murphy, Liu, Yi L
  Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org

> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Wednesday, December 11, 2024 3:35 PM
> 
> On 2024/12/11 15:21, Tian, Kevin wrote:
> >> From: Lu Baolu<baolu.lu@linux.intel.com>
> >> Sent: Friday, November 29, 2024 10:05 AM
> >>
> >> The current implementation removes cache tags after disabling ATS,
> >> leading to potential memory leaks and kernel crashes. Specifically,
> >> CACHE_TAG_DEVTLB type cache tags may still remain in the list even
> >> after the domain is freed, causing a use-after-free condition.
> >>
> >> This issue really shows up when multiple VFs from different PFs
> >> passed through to a single user-space process via vfio-pci. In such
> >> cases, the kernel may crash with kernel messages like:
> > Is "multiple VFs from different PFs" the key to trigger the problem?
> 
> This is the real test case that triggered this issue. It's definitely
> not the only case that could trigger this issue.
> 

it's the real test case but is  a bit misleading when connecting it to
the patch. 😊

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-12-11  7:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-29  2:05 [PATCH] iommu/vt-d: Remove cache tags before disabling ATS Lu Baolu
2024-12-11  7:21 ` Tian, Kevin
2024-12-11  7:35   ` Baolu Lu
2024-12-11  7:42     ` Tian, Kevin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox