iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] iommu/vt-d: Fix PCI bus rescan device hot add
@ 2022-01-13 13:23 Jacob Pan
  2022-01-14  0:58 ` Lu Baolu
  0 siblings, 1 reply; 10+ messages in thread
From: Jacob Pan @ 2022-01-13 13:23 UTC (permalink / raw)
  To: iommu, LKML, Joerg Roedel, Lu Baolu; +Cc: Kumar, Sanjay K, Jacob Pan, Raj Ashok

During PCI bus rescan, adding new devices involve two notifiers.
1. dmar_pci_bus_notifier()
2. iommu_bus_notifier()
The current code sets #1 as low priority (INT_MIN) which resulted in #2
being invoked first. The result is that struct device pointer cannot be
found in DRHD search for the new device's DMAR/IOMMU. Subsequently, the
device is put under the "catch-all" IOMMU instead of the correct one.

This could cause system hang when device TLB invalidation is sent to the
wrong IOMMU. Invalidation timeout error or hard lockup can be observed.

This patch fixes the issue by setting a higher priority for
dmar_pci_bus_notifier. DRHD search for a new device will find the
correct IOMMU.

Fixes: 59ce0515cdaf ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope")
Reported-by: Zhang, Bernice <bernice.zhang@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 drivers/iommu/intel/dmar.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 915bff76fe96..5d07e5b89c2e 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -385,7 +385,7 @@ static int dmar_pci_bus_notifier(struct notifier_block *nb,
 
 static struct notifier_block dmar_pci_bus_nb = {
 	.notifier_call = dmar_pci_bus_notifier,
-	.priority = INT_MIN,
+	.priority = INT_MAX,
 };
 
 static struct dmar_drhd_unit *
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 10+ messages in thread
* [PATCH] iommu/vt-d: Fix PCI bus rescan device hot add
@ 2022-05-21  0:21 Yian Chen
  2022-05-25  1:40 ` Baolu Lu
  0 siblings, 1 reply; 10+ messages in thread
From: Yian Chen @ 2022-05-21  0:21 UTC (permalink / raw)
  To: iommu, Joerg Roedel, David Woodhouse, Lu Baolu, Jacob jun Pan,
	Sohil Mehta, Ravi Shankar, Yian Chen

Notifier calling chain uses priority to determine the execution
order of the notifiers or listeners registered to the chain.
PCI bus device hot add utilizes the notification mechanism.

The current code sets low priority (INT_MIN) to Intel
dmar_pci_bus_notifier and postpones DMAR decoding after adding
new device into IOMMU. The result is that struct device pointer
cannot be found in DRHD search for the new device's DMAR/IOMMU.
Subsequently, the device is put under the "catch-all" IOMMU
instead of the correct one. This could cause system hang when
device TLB invalidation is sent to the wrong IOMMU. Invalidation
timeout error and hard lockup have been observed and data
inconsistency/crush may occur as well.

This patch fixes the issue by setting a positive priority(1) for
dmar_pci_bus_notifier while the priority of IOMMU bus notifier
uses the default value(0), therefore DMAR decoding will be in
advance of DRHD search for a new device to find the correct IOMMU.

Following is a 2-step example that triggers the bug by simulating
PCI device hot add behavior in Intel Sapphire Rapids server.

echo 1 > /sys/bus/pci/devices/0000:6a:01.0/remove
echo 1 > /sys/bus/pci/rescan

Fixes: 59ce0515cdaf ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope")
Cc: stable@vger.kernel.org # v3.15+
Reported-by: Zhang, Bernice <bernice.zhang@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Yian Chen <yian.chen@intel.com>
---
This is a quick fix for the bug reported. Intel internally evaluated
another redesigned solution that eliminates dmar pci bus notifier to
simplify the workflow of pci hotplug and improve its runtime efficiency.

While considering the fix could apply to downstream and the complexity
of pci hotplug workflow change may significantly increase the
engineering effort to downstream the patch, the choice is to submit this
simple patch to help the deployment of this bug fix.
---

 drivers/iommu/intel/dmar.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 4de960834a1b..497c5bd95caf 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -383,7 +383,7 @@ static int dmar_pci_bus_notifier(struct notifier_block *nb,
 
 static struct notifier_block dmar_pci_bus_nb = {
 	.notifier_call = dmar_pci_bus_notifier,
-	.priority = INT_MIN,
+	.priority = 1,
 };
 
 static struct dmar_drhd_unit *
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-07-06 10:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-01-13 13:23 [PATCH] iommu/vt-d: Fix PCI bus rescan device hot add Jacob Pan
2022-01-14  0:58 ` Lu Baolu
2022-01-14  3:11   ` Jacob Pan
2022-01-14  3:12     ` Lu Baolu
2022-01-14 15:24       ` Jacob Pan
  -- strict thread matches above, loose matches on Subject: below --
2022-05-21  0:21 Yian Chen
2022-05-25  1:40 ` Baolu Lu
2022-06-24  5:45   ` Joerg Roedel
2022-06-24  6:12     ` Baolu Lu
2022-07-06 10:46       ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).