Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Nicolin Chen <nicolinc@nvidia.com>
To: Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"Jason Gunthorpe" <jgg@nvidia.com>
Cc: "Rafael J . Wysocki" <rafael@kernel.org>,
	Len Brown <lenb@kernel.org>,
	Pranjal Shrivastava <praan@google.com>,
	Mostafa Saleh <smostafa@google.com>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	Kevin Tian <kevin.tian@intel.com>,
	<linux-arm-kernel@lists.infradead.org>, <iommu@lists.linux.dev>,
	<linux-kernel@vger.kernel.org>, <linux-acpi@vger.kernel.org>,
	<linux-pci@vger.kernel.org>, <vsethi@nvidia.com>,
	Shuai Xue <xueshuai@linux.alibaba.com>
Subject: [PATCH v5 01/18] PCI: Don't suspend IOMMU when probing reset capability
Date: Thu, 2 Jul 2026 21:06:26 -0700	[thread overview]
Message-ID: <6dbbbbe30cdc08a467de893fb9ecdc07d9cc1fec.1783044582.git.nicolinc@nvidia.com> (raw)
In-Reply-To: <cover.1783044582.git.nicolinc@nvidia.com>

reset_method_store() in drivers/pci/pci-sysfs.c discovers supported reset
methods by calling reset_fn(pdev, PCI_RESET_PROBE, ...) without holding a
device_lock, since the probe path is expected to query the device's reset
capability without changing device state.

However, pci_reset_bus_function() and __pci_dev_specific_reset() violate
that contract after pci_dev_reset_iommu_prepare/done() were added, which
moves the device into a blocking domain and abruptly aborts any in-flight
DMA. Doing this for a probe -- a state-query call that does not even hold
device_lock -- can cause driver timeouts and data loss on a DMAing device.

The peer reset helpers all handle this correctly: they short-circuit on a
probe input before touching the IOMMU.

Skip pci_dev_reset_iommu_prepare()/_done() entirely when probe is set. The
inner reset routines already implement their own probe semantics, and they
perform the capability checks and return without changing device state.

Fixes: f5b16b802174 ("PCI: Suspend iommu function prior to resetting a device")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/pci/pci.c    | 13 ++++++++-----
 drivers/pci/quirks.c | 13 ++++++++-----
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 77b17b13ee615..01cf310540561 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4946,10 +4946,12 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
 	if (bridge && pcie_is_cxl(bridge) && cxl_sbr_masked(bridge))
 		return -ENOTTY;
 
-	rc = pci_dev_reset_iommu_prepare(dev);
-	if (rc) {
-		pci_err(dev, "failed to stop IOMMU for a PCI reset: %d\n", rc);
-		return rc;
+	if (!probe) {
+		rc = pci_dev_reset_iommu_prepare(dev);
+		if (rc) {
+			pci_err(dev, "failed to stop IOMMU for a PCI reset: %d\n", rc);
+			return rc;
+		}
 	}
 
 	rc = pci_dev_reset_slot_function(dev, probe);
@@ -4958,7 +4960,8 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
 
 	rc = pci_parent_bus_reset(dev, probe);
 done:
-	pci_dev_reset_iommu_done(dev);
+	if (!probe)
+		pci_dev_reset_iommu_done(dev);
 	return rc;
 }
 
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index b09f27f7846fc..8ecd1bc561d28 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4250,14 +4250,17 @@ static int __pci_dev_specific_reset(struct pci_dev *dev, bool probe,
 {
 	int ret;
 
-	ret = pci_dev_reset_iommu_prepare(dev);
-	if (ret) {
-		pci_err(dev, "failed to stop IOMMU for a PCI reset: %d\n", ret);
-		return ret;
+	if (!probe) {
+		ret = pci_dev_reset_iommu_prepare(dev);
+		if (ret) {
+			pci_err(dev, "failed to stop IOMMU for a PCI reset: %d\n", ret);
+			return ret;
+		}
 	}
 
 	ret = i->reset(dev, probe);
-	pci_dev_reset_iommu_done(dev);
+	if (!probe)
+		pci_dev_reset_iommu_done(dev);
 	return ret;
 }
 
-- 
2.43.0



  reply	other threads:[~2026-07-03  4:08 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-03  4:06 [PATCH v5 00/18] iommu/arm-smmu-v3: Quarantine device upon ATC invalidation timeout Nicolin Chen
2026-07-03  4:06 ` Nicolin Chen [this message]
2026-07-03  4:06 ` [PATCH v5 02/18] PCI/CXL: Probe the underlying bus reset in cxl_reset_bus_function() Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 03/18] PCI: Propagate FLR return values to callers Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 04/18] iommu: Convert gdev->blocked from bool to enum gdev_blocked Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 05/18] iommu: Pass in reset result to pci_dev_reset_iommu_done() Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 06/18] iommu/arm-smmu-v3: Don't rb_erase() a never-inserted stream node Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 07/18] iommu/arm-smmu-v3: Mark ATC invalidate timeouts via lockless bitmap Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 08/18] iommu/arm-smmu-v3: Skip remaining GERROR causes on SFM Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 09/18] iommu/arm-smmu-v3: Introduce per-cmdq cmdq_err_handler callback Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 10/18] iommu/arm-smmu-v3: Recheck CMDQ_ERR in tegra241_vintf0_handle_error() Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 11/18] iommu/arm-smmu-v3: Co-clear pending CMDQ_ERR when CMD_SYNC times out Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 12/18] iommu/arm-smmu-v3: Introduce arm_smmu_cmdq_batch_issue() wrapper Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 13/18] iommu/arm-smmu-v3: Add streams_lock for atomic-context SID->master lookup Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 14/18] iommu/arm-smmu-v3: Add has_ats to struct arm_smmu_cmdq_batch Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 15/18] iommu/arm-smmu-v3: Add INV_TYPE_ATS_BROKEN to skip quarantined ATS masters Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 16/18] iommu/arm-smmu-v3: Factor out CMDQ batch force-sync conditions Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 17/18] iommu/arm-smmu-v3: Thread arm_smmu_master_domain on a per-master list Nicolin Chen
2026-07-03  4:06 ` [PATCH v5 18/18] iommu/arm-smmu-v3: Block ATS for a master upon an ATC invalidation timeout Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6dbbbbe30cdc08a467de893fb9ecdc07d9cc1fec.1783044582.git.nicolinc@nvidia.com \
    --to=nicolinc@nvidia.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=praan@google.com \
    --cc=rafael@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=smostafa@google.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    --cc=xueshuai@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox