Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/24] iommu/arm-smmu-v3: Quarantine device upon ATC invalidation timeout
@ 2026-05-19  3:38 Nicolin Chen
  2026-05-19  3:38 ` [PATCH v4 01/24] PCI: Don't suspend IOMMU when probing reset capability Nicolin Chen
                   ` (23 more replies)
  0 siblings, 24 replies; 37+ messages in thread
From: Nicolin Chen @ 2026-05-19  3:38 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Bjorn Helgaas,
	Jason Gunthorpe
  Cc: Rafael J . Wysocki, Len Brown, Pranjal Shrivastava, Mostafa Saleh,
	Lu Baolu, Kevin Tian, linux-arm-kernel, iommu, linux-kernel,
	linux-acpi, linux-pci, vsethi, Shuai Xue

Hi all,

This series addresses a critical vulnerability and stability issue where an
unresponsive PCIe device failing to process ATC (Address Translation Cache)
invalidation requests leads to silent data corruption and continuous SMMU
CMDQ error spam.

[ As Jason pointed out, because this series fundamentally introduces a new
  RAS feature to quarantine and recover from hardware faults and relies on
  a recently accepted SMMU driver rework, it is not treated as a standard
  bug fix. Thus, most of the patches here don't carry a "Fixes" tag. ]

Currently, when an ATC invalidation times out, the SMMUv3 driver skips the
CMDQ_ERR_CERROR_ATC_INV_IDX error. This leaves the device's ATS cache state
desynchronized from the SMMU: the device cache may retain stale ATC entries
for memory pages that the OS has already reclaimed and reassigned, creating
a direct vector for data corruption. Furthermore, the driver might continue
issuing ATC_INV commands, resulting in constant CMDQ errors:
    unexpected global error reported (0x00000001), this could be serious
    CMDQ error (cons 0x0302bb84): ATC invalidate timeout
    unexpected global error reported (0x00000001), this could be serious
    CMDQ error (cons 0x0302bb88): ATC invalidate timeout
    unexpected global error reported (0x00000001), this could be serious
    CMDQ error (cons 0x0302bb8c): ATC invalidate timeout
    ...

To resolve this, introduce a mechanism to quarantine a broken device in the
SMMUv3 driver and the IOMMU core. To achieve this, add preparatory changes:
 - Pass in PCI reset result to pci_dev_reset_iommu_done()
 - Co-clear pending CMDQ_ERR from the cmdq issuer under a raw_spinlock_t,
   so an ATC_INV timeout flagged in cmdq->atc_sync_timeouts is definitive
   when the issuer reads its bit after CMD_SYNC poll
 - Introduce a reset_device_done op, allowing the core to signal the driver
   when the physical hardware has been cleanly recovered (e.g., via AER or
   a manual reset) so the quarantine can be lifted
 - Utilize a per-group_device WQ via an iommu_report_device_broken() helper

On the SMMUv3 driver side, retry the timedout ATC_INV batch to identify the
faulty device(s). Perform a surgical STE update, and flag the ATS as broken
to reject further ATS/ATC requests at HW level and suppress timeout spam.

This is on Github:
https://github.com/nicolinc/iommufd/commits/smmuv3_atc_timeout-v4

Changelog
v4:
 * Rebase on Joerg's IOMMU "fixes" branch
 * Rebase on Jason's SMMUv3 cmd_ent series
   https://lore.kernel.org/all/0-v2-47b2bf710ad5+716ac-smmu_no_cmdq_ent_jgg@nvidia.com/
 * [PCI] Don't suspend IOMMU in probe mode
 * [iommu] kfree_rcu() iommu_group
 * [iommu] Convert gdev->blocked to enum gdev_blocked
 * [iommu] Use disable_work_sync() to fix UAF and ref leak
 * [iommu] Gate done() transitions to preserve BLOCKED_BROKEN
 * [iommu] Decrement recovery_cnt when unplugging a blocked gdev
 * [iommu] Drop racy dev_has_iommu() in iommu_report_device_broken()
 * [iommu] Add gdev->broken_pending to skip worker after racing recovery
 * [smmuv3] Add master->ats_invs scratch
 * [smmuv3] Add arm_smmu_cmdq_batch_issue() wrapper
 * [smmuv3] Force per-flush sync for has_ats batches
 * [smmuv3] Serialize STE.EATS and ats_broken updates
 * [smmuv3] Co-clear pending CMDQ_ERR from cmdq issuer
 * [smmuv3] Add invs and has_ats to arm_smmu_cmdq_batch
 * [smmuv3] Move arm_smmu_invs_for_each_entry to header
 * [smmuv3] Set master->ats_broken after clearing STE.EATS
 * [smmuv3] Issue CFGI_STE via arm_smmu_cmdq_issue_cmd_with_sync()
 * [smmuv3] Keep "smmu" pointer in arm_smmu_inv but add "master" for ATS
v3:
 https://lore.kernel.org/all/cover.1776381841.git.nicolinc@nvidia.com/
 * Rebase on arm/smmu/updates branch + bug fix
 * Update commit messages and inline comments
 * [iommu] Drop unnecessary ops validation
 * [iommu] Add missed function stub when !CONFIG_IOMMU_API
 * [iommu] Change iommu_report_device_broken() to per gdev
 * [iommu] Separate quarantine from pci_dev_reset_prepare()
 * [iommu] Check reset failure in pci_dev_reset_iommu_done()
 * [smmuv3] Fix STE update with try_cmpxchg64()
 * [smmuv3] Fix "continue" bug when skipping ATC commands
 * [smmuv3] Replace atomic_t prod_err with a lockless bitmap
 * [smmuv3] Drop master->invs_domain; disable ATS per-master directly
 * [smmuv3] Return -EIO for ATC timeout v.s. -ETIMEDOUT for poll timeout
 * [smmuv3] Replace INV_TYPE_ATS_DISABLED with per-master ats_broken flag
v2:
 https://lore.kernel.org/all/cover.1773774441.git.nicolinc@nvidia.com/
 * Rebase on arm_smmu_invs-v13 series
 * Bisect batched atc invalidation commands
 * Drop the direct pci_reset_function() call
 * Move the work queue from SMMUv3 to the core
 * Proceed a surgical STE update to disable EATS
 * Wait for pci_dev_reset_iommu_done() to signal a recovery
v1:
 https://lore.kernel.org/all/cover.1772686998.git.nicolinc@nvidia.com/

Thanks
Nicolin

Nicolin Chen (24):
  PCI: Don't suspend IOMMU when probing reset capability
  PCI: Propagate FLR return values to callers
  iommu: Convert gdev->blocked from bool to enum gdev_blocked
  iommu: Pass in reset result to pci_dev_reset_iommu_done()
  iommu: Add reset_device_done callback for hardware fault recovery
  iommu: Defer iommu_group free via kfree_rcu()
  iommu: Defer __iommu_group_free_device() to be outside group->mutex
  iommu: Change group->devices to RCU-protected list
  iommu: Add group pointer to struct group_device
  iommu: Add __iommu_group_block_device helper
  iommu: Add iommu_report_device_broken() to quarantine a broken device
  iommu/arm-smmu-v3: Mark ATC invalidate timeouts via lockless bitmap
  iommu/arm-smmu-v3: Skip remaining GERROR causes on SFM
  iommu/arm-smmu-v3: Introduce per-cmdq cmdq_err_handler callback
  iommu/arm-smmu-v3: Co-clear pending CMDQ_ERR when CMD_SYNC times out
  iommu/arm-smmu-v3: Co-clear pending CMDQ_ERR when queue_has_space()
    fails
  iommu/arm-smmu-v3: Add master in arm_smmu_inv for ATS entries
  iommu/arm-smmu-v3: Introduce master->ats_broken flag
  iommu/arm-smmu-v3: Add invs and has_ats to struct arm_smmu_cmdq_batch
  iommu/arm-smmu-v3: Introduce arm_smmu_cmdq_batch_issue() wrapper
  iommu/arm-smmu-v3: Move arm_smmu_invs_for_each_entry to header
  iommu/arm-smmu-v3: Introduce master->ats_invs
  iommu/arm-smmu-v3: Serialize STE.EATS and ats_broken updates
  iommu/arm-smmu-v3: Block ATS upon an ATC invalidation timeout

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  72 +++-
 include/linux/iommu.h                         |  18 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 387 ++++++++++++++---
 .../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c    |  36 +-
 drivers/iommu/iommu.c                         | 406 ++++++++++++++----
 drivers/pci/pci-acpi.c                        |   2 +-
 drivers/pci/pci.c                             |  21 +-
 drivers/pci/quirks.c                          |  43 +-
 8 files changed, 820 insertions(+), 165 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2026-05-19 23:02 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-19  3:38 [PATCH v4 00/24] iommu/arm-smmu-v3: Quarantine device upon ATC invalidation timeout Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 01/24] PCI: Don't suspend IOMMU when probing reset capability Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 02/24] PCI: Propagate FLR return values to callers Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 03/24] iommu: Convert gdev->blocked from bool to enum gdev_blocked Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 04/24] iommu: Pass in reset result to pci_dev_reset_iommu_done() Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 05/24] iommu: Add reset_device_done callback for hardware fault recovery Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 06/24] iommu: Defer iommu_group free via kfree_rcu() Nicolin Chen
2026-05-19 11:39   ` Jason Gunthorpe
2026-05-19 18:54     ` Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 07/24] iommu: Defer __iommu_group_free_device() to be outside group->mutex Nicolin Chen
2026-05-19 11:47   ` Jason Gunthorpe
2026-05-19  3:38 ` [PATCH v4 08/24] iommu: Change group->devices to RCU-protected list Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 09/24] iommu: Add group pointer to struct group_device Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 10/24] iommu: Add __iommu_group_block_device helper Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 11/24] iommu: Add iommu_report_device_broken() to quarantine a broken device Nicolin Chen
2026-05-19 12:07   ` Jason Gunthorpe
2026-05-19 18:29     ` Nicolin Chen
2026-05-19 19:16       ` Jason Gunthorpe
2026-05-19 22:30         ` Nicolin Chen
2026-05-19 23:02           ` Jason Gunthorpe
2026-05-19  3:38 ` [PATCH v4 12/24] iommu/arm-smmu-v3: Mark ATC invalidate timeouts via lockless bitmap Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 13/24] iommu/arm-smmu-v3: Skip remaining GERROR causes on SFM Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 14/24] iommu/arm-smmu-v3: Introduce per-cmdq cmdq_err_handler callback Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 15/24] iommu/arm-smmu-v3: Co-clear pending CMDQ_ERR when CMD_SYNC times out Nicolin Chen
2026-05-19  3:38 ` [PATCH v4 16/24] iommu/arm-smmu-v3: Co-clear pending CMDQ_ERR when queue_has_space() fails Nicolin Chen
2026-05-19  3:39 ` [PATCH v4 17/24] iommu/arm-smmu-v3: Add master in arm_smmu_inv for ATS entries Nicolin Chen
2026-05-19 12:01   ` Jason Gunthorpe
2026-05-19  3:39 ` [PATCH v4 18/24] iommu/arm-smmu-v3: Introduce master->ats_broken flag Nicolin Chen
2026-05-19 12:06   ` Jason Gunthorpe
2026-05-19  3:39 ` [PATCH v4 19/24] iommu/arm-smmu-v3: Add invs and has_ats to struct arm_smmu_cmdq_batch Nicolin Chen
2026-05-19 12:09   ` Jason Gunthorpe
2026-05-19  3:39 ` [PATCH v4 20/24] iommu/arm-smmu-v3: Introduce arm_smmu_cmdq_batch_issue() wrapper Nicolin Chen
2026-05-19  3:39 ` [PATCH v4 21/24] iommu/arm-smmu-v3: Move arm_smmu_invs_for_each_entry to header Nicolin Chen
2026-05-19  3:39 ` [PATCH v4 22/24] iommu/arm-smmu-v3: Introduce master->ats_invs Nicolin Chen
2026-05-19 12:12   ` Jason Gunthorpe
2026-05-19  3:39 ` [PATCH v4 23/24] iommu/arm-smmu-v3: Serialize STE.EATS and ats_broken updates Nicolin Chen
2026-05-19  3:39 ` [PATCH v4 24/24] iommu/arm-smmu-v3: Block ATS upon an ATC invalidation timeout Nicolin Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox