public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/11] iommu/arm-smmu-v3: Quarantine device upon ATC invalidation timeout
@ 2026-04-16 23:28 Nicolin Chen
  2026-04-16 23:28 ` [PATCH v3 01/11] PCI: Propagate FLR return values to callers Nicolin Chen
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Nicolin Chen @ 2026-04-16 23:28 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Bjorn Helgaas,
	Jason Gunthorpe
  Cc: Rafael J . Wysocki, Len Brown, Pranjal Shrivastava, Mostafa Saleh,
	Lu Baolu, Kevin Tian, linux-arm-kernel, iommu, linux-kernel,
	linux-acpi, linux-pci, vsethi, Shuai Xue

Hi all,

This series addresses a critical vulnerability and stability issue where an
unresponsive PCIe device failing to process ATC (Address Translation Cache)
invalidation requests leads to silent data corruption and continuous SMMU
CMDQ error spam.

[ As Jason pointed out, because this series fundamentally introduces a new
  RAS feature to quarantine and recover from hardware faults and relies on
  a recently accepted SMMU driver rework, it is not treated as a standard
  bug fix. Thus, none of the patches here carries a "Fixes" tag. ]

Currently, when an ATC invalidation times out, the SMMUv3 driver skips the
CMDQ_ERR_CERROR_ATC_INV_IDX error. This leaves the device's ATS cache state
desynchronized from the SMMU: the device cache may retain stale ATC entries
for memory pages that the OS has already reclaimed and reassigned, creating
a direct vector for data corruption. Furthermore, the driver might continue
issuing ATC_INV commands, resulting in constant CMDQ errors:
    unexpected global error reported (0x00000001), this could be serious
    CMDQ error (cons 0x0302bb84): ATC invalidate timeout
    unexpected global error reported (0x00000001), this could be serious
    CMDQ error (cons 0x0302bb88): ATC invalidate timeout
    unexpected global error reported (0x00000001), this could be serious
    CMDQ error (cons 0x0302bb8c): ATC invalidate timeout
    ...

To resolve this, introduce a mechanism to quarantine a broken device in the
SMMUv3 driver and the IOMMU core. To achieve this, add preparatory changes:
 - Tighten the semantics of pci_dev_reset_iommu_done() that is now strictly
   called only upon a successful hardware reset
 - Introduce a reset_device_done op, allowing the core to signal the driver
   when the physical hardware has been cleanly recovered (e.g., via AER or
   a manual reset) so the quarantine can be lifted
 - Utilize a per-group_device WQ via an iommu_report_device_broken() helper

On the SMMUv3 driver side, retry the timedout ATC_INV batch to identify the
faulty device(s) via an atc_sync_timeouts tracker. Perform a surgical STE
update and flag the ATS as broken to reject further ATS/ATC requests at the
hardware level and suppress further timeout spam.

This is on Github:
https://github.com/nicolinc/iommufd/commits/smmuv3_atc_timeout-v3

Note that patches are rebased on bug-fix under review:
https://lore.kernel.org/all/20260407194644.171304-1-nicolinc@nvidia.com/

Changelog
v3:
 * Rebase on arm/smmu/updates branch + bug fix
 * Update commit messages and inline comments
 * [iommu] Drop unnecessary ops validation
 * [iommu] Add missed function stub when !CONFIG_IOMMU_API
 * [iommu] Change iommu_report_device_broken() to per gdev
 * [iommu] Separate quarantine from pci_dev_reset_prepare()
 * [iommu] Check reset failure in pci_dev_reset_iommu_done()
 * [smmuv3] Fix STE update with try_cmpxchg64()
 * [smmuv3] Fix "continue" bug when skipping ATC commands
 * [smmuv3] Replace atomic_t prod_err with a lockless bitmap
 * [smmuv3] Drop master->invs_domain; disable ATS per-master directly
 * [smmuv3] Return -EIO for ATC timeout v.s. -ETIMEDOUT for poll timeout
 * [smmuv3] Replace INV_TYPE_ATS_DISABLED with per-master ats_broken flag
v2:
 https://lore.kernel.org/all/cover.1773774441.git.nicolinc@nvidia.com/
 * Rebase on arm_smmu_invs-v13 series [0]
 * Bisect batched atc invalidation commands
 * Drop the direct pci_reset_function() call
 * Move the work queue from SMMUv3 to the core
 * Proceed a surgical STE update to disable EATS
 * Wait for pci_dev_reset_iommu_done() to signal a recovery
v1:
 https://lore.kernel.org/all/cover.1772686998.git.nicolinc@nvidia.com/

[0] https://lore.kernel.org/all/cover.1773733797.git.nicolinc@nvidia.com/

Thanks
Nicolin

Nicolin Chen (11):
  PCI: Propagate FLR return values to callers
  iommu: Pass in reset result to pci_dev_reset_iommu_done()
  iommu: Add reset_device_done callback for hardware fault recovery
  iommu: Add __iommu_group_block_device helper
  iommu: Change group->devices to RCU-protected list
  iommu: Defer __iommu_group_free_device() to be outside group->mutex
  iommu: Add iommu_report_device_broken() to quarantine a broken device
  iommu/arm-smmu-v3: Mark ATC invalidate timeouts via lockless bitmap
  iommu/arm-smmu-v3: Replace smmu with master in arm_smmu_inv
  iommu/arm-smmu-v3: Introduce master->ats_broken flag
  iommu/arm-smmu-v3: Block ATS upon an ATC invalidation timeout

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |   4 +-
 include/linux/iommu.h                         |  15 +-
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c  |  34 ++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 193 +++++++++++-
 drivers/iommu/iommu.c                         | 284 ++++++++++++++----
 drivers/pci/pci-acpi.c                        |   2 +-
 drivers/pci/pci.c                             |  10 +-
 drivers/pci/quirks.c                          |  24 +-
 8 files changed, 454 insertions(+), 112 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-04-16 23:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-16 23:28 [PATCH v3 00/11] iommu/arm-smmu-v3: Quarantine device upon ATC invalidation timeout Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 01/11] PCI: Propagate FLR return values to callers Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 02/11] iommu: Pass in reset result to pci_dev_reset_iommu_done() Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 03/11] iommu: Add reset_device_done callback for hardware fault recovery Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 04/11] iommu: Add __iommu_group_block_device helper Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 05/11] iommu: Change group->devices to RCU-protected list Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 06/11] iommu: Defer __iommu_group_free_device() to be outside group->mutex Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 07/11] iommu: Add iommu_report_device_broken() to quarantine a broken device Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 08/11] iommu/arm-smmu-v3: Mark ATC invalidate timeouts via lockless bitmap Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 09/11] iommu/arm-smmu-v3: Replace smmu with master in arm_smmu_inv Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 10/11] iommu/arm-smmu-v3: Introduce master->ats_broken flag Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 11/11] iommu/arm-smmu-v3: Block ATS upon an ATC invalidation timeout Nicolin Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox