public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nicolin Chen <nicolinc@nvidia.com>
To: Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"Jason Gunthorpe" <jgg@nvidia.com>
Cc: "Rafael J . Wysocki" <rafael@kernel.org>,
	Len Brown <lenb@kernel.org>,
	Pranjal Shrivastava <praan@google.com>,
	Mostafa Saleh <smostafa@google.com>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	Kevin Tian <kevin.tian@intel.com>,
	<linux-arm-kernel@lists.infradead.org>, <iommu@lists.linux.dev>,
	<linux-kernel@vger.kernel.org>, <linux-acpi@vger.kernel.org>,
	<linux-pci@vger.kernel.org>, <vsethi@nvidia.com>,
	Shuai Xue <xueshuai@linux.alibaba.com>
Subject: [PATCH v3 00/11] iommu/arm-smmu-v3: Quarantine device upon ATC invalidation timeout
Date: Thu, 16 Apr 2026 16:28:29 -0700	[thread overview]
Message-ID: <cover.1776381841.git.nicolinc@nvidia.com> (raw)

Hi all,

This series addresses a critical vulnerability and stability issue where an
unresponsive PCIe device failing to process ATC (Address Translation Cache)
invalidation requests leads to silent data corruption and continuous SMMU
CMDQ error spam.

[ As Jason pointed out, because this series fundamentally introduces a new
  RAS feature to quarantine and recover from hardware faults and relies on
  a recently accepted SMMU driver rework, it is not treated as a standard
  bug fix. Thus, none of the patches here carries a "Fixes" tag. ]

Currently, when an ATC invalidation times out, the SMMUv3 driver skips the
CMDQ_ERR_CERROR_ATC_INV_IDX error. This leaves the device's ATS cache state
desynchronized from the SMMU: the device cache may retain stale ATC entries
for memory pages that the OS has already reclaimed and reassigned, creating
a direct vector for data corruption. Furthermore, the driver might continue
issuing ATC_INV commands, resulting in constant CMDQ errors:
    unexpected global error reported (0x00000001), this could be serious
    CMDQ error (cons 0x0302bb84): ATC invalidate timeout
    unexpected global error reported (0x00000001), this could be serious
    CMDQ error (cons 0x0302bb88): ATC invalidate timeout
    unexpected global error reported (0x00000001), this could be serious
    CMDQ error (cons 0x0302bb8c): ATC invalidate timeout
    ...

To resolve this, introduce a mechanism to quarantine a broken device in the
SMMUv3 driver and the IOMMU core. To achieve this, add preparatory changes:
 - Tighten the semantics of pci_dev_reset_iommu_done() that is now strictly
   called only upon a successful hardware reset
 - Introduce a reset_device_done op, allowing the core to signal the driver
   when the physical hardware has been cleanly recovered (e.g., via AER or
   a manual reset) so the quarantine can be lifted
 - Utilize a per-group_device WQ via an iommu_report_device_broken() helper

On the SMMUv3 driver side, retry the timedout ATC_INV batch to identify the
faulty device(s) via an atc_sync_timeouts tracker. Perform a surgical STE
update and flag the ATS as broken to reject further ATS/ATC requests at the
hardware level and suppress further timeout spam.

This is on Github:
https://github.com/nicolinc/iommufd/commits/smmuv3_atc_timeout-v3

Note that patches are rebased on bug-fix under review:
https://lore.kernel.org/all/20260407194644.171304-1-nicolinc@nvidia.com/

Changelog
v3:
 * Rebase on arm/smmu/updates branch + bug fix
 * Update commit messages and inline comments
 * [iommu] Drop unnecessary ops validation
 * [iommu] Add missed function stub when !CONFIG_IOMMU_API
 * [iommu] Change iommu_report_device_broken() to per gdev
 * [iommu] Separate quarantine from pci_dev_reset_prepare()
 * [iommu] Check reset failure in pci_dev_reset_iommu_done()
 * [smmuv3] Fix STE update with try_cmpxchg64()
 * [smmuv3] Fix "continue" bug when skipping ATC commands
 * [smmuv3] Replace atomic_t prod_err with a lockless bitmap
 * [smmuv3] Drop master->invs_domain; disable ATS per-master directly
 * [smmuv3] Return -EIO for ATC timeout v.s. -ETIMEDOUT for poll timeout
 * [smmuv3] Replace INV_TYPE_ATS_DISABLED with per-master ats_broken flag
v2:
 https://lore.kernel.org/all/cover.1773774441.git.nicolinc@nvidia.com/
 * Rebase on arm_smmu_invs-v13 series [0]
 * Bisect batched atc invalidation commands
 * Drop the direct pci_reset_function() call
 * Move the work queue from SMMUv3 to the core
 * Proceed a surgical STE update to disable EATS
 * Wait for pci_dev_reset_iommu_done() to signal a recovery
v1:
 https://lore.kernel.org/all/cover.1772686998.git.nicolinc@nvidia.com/

[0] https://lore.kernel.org/all/cover.1773733797.git.nicolinc@nvidia.com/

Thanks
Nicolin

Nicolin Chen (11):
  PCI: Propagate FLR return values to callers
  iommu: Pass in reset result to pci_dev_reset_iommu_done()
  iommu: Add reset_device_done callback for hardware fault recovery
  iommu: Add __iommu_group_block_device helper
  iommu: Change group->devices to RCU-protected list
  iommu: Defer __iommu_group_free_device() to be outside group->mutex
  iommu: Add iommu_report_device_broken() to quarantine a broken device
  iommu/arm-smmu-v3: Mark ATC invalidate timeouts via lockless bitmap
  iommu/arm-smmu-v3: Replace smmu with master in arm_smmu_inv
  iommu/arm-smmu-v3: Introduce master->ats_broken flag
  iommu/arm-smmu-v3: Block ATS upon an ATC invalidation timeout

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |   4 +-
 include/linux/iommu.h                         |  15 +-
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c  |  34 ++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 193 +++++++++++-
 drivers/iommu/iommu.c                         | 284 ++++++++++++++----
 drivers/pci/pci-acpi.c                        |   2 +-
 drivers/pci/pci.c                             |  10 +-
 drivers/pci/quirks.c                          |  24 +-
 8 files changed, 454 insertions(+), 112 deletions(-)

-- 
2.43.0


             reply	other threads:[~2026-04-16 23:29 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-16 23:28 Nicolin Chen [this message]
2026-04-16 23:28 ` [PATCH v3 01/11] PCI: Propagate FLR return values to callers Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 02/11] iommu: Pass in reset result to pci_dev_reset_iommu_done() Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 03/11] iommu: Add reset_device_done callback for hardware fault recovery Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 04/11] iommu: Add __iommu_group_block_device helper Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 05/11] iommu: Change group->devices to RCU-protected list Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 06/11] iommu: Defer __iommu_group_free_device() to be outside group->mutex Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 07/11] iommu: Add iommu_report_device_broken() to quarantine a broken device Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 08/11] iommu/arm-smmu-v3: Mark ATC invalidate timeouts via lockless bitmap Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 09/11] iommu/arm-smmu-v3: Replace smmu with master in arm_smmu_inv Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 10/11] iommu/arm-smmu-v3: Introduce master->ats_broken flag Nicolin Chen
2026-04-16 23:28 ` [PATCH v3 11/11] iommu/arm-smmu-v3: Block ATS upon an ATC invalidation timeout Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1776381841.git.nicolinc@nvidia.com \
    --to=nicolinc@nvidia.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=praan@google.com \
    --cc=rafael@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=smostafa@google.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    --cc=xueshuai@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox