From: Nicolin Chen <nicolinc@nvidia.com>
To: <will@kernel.org>, <robin.murphy@arm.com>, <jgg@nvidia.com>,
<kevin.tian@intel.com>
Cc: <joro@8bytes.org>, <praan@google.com>, <baolu.lu@linux.intel.com>,
<miko.lenczewski@arm.com>, <smostafa@google.com>,
<linux-arm-kernel@lists.infradead.org>, <iommu@lists.linux.dev>,
<linux-kernel@vger.kernel.org>, <stable@vger.kernel.org>,
<jamien@nvidia.com>
Subject: [PATCH rc v3 0/5] iommu/arm-smmu-v3: Fix device crash on kdump kernel
Date: Sat, 25 Apr 2026 14:30:45 -0700 [thread overview]
Message-ID: <cover.1777150307.git.nicolinc@nvidia.com> (raw)
When transitioning to a kdump kernel, the primary kernel might have crashed
while endpoint devices were actively bus-mastering DMA. Currently, the SMMU
driver aggressively resets the hardware during probe by clearing CR0_SMMUEN
and setting the Global Bypass Attribute (GBPA) to ABORT.
In a kdump scenario, this aggressive reset is highly destructive:
a) If GBPA is set to ABORT, in-flight DMA will be aborted, generating fatal
PCIe AER or SErrors that may panic the kdump kernel
b) If GBPA is set to BYPASS, in-flight DMA targeting some IOVAs will bypass
the SMMU and corrupt the physical memory at those 1:1 mapped IOVAs.
To safely absorb in-flight DMA, the kdump kernel must leave SMMUEN=1 intact
and avoid modifying STRTAB_BASE. This allows HW to continue translating in-
flight DMA using the crashed kernel's page tables until the endpoint device
drivers probe and quiesce their respective hardware.
However, the ARM SMMUv3 architecture specification states that updating the
SMMU_STRTAB_BASE register while SMMUEN == 1 is UNPREDICTABLE or ignored.
This leaves a kdump kernel no choice but to adopt the stream table from the
crashed kernel.
In this series:
- Introduce an ARM_SMMU_OPT_KDUMP_ADOPT
- Skip SMMUEN and STRTAB_BASE resets in arm_smmu_device_reset()
- Skip EVENTQ and PRIQ setups including interrupts and their handlers
- Memremap the crashed kernel's stream tables into the kdump kernel [*]
- Defer any default domain attachment to retain STEs until device drivers
explicitly request it.
[*] This only works on a coherent SMMU.
For non-ARM_SMMU_OPT_KDUMP_ADOPT cases, keep a status quo since the commit
3f54c447df34f ("iommu/arm-smmu-v3: Don't disable SMMU in kdump kernel"):
full reset followed by driver-initiated reattach, potentially rejecting any
in-flight DMA.
Note that the series requires Jason's work that was merged in v6.12: commit
85196f54743d ("iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg").
I have a backported version that is verified with a v6.8 kernel. I can send
if we see a strong need after this version is accepted.
This is on Github:
https://github.com/nicolinc/iommufd/commits/smmuv3_kdump-v3
Changelog
v3
* s/OPT_KDUMP/OPT_KDUMP_ADOPT
* Do not adopt if GERROR_SFM_ERR
* Retain CR0_ATSCHK beside CR0_SMMUEN
* Clear latched GERROR bits (e.g. CMDQ_ERR)
* Assert ARM_SMMU_FEAT_COHERENCY in adopt functions
* Add STE.Cfg check in arm_smmu_is_attach_deferred()
* Fix validations on return codes from devm_memremap()
* Sanitize crashed kernel register values in adopt functions
* Drop unnecessary l2ptrs guard in arm_smmu_is_attach_deferred()
* Don't enable PRIQ/EVTQ irqs and guard the irq functions for combined
irq cases
v2
https://lore.kernel.org/all/cover.1776286352.git.nicolinc@nvidia.com/
* Add warning in non-coherent SMMU cases
* Keep eventq/priq disabled v.s. enabling-and-disabling-later
* Check KDUMP option in the beginning of arm_smmu_device_reset()
* Validate STRTAB format matches HW capability instead of forcing flags
v1:
https://lore.kernel.org/all/cover.1775763475.git.nicolinc@nvidia.com/
Nicolin Chen (5):
iommu/arm-smmu-v3: Add arm_smmu_adopt_strtab() for kdump
iommu/arm-smmu-v3: Implement is_attach_deferred() for kdump
iommu/arm-smmu-v3: Retain CR0_SMMUEN during kdump device reset
iommu/arm-smmu-v3: Skip EVTQ/PRIQ setup in kdump kernel
iommu/arm-smmu-v3: Detect ARM_SMMU_OPT_KDUMP_ADOPT in probe()
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 358 ++++++++++++++++++--
2 files changed, 338 insertions(+), 21 deletions(-)
--
2.43.0
next reply other threads:[~2026-04-25 21:32 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-25 21:30 Nicolin Chen [this message]
2026-04-25 21:30 ` [PATCH rc v3 1/5] iommu/arm-smmu-v3: Add arm_smmu_adopt_strtab() for kdump Nicolin Chen
2026-04-25 21:30 ` [PATCH rc v3 2/5] iommu/arm-smmu-v3: Implement is_attach_deferred() " Nicolin Chen
2026-04-25 21:30 ` [PATCH rc v3 3/5] iommu/arm-smmu-v3: Retain CR0_SMMUEN during kdump device reset Nicolin Chen
2026-04-25 21:30 ` [PATCH rc v3 4/5] iommu/arm-smmu-v3: Skip EVTQ/PRIQ setup in kdump kernel Nicolin Chen
2026-04-25 21:30 ` [PATCH rc v3 5/5] iommu/arm-smmu-v3: Detect ARM_SMMU_OPT_KDUMP_ADOPT in probe() Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1777150307.git.nicolinc@nvidia.com \
--to=nicolinc@nvidia.com \
--cc=baolu.lu@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=jamien@nvidia.com \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miko.lenczewski@arm.com \
--cc=praan@google.com \
--cc=robin.murphy@arm.com \
--cc=smostafa@google.com \
--cc=stable@vger.kernel.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox