From: Nicolin Chen <nicolinc@nvidia.com>
To: <will@kernel.org>, <robin.murphy@arm.com>, <jgg@nvidia.com>,
<kevin.tian@intel.com>
Cc: <joro@8bytes.org>, <praan@google.com>, <baolu.lu@linux.intel.com>,
<miko.lenczewski@arm.com>, <smostafa@google.com>,
<linux-arm-kernel@lists.infradead.org>, <iommu@lists.linux.dev>,
<linux-kernel@vger.kernel.org>, <stable@vger.kernel.org>,
<jamien@nvidia.com>
Subject: [PATCH rc v3 1/5] iommu/arm-smmu-v3: Add arm_smmu_adopt_strtab() for kdump
Date: Sat, 25 Apr 2026 14:30:46 -0700 [thread overview]
Message-ID: <9ac81da2ad2fb5795565795759b3e1dd94f0b4bb.1777150307.git.nicolinc@nvidia.com> (raw)
In-Reply-To: <cover.1777150307.git.nicolinc@nvidia.com>
When transitioning to a kdump kernel, the primary kernel might have crashed
while endpoint devices were actively bus-mastering DMA. Currently, the SMMU
driver aggressively resets the hardware during probe by clearing CR0_SMMUEN
and setting the Global Bypass Attribute (GBPA) to ABORT.
In a kdump scenario, this aggressive reset is highly destructive:
a) If GBPA is set to ABORT, in-flight DMA will be aborted, generating fatal
PCIe AER or SErrors that may panic the kdump kernel
b) If GBPA is set to BYPASS, in-flight DMA targeting some IOVAs will bypass
the SMMU and corrupt the physical memory at those 1:1 mapped IOVAs.
To safely absorb in-flight DMAs, a kdump kernel will have to leave SMMUEN=1
intact and avoid modifying STRTAB_BASE, allowing HW to continue translating
in-flight DMAs reusing the crashed kernel's page tables until the endpoint
device drivers probe and quiesce their respective hardware.
However, the ARM SMMUv3 architecture specification states that updating the
SMMU_STRTAB_BASE register while SMMUEN == 1 is UNPREDICTABLE or ignored.
This leaves a kdump kernel no choice but to adopt the stream table from the
crashed kernel.
Introduce ARM_SMMU_OPT_KDUMP_ADOPT and its pairing arm_smmu_adopt_strtab(),
which does memremap on all the stream tables extracted from STRTAB_BASE and
STRTAB_BASE_CFG. This new option will be set in arm_smmu_device_hw_probe()
in a following change.
Note that the adoption of the crashed kernel's stream table follows certain
strict rules, since the old stream table might be compromised. Thus, apply
a series of validations against the values read from the registers. If any
address or size doesn't pass the test, it means the stream table cannot be
trusted, so toss it completely.
Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
Cc: stable@vger.kernel.org # v6.12+
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 187 ++++++++++++++++++++
2 files changed, 188 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index ef42df4753ec4..cd60b692c3901 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -861,6 +861,7 @@ struct arm_smmu_device {
#define ARM_SMMU_OPT_MSIPOLL (1 << 2)
#define ARM_SMMU_OPT_CMDQ_FORCE_SYNC (1 << 3)
#define ARM_SMMU_OPT_TEGRA241_CMDQV (1 << 4)
+#define ARM_SMMU_OPT_KDUMP_ADOPT (1 << 5)
u32 options;
struct arm_smmu_cmdq cmdq;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f6901c5437edc..bf292e1e0c323 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -14,6 +14,7 @@
#include <linux/bitops.h>
#include <linux/crash_dump.h>
#include <linux/delay.h>
+#include <linux/dma-direct.h>
#include <linux/err.h>
#include <linux/interrupt.h>
#include <linux/io-pgtable.h>
@@ -4553,10 +4554,195 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
return 0;
}
+/*
+ * Adopting the crashed kernel's stream table has risks: the physical addresses
+ * read from ARM_SMMU_STRTAB_BASE / L1 descriptors may be corrupted. Reject any
+ * range that overlaps the kdump kernel's critical regions.
+ *
+ * Note that we cannot reject an overlap on IORESOURCE_MEM, as reserved regions
+ * of the crashed kernel might reside there.
+ */
+static bool arm_smmu_kdump_phys_is_corrupted(phys_addr_t base, size_t size)
+{
+ /* Must NOT overlap kdump kernel's own RAM */
+ return region_intersects(base, size, IORESOURCE_SYSTEM_RAM,
+ IORES_DESC_NONE) != REGION_DISJOINT;
+}
+
+static int arm_smmu_adopt_strtab_2lvl(struct arm_smmu_device *smmu, u32 cfg_reg,
+ dma_addr_t dma)
+{
+ u32 log2size = FIELD_GET(STRTAB_BASE_CFG_LOG2SIZE, cfg_reg);
+ u32 split = FIELD_GET(STRTAB_BASE_CFG_SPLIT, cfg_reg);
+ struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+ phys_addr_t base;
+ u32 num_l1_ents;
+ size_t size;
+ int i;
+
+ /*
+ * Only a coherent SMMU is supported at this moment. For a non-coherent
+ * SMMU that wants to support ARM_SMMU_OPT_KDUMP_ADOPT, try MEMREMAP_WC.
+ */
+ if (WARN_ON(!(smmu->features & ARM_SMMU_FEAT_COHERENCY)))
+ return -EOPNOTSUPP;
+
+ if (log2size < split || log2size > smmu->sid_bits) {
+ dev_err(smmu->dev, "kdump: log2size %u out of range [%u, %u]\n",
+ log2size, split, smmu->sid_bits);
+ return -EINVAL;
+ }
+ if (split != STRTAB_SPLIT) {
+ dev_err(smmu->dev,
+ "kdump: unsupported STRTAB_SPLIT %u (expected %u)\n",
+ split, STRTAB_SPLIT);
+ return -EINVAL;
+ }
+
+ num_l1_ents = 1U << (log2size - split);
+ if (num_l1_ents > STRTAB_MAX_L1_ENTRIES) {
+ dev_err(smmu->dev, "kdump: l1 entries %u exceeds max %u\n",
+ num_l1_ents, STRTAB_MAX_L1_ENTRIES);
+ return -EINVAL;
+ }
+
+ cfg->l2.l1_dma = dma;
+ cfg->l2.num_l1_ents = num_l1_ents;
+
+ base = dma_to_phys(smmu->dev, dma);
+ size = num_l1_ents * sizeof(struct arm_smmu_strtab_l1);
+ if (arm_smmu_kdump_phys_is_corrupted(base, size)) {
+ dev_err(smmu->dev, "kdump: l1 stream table is corrupted\n");
+ return -EINVAL;
+ }
+
+ cfg->l2.l1tab = devm_memremap(smmu->dev, base, size, MEMREMAP_WB);
+ if (IS_ERR(cfg->l2.l1tab))
+ return PTR_ERR(cfg->l2.l1tab);
+
+ cfg->l2.l2ptrs = devm_kcalloc(smmu->dev, num_l1_ents,
+ sizeof(*cfg->l2.l2ptrs), GFP_KERNEL);
+ if (!cfg->l2.l2ptrs)
+ return -ENOMEM;
+
+ for (i = 0; i < num_l1_ents; i++) {
+ u64 l2ptr = le64_to_cpu(cfg->l2.l1tab[i].l2ptr);
+ dma_addr_t l2_dma = l2ptr & STRTAB_L1_DESC_L2PTR_MASK;
+ u32 span = FIELD_GET(STRTAB_L1_DESC_SPAN, l2ptr);
+
+ if (!span || !l2_dma)
+ continue;
+
+ if (span != STRTAB_SPLIT + 1) {
+ dev_err(smmu->dev,
+ "kdump: L1[%u] unsupported span %u (vs %u)\n",
+ i, span, STRTAB_SPLIT + 1);
+ return -EINVAL;
+ }
+
+ base = dma_to_phys(smmu->dev, l2_dma);
+ size = (1UL << (span - 1)) * sizeof(struct arm_smmu_ste);
+ if (arm_smmu_kdump_phys_is_corrupted(base, size)) {
+ dev_err(smmu->dev,
+ "kdump: l2 stream table is corrupted\n");
+ return -EINVAL;
+ }
+
+ cfg->l2.l2ptrs[i] =
+ devm_memremap(smmu->dev, base, size, MEMREMAP_WB);
+ if (IS_ERR(cfg->l2.l2ptrs[i]))
+ return PTR_ERR(cfg->l2.l2ptrs[i]);
+ }
+
+ return 0;
+}
+
+static int arm_smmu_adopt_strtab_linear(struct arm_smmu_device *smmu,
+ u32 cfg_reg, dma_addr_t dma)
+{
+ u32 log2size = FIELD_GET(STRTAB_BASE_CFG_LOG2SIZE, cfg_reg);
+ struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+ unsigned int max_log2size;
+ phys_addr_t base;
+ size_t size;
+
+ /*
+ * Only a coherent SMMU is supported at this moment. For a non-coherent
+ * SMMU that wants to support ARM_SMMU_OPT_KDUMP_ADOPT, try MEMREMAP_WC.
+ */
+ if (WARN_ON(!(smmu->features & ARM_SMMU_FEAT_COHERENCY)))
+ return -EOPNOTSUPP;
+
+ /* cfg->linear.num_ents is unsigned int, so cap log2size at 31 */
+ max_log2size = min(smmu->sid_bits, 31U);
+ if (log2size > max_log2size) {
+ dev_err(smmu->dev, "kdump: unsupported log2size %u (> %u)\n",
+ log2size, max_log2size);
+ return -EINVAL;
+ }
+
+ cfg->linear.ste_dma = dma;
+ cfg->linear.num_ents = 1U << log2size;
+
+ base = dma_to_phys(smmu->dev, dma);
+ size = cfg->linear.num_ents * sizeof(struct arm_smmu_ste);
+ if (arm_smmu_kdump_phys_is_corrupted(base, size)) {
+ dev_err(smmu->dev, "kdump: stream table is corrupted\n");
+ return -EINVAL;
+ }
+
+ cfg->linear.table = devm_memremap(smmu->dev, base, size, MEMREMAP_WB);
+ if (IS_ERR(cfg->linear.table))
+ return PTR_ERR(cfg->linear.table);
+ return 0;
+}
+
+static int arm_smmu_adopt_strtab(struct arm_smmu_device *smmu)
+{
+ u32 cfg_reg = readl_relaxed(smmu->base + ARM_SMMU_STRTAB_BASE_CFG);
+ u64 base_reg = readq_relaxed(smmu->base + ARM_SMMU_STRTAB_BASE);
+ u32 fmt = FIELD_GET(STRTAB_BASE_CFG_FMT, cfg_reg);
+ dma_addr_t dma = base_reg & STRTAB_BASE_ADDR_MASK;
+ int ret;
+
+ dev_info(smmu->dev, "kdump: adopting crashed kernel's stream table\n");
+
+ if (fmt == STRTAB_BASE_CFG_FMT_2LVL) {
+ /*
+ * Both kernels run on the same hardware, so it's impossible for
+ * kdump kernel to see the support for linear stream table only.
+ */
+ if (WARN_ON(!(smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)))
+ return -EINVAL;
+ ret = arm_smmu_adopt_strtab_2lvl(smmu, cfg_reg, dma);
+ } else if (fmt == STRTAB_BASE_CFG_FMT_LINEAR) {
+ /*
+ * In case that the old kernel for some reason used the linear
+ * format, enforce the same format to match the adopted table.
+ */
+ smmu->features &= ~ARM_SMMU_FEAT_2_LVL_STRTAB;
+ ret = arm_smmu_adopt_strtab_linear(smmu, cfg_reg, dma);
+ } else {
+ dev_err(smmu->dev, "kdump: invalid STRTAB format %u\n", fmt);
+ ret = -EINVAL;
+ }
+
+ if (ret) {
+ dev_warn(smmu->dev, "kdump: falling back to full reset\n");
+ smmu->options &= ~ARM_SMMU_OPT_KDUMP_ADOPT;
+ memset(&smmu->strtab_cfg, 0, sizeof(smmu->strtab_cfg));
+ }
+ return ret;
+}
+
static int arm_smmu_init_strtab(struct arm_smmu_device *smmu)
{
int ret;
+ if ((smmu->options & ARM_SMMU_OPT_KDUMP_ADOPT) &&
+ !arm_smmu_adopt_strtab(smmu))
+ goto out;
+
if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
ret = arm_smmu_init_strtab_2lvl(smmu);
else
@@ -4564,6 +4750,7 @@ static int arm_smmu_init_strtab(struct arm_smmu_device *smmu)
if (ret)
return ret;
+out:
ida_init(&smmu->vmid_map);
return 0;
--
2.43.0
next prev parent reply other threads:[~2026-04-25 21:32 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-25 21:30 [PATCH rc v3 0/5] iommu/arm-smmu-v3: Fix device crash on kdump kernel Nicolin Chen
2026-04-25 21:30 ` Nicolin Chen [this message]
2026-04-25 21:30 ` [PATCH rc v3 2/5] iommu/arm-smmu-v3: Implement is_attach_deferred() for kdump Nicolin Chen
2026-04-25 21:30 ` [PATCH rc v3 3/5] iommu/arm-smmu-v3: Retain CR0_SMMUEN during kdump device reset Nicolin Chen
2026-04-25 21:30 ` [PATCH rc v3 4/5] iommu/arm-smmu-v3: Skip EVTQ/PRIQ setup in kdump kernel Nicolin Chen
2026-04-25 21:30 ` [PATCH rc v3 5/5] iommu/arm-smmu-v3: Detect ARM_SMMU_OPT_KDUMP_ADOPT in probe() Nicolin Chen
2026-04-29 3:55 ` [PATCH rc v3 0/5] iommu/arm-smmu-v3: Fix device crash on kdump kernel Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9ac81da2ad2fb5795565795759b3e1dd94f0b4bb.1777150307.git.nicolinc@nvidia.com \
--to=nicolinc@nvidia.com \
--cc=baolu.lu@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=jamien@nvidia.com \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miko.lenczewski@arm.com \
--cc=praan@google.com \
--cc=robin.murphy@arm.com \
--cc=smostafa@google.com \
--cc=stable@vger.kernel.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox