* [PATCH rc v7 1/7] iommu/arm-smmu-v3: Add arm_smmu_kdump_adopt_strtab() for kdump
2026-06-30 6:15 [PATCH rc v7 0/7] iommu/arm-smmu-v3: Fix device crash on kdump kernel Nicolin Chen
@ 2026-06-30 6:15 ` Nicolin Chen
2026-06-30 6:15 ` [PATCH rc v7 2/7] iommu/arm-smmu-v3: Implement is_attach_deferred() " Nicolin Chen
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Nicolin Chen @ 2026-06-30 6:15 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, praan, kees, baolu.lu, kevin.tian, miko.lenczewski,
smostafa, linux-arm-kernel, iommu, linux-kernel, stable, jamien
When transitioning to a kdump kernel, the primary kernel might have crashed
while endpoint devices were actively bus-mastering DMA. Currently, the SMMU
driver aggressively resets the hardware during probe by clearing CR0_SMMUEN
and setting the Global Bypass Attribute (GBPA) to ABORT.
In a kdump scenario, this aggressive reset is highly destructive:
a) If GBPA is set to ABORT, in-flight DMA will be aborted, generating fatal
PCIe AER or SErrors that may panic the kdump kernel
b) If GBPA is set to BYPASS, in-flight DMA targeting some IOVAs will bypass
the SMMU and corrupt the physical memory at those 1:1 mapped IOVAs.
To safely absorb in-flight DMAs, a kdump kernel will have to leave SMMUEN=1
intact and avoid modifying STRTAB_BASE, allowing HW to continue translating
in-flight DMAs reusing the crashed kernel's page tables until the endpoint
device drivers probe and quiesce their respective hardware.
However, the ARM SMMUv3 architecture specification states that updating the
SMMU_STRTAB_BASE register while SMMUEN == 1 is UNPREDICTABLE or ignored.
This leaves a kdump kernel no choice but to adopt the stream table from the
crashed kernel.
Introduce ARM_SMMU_OPT_KDUMP_ADOPT and adopt functions memremapping all the
stream tables extracted from STRTAB_BASE and STRTAB_BASE_CFG.
Note that the adoption of the crashed kernel's stream table follows certain
strict rules, since the old stream table might be compromised. Thus, apply
some basic validations against the values read from the registers. If tests
fail, it means the stream table cannot be trusted, so toss it entirely. To
avoid OOM due to a potentially corrupted stream table, the memremap for l2
tables is done on the kdump kernel's demand.
The new option will be set in a following change.
Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
Cc: stable@vger.kernel.org # v6.12+
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 244 +++++++++++++++++++-
2 files changed, 242 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index c909c9a88538b..9d86dc89d8e2e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -928,6 +928,7 @@ struct arm_smmu_device {
#define ARM_SMMU_OPT_MSIPOLL (1 << 2)
#define ARM_SMMU_OPT_CMDQ_FORCE_SYNC (1 << 3)
#define ARM_SMMU_OPT_TEGRA241_CMDQV (1 << 4)
+#define ARM_SMMU_OPT_KDUMP_ADOPT (1 << 5)
u32 options;
struct arm_smmu_cmdq cmdq;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a10affb483a4f..af97a22c11696 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1933,16 +1933,67 @@ static void arm_smmu_init_initial_stes(struct arm_smmu_ste *strtab,
}
}
+static int arm_smmu_kdump_adopt_l2_strtab(struct arm_smmu_device *smmu, u32 sid,
+ phys_addr_t base, u32 span,
+ struct arm_smmu_strtab_l2 **l2table)
+{
+ struct arm_smmu_strtab_l2 *table;
+ size_t size;
+
+ /*
+ * Retest the span in case the L1 descriptor has been overwritten since
+ * the adopt. Reject this master's insert; panic or SMMU-disable would
+ * either lose the vmcore or cascade aborts. Do not try to fix it, as it
+ * would break all other SIDs in the same bus (PCI case). The corruption
+ * blast radius is already bounded to that bus range.
+ */
+ if (span != STRTAB_SPLIT + 1) {
+ dev_err(smmu->dev,
+ "kdump: L1[%u] span %u changed since adopt (was %u)\n",
+ arm_smmu_strtab_l1_idx(sid), span, STRTAB_SPLIT + 1);
+ return -EINVAL;
+ }
+
+ size = (1UL << (span - 1)) * sizeof(struct arm_smmu_ste);
+
+ /*
+ * This L2 table is mapped lazily per master; devres frees it at unbind,
+ * as with the dmam_alloc_coherent() used for a fresh L2.
+ */
+ table = devm_memremap(smmu->dev, base, size, MEMREMAP_WB);
+ if (IS_ERR(table)) {
+ dev_err(smmu->dev,
+ "kdump: failed to adopt l2 stream table for SID %u\n",
+ sid);
+ return PTR_ERR(table);
+ }
+
+ *l2table = table;
+ return 0;
+}
+
static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
{
dma_addr_t l2ptr_dma;
struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
struct arm_smmu_strtab_l2 **l2table;
+ u32 l1_idx = arm_smmu_strtab_l1_idx(sid);
- l2table = &cfg->l2.l2ptrs[arm_smmu_strtab_l1_idx(sid)];
+ l2table = &cfg->l2.l2ptrs[l1_idx];
if (*l2table)
return 0;
+ /* Deferred adoption of the crashed kernel's L2 table */
+ if (smmu->options & ARM_SMMU_OPT_KDUMP_ADOPT) {
+ u64 l2ptr = le64_to_cpu(cfg->l2.l1tab[l1_idx].l2ptr);
+ phys_addr_t base = l2ptr & STRTAB_L1_DESC_L2PTR_MASK;
+ u32 span = FIELD_GET(STRTAB_L1_DESC_SPAN, l2ptr);
+
+ if (span && base)
+ return arm_smmu_kdump_adopt_l2_strtab(smmu, sid, base,
+ span, l2table);
+ }
+
*l2table = dmam_alloc_coherent(smmu->dev, sizeof(**l2table),
&l2ptr_dma, GFP_KERNEL);
if (!*l2table) {
@@ -1954,8 +2005,7 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
arm_smmu_init_initial_stes((*l2table)->stes,
ARRAY_SIZE((*l2table)->stes));
- arm_smmu_write_strtab_l1_desc(&cfg->l2.l1tab[arm_smmu_strtab_l1_idx(sid)],
- l2ptr_dma);
+ arm_smmu_write_strtab_l1_desc(&cfg->l2.l1tab[l1_idx], l2ptr_dma);
return 0;
}
@@ -4490,10 +4540,197 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
return 0;
}
+static int arm_smmu_kdump_adopt_strtab_2lvl(struct arm_smmu_device *smmu,
+ u32 cfg_reg, phys_addr_t base)
+{
+ u32 log2size = FIELD_GET(STRTAB_BASE_CFG_LOG2SIZE, cfg_reg);
+ u32 split = FIELD_GET(STRTAB_BASE_CFG_SPLIT, cfg_reg);
+ struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+ u32 num_l1_ents;
+ size_t size;
+ int i;
+
+ if (log2size < split || log2size > smmu->sid_bits) {
+ dev_err(smmu->dev, "kdump: log2size %u out of range [%u, %u]\n",
+ log2size, split, smmu->sid_bits);
+ return -EINVAL;
+ }
+ if (split != STRTAB_SPLIT) {
+ dev_err(smmu->dev,
+ "kdump: unsupported STRTAB_SPLIT %u (expected %u)\n",
+ split, STRTAB_SPLIT);
+ return -EINVAL;
+ }
+
+ num_l1_ents = 1U << (log2size - split);
+ if (num_l1_ents > STRTAB_MAX_L1_ENTRIES) {
+ dev_err(smmu->dev, "kdump: l1 entries %u exceeds max %u\n",
+ num_l1_ents, STRTAB_MAX_L1_ENTRIES);
+ return -EINVAL;
+ }
+
+ cfg->l2.num_l1_ents = num_l1_ents;
+
+ size = num_l1_ents * sizeof(struct arm_smmu_strtab_l1);
+ cfg->l2.l1tab = memremap(base, size, MEMREMAP_WB);
+ if (!cfg->l2.l1tab)
+ return -ENOMEM;
+
+ cfg->l2.l2ptrs =
+ kcalloc(num_l1_ents, sizeof(*cfg->l2.l2ptrs), GFP_KERNEL);
+ if (!cfg->l2.l2ptrs)
+ return -ENOMEM;
+
+ for (i = 0; i < num_l1_ents; i++) {
+ u64 l2ptr = le64_to_cpu(cfg->l2.l1tab[i].l2ptr);
+ phys_addr_t l2_base = l2ptr & STRTAB_L1_DESC_L2PTR_MASK;
+ u32 span = FIELD_GET(STRTAB_L1_DESC_SPAN, l2ptr);
+
+ if (!span || !l2_base)
+ continue;
+
+ if (span != STRTAB_SPLIT + 1) {
+ dev_err(smmu->dev,
+ "kdump: L1[%u] unsupported span %u (vs %u)\n",
+ i, span, STRTAB_SPLIT + 1);
+ return -EINVAL;
+ }
+
+ /*
+ * If the crashed kernel's l1 descriptors are deeply corrupted,
+ * blindly memremapping every l2 table here could lead to OOM.
+ *
+ * Defer the l2 memremap to arm_smmu_init_l2_strtab(), so peak
+ * memory is bounded by the kdump kernel's actual demand.
+ */
+ }
+
+ return 0;
+}
+
+static int arm_smmu_kdump_adopt_strtab_linear(struct arm_smmu_device *smmu,
+ u32 cfg_reg, phys_addr_t base)
+{
+ u32 log2size = FIELD_GET(STRTAB_BASE_CFG_LOG2SIZE, cfg_reg);
+ struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+ unsigned int max_log2size;
+ size_t size;
+
+ /* Cap the size at what the kdump kernel itself would have allocated */
+ if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
+ max_log2size =
+ ilog2(STRTAB_MAX_L1_ENTRIES * STRTAB_NUM_L2_STES);
+ else
+ max_log2size = smmu->sid_bits;
+
+ /* cfg->linear.num_ents is unsigned int, so cap log2size at 31 */
+ max_log2size = min(max_log2size, 31U);
+ if (log2size > max_log2size) {
+ dev_err(smmu->dev, "kdump: unsupported log2size %u (> %u)\n",
+ log2size, max_log2size);
+ return -EINVAL;
+ }
+
+ /*
+ * We might end up with a num_ents != sid_bits, which is fine. In the
+ * ARM_SMMU_OPT_KDUMP_ADOPT case, arm_smmu_write_strtab() is bypassed.
+ */
+ cfg->linear.num_ents = 1U << log2size;
+
+ size = cfg->linear.num_ents * sizeof(struct arm_smmu_ste);
+ cfg->linear.table = memremap(base, size, MEMREMAP_WB);
+ if (!cfg->linear.table)
+ return -ENOMEM;
+ return 0;
+}
+
+static void arm_smmu_kdump_adopt_cleanup(void *data)
+{
+ struct arm_smmu_device *smmu = data;
+ struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+
+ if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
+ kfree(cfg->l2.l2ptrs);
+ if (cfg->l2.l1tab)
+ memunmap(cfg->l2.l1tab);
+ } else {
+ if (cfg->linear.table)
+ memunmap(cfg->linear.table);
+ }
+}
+
+static int arm_smmu_kdump_adopt_strtab(struct arm_smmu_device *smmu)
+{
+ u32 cfg_reg = readl_relaxed(smmu->base + ARM_SMMU_STRTAB_BASE_CFG);
+ u64 base_reg = readq_relaxed(smmu->base + ARM_SMMU_STRTAB_BASE);
+ bool was_2lvl = smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB;
+ phys_addr_t base = base_reg & STRTAB_BASE_ADDR_MASK;
+ u32 fmt = FIELD_GET(STRTAB_BASE_CFG_FMT, cfg_reg);
+ int ret;
+
+ dev_dbg(smmu->dev, "kdump: adopting crashed kernel's stream table\n");
+
+ if (fmt == STRTAB_BASE_CFG_FMT_2LVL) {
+ /*
+ * Both kernels run on the same hardware, so it's impossible for
+ * kdump kernel to see the support for linear stream table only.
+ */
+ if (WARN_ON(!(smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)))
+ ret = -EINVAL;
+ else
+ ret = arm_smmu_kdump_adopt_strtab_2lvl(smmu, cfg_reg,
+ base);
+ } else if (fmt == STRTAB_BASE_CFG_FMT_LINEAR) {
+ /*
+ * The kdump kernel need not match the crashed kernel. An older
+ * crashed kernel that predates two-level stream table support
+ * may have used a linear table on 2-level-capable hardware, so
+ * enforce the same format here to match the adopted table.
+ */
+ ret = arm_smmu_kdump_adopt_strtab_linear(smmu, cfg_reg, base);
+ if (!ret)
+ smmu->features &= ~ARM_SMMU_FEAT_2_LVL_STRTAB;
+ } else {
+ dev_err(smmu->dev, "kdump: invalid STRTAB format %u\n", fmt);
+ ret = -EINVAL;
+ }
+
+ if (ret) {
+ arm_smmu_kdump_adopt_cleanup(smmu);
+ goto err;
+ }
+
+ ret = devm_add_action_or_reset(smmu->dev, arm_smmu_kdump_adopt_cleanup,
+ smmu);
+ /* devm_add_action_or_reset ran the cleanup upon failure */
+ if (ret) {
+ dev_warn(smmu->dev, "kdump: failed to set up cleanup action\n");
+ /*
+ * Undo the linear adoption's clearing of FEAT_2_LVL_STRTAB so
+ * the full-reset fallback uses the hardware-supported format.
+ */
+ if (was_2lvl)
+ smmu->features |= ARM_SMMU_FEAT_2_LVL_STRTAB;
+ goto err;
+ }
+
+ return 0;
+
+err:
+ dev_warn(smmu->dev, "kdump: falling back to full reset\n");
+ memset(&smmu->strtab_cfg, 0, sizeof(smmu->strtab_cfg));
+ smmu->options &= ~ARM_SMMU_OPT_KDUMP_ADOPT;
+ return ret;
+}
+
static int arm_smmu_init_strtab(struct arm_smmu_device *smmu)
{
int ret;
+ if ((smmu->options & ARM_SMMU_OPT_KDUMP_ADOPT) &&
+ !arm_smmu_kdump_adopt_strtab(smmu))
+ goto out;
+
if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
ret = arm_smmu_init_strtab_2lvl(smmu);
else
@@ -4501,6 +4738,7 @@ static int arm_smmu_init_strtab(struct arm_smmu_device *smmu)
if (ret)
return ret;
+out:
ida_init(&smmu->vmid_map);
return 0;
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH rc v7 2/7] iommu/arm-smmu-v3: Implement is_attach_deferred() for kdump
2026-06-30 6:15 [PATCH rc v7 0/7] iommu/arm-smmu-v3: Fix device crash on kdump kernel Nicolin Chen
2026-06-30 6:15 ` [PATCH rc v7 1/7] iommu/arm-smmu-v3: Add arm_smmu_kdump_adopt_strtab() for kdump Nicolin Chen
@ 2026-06-30 6:15 ` Nicolin Chen
2026-06-30 6:15 ` [PATCH rc v7 3/7] iommu/arm-smmu-v3: Do not enable EVTQ/PRIQ interrupts in kdump kernel Nicolin Chen
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Nicolin Chen @ 2026-06-30 6:15 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, praan, kees, baolu.lu, kevin.tian, miko.lenczewski,
smostafa, linux-arm-kernel, iommu, linux-kernel, stable, jamien
Though the kdump kernel adopts the crashed kernel's stream table, the iommu
core will still try to attach each probed device to a default domain, which
overwrites the adopted STE and breaks in-flight DMA from that device.
Implement an is_attach_deferred() callback to prevent this. For each device
that has STE.V=1 and STE.Cfg!=Abort in the adopted table, defer the default
domain attachment, until the device driver explicitly requests it.
Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
Cc: stable@vger.kernel.org # v6.12+
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 +++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index af97a22c11696..b4702945b7324 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4198,6 +4198,29 @@ static int arm_smmu_master_prepare_ats(struct arm_smmu_master *master)
return arm_smmu_alloc_cd_tables(master);
}
+static bool arm_smmu_is_attach_deferred(struct device *dev)
+{
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_device *smmu = master->smmu;
+ int i;
+
+ if (!(smmu->options & ARM_SMMU_OPT_KDUMP_ADOPT))
+ return false;
+
+ for (i = 0; i < master->num_streams; i++) {
+ struct arm_smmu_ste *ste =
+ arm_smmu_get_step_for_sid(smmu, master->streams[i].id);
+ u64 ent0 = le64_to_cpu(ste->data[0]);
+
+ /* Defer only when there might be in-flight DMAs */
+ if ((ent0 & STRTAB_STE_0_V) &&
+ FIELD_GET(STRTAB_STE_0_CFG, ent0) != STRTAB_STE_0_CFG_ABORT)
+ return true;
+ }
+
+ return false;
+}
+
static struct iommu_device *arm_smmu_probe_device(struct device *dev)
{
int ret;
@@ -4361,6 +4384,7 @@ static const struct iommu_ops arm_smmu_ops = {
.hw_info = arm_smmu_hw_info,
.domain_alloc_sva = arm_smmu_sva_domain_alloc,
.domain_alloc_paging_flags = arm_smmu_domain_alloc_paging_flags,
+ .is_attach_deferred = arm_smmu_is_attach_deferred,
.probe_device = arm_smmu_probe_device,
.release_device = arm_smmu_release_device,
.device_group = arm_smmu_device_group,
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH rc v7 3/7] iommu/arm-smmu-v3: Do not enable EVTQ/PRIQ interrupts in kdump kernel
2026-06-30 6:15 [PATCH rc v7 0/7] iommu/arm-smmu-v3: Fix device crash on kdump kernel Nicolin Chen
2026-06-30 6:15 ` [PATCH rc v7 1/7] iommu/arm-smmu-v3: Add arm_smmu_kdump_adopt_strtab() for kdump Nicolin Chen
2026-06-30 6:15 ` [PATCH rc v7 2/7] iommu/arm-smmu-v3: Implement is_attach_deferred() " Nicolin Chen
@ 2026-06-30 6:15 ` Nicolin Chen
2026-06-30 6:15 ` [PATCH rc v7 4/7] iommu/arm-smmu-v3: Skip EVTQ/PRIQ setup " Nicolin Chen
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Nicolin Chen @ 2026-06-30 6:15 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, praan, kees, baolu.lu, kevin.tian, miko.lenczewski,
smostafa, linux-arm-kernel, iommu, linux-kernel, stable, jamien
In kdump cases, the crashed kernel's CDs and page tables can be corrupted,
which could trigger event spamming. Also, we cannot serve page requests.
Skip the IRQ setup for EVTQ/PRIQ in arm_smmu_setup_irqs().
Skip their IRQ handler registration in unique-IRQ and combined-IRQ cases.
Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
Cc: stable@vger.kernel.org # v6.12+
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 58 ++++++++++++++-------
1 file changed, 39 insertions(+), 19 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b4702945b7324..2c33de5128a09 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2344,7 +2344,11 @@ static irqreturn_t arm_smmu_combined_irq_thread(int irq, void *dev)
static irqreturn_t arm_smmu_combined_irq_handler(int irq, void *dev)
{
- arm_smmu_gerror_handler(irq, dev);
+ irqreturn_t ret = arm_smmu_gerror_handler(irq, dev);
+
+ /* In kdump, EVTQ/PRIQ are disabled and there is no thread to wake */
+ if (is_kdump_kernel())
+ return ret;
return IRQ_WAKE_THREAD;
}
@@ -4887,6 +4891,21 @@ static void arm_smmu_setup_unique_irqs(struct arm_smmu_device *smmu)
arm_smmu_setup_msis(smmu);
/* Request interrupt lines */
+ irq = smmu->gerr_irq;
+ if (irq) {
+ ret = devm_request_irq(smmu->dev, irq, arm_smmu_gerror_handler,
+ 0, "arm-smmu-v3-gerror", smmu);
+ if (ret < 0)
+ dev_warn(smmu->dev, "failed to enable gerror irq\n");
+ } else {
+ dev_warn(smmu->dev,
+ "no gerr irq - errors will not be reported!\n");
+ }
+
+ /* No EVTQ/PRIQ interrupts in kdump -- queues are disabled */
+ if (is_kdump_kernel())
+ return;
+
irq = smmu->evtq.q.irq;
if (irq) {
ret = devm_request_threaded_irq(smmu->dev, irq, NULL,
@@ -4899,16 +4918,6 @@ static void arm_smmu_setup_unique_irqs(struct arm_smmu_device *smmu)
dev_warn(smmu->dev, "no evtq irq - events will not be reported!\n");
}
- irq = smmu->gerr_irq;
- if (irq) {
- ret = devm_request_irq(smmu->dev, irq, arm_smmu_gerror_handler,
- 0, "arm-smmu-v3-gerror", smmu);
- if (ret < 0)
- dev_warn(smmu->dev, "failed to enable gerror irq\n");
- } else {
- dev_warn(smmu->dev, "no gerr irq - errors will not be reported!\n");
- }
-
if (smmu->features & ARM_SMMU_FEAT_PRI) {
irq = smmu->priq.q.irq;
if (irq) {
@@ -4929,7 +4938,7 @@ static void arm_smmu_setup_unique_irqs(struct arm_smmu_device *smmu)
static int arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
{
int ret, irq;
- u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN;
+ u32 irqen_flags = IRQ_CTRL_GERROR_IRQEN;
/* Disable IRQs first */
ret = arm_smmu_write_reg_sync(smmu, 0, ARM_SMMU_IRQ_CTRL,
@@ -4944,19 +4953,30 @@ static int arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
/*
* Cavium ThunderX2 implementation doesn't support unique irq
* lines. Use a single irq line for all the SMMUv3 interrupts.
+ *
+ * In kdump, EVTQ/PRIQ are disabled, so no threaded handling.
*/
- ret = devm_request_threaded_irq(smmu->dev, irq,
- arm_smmu_combined_irq_handler,
- arm_smmu_combined_irq_thread,
- IRQF_ONESHOT,
- "arm-smmu-v3-combined-irq", smmu);
+ if (is_kdump_kernel())
+ ret = devm_request_irq(smmu->dev, irq,
+ arm_smmu_combined_irq_handler, 0,
+ "arm-smmu-v3-combined-irq",
+ smmu);
+ else
+ ret = devm_request_threaded_irq(
+ smmu->dev, irq, arm_smmu_combined_irq_handler,
+ arm_smmu_combined_irq_thread, IRQF_ONESHOT,
+ "arm-smmu-v3-combined-irq", smmu);
if (ret < 0)
dev_warn(smmu->dev, "failed to enable combined irq\n");
} else
arm_smmu_setup_unique_irqs(smmu);
- if (smmu->features & ARM_SMMU_FEAT_PRI)
- irqen_flags |= IRQ_CTRL_PRIQ_IRQEN;
+ /* No EVTQ/PRIQ IRQ generation in kdump -- queues are disabled */
+ if (!is_kdump_kernel()) {
+ irqen_flags |= IRQ_CTRL_EVTQ_IRQEN;
+ if (smmu->features & ARM_SMMU_FEAT_PRI)
+ irqen_flags |= IRQ_CTRL_PRIQ_IRQEN;
+ }
/* Enable interrupt generation on the SMMU */
ret = arm_smmu_write_reg_sync(smmu, irqen_flags,
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH rc v7 4/7] iommu/arm-smmu-v3: Skip EVTQ/PRIQ setup in kdump kernel
2026-06-30 6:15 [PATCH rc v7 0/7] iommu/arm-smmu-v3: Fix device crash on kdump kernel Nicolin Chen
` (2 preceding siblings ...)
2026-06-30 6:15 ` [PATCH rc v7 3/7] iommu/arm-smmu-v3: Do not enable EVTQ/PRIQ interrupts in kdump kernel Nicolin Chen
@ 2026-06-30 6:15 ` Nicolin Chen
2026-06-30 6:15 ` [PATCH rc v7 5/7] iommu/arm-smmu-v3: Retain CR0_SMMUEN during kdump device reset Nicolin Chen
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Nicolin Chen @ 2026-06-30 6:15 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, praan, kees, baolu.lu, kevin.tian, miko.lenczewski,
smostafa, linux-arm-kernel, iommu, linux-kernel, stable, jamien
In kdump cases, the crashed kernel's CDs and page tables can be corrupted,
which could trigger event spamming. Also, we cannot serve page requests.
Skip the EVTQ/PRIQ setup entirely rather than enabling then disabling them.
Also add some inline comments explaining that.
Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
Cc: stable@vger.kernel.org # v6.12+
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 43 +++++++++++++--------
1 file changed, 27 insertions(+), 16 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 2c33de5128a09..abcbc9874f252 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -5083,21 +5083,35 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
arm_smmu_cmdq_issue_cmd_with_sync(
smmu, arm_smmu_make_cmd_op(CMDQ_OP_TLBI_NSNH_ALL));
- /* Event queue */
- writeq_relaxed(smmu->evtq.q.q_base, smmu->base + ARM_SMMU_EVTQ_BASE);
- writel_relaxed(smmu->evtq.q.llq.prod, smmu->page1 + ARM_SMMU_EVTQ_PROD);
- writel_relaxed(smmu->evtq.q.llq.cons, smmu->page1 + ARM_SMMU_EVTQ_CONS);
-
- enables |= CR0_EVTQEN;
- ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
- ARM_SMMU_CR0ACK);
- if (ret) {
- dev_err(smmu->dev, "failed to enable event queue\n");
- return ret;
+ /*
+ * Event queue
+ *
+ * Do not enable in a kdump case, as the crashed kernel's CDs and page
+ * tables might be corrupted, triggering event spamming.
+ */
+ if (!is_kdump_kernel()) {
+ writeq_relaxed(smmu->evtq.q.q_base,
+ smmu->base + ARM_SMMU_EVTQ_BASE);
+ writel_relaxed(smmu->evtq.q.llq.prod,
+ smmu->page1 + ARM_SMMU_EVTQ_PROD);
+ writel_relaxed(smmu->evtq.q.llq.cons,
+ smmu->page1 + ARM_SMMU_EVTQ_CONS);
+
+ enables |= CR0_EVTQEN;
+ ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
+ ARM_SMMU_CR0ACK);
+ if (ret) {
+ dev_err(smmu->dev, "failed to enable event queue\n");
+ return ret;
+ }
}
- /* PRI queue */
- if (smmu->features & ARM_SMMU_FEAT_PRI) {
+ /*
+ * PRI queue
+ *
+ * Do not enable in a kdump case, as we cannot serve page requests.
+ */
+ if (!is_kdump_kernel() && (smmu->features & ARM_SMMU_FEAT_PRI)) {
writeq_relaxed(smmu->priq.q.q_base,
smmu->base + ARM_SMMU_PRIQ_BASE);
writel_relaxed(smmu->priq.q.llq.prod,
@@ -5130,9 +5144,6 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
return ret;
}
- if (is_kdump_kernel())
- enables &= ~(CR0_EVTQEN | CR0_PRIQEN);
-
/* Enable the SMMU interface */
enables |= CR0_SMMUEN;
ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH rc v7 5/7] iommu/arm-smmu-v3: Retain CR0_SMMUEN during kdump device reset
2026-06-30 6:15 [PATCH rc v7 0/7] iommu/arm-smmu-v3: Fix device crash on kdump kernel Nicolin Chen
` (3 preceding siblings ...)
2026-06-30 6:15 ` [PATCH rc v7 4/7] iommu/arm-smmu-v3: Skip EVTQ/PRIQ setup " Nicolin Chen
@ 2026-06-30 6:15 ` Nicolin Chen
2026-06-30 6:15 ` [PATCH rc v7 6/7] iommu/arm-smmu-v3: Skip RMR bypass for kdump adoption Nicolin Chen
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Nicolin Chen @ 2026-06-30 6:15 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, praan, kees, baolu.lu, kevin.tian, miko.lenczewski,
smostafa, linux-arm-kernel, iommu, linux-kernel, stable, jamien
When ARM_SMMU_OPT_KDUMP_ADOPT is detected, do not disable SMMUEN and skip
the CR1/CR2/STRTAB_BASE update sequence in arm_smmu_device_reset(). Those
register writes are all CONSTRAINED UNPREDICTABLE while CR0_SMMUEN==1, so
leaving them intact lets in-flight DMAs continue to be translated by the
adopted stream table.
Initialize 'enables' to 0 so it can carry CR0_SMMUEN in kdump case. Then,
preserve that when enabling the command queue.
Clear latched gerror bits if necessary.
Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
Cc: stable@vger.kernel.org # v6.12+
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 54 +++++++++++++++++++--
1 file changed, 50 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index abcbc9874f252..55ef2e7470a42 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -5025,10 +5025,27 @@ static void arm_smmu_write_strtab(struct arm_smmu_device *smmu)
static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
{
int ret;
- u32 reg, enables;
+ u32 reg, enables = 0;
- /* Clear CR0 and sync (disables SMMU and queue processing) */
reg = readl_relaxed(smmu->base + ARM_SMMU_CR0);
+
+ /*
+ * In a kdump case (set when CR0_SMMUEN=1 and !GERROR_SFM_ERR), retain
+ * CR0_SMMUEN to avoid aborting in-flight DMA, and CR0_ATSCHK to carry
+ * on the ATS-check policy.
+ *
+ * According to spec, updating STRTAB_BASE/CR1/CR2 when CR0_SMMUEN=1 is
+ * CONSTRAINED UNPREDICTABLE. So, skip those register updates and rely
+ * on the adopted stream table from the crashed kernel.
+ */
+ if (smmu->options & ARM_SMMU_OPT_KDUMP_ADOPT) {
+ dev_info(smmu->dev,
+ "kdump: retaining SMMUEN for in-flight DMA\n");
+ enables = reg & (CR0_SMMUEN | CR0_ATSCHK);
+ goto reset_queues;
+ }
+
+ /* Clear CR0 and sync (disables SMMU and queue processing) */
if (reg & CR0_SMMUEN) {
dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
@@ -5058,12 +5075,36 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
/* Stream table */
arm_smmu_write_strtab(smmu);
+reset_queues:
+ if (smmu->options & ARM_SMMU_OPT_KDUMP_ADOPT) {
+ /* Disable queues since arm_smmu_device_disable() was skipped */
+ ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
+ ARM_SMMU_CR0ACK);
+ if (ret) {
+ dev_err(smmu->dev, "failed to disable queues\n");
+ return ret;
+ }
+ }
+
+ /*
+ * GERROR bits are latched. Read after queue disabling so that unhandled
+ * errors would be visible. Ack everything prior to re-enabling the CMDQ
+ * as a stale CMDQ_ERR would halt the CMDQ and new command will timeout.
+ */
+ if (is_kdump_kernel()) {
+ u32 gerror = readl_relaxed(smmu->base + ARM_SMMU_GERROR);
+ u32 gerrorn = readl_relaxed(smmu->base + ARM_SMMU_GERRORN);
+
+ if ((gerror ^ gerrorn) & GERROR_ERR_MASK)
+ writel(gerror, smmu->base + ARM_SMMU_GERRORN);
+ }
+
/* Command queue */
writeq_relaxed(smmu->cmdq.q.q_base, smmu->base + ARM_SMMU_CMDQ_BASE);
writel_relaxed(smmu->cmdq.q.llq.prod, smmu->base + ARM_SMMU_CMDQ_PROD);
writel_relaxed(smmu->cmdq.q.llq.cons, smmu->base + ARM_SMMU_CMDQ_CONS);
- enables = CR0_CMDQEN;
+ enables |= CR0_CMDQEN;
ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
ARM_SMMU_CR0ACK);
if (ret) {
@@ -5128,7 +5169,12 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
}
}
- if (smmu->features & ARM_SMMU_FEAT_ATS) {
+ /*
+ * In a kdump adopt case, retain the crashed kernel's ATS-check policy
+ * captured above rather than forcing it on.
+ */
+ if (!(smmu->options & ARM_SMMU_OPT_KDUMP_ADOPT) &&
+ (smmu->features & ARM_SMMU_FEAT_ATS)) {
enables |= CR0_ATSCHK;
ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
ARM_SMMU_CR0ACK);
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH rc v7 6/7] iommu/arm-smmu-v3: Skip RMR bypass for kdump adoption
2026-06-30 6:15 [PATCH rc v7 0/7] iommu/arm-smmu-v3: Fix device crash on kdump kernel Nicolin Chen
` (4 preceding siblings ...)
2026-06-30 6:15 ` [PATCH rc v7 5/7] iommu/arm-smmu-v3: Retain CR0_SMMUEN during kdump device reset Nicolin Chen
@ 2026-06-30 6:15 ` Nicolin Chen
2026-06-30 6:15 ` [PATCH rc v7 7/7] iommu/arm-smmu-v3: Detect ARM_SMMU_OPT_KDUMP_ADOPT in probe() Nicolin Chen
2026-06-30 13:17 ` [PATCH rc v7 0/7] iommu/arm-smmu-v3: Fix device crash on kdump kernel Mostafa Saleh
7 siblings, 0 replies; 9+ messages in thread
From: Nicolin Chen @ 2026-06-30 6:15 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, praan, kees, baolu.lu, kevin.tian, miko.lenczewski,
smostafa, linux-arm-kernel, iommu, linux-kernel, stable, jamien
RMR bypass STEs are installed during SMMUv3 probe for StreamIDs listed by
IORT RMR nodes. A normal boot switches the driver to a fresh stream table
whose initial STEs abort, so those RMR SIDs need bypass entries before it
becomes live. This preserves firmware/guest-owned traffic, including vSMMU
guest MSI cases built around RMR-described SIDs.
ARM_SMMU_OPT_KDUMP_ADOPT is the opposite case: the driver keeps SMMUEN set
and adopts the crashed kernel's stream table, so RMR SIDs already have the
only translation state known to be safe for active in-flight DMA. Replacing
an adopted STE with bypass can turn translated DMA into physical DMA, then
point it at the wrong memory.
arm_smmu_make_bypass_ste() also rewrites the STE in place after clearing it
first. While the table is live, a concurrent hardware STE fetch can observe
V=0 or mixed old/new state.
Leaving the adopted STE unmodified keeps the kdump kernel using the crashed
kernel's translation. That gives the endpoint driver a chance to probe and
quiesce the device.
If the old STE was already abort or invalid, installing bypass would create
new DMA permission; leaving it alone is a safer failure mode. Later domain
setup still gets the RMR direct mappings through the reserved-region path.
Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
Cc: stable@vger.kernel.org # v6.12+
Assisted-by: Codex:gpt-5.5
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 55ef2e7470a42..822ab73161969 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -5658,6 +5658,14 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
struct list_head rmr_list;
struct iommu_resv_region *e;
+ /*
+ * Kdump adoption keeps the crashed kernel's table live. Rewriting the
+ * adopted STE here could expose an in-flight fetch to a transient V=0
+ * entry, or change Cfg=translate to Cfg=bypass. Must skip here.
+ */
+ if (smmu->options & ARM_SMMU_OPT_KDUMP_ADOPT)
+ return;
+
INIT_LIST_HEAD(&rmr_list);
iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
@@ -5674,10 +5682,7 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
continue;
}
- /*
- * STE table is not programmed to HW, see
- * arm_smmu_initial_bypass_stes()
- */
+ /* The fresh stream table is not yet live. */
arm_smmu_make_bypass_ste(smmu,
arm_smmu_get_step_for_sid(smmu, rmr->sids[i]));
}
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH rc v7 7/7] iommu/arm-smmu-v3: Detect ARM_SMMU_OPT_KDUMP_ADOPT in probe()
2026-06-30 6:15 [PATCH rc v7 0/7] iommu/arm-smmu-v3: Fix device crash on kdump kernel Nicolin Chen
` (5 preceding siblings ...)
2026-06-30 6:15 ` [PATCH rc v7 6/7] iommu/arm-smmu-v3: Skip RMR bypass for kdump adoption Nicolin Chen
@ 2026-06-30 6:15 ` Nicolin Chen
2026-06-30 13:17 ` [PATCH rc v7 0/7] iommu/arm-smmu-v3: Fix device crash on kdump kernel Mostafa Saleh
7 siblings, 0 replies; 9+ messages in thread
From: Nicolin Chen @ 2026-06-30 6:15 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, praan, kees, baolu.lu, kevin.tian, miko.lenczewski,
smostafa, linux-arm-kernel, iommu, linux-kernel, stable, jamien
arm_smmu_device_hw_probe() runs before arm_smmu_init_structures(), so it's
natural to decide whether the kdump kernel must adopt the crashed kernel's
stream table.
Given that memremap is used to adopt the old stream table, set this option
only on a coherent SMMU.
And make sure SMMU isn't in Service Failure Mode.
Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
Cc: stable@vger.kernel.org # v6.12+
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 31 +++++++++++++++++++++
1 file changed, 31 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 822ab73161969..bca9395b6a1ef 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -5280,6 +5280,33 @@ static void arm_smmu_get_httu(struct arm_smmu_device *smmu, u32 reg)
hw_features, fw_features);
}
+static void arm_smmu_device_hw_probe_kdump(struct arm_smmu_device *smmu)
+{
+ u32 gerror, gerrorn, active;
+
+ /* No adoption if SMMU is disabled (i.e., there is no in-flight DMA) */
+ if (!(readl_relaxed(smmu->base + ARM_SMMU_CR0) & CR0_SMMUEN))
+ return;
+
+ /* For now, only support a coherent SMMU that works with MEMREMAP_WB */
+ if (!(smmu->features & ARM_SMMU_FEAT_COHERENCY)) {
+ dev_warn(smmu->dev,
+ "kdump: non-coherent SMMU unsupported; reset to block all DMAs\n");
+ return;
+ }
+
+ gerror = readl_relaxed(smmu->base + ARM_SMMU_GERROR);
+ gerrorn = readl_relaxed(smmu->base + ARM_SMMU_GERRORN);
+ active = gerror ^ gerrorn;
+ if (active & GERROR_SFM_ERR) {
+ dev_warn(smmu->dev,
+ "kdump: SMMU in Service Failure Mode, must reset\n");
+ return;
+ }
+
+ smmu->options |= ARM_SMMU_OPT_KDUMP_ADOPT;
+}
+
static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
{
u32 reg;
@@ -5494,6 +5521,10 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
dev_info(smmu->dev, "oas %lu-bit (features 0x%08x)\n",
smmu->oas, smmu->features);
+
+ if (is_kdump_kernel())
+ arm_smmu_device_hw_probe_kdump(smmu);
+
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH rc v7 0/7] iommu/arm-smmu-v3: Fix device crash on kdump kernel
2026-06-30 6:15 [PATCH rc v7 0/7] iommu/arm-smmu-v3: Fix device crash on kdump kernel Nicolin Chen
` (6 preceding siblings ...)
2026-06-30 6:15 ` [PATCH rc v7 7/7] iommu/arm-smmu-v3: Detect ARM_SMMU_OPT_KDUMP_ADOPT in probe() Nicolin Chen
@ 2026-06-30 13:17 ` Mostafa Saleh
7 siblings, 0 replies; 9+ messages in thread
From: Mostafa Saleh @ 2026-06-30 13:17 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, jgg, joro, praan, kees, baolu.lu, kevin.tian,
miko.lenczewski, linux-arm-kernel, iommu, linux-kernel, stable,
jamien
On Mon, Jun 29, 2026 at 11:15:33PM -0700, Nicolin Chen wrote:
> When transitioning to a kdump kernel, the primary kernel might have crashed
> while endpoint devices were actively bus-mastering DMA. Currently, the SMMU
> driver aggressively resets the hardware during probe by clearing CR0_SMMUEN
> and setting the Global Bypass Attribute (GBPA) to ABORT.
>
> In a kdump scenario, this aggressive reset is highly destructive:
> a) If GBPA is set to ABORT, in-flight DMA will be aborted, generating fatal
> PCIe AER or SErrors that may panic the kdump kernel
Can you please clarify more on those errors, what conditions will
trigger that?
For example, patch 4 disables the EVTQ to avoid events as there might
be a lot, why are they not fatal also?
> b) If GBPA is set to BYPASS, in-flight DMA targeting some IOVAs will bypass
> the SMMU and corrupt the physical memory at those 1:1 mapped IOVAs.
>
> To safely absorb in-flight DMA, the kdump kernel must leave SMMUEN=1 intact
> and avoid modifying STRTAB_BASE. This allows HW to continue translating in-
> flight DMA using the crashed kernel's page tables until the endpoint device
> drivers probe and quiesce their respective hardware.
>
> However, the ARM SMMUv3 architecture specification states that updating the
> SMMU_STRTAB_BASE register while SMMUEN == 1 is UNPREDICTABLE or ignored.
>
> This leaves a kdump kernel no choice but to adopt the stream table from the
> crashed kernel.
In many cases the patches assume that the CDs/STE might be corrupted,
but still attempt to retrieve them with some validation
(log2size/split...)
However, the base address might be broken, TLBs state is unknown...
IMO, although that might improve the status quo, there are still
heuristics, in addition to noticeable complexity to transition the
stream tables. I wonder if FW can deal with AER in that case before
booting the kdump kernel.
Thanks,
Mostafa
>
> In this series:
> - Introduce an ARM_SMMU_OPT_KDUMP_ADOPT
> - Skip SMMUEN and STRTAB_BASE resets in arm_smmu_device_reset()
> - Skip EVENTQ/PRIQ setup including interrupts and their handlers
> - Memremap the crashed kernel's stream tables into the kdump kernel [*]
> - Defer any default domain attachment to retain STEs until device drivers
> explicitly request it.
>
> [*] For verification reasons, this series only fixes coherent SMMUs.
>
> For non-ARM_SMMU_OPT_KDUMP_ADOPT cases, keep a status quo since the commit
> 3f54c447df34f ("iommu/arm-smmu-v3: Don't disable SMMU in kdump kernel"):
> full reset followed by driver-initiated reattach, potentially rejecting any
> in-flight DMA.
>
> Note that the series requires Jason's work that was merged in v6.12: commit
> 85196f54743d ("iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg").
> I have a backported version that is verified with a v6.8 kernel. I can send
> if we see a strong need after this version is accepted.
>
> This is on Github:
> https://github.com/nicolinc/iommufd/commits/smmuv3_kdump-v7
>
> Changelog
> v7
> * Rebase v7.2-rc1
> * Add Reviewed-by from Pranjal
> * Reword the linear stream table adoption comment
> * Use dev_dbg for the stream table adoption message
> * Document why the lazy L2 adoption uses devm_memremap()
> * Drop redundant FEAT_COHERENCY checks in the adopt functions
> * Use feature bit instead of STRTAB_BASE_CFG in adopt cleanup
> * Skip CR0_ATSCHK update in adopt mode to retain the crashed policy
> * Restore FEAT_2_LVL_STRTAB if the cleanup action fails to register
> v6
> https://lore.kernel.org/all/cover.1779265413.git.nicolinc@nvidia.com/
> * Rebase v7.1-rc3
> * Add Reviewed-by from Jason
> * Replace dma_addr_t with phys_addr_t
> * Drop arm_smmu_kdump_phys_is_corrupted()
> * Skip threaded IRQ handlers for EVTQ and PRIQ
> * Bypass arm_smmu_rmr_install_bypass_ste() in kdump case
> * Drop devm_ for adopt-time allocations; set up cleanup function via
> devm_add_action_or_reset()
> v5
> https://lore.kernel.org/all/cover.1778416609.git.nicolinc@nvidia.com/
> * Add Reviewed-by from Kevin
> * Drop READ_ONCE on lazy-attach L1 read
> * Split "Skip EVTQ/PRIQ setup" into two patches
> * Tighten kdump probe comment and dev_warn message
> * Use MEM + BUSY in arm_smmu_kdump_phys_is_corrupted
> v4
> https://lore.kernel.org/all/cover.1777446969.git.nicolinc@nvidia.com/
> * Rebase v7.1-rc1
> * s/arm_smmu_adopt/arm_smmu_kdump_adopt
> * Revert alloc/memremap/fmt on fallback
> * Reorder patches to avoid bisect regression
> * Use IRQ_NONE for spurious evtq/priq entries
> * Cap linear log2size by kdump's allocation bound
> * Defer clearing FEAT_2_LVL_STRTAB on linear adopt
> * Add arm_smmu_kdump_phys_is_corrupted() validation
> * Defer l2 stream table memremap till master inserts
> * Re-validate L1 desc on master insert with READ_ONCE
> v3
> https://lore.kernel.org/all/cover.1777150307.git.nicolinc@nvidia.com/
> * s/OPT_KDUMP/OPT_KDUMP_ADOPT
> * Do not adopt if GERROR_SFM_ERR
> * Retain CR0_ATSCHK beside CR0_SMMUEN
> * Clear latched GERROR bits (e.g. CMDQ_ERR)
> * Assert ARM_SMMU_FEAT_COHERENCY in adopt functions
> * Add STE.Cfg check in arm_smmu_is_attach_deferred()
> * Fix validations on return codes from devm_memremap()
> * Sanitize crashed kernel register values in adopt functions
> * Drop unnecessary l2ptrs guard in arm_smmu_is_attach_deferred()
> * Don't enable PRIQ/EVTQ irqs and guard the irq functions for combined
> irq cases
> v2
> https://lore.kernel.org/all/cover.1776286352.git.nicolinc@nvidia.com/
> * Add warning in non-coherent SMMU cases
> * Keep eventq/priq disabled vs. enabling-and-disabling-later
> * Check KDUMP option in the beginning of arm_smmu_device_reset()
> * Validate STRTAB format matches HW capability instead of forcing flags
> v1:
> https://lore.kernel.org/all/cover.1775763475.git.nicolinc@nvidia.com/
>
> Nicolin Chen (7):
> iommu/arm-smmu-v3: Add arm_smmu_kdump_adopt_strtab() for kdump
> iommu/arm-smmu-v3: Implement is_attach_deferred() for kdump
> iommu/arm-smmu-v3: Do not enable EVTQ/PRIQ interrupts in kdump kernel
> iommu/arm-smmu-v3: Skip EVTQ/PRIQ setup in kdump kernel
> iommu/arm-smmu-v3: Retain CR0_SMMUEN during kdump device reset
> iommu/arm-smmu-v3: Skip RMR bypass for kdump adoption
> iommu/arm-smmu-v3: Detect ARM_SMMU_OPT_KDUMP_ADOPT in probe()
>
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 +
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 467 ++++++++++++++++++--
> 2 files changed, 422 insertions(+), 46 deletions(-)
>
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 9+ messages in thread