The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Nicolin Chen <nicolinc@nvidia.com>
To: Pranjal Shrivastava <praan@google.com>
Cc: <will@kernel.org>, <robin.murphy@arm.com>, <jgg@nvidia.com>,
	<joro@8bytes.org>, <kees@kernel.org>, <baolu.lu@linux.intel.com>,
	<kevin.tian@intel.com>, <miko.lenczewski@arm.com>,
	<smostafa@google.com>, <linux-arm-kernel@lists.infradead.org>,
	<iommu@lists.linux.dev>, <linux-kernel@vger.kernel.org>,
	<stable@vger.kernel.org>, <jamien@nvidia.com>
Subject: Re: [PATCH rc v6 1/7] iommu/arm-smmu-v3: Add arm_smmu_kdump_adopt_strtab() for kdump
Date: Sun, 28 Jun 2026 21:01:45 -0700	[thread overview]
Message-ID: <akHuKRctyEv88ytr@nvidia.com> (raw)
In-Reply-To: <akGnm_dzvoapjwoI@google.com>

On Sun, Jun 28, 2026 at 11:00:43PM +0000, Pranjal Shrivastava wrote:
> On Wed, May 20, 2026 at 10:03:18AM -0700, Nicolin Chen wrote:
> > +static int arm_smmu_kdump_adopt_l2_strtab(struct arm_smmu_device *smmu, u32 sid,
> > +					  phys_addr_t base, u32 span,
> > +					  struct arm_smmu_strtab_l2 **l2table)
[...]
> > +
> > +	/*
> > +	 * Retest the span in case the L1 descriptor has been overwritten since
> > +	 * the adopt. Reject this master's insert; panic or SMMU-disable would
> > +	 * either lose the vmcore or cascade aborts. Do not try to fix it, as it
> > +	 * would break all other SIDs in the same bus (PCI case). The corruption
> > +	 * blast radius is already bounded to that bus range.
> > +	 */
> > +	if (span != STRTAB_SPLIT + 1) {
> > +		dev_err(smmu->dev,
> > +			"kdump: L1[%u] span %u changed since adopt (was %u)\n",
> > +			arm_smmu_strtab_l1_idx(sid), span, STRTAB_SPLIT + 1);
> > +		return -EINVAL;
> > +	}
> > +
> > +	size = (1UL << (span - 1)) * sizeof(struct arm_smmu_ste);
> > +
> > +	table = devm_memremap(smmu->dev, base, size, MEMREMAP_WB);
> 
> Why do we use devm_memremap() here but memremap() in the rest of the
> functions (even for the L1 ptr)?

Because it's called by arm_smmu_init_l2_strtab() that is a later
stage (arm_smmu_insert_master) than the other adopt functions.

This is deliberate. Otherwise, no one would undo memremap.

I can add a line of note here.

> > +static int arm_smmu_kdump_adopt_strtab_2lvl(struct arm_smmu_device *smmu,
> > +					    u32 cfg_reg, phys_addr_t base)
[...]
> > +
> > +	cfg->l2.l2ptrs =
> > +		kcalloc(num_l1_ents, sizeof(*cfg->l2.l2ptrs), GFP_KERNEL);
> > +	if (!cfg->l2.l2ptrs)
> > +		return -ENOMEM;
> 
> We shuold ummap cfg->l2.l1tab before returning -ENOMEM here

No. The caller would call cleanup() on the -ENOMEM.

> > +		if (span != STRTAB_SPLIT + 1) {
> > +			dev_err(smmu->dev,
> > +				"kdump: L1[%u] unsupported span %u (vs %u)\n",
> > +				i, span, STRTAB_SPLIT + 1);
> > +			return -EINVAL;
> 
> We leak kcalloc'd mem (l2.l2ptrs) here, also we should unmap cfg->l2.l1tab

Ditto.

> > +static int arm_smmu_kdump_adopt_strtab_linear(struct arm_smmu_device *smmu,
> > +					      u32 cfg_reg, phys_addr_t base)
> > +{
> > +	u32 log2size = FIELD_GET(STRTAB_BASE_CFG_LOG2SIZE, cfg_reg);
> > +	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> > +	unsigned int max_log2size;
> > +	size_t size;
> > +
> > +	/*
> > +	 * Only a coherent SMMU is supported at this moment. For a non-coherent
> > +	 * SMMU that wants to support ARM_SMMU_OPT_KDUMP_ADOPT, try MEMREMAP_WC.
> > +	 */
> > +	if (WARN_ON(!(smmu->features & ARM_SMMU_FEAT_COHERENCY)))
> > +		return -EOPNOTSUPP;
> > +
> > +	/* Cap the size at what the kdump kernel itself would have allocated */
> > +	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
> > +		max_log2size =
> > +			ilog2(STRTAB_MAX_L1_ENTRIES * STRTAB_NUM_L2_STES);
> 
> Looks like we'd never hit this if condition because we'd never support a
> "linear" strtab if the HW supports ARM_SMMU_FEAT_2_LVL_STRTAB. Please
> see arm_smmu_init_strtab:

It's a defensive check that Sashiko recommended, as we cannot be sure
that the crashed kernel was using the same driver, which I think it's
not bad to have..

> > +	size = cfg->linear.num_ents * sizeof(struct arm_smmu_ste);
> > +	cfg->linear.table = memremap(base, size, MEMREMAP_WB);
> > +	if (!cfg->linear.table)
> > +		return -ENOMEM;
> 
> We seem to skips initializing cfg->linear.ste_dma (it is populated in
> arm_smmu_init_strtab_linear)

Yes, deliberately since it's not used anywhere in kdump adopt mode.

> While the comment notes that arm_smmu_write_strtab() is bypassed in the
> kdump case, leaving cfg->linear.ste_dma uninitialized seems like a 
> ticking time bomb if any other part of the driver ever uses it.

I can't think of a reason...

> > +static void arm_smmu_kdump_adopt_cleanup(void *data)
> > +{
> > +	struct arm_smmu_device *smmu = data;
> > +	u32 cfg_reg = readl_relaxed(smmu->base + ARM_SMMU_STRTAB_BASE_CFG);
> 
> I'm worried about reading the HW register here, since this is a devres action, it
> can run after arm_smmu_device_remove() or arm_smmu_device_shutdown()
> (which would call rpm_put()). Please see __device_release_driver[1]:
> 
> 	pm_runtime_put_sync(dev); <--- HW turned off
> 
> 	device_remove(dev);
> 
> 	if (dev->bus && dev->bus->dma_cleanup)
> 		dev->bus->dma_cleanup(dev);
> 
> 	device_unbind_cleanup(dev); <--- This is where devm_release runs
> 	device_links_driver_cleanup(dev);
> 
> Thus, even if we call rpm_get() here it would make no sense as the 
> register contents would've been lost. Can we rely on some SW state
> here? smmu->features & 2LVL or maybe add an fmt in cfg? 

Yes. Sashiko pointed it out too. We can do feature bit.

> > +		 * format, enforce the same format to match the adopted table.
> > +		 */
> > +		ret = arm_smmu_kdump_adopt_strtab_linear(smmu, cfg_reg, base);
> > +		if (!ret)
> > +			smmu->features &= ~ARM_SMMU_FEAT_2_LVL_STRTAB;
> 
> IIRC, this should NOT be set if we selected the linear format.
> Looking at arm_smmu_init_strtab(), if the HW supports 2-level tables,
> the driver unconditionally selects it. What is the expected scenario
> where the previous kernel would have allocated a linear table on 2-level
> capable hardware? IMO, it is a bug if we see linear fmt with this
> feature set. Am I missing something?

Again, the crashed kernel doesn't have to use the same driver.

I will address the rest of your comments.

Thanks
Nicolin

  reply	other threads:[~2026-06-29  4:02 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-20 17:03 [PATCH rc v6 0/7] iommu/arm-smmu-v3: Fix device crash on kdump kernel Nicolin Chen
2026-05-20 17:03 ` [PATCH rc v6 1/7] iommu/arm-smmu-v3: Add arm_smmu_kdump_adopt_strtab() for kdump Nicolin Chen
2026-06-28 23:00   ` Pranjal Shrivastava
2026-06-29  4:01     ` Nicolin Chen [this message]
2026-05-20 17:03 ` [PATCH rc v6 2/7] iommu/arm-smmu-v3: Implement is_attach_deferred() " Nicolin Chen
2026-06-28 23:06   ` Pranjal Shrivastava
2026-05-20 17:03 ` [PATCH rc v6 3/7] iommu/arm-smmu-v3: Do not enable EVTQ/PRIQ interrupts in kdump kernel Nicolin Chen
2026-05-20 17:03 ` [PATCH rc v6 4/7] iommu/arm-smmu-v3: Skip EVTQ/PRIQ setup " Nicolin Chen
2026-05-20 17:03 ` [PATCH rc v6 5/7] iommu/arm-smmu-v3: Retain CR0_SMMUEN during kdump device reset Nicolin Chen
2026-05-20 17:03 ` [PATCH rc v6 6/7] iommu/arm-smmu-v3: Skip RMR bypass for kdump adoption Nicolin Chen
2026-05-20 17:03 ` [PATCH rc v6 7/7] iommu/arm-smmu-v3: Detect ARM_SMMU_OPT_KDUMP_ADOPT in probe() Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akHuKRctyEv88ytr@nvidia.com \
    --to=nicolinc@nvidia.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=iommu@lists.linux.dev \
    --cc=jamien@nvidia.com \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kees@kernel.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miko.lenczewski@arm.com \
    --cc=praan@google.com \
    --cc=robin.murphy@arm.com \
    --cc=smostafa@google.com \
    --cc=stable@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox