Linux IOMMU Development
 help / color / mirror / Atom feed
From: Pranjal Shrivastava <praan@google.com>
To: Daniel Mentz <danielmentz@google.com>
Cc: iommu@lists.linux.dev, Will Deacon <will@kernel.org>,
	Joerg Roedel <joro@8bytes.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Mostafa Saleh <smostafa@google.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	Ashish Mhetre <amhetre@nvidia.com>
Subject: Re: [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
Date: Wed, 22 Apr 2026 13:40:53 +0000	[thread overview]
Message-ID: <aejP5Yk72_FtZnIo@google.com> (raw)
In-Reply-To: <CAE2F3rCaBWtRruBbZcedrsLnkkETSniTWoxxYn+Lkac9ttsshQ@mail.gmail.com>

On Tue, Apr 21, 2026 at 09:25:12PM -0700, Daniel Mentz wrote:
> On Tue, Apr 14, 2026 at 12:47 PM Pranjal Shrivastava <praan@google.com> wrote:
> > +/* Runtime PM helpers */
> > +__maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
> 
> Please be consistent about placing the __maybe_unused attribute. I
> believe that in the other functions, the placement is between the
> return type and the function name.
> 

Ack.

> > +
> > +/*
> > + * This should always return true if devlinks are setup correctly
> > + * and the client using the SMMU is active.
> > + */
> > +__maybe_unused bool arm_smmu_rpm_get_if_active(struct arm_smmu_device *smmu)
> > +{
> > +       if (!arm_smmu_is_active(smmu))
> > +               return false;
> 
> I'm wondering if this check is redundant. What would happen if we removed it?
> 

You're right that pm_runtime_get_if_active() shall suffice. However, I
added the arm_smmu_is_active() check as a lockless fast path for unmaps

Consider a scneario where there're high-frequency unmaps (unmap-storm)
while the SMMU is suspended. Without this check, every thread would 
contend for the dev->power.lock spinlock inside the RPM core 
(pm_runtime_get_if_active) just to discover the device is not active.

This check allows CPUs to elide the command immediately without
contending for the dev->power.lock spinlock inside the RPM core
(get_if_active call) and avoids unnecessary cacheline bouncing when the
SMMU is suspended.

> > +
> > +       if (pm_runtime_enabled(smmu->dev))
> > +               return pm_runtime_get_if_active(smmu->dev) > 0;
> > +
> > +       return true;
> > +}
> 
> > +static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
> > +{
> > +       struct arm_smmu_device *smmu = dev_get_drvdata(dev);
> > +       struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
> > +       struct arm_smmu_ll_queue llq, head;
> > +       int timeout = ARM_SMMU_SUSPEND_TIMEOUT_US;
> > +       u32 enables, target;
> > +       u64 old;
> > +       int ret;
> > +
> > +       /* Abort all transactions before disable to avoid spurious bypass */
> > +       arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> > +
> > +       /* Disable the SMMU via CR0.EN and all queues except CMDQ */
> > +       enables = CR0_CMDQEN;
> > +       ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0, ARM_SMMU_CR0ACK);
> > +       if (ret) {
> > +               dev_err(smmu->dev, "failed to disable SMMUEN\n");
> 
> "failed to disable SMMU". "EN" is redundant.
> 

Ack. I attempted to be verbose highlighting that the attempt was to
disable non-CMDQ stuff but I don't think that's needed. I'll update this

> > +               return ret;
> > +       }
> > +
> > +       /*
> > +        * At this point the SMMU is completely disabled and won't access
> > +        * any translation/config structures, even speculative accesses
> > +        * aren't performed as per the IHI0070 spec (section 6.3.9.6).
> > +        */
> > +
> > +       /* Mark the CMDQ to stop */
> > +       llq.val = READ_ONCE(cmdq->q.llq.val);
> > +       do {
> > +               head = llq;
> > +               head.prod |= CMDQ_PROD_STOP_FLAG;
> > +
> > +               old = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);
> > +               if (old == llq.val)
> > +                       break;
> > +
> > +               llq.val = old;
> > +       } while (1);
> > +
> > +       /* Wait for the last committed owner to reach the hardware */
> > +       target = head.prod & ~CMDQ_PROD_STOP_FLAG;
> > +       while (atomic_read(&cmdq->owner_prod) != target && --timeout)
> > +               udelay(1);
> 
> Should we just use the following line from arm_smmu_cmdq_issue_cmdlist()?
> 
> atomic_cond_read_relaxed(&cmdq->owner_prod, VAL == target);
> 

I did consider atomic_cond_read_relaxed() but didn't go for it as it
doesn't have a timeout, we could wait here indefinitely.. (it's a
while(1) loop)

> > +
> > +       /*
> > +        * Entering suspend implies no active clients. A race or timeout here
> > +        * indicates a broken RPM or devlink dependency. We proceed anyway to
> > +        * prioritize memory safety (avoiding stale TLBs) over bus faults.
> 
> Can you elaborate on how broken dependencies will result in a stuck
> cmdq->owner_prod?
> 

I believe you are right, this comment needs to be updated.
With the CMDQ_PROD_STOP_FLAG gating mechanism, concurrent submissions are
handled safely even if device links are broken (they either complete if
committed before the flag, or bail out if they arrive after). A timeout
here would actually indicate something like a CMDQ stall rather than just
a missing device link. I will update the comment to reflect this.

> > +        */
> > +       if (!timeout)
> > +               dev_err(smmu->dev, "cmdq owner wait timeout, (check runtime PM + devlinks)\n");
> > +
> > +       /* Drain the CMDQs */
> > +       ret = arm_smmu_drain_queues(smmu);
> > +       if (ret)
> > +               dev_warn(smmu->dev, "failed to drain queues, forcing suspend\n");
> > +
> > +       /* Wait for cmdq->lock == 0 to ensure last CMDQ_CONS_REG is written */
> > +       timeout = ARM_SMMU_SUSPEND_TIMEOUT_US;
> > +       while (atomic_read(&cmdq->lock) != 0 && --timeout)
> 
> Can we use
> 
> atomic_cond_read_relaxed(&cmdq->lock, VAL == 0);
> 

Same as above, worried about a lockup due to CMDQ Stalls.

> > +               udelay(1);
> > +
> > +       /* Timing out here implies misconfigured Runtime PM or broken devlinks */
> > +       if (!timeout)
> > +               dev_err(smmu->dev, "cmdq lock != 0, forcing suspend. Polling CPUs may fault.\n");
> > +
> > +       /* Disable everything */
> > +       arm_smmu_device_disable(smmu);
> > +
> > +       /* Handle any pending gerrors before powering down */
> > +       arm_smmu_handle_gerror(smmu);
> > +
> > +       dev_dbg(dev, "suspended smmu\n");
> > +
> > +       return 0;
> > +}
> > +
> > +static int __maybe_unused arm_smmu_runtime_resume(struct device *dev)
> > +{
> > +       int ret;
> > +       struct arm_smmu_device *smmu = dev_get_drvdata(dev);
> > +       struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
> > +       struct arm_smmu_ll_queue llq;
> > +
> > +       /* Clear the STOP FLAG to allow CMDQ Submissions */
> 
> Clearing the STOP flag here seems premature. I propose the following sequence:
> 
> * Set up command queue (SMMU_CR1, SMMU_CMDQ_BASE etc.)
> * Enable command queue
> * Clear STOP flag
> * Enable SMMU
> 
> By clearing the STOP flag, you allow arm_smmu_cmdq_issue_cmdlist() to
> write to SMMU_CMDQ_PROD which then races against
> arm_smmu_device_reset() which also writes to SMMU_CMDQ_PROD.
> 

I agree, IIUC you're suggesting to clear the stop flag before the
TLBI_ALL and CFGI_ALL commands are issued by device_reset? We'll need
to clear the flag before device_reset issues the  TLBI_ALL AND CFGI_ALL.
Thus the sequence should be:

* Set up CMDQ
* Enable CMDQ
* Clear STOP flag
* Issue TLBI_ALL + CFGI_ALL
* Enable SMMU

Right?

> > +       llq.val = READ_ONCE(cmdq->q.llq.val);
> > +       while (1) {
> > +               u64 old_val;
> > +               struct arm_smmu_ll_queue head = llq;
> > +
> > +               if (!(head.prod & CMDQ_PROD_STOP_FLAG))
> > +                       break;
> > +
> > +               head.prod &= ~CMDQ_PROD_STOP_FLAG;
> > +               old_val = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);
> > +               if (old_val == llq.val)
> > +                       break;
> > +               llq.val = old_val;
> > +       }
> > +
> > +       /* Re-configure MSIs */
> > +       arm_smmu_resume_msis(smmu);
> > +
> > +       ret = arm_smmu_device_reset(smmu);
> > +       if (ret)
> > +               dev_err(dev, "failed to reset during resume operation: %d\n", ret);
> > +
> > +       dev_dbg(dev, "resumed device\n");
> 
> arm_smmu_runtime_suspend() prints "resumed smmu" (not "device"). Try
> to be consistent with the language used in messages.
> 

Ack. I'll update it to dev_dbg(dev, "resumed smmu\n");

> > +
> > +       return ret;
> > +}

Thanks,
Praan

  reply	other threads:[~2026-04-22 13:41 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-14 19:46 [PATCH v6 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 01/10] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 02/10] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 03/10] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 04/10] iommu/tegra241-cmdqv: Restore PROD and CONS after resume Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 05/10] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 06/10] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions Pranjal Shrivastava
2026-04-22  4:31   ` Daniel Mentz
2026-04-22 12:18     ` Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
2026-04-21  4:47   ` Daniel Mentz
2026-04-22 12:20     ` Pranjal Shrivastava
2026-04-21 15:46   ` Daniel Mentz
2026-04-22 12:23     ` Pranjal Shrivastava
2026-04-22  4:25   ` Daniel Mentz
2026-04-22 13:40     ` Pranjal Shrivastava [this message]
2026-04-24  5:24       ` Daniel Mentz
2026-04-24 15:16         ` Jason Gunthorpe
2026-05-04  9:01           ` Pranjal Shrivastava
2026-04-24 15:13       ` Jason Gunthorpe
2026-05-04 11:15         ` Pranjal Shrivastava
2026-05-19 20:50       ` Daniel Mentz
2026-04-14 19:47 ` [PATCH v6 08/10] iommu/arm-smmu-v3: Handle gerror during suspend Pranjal Shrivastava
2026-04-14 19:47 ` [PATCH v6 09/10] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
2026-04-14 19:47 ` [PATCH v6 10/10] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
2026-05-19 18:31   ` Daniel Mentz
2026-05-25 19:53     ` Pranjal Shrivastava

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aejP5Yk72_FtZnIo@google.com \
    --to=praan@google.com \
    --cc=amhetre@nvidia.com \
    --cc=danielmentz@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@ziepe.ca \
    --cc=joro@8bytes.org \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=smostafa@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox