Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Pranjal Shrivastava <praan@google.com>
To: Nicolin Chen <nicolinc@nvidia.com>
Cc: iommu@lists.linux.dev, Will Deacon <will@kernel.org>,
	Joerg Roedel <joro@8bytes.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Mostafa Saleh <smostafa@google.com>,
	Daniel Mentz <danielmentz@google.com>,
	Ashish Mhetre <amhetre@nvidia.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v7 10/11] iommu/arm-smmu-v3: Invoke pm_runtime before hw access
Date: Fri, 29 May 2026 14:48:32 +0000	[thread overview]
Message-ID: <ahmnQN5xatcaF5hT@google.com> (raw)
In-Reply-To: <ahjNYqe8hjPUQIQD@Asurada-Nvidia>

On Thu, May 28, 2026 at 04:18:58PM -0700, Nicolin Chen wrote:
> On Thu, May 28, 2026 at 10:25:11PM +0000, Pranjal Shrivastava wrote:
> > On Thu, May 28, 2026 at 03:01:13PM -0700, Nicolin Chen wrote:
> > > On Thu, May 28, 2026 at 09:46:33PM +0000, Pranjal Shrivastava wrote:
> > > > On Thu, May 28, 2026 at 01:28:15PM -0700, Nicolin Chen wrote:
> > > > > On Wed, May 27, 2026 at 10:14:06PM +0000, Pranjal Shrivastava wrote:
> > > > > > TLB and CFG invalidations are
> > > > > > elided if the SMMU is suspended by observing the CMDQ_PROD_STOP_FLAG via
> > > > > > the arm_smmu_can_elide() helper.
> > > > > 
> > > > > All the arm_smmu_can_elide() call sites here would eventually elide
> > > > > the commands in arm_smmu_cmdq_issue_cmdlist() that is already gated
> > > > > by CMDQ_PROD_STOP_FLAG? It doesn't seem necessary to gate again?
> > > > 
> > > > While issue_cmdlist() would eventually elide these commands, the 
> > > > can_elide() check is necessary to return early during suspension. 
> > > > 
> > > > This avoids unnecessary stack allocation, cmd building, and spinlock
> > > > contention on the cmdq->lock for threads that are anyway about to be 
> > > > elided. 
> > > 
> > > We aren't in the perf sensitive path.. most of those aren't going
> > > to be that bad.
> > > 
> > > arm_smmu_cmdq_shared_lock() on the other hand is taken at step 2,
> > > and the STOP flag in the same function is gated at step 1?
> > 
> > DMA unmaps frequently occur from atomic contexts, interrupt handlers etc
> > Thee Step 1 check in issue_cmdlist() happens under local_irq_save().
> > We may argue that it doesn't happen for long though..
> 
> It shouldn't IMHO. At least most of the call sites in this patch
> are right before calling issue() functions, so they are merely a
> few cycles away from the STOP gate in issue_cmdlist()?
>

I agree that eliding right before calling issue_cmdlist() might seem
like an over-optimization. I guess we had this earlier because we didn't
have ellision in the CMDQ. I'll think more about it (just in case we're
missing some scenario) and try to perf it to confirm there's no big diff

Otherwise, I guess I'll drop the "early-exit" in v8..

> The only place that might be slightly longer is the inv_range(),
> if the domain->invs is really long (e.g. nesting parent for VM),
> in which case, it might be plausible to add a gate. And even with
> that being said, it should be add to the top of the iteration (on
> invs->has_ats) rather than before submit()?
> 

I agree.. but I'm thinking if we plan to remove the early exits, does it
make sense to keep this one? Ideally, we shouldn't be dealing with a
long domain->invs if we are in VMs (IOMMUFD & VFIO both get a pm_ref).
So, I guess if we're dropping elisions from everywhere it would be fine

> > > > By dropping these requests immediately, we significantly reduce cacheline
> > > > bouncing and contention during unmap storms.
> > > 
> > > How significantly, so as to justify invading every command issue()
> > > call site, which would be difficult to maintain? If we really need
> > > an early return, it would be nicer to have a common place at least.
> > 
> > Eliding early is more of an early-exit from the DMA unmap paths really..
> > If maintaining these high-level elision checks at 4 or 5 call sites is
> > a maintenance burden, maybe we could move the logic into the issue_cmd
> > macros? 
> 
> What kinda of macro? Again, if it is added to just a few cycles
> right before issue_cmdlist(), it still wouldn't seem necessary.

I was referring to the arm_smmu_cmdq_issue_cmd* macros here.
But I suppose you're right..

Thanks,
Praan


  reply	other threads:[~2026-05-29 14:48 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2026-05-27 22:13 ` [PATCH v7 01/11] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
2026-05-27 22:13 ` [PATCH v7 02/11] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
2026-05-28  1:35   ` Nicolin Chen
2026-05-28 10:34     ` Pranjal Shrivastava
2026-05-28 22:09       ` Nicolin Chen
2026-05-29 14:32         ` Pranjal Shrivastava
2026-05-27 22:13 ` [PATCH v7 03/11] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
2026-05-27 22:14 ` [PATCH v7 04/11] iommu/tegra241-cmdqv: Restore PROD and CONS after resume Pranjal Shrivastava
2026-05-28 18:14   ` Nicolin Chen
2026-05-27 22:14 ` [PATCH v7 05/11] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
2026-05-28 18:36   ` Nicolin Chen
2026-05-28 21:57     ` Pranjal Shrivastava
2026-05-28 22:03       ` Nicolin Chen
2026-05-27 22:14 ` [PATCH v7 06/11] iommu/arm-smmu-v3: Handle gerror during suspend Pranjal Shrivastava
2026-05-28 18:53   ` Nicolin Chen
2026-05-28 21:59     ` Pranjal Shrivastava
2026-05-27 22:14 ` [PATCH v7 07/11] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions Pranjal Shrivastava
2026-05-28 19:41   ` Nicolin Chen
2026-05-28 21:57     ` Pranjal Shrivastava
2026-05-27 22:14 ` [PATCH v7 08/11] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
2026-05-28 19:39   ` Nicolin Chen
2026-05-28 21:21     ` Pranjal Shrivastava
2026-05-28 22:13       ` Nicolin Chen
2026-05-28 23:30         ` Pranjal Shrivastava
2026-05-27 22:14 ` [PATCH v7 09/11] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
2026-05-28 20:13   ` Nicolin Chen
2026-05-28 21:36     ` Pranjal Shrivastava
2026-05-27 22:14 ` [PATCH v7 10/11] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
2026-05-28 20:28   ` Nicolin Chen
2026-05-28 21:46     ` Pranjal Shrivastava
2026-05-28 22:01       ` Nicolin Chen
2026-05-28 22:25         ` Pranjal Shrivastava
2026-05-28 23:18           ` Nicolin Chen
2026-05-29 14:48             ` Pranjal Shrivastava [this message]
2026-05-27 22:14 ` [PATCH v7 11/11] iommu/arm-smmu-v3: Add KUnit unit tests for Runtime PM Pranjal Shrivastava
2026-05-28 21:43   ` Nicolin Chen
2026-05-28 23:10     ` Pranjal Shrivastava
2026-05-28 23:21       ` Nicolin Chen
2026-05-28 23:33         ` Pranjal Shrivastava
2026-05-28 18:05 ` [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahmnQN5xatcaF5hT@google.com \
    --to=praan@google.com \
    --cc=amhetre@nvidia.com \
    --cc=danielmentz@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@ziepe.ca \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=smostafa@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox