All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pranjal Shrivastava <praan@google.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: iommu@lists.linux.dev, Will Deacon <will@kernel.org>,
	Joerg Roedel <joro@8bytes.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Mostafa Saleh <smostafa@google.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	Daniel Mentz <danielmentz@google.com>
Subject: Re: [PATCH v4 8/8] iommu/arm-smmu-v3: Invoke pm_runtime before hw access
Date: Thu, 20 Nov 2025 20:26:20 +0000	[thread overview]
Message-ID: <aR95bHwPMyW3U5hd@google.com> (raw)
In-Reply-To: <20251118001439.GF17968@ziepe.ca>

On Mon, Nov 17, 2025 at 08:14:39PM -0400, Jason Gunthorpe wrote:
> On Mon, Nov 17, 2025 at 07:14:33PM +0000, Pranjal Shrivastava wrote:
> > Invoke the pm_runtime helpers at all places before accessing the hw.
> > The idea is to invoke runtime_pm helpers at common points which are
> > used by exposed ops or interrupt handlers. Elide all TLB and CFG
> > invalidations if the smmu is suspended but not  ATC invalidations.
> > 
> > Signed-off-by: Pranjal Shrivastava <praan@google.com>
> > ---
> >  .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c     | 10 +-
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 95 +++++++++++++++----
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  3 +
> >  3 files changed, 91 insertions(+), 17 deletions(-)
> 
> I wonder if the cleanup.h macros would improve this?
> 
> > @@ -1122,6 +1124,10 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
> >  		break;
> >  	}
> >  
> > +	ret = arm_smmu_rpm_get(smmu);
> > +	if (ret < 0)
> > +		return;
> > +
> >  	arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
> >  	/*
> >  	 * Don't send a SYNC, it doesn't do anything for RESUME or PRI_RESP.
> > @@ -1129,6 +1135,7 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
> >  	 * terminated... at some point in the future. PRI_RESP is fire and
> >  	 * forget.
> >  	 */
> > +	arm_smmu_rpm_put(smmu);
> >  }
> 
> What are the rules for power managemnt? If the SMMU sleeps and GBPA is
> set to abort all then how can we have an outstanding concurrent DMA that
> needs page fault handling? 
> 
> ie if we resume the fault and the racily power down the SMMU isn't
> there a good chance the transaction will blow up anyhow?
> 
> Isn't that a bug we should detect?
> 
> Basically, shouldn't some of these be more like 'if not powered up
> then something is very wrong so log and fail' ?
> 

Yes, I think I assumed that this thread won't run while we're powered
off which was wrong. I'll add a check to confirm we're powered up. The
RPM calls are just to ensure that we don't suspend while issuing the
resume.

> > @@ -2152,13 +2171,24 @@ static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
> >  {
> >  	u32 gerror, gerrorn, active;
> >  	struct arm_smmu_device *smmu = dev;
> > +	int ret;
> > +
> > +	if (pm_runtime_enabled(smmu->dev)) {
> > +		ret = pm_runtime_get_if_active(smmu->dev);
> > +		if (ret == 0) {
> > +			dev_err(smmu->dev, "Ignoring gerror interrupt because device isn't rpm active\n");
> > +			return IRQ_NONE;
> > +		}
> > +	}
> 
> This looks troubled too, if powering down looses the gerror register,
> then this scheme is racily going to loose gerror events?
> 
> Seems like it should be checking that gerror is indeed 0 before
> powering down? Under the assumption there is no concurrent activity
> this should be race free? If it is not zero then it should be handled
> before it is lost.

Yes, I believe we can check it after disabling the CMDQ and log the
events.
> 
> >  /* IO_PGTABLE API */
> > @@ -2377,6 +2422,7 @@ static void arm_smmu_tlb_inv_context(void *cookie)
> >  	} else {
> >  		cmd.opcode	= CMDQ_OP_TLBI_S12_VMALL;
> >  		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
> > +
> >  		arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);> >  	}
> >  	arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
> > @@ -2471,6 +2517,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
> >  		cmd.opcode	= CMDQ_OP_TLBI_S2_IPA;
> >  		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
> >  	}
> > +
> >  	__arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
> >  
> >  	if (smmu_domain->nest_parent) {
> > @@ -3312,13 +3359,14 @@ static int arm_smmu_blocking_set_dev_pasid(struct iommu_domain *new_domain,
> >  		    sid_domain->type == IOMMU_DOMAIN_BLOCKED)
> >  			sid_domain->ops->attach_dev(sid_domain, dev);
> >  	}
> > +
> >  	return 0;
> 
> extra new lines??
> 

Ughh, my bad, I'll fix this, all of these are just some rebase-remnants

> > -static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
> > -				    struct device *dev,
> > -				    struct arm_smmu_ste *ste,
> > -				    unsigned int s1dss)
> > +static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
> > +				   struct device *dev,
> > +				   struct arm_smmu_ste *ste,
> > +				   unsigned int s1dss)
> >  {
> >  	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> >  	struct arm_smmu_attach_state state = {
> > @@ -3361,6 +3409,7 @@ static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
> >  	 * descriptor from arm_smmu_share_asid().
> >  	 */
> >  	arm_smmu_clear_cd(master, IOMMU_NO_PASID);
> > +	return 0;
> >  }
> 
> ??
> 
> >  
> >  static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
> > @@ -3371,8 +3420,8 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
> >  
> >  	arm_smmu_master_clear_vmaster(master);
> >  	arm_smmu_make_bypass_ste(master->smmu, &ste);
> > -	arm_smmu_attach_dev_ste(domain, dev, &ste, STRTAB_STE_1_S1DSS_BYPASS);
> > -	return 0;
> > +	return arm_smmu_attach_dev_ste(domain, dev, &ste,
> > +				       STRTAB_STE_1_S1DSS_BYPASS);
> >  }
> 
> These things are not allowed to fail, please be careful not to
> propogate errors beyond where they should be. This is the biggest
> thing I don't like about this series, we already have no error
> recovery and now we are adding more cases where errors can't be
> propogated. 
> 
> For instance this call really cannot fail and propogating an error is
> nonsense.
> 
> If you really couldn't power it up then you are better to keep it
> powered down and update the in memory structures knowing that a
> powered down device cannot cache them.
> 

Ughh, sorry, these are just some rebase-remnants. I'll fix them.

> Jason
> 

Thanks!
Praan

      reply	other threads:[~2025-11-20 20:26 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-17 19:14 [RFC PATCH v4 0/8] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 1/8] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 2/8] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
2025-11-17 19:31   ` Nicolin Chen
2025-11-18  3:48   ` Daniel Mentz
2025-11-20 20:45     ` Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 3/8] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 4/8] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 5/8] iommu/arm-smmu-v3: Add a usage counter for cmdq Pranjal Shrivastava
2025-11-18  6:05   ` Sairaj Kodilkar
2025-11-20 20:48     ` Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 6/8] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
2025-12-24  7:39   ` Ashish Mhetre
2026-01-12  8:50     ` Pranjal Shrivastava
2026-01-13  5:06       ` Ashish Mhetre
2025-11-17 19:14 ` [PATCH v4 7/8] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 8/8] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
2025-11-18  0:14   ` Jason Gunthorpe
2025-11-20 20:26     ` Pranjal Shrivastava [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aR95bHwPMyW3U5hd@google.com \
    --to=praan@google.com \
    --cc=danielmentz@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@ziepe.ca \
    --cc=joro@8bytes.org \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=smostafa@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.