From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 869832FFDD6 for ; Fri, 3 Apr 2026 02:20:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.177 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775182816; cv=none; b=baIlxq1S73p59AOj2tLTHj+j76HAShT6za5yD8Y3SmdFKEqCsNw89r0tHIpgMGMoRJTURxFraSUjD3ILwqHdIVr9Ed52JJ/YSxjYoma234jP+o0tDvtNpBsr6+LSvFKtCViqSNO4r5oGrviBhRXkgr2OAFVjNaTDKBpEHHecgE8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775182816; c=relaxed/simple; bh=bda+2S/CRRI24v+0YtjXyLzM6YHcwsrVhQlTDGYXBi4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=LQpMd1kel8JH32DBhzbQElX/W2TJ9H2bXZ84u37211rgq9G7bd/6mTJzO/7wVMNEfj7V8z7gO/DM+rwZTntXq9CXBBqEgqlNivrlnqjwv+KqpVbhGjgMEi3PYfMX0d7BZ47VeafkUOIhLjwCB9p7ob0XSbpwUo6QMdZcrlC8BU8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=W958vsGF; arc=none smtp.client-ip=209.85.214.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="W958vsGF" Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-2b243198058so23805ad.1 for ; Thu, 02 Apr 2026 19:20:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1775182814; x=1775787614; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=YLyfrpphk/pri+vsSR8ZKPbndfkUJotLPgfVmPM7TEQ=; b=W958vsGFUdCI2tabe1gORw7jMM058RIXSXTAAdq6uYZaIE8HAm2QJ97DGE1EkYvjm/ ixLe2j+fpACgmlZ54RZUy9IoNRvVGAc+15IrLDRRUpKIgMAav4MqmTRTbPNrQMTqZkyh o9YOrS65TwsUT4F4lP/dMe+H2lK3D+4rwOcfREshb+LQF8OlTnDoHoi1n1Z/1H0s5xpQ fdhqNRcRCip21GgY3oQm56BAx9vtrkj9QNCBSa/8Ois6Y9FmE2jdBvqWVS9HRxNz33J2 Y4tIRm22CtH9YU/+usv3IgE3s1/EhpeTawxAE/jtkyV3/xauS39Mnyu4gNw0m5DU9ghp aeFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775182814; x=1775787614; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YLyfrpphk/pri+vsSR8ZKPbndfkUJotLPgfVmPM7TEQ=; b=hWcPxZoKnUY07RGCuCh4Is7IXsO3osBHJ50PJiGJEi8fxgqCrFAucJtDeTbY7+yrT0 OLablvaVHM5qH+MGdulzz4nvVf8h2SKsh1NmNggsNi0uNhV/VLxdkoYZR8RTKg/uC5CF CoDUIiHToX/dd37RLddK54AFB6fz7X4v5C4vfKOzpnp6L5rqb4wBoIVPa8FCeJOaXBZG Y17XwCd9YejnsP8weat+T6Igb5FwT+DZpdgccU29dOOPeeQSi405D+cyvkcLBY5vxrFM +BT6gVQ6A6anUz/pFZbUWCut0+sUzHC9fBx9hFcemWXK/RY198ufNo3VRuk5Ujv8Din8 9BBA== X-Forwarded-Encrypted: i=1; AJvYcCWKF0PBnlUl034HhDo0WlmB6f9qDMI3B6eBzFTfU+hIWLtLTNIaS8MlfXzGXBhTD3tzhneGcmmgc8rlqBA=@vger.kernel.org X-Gm-Message-State: AOJu0Yxwc2+UAVav7vbtT+KfMIqr1Z6htL+mtwKvFgLnOhnOYRYsFYRY yLZBTsh5UMX6boSPExq5B0n0dnZG+pWZZrbY7/WWRws9hMnW1PJ8tKC45AebwhW1Jg== X-Gm-Gg: AeBDietc+RTPZUZSWFIJXjfq0f7sbulMMQdLw/fcNdbIwLQXM+dNnRFp88thCYIVTd5 WT043JjSckuN0F+go8Xh20eNGWYrzIzSPux6GHca9ya3h83KYJHFotn/QbomC7EdqXzyX/yQfrZ Hiqpex8MsFAKf2t2bbLGhWsfgQh3C1EKVv1L4WnhnC9/MUMzWgnS6iGA7b0NfiGL5BV7RwSF3kS Trcq8h2s4UsqiRdUIeB1qo50YzvUbAWsavfAXMN8nUeN2kc+ZuAKzBTXCTsf6WVOf2vtfiNDq66 WMzPWbFi3fxluNXZ4tb8vfPJoLRHeVjyhL/6wXyDwDEmLfiI28mYbNPzxhd0sGQpCGkfOyguP3r ddDO5VrLEsnpTsJUKDQiapDYITJRQeAgIPCvAxad3fqMdBtHu6ofpa4OwgpgynnNwg6QrNRassv 5r2bZLFn0eixkRm97IrkRc4TlH9C8D/5NbDWglfSG7Z7mrQnBjdXKhSRYSDA== X-Received: by 2002:a17:903:2f0d:b0:2ae:4808:bd99 with SMTP id d9443c01a7336-2b282ed3780mr1613755ad.2.1775182813067; Thu, 02 Apr 2026 19:20:13 -0700 (PDT) Received: from google.com (10.129.124.34.bc.googleusercontent.com. [34.124.129.10]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82cf9c71e62sm4165949b3a.44.2026.04.02.19.20.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Apr 2026 19:20:12 -0700 (PDT) Date: Fri, 3 Apr 2026 02:20:06 +0000 From: Pranjal Shrivastava To: Prakash Gupta Cc: Will Deacon , Robin Murphy , Joerg Roedel , Rob Clark , Connor Abbott , linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Akhil P Oommen , Pratyush Brahma Subject: Re: [PATCH v2] iommu/arm-smmu: Use pm_runtime in fault handlers Message-ID: References: <20260313-smmu-rpm-v2-1-8c2236b402b0@oss.qualcomm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260313-smmu-rpm-v2-1-8c2236b402b0@oss.qualcomm.com> On Fri, Mar 13, 2026 at 03:53:53PM +0530, Prakash Gupta wrote: > Commit d4a44f0750bb ("iommu/arm-smmu: Invoke pm_runtime across the driver") > enabled pm_runtime for the arm-smmu device. On systems where the SMMU > sits in a power domain, all register accesses must be done while the > device is runtime active to avoid unclocked register reads and > potential NoC errors. > > So far, this has not been an issue for most SMMU clients because > stall-on-fault is enabled by default. While a translation fault is > being handled, the SMMU stalls further translations for that context > bank, so the fault handler would not race with a powered-down > SMMU. > > Adreno SMMU now disables stall-on-fault in the presence of fault > storms to avoid saturating SMMU resources and hanging the GMU. With > stall-on-fault disabled, the SMMU can generate faults while its power > domain may no longer be enabled, which makes unclocked accesses to > fault-status registers in the SMMU fault handlers possible. > > Guard the context and global fault handlers with pm_runtime_get_if_active() > and pm_runtime_put_autosuspend() so that all SMMU fault register accesses > are done with the SMMU powered. In case pm_runtime is not active we can > safely ignore the fault as for pm runtime resume the smmu device is > reset and fault registers are cleared. > > Fixes: b13044092c1e ("drm/msm: Temporarily disable stall-on-fault after a page fault") > Co-developed-by: Pratyush Brahma > Signed-off-by: Pratyush Brahma > Signed-off-by: Prakash Gupta > --- > Changes in v2: > - Switched from arm_smmu_rpm_get()/arm_smmu_rpm_put() wrappers to > pm_runtime_get_if_active()/pm_runtime_put_autosuspend() APIs > - Added support for smmu->impl->global_fault callback in global fault handler > - Remove threaded irq context fault restriction to allow modifying stall > mode for adreno smmu > - Link to v1: https://patch.msgid.link/20260127-smmu-rpm-v1-1-2ef2f4c85305@oss.qualcomm.com > --- > drivers/iommu/arm/arm-smmu/arm-smmu.c | 60 +++++++++++++++++++++++------------ > 1 file changed, 39 insertions(+), 21 deletions(-) > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c > index 5e690cf85ec9..f4c46491a03d 100644 > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c > @@ -462,10 +462,20 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev) > int idx = smmu_domain->cfg.cbndx; > int ret; > > + if (!pm_runtime_get_if_active(smmu->dev)) Note that the pm_runtime_get_if_active(dev) only returns a positive value if the device state is exactly RPM_ACTIVE. If the device is in the middle of its runtime_suspend callback, its state is RPM_SUSPENDING. Thus, if a fault races with the suspend callback, we'll return IRQ_NONE and the suspend callback doesn't seem to be disabling interrupts. This isn't any better if we're in a fault-storm caused by level-triggered interrupts, we'd simply keep entering this handler and return. I believe we should clear / handle any pending faults/interrupts or atleast mask interrupt in the suspend handler to avoid this situation. > + return IRQ_NONE; > + Maybe we could add another wrapped-helper to maintain consistency: static inline int arm_smmu_rpm_get_if_active(struct arm_smmu_device *smmu) { if (!pm_runtime_enabled(smmu->dev)) return 1; // Assume active/powered if RPM is not used return pm_runtime_get_if_active(smmu->dev); } This returns -EINVAL otherwise which isn't a problem for the if condition but slightly cleaner. > + if (smmu->impl && smmu->impl->context_fault) { > + ret = smmu->impl->context_fault(irq, dev); > + goto out_power_off; > + } > + We've moved impl-specific handlers here, I don't see a functional change. This looks fine. > arm_smmu_read_context_fault_info(smmu, idx, &cfi); > > - if (!(cfi.fsr & ARM_SMMU_CB_FSR_FAULT)) > - return IRQ_NONE; > + if (!(cfi.fsr & ARM_SMMU_CB_FSR_FAULT)) { > + ret = IRQ_NONE; > + goto out_power_off; > + } > > ret = report_iommu_fault(&smmu_domain->domain, NULL, cfi.iova, > cfi.fsynr & ARM_SMMU_CB_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ); > @@ -480,7 +490,12 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev) > ret == -EAGAIN ? 0 : ARM_SMMU_RESUME_TERMINATE); > } > > - return IRQ_HANDLED; > + ret = IRQ_HANDLED; > + > +out_power_off: > + pm_runtime_put_autosuspend(smmu->dev); Nit: Please use arm_smmu_rpm_put() here.. while at it, I guess we can also bring back pm_runtime_put_autosuspend() in arm_smmu_rpm_put() since it is updated now to also mark last busy. > + > + return ret; > } > > static irqreturn_t arm_smmu_global_fault(int irq, void *dev) > @@ -489,14 +504,25 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev) > struct arm_smmu_device *smmu = dev; > static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL, > DEFAULT_RATELIMIT_BURST); > + int ret; > + > + if (!pm_runtime_get_if_active(smmu->dev)) > + return IRQ_NONE; > + Same here. > + if (smmu->impl && smmu->impl->global_fault) { > + ret = smmu->impl->global_fault(irq, dev); > + goto out_power_off; > + } > > gfsr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sGFSR); > gfsynr0 = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sGFSYNR0); > gfsynr1 = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sGFSYNR1); > gfsynr2 = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sGFSYNR2); > > - if (!gfsr) > - return IRQ_NONE; > + if (!gfsr) { > + ret = IRQ_NONE; > + goto out_power_off; > + } > > if (__ratelimit(&rs)) { > if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) && > @@ -513,7 +539,11 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev) > } > > arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sGFSR, gfsr); > - return IRQ_HANDLED; > + ret = IRQ_HANDLED; > + > +out_power_off: > + pm_runtime_put_autosuspend(smmu->dev); > + return ret; > } > > static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain, > @@ -683,7 +713,6 @@ static int arm_smmu_init_domain_context(struct arm_smmu_domain *smmu_domain, > enum io_pgtable_fmt fmt; > struct iommu_domain *domain = &smmu_domain->domain; > struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > - irqreturn_t (*context_fault)(int irq, void *dev); > > mutex_lock(&smmu_domain->init_mutex); > if (smmu_domain->smmu) > @@ -850,19 +879,14 @@ static int arm_smmu_init_domain_context(struct arm_smmu_domain *smmu_domain, > */ > irq = smmu->irqs[cfg->irptndx]; > > - if (smmu->impl && smmu->impl->context_fault) > - context_fault = smmu->impl->context_fault; > - else > - context_fault = arm_smmu_context_fault; > - > if (smmu->impl && smmu->impl->context_fault_needs_threaded_irq) > ret = devm_request_threaded_irq(smmu->dev, irq, NULL, > - context_fault, > + arm_smmu_context_fault, > IRQF_ONESHOT | IRQF_SHARED, > "arm-smmu-context-fault", > smmu_domain); > else > - ret = devm_request_irq(smmu->dev, irq, context_fault, IRQF_SHARED, > + ret = devm_request_irq(smmu->dev, irq, arm_smmu_context_fault, IRQF_SHARED, > "arm-smmu-context-fault", smmu_domain); > > if (ret < 0) { > @@ -2125,7 +2149,6 @@ static int arm_smmu_device_probe(struct platform_device *pdev) > struct device *dev = &pdev->dev; > int num_irqs, i, err; > u32 global_irqs, pmu_irqs; > - irqreturn_t (*global_fault)(int irq, void *dev); > > smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL); > if (!smmu) { > @@ -2205,18 +2228,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev) > smmu->num_context_irqs = smmu->num_context_banks; > } > > - if (smmu->impl && smmu->impl->global_fault) > - global_fault = smmu->impl->global_fault; > - else > - global_fault = arm_smmu_global_fault; > - > for (i = 0; i < global_irqs; i++) { > int irq = platform_get_irq(pdev, i); > > if (irq < 0) > return irq; > > - err = devm_request_irq(dev, irq, global_fault, IRQF_SHARED, > + err = devm_request_irq(dev, irq, arm_smmu_global_fault, IRQF_SHARED, > "arm-smmu global fault", smmu); > if (err) > return dev_err_probe(dev, err, > Thanks, Praan