* [PATCH v6 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops
@ 2026-04-14 19:46 Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 01/10] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
` (9 more replies)
0 siblings, 10 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-14 19:46 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
Pranjal Shrivastava
As arm-smmu-v3 rapidly finds its way into SoCs designed for hand-held
devices, power management capabilities, similar to its predecessors, are
crucial for these applications. This series introduces power management
support for the arm-smmu-v3 driver.
Design
=======
The arm-smmu-v3 primarily operates with in-memory data structures
through HW registers pointing to these data structures. The proposed design
makes use of this fact for implementing suspend and resume ops, centered
around a software gate embedded in the command queue.
1. CMDQ Gate (CMDQ_PROD_STOP_FLAG)
To safely manage runtime PM without regressing performance on high core
count servers or systems not opting for runtime power management,
this series introduces a CMDQ_PROD_STOP_FLAG (bit 30) in the command
queue's producer index. The flag acts as a Point of Commitment in the
cmpxchg loop of arm_smmu_cmdq_issue_cmdlist(), ensuring no new indices
are reserved once suspension begins.
2. Suspend / Resume Flow
The "suspend" operation follows a multi-stage quiesce sequence:
a. Stop Traffic: Sets SMMUEN=0 and GBPA=Abort to halt new transactions.
b. Gate CMDQ: Sets the CMDQ_PROD_STOP_FLAG to block new submissions.
c. Command Flush: Waits for any in-flight "owner" threads to commit
their reserved indices to hardware.
d. HW Drain: Polls the CMDQ until all committed commands are consumed.
e. SW Quiesce: Waits for all concurrent threads to release the shared
cmdq->lock, ensuring no CPUs are left polling CONS register.
Entering the suspend sequence implies that the device has no active clients
Any racing command submissions or failure to quiesce at this stage indicates
a bug in the Runtime PM or device link dependencies. Such races should not
happen in practice, which justifies the non-failing nature of the suspend
sequence in favor of memory safety.
The "resume" operation clears the STOP_FLAG & performs a full device
reset via arm_smmu_device_reset(), which re-initializes the HW using the
SW-copies maintained by the driver and clears all cached configurations.
3. Guarding Hardware Access and Elision
The driver ensures the SMMU is active before hardware access via
arm_smmu_rpm_get() and arm_smmu_rpm_put() helpers. An
arm_smmu_rpm_get_if_active() helper is introduced to elide TLB, CFG, and
ATC invalidations if the SMMU is suspended. For ATC invalidations,
device links must guarantee the SMMU is active if the endpoint is active;
a WARN_ON_ONCE() catches inconsistencies. Elision is safe because
hardware reset on resume invalidates caches, and PCIe device links handle
power dependencies for ATC.
4. Interrupt Re-config
a. Wired irqs: The series refactors arm_smmu_setup_irqs to allow
separate installation of handlers, aiding in correct re-initialization.
b. MSIs: The series caches the msi_msg and restores it during resume
via a new arm_smmu_resume_msis() helper.
c. GERROR: Late-breaking global errors are captured and handled
immediately after SMMU disablement during suspend to ensure no diagnostic
information is lost.
Scalability and Performance
===========================
A key design goal of this series is to ensure that high-performance systems
(typically servers) that do not enable runtime PM are not penalized. By
embedding a stop flag in the command queue's producer index and designing
RPM helpers to perform only read-only checks when RPM is disabled, command
submission on these systems incurs negligible overhead. Power-managed
systems only utilize runtime PM atomics as necessary, ensuring that
scalability is maintained across all hardware classes.
Call for review
Any insights/comments on the proposed changes are appreciated,
especially regarding the synchronization & elision logic.
[v6]
- Replaced the atomic nr_cmdq_users counter with CMDQ_PROD_STOP_FLAG
to eliminate atomic overhead on high-core count servers.
- Implemented a 5-step quiesce sequence in runtime_suspend including
pipeline flushes and software completion barriers.
- Introduced arm_smmu_rpm_get_if_active() to elide TLB/CFG/ATC
invalidations when the SMMU is suspended.
- Added WARN_ON_ONCE() in invalidation paths to detect inconsistent
power states for active endpoints.
- Refined batch submission in __arm_smmu_domain_inv_range() to ensure
clean state when dropping batches.
- Refactored GERROR handling for better integration with suspend.
- Added Suggested-by tags for Daniel Mentz.
[v5]
- https://lore.kernel.org/all/20260126151157.3418145-1-praan@google.com/
- Refactored GERROR handling into a helper function and invoked it during
runtime suspend after disabling the SMMU to capture any late-breaking
gerrors as suggested by Jason.
- Updated `arm_smmu_page_response` to be power-state aware and drop
page faults received while suspended.
- Included a patch from Ashish to correctly restore PROD and CONS
indices for tegra241-cmdqv after a hardware reset.
- Collected Reviewed-bys from Mostafa and Nicolin.
[v4]
- https://lore.kernel.org/all/20251117191433.3360130-1-praan@google.com/
- Dropped the `pm_runtime_get_if_not_suspended()` API in favor of a
simpler, driver-specific biased counter (`nr_cmdq_users`) to manage
runtime PM state.
- Reworked the suspend callback to poll on the biased counter before
disabling the SMMU.
- Addressed comments for the MSI refactor.
[v3]
- https://lore.kernel.org/all/20250616203149.2649118-1-praan@google.com/
- Introduced `pm_runtime_get_if_not_suspended` API to avoid races due
to bouncing RPM states while eliding TLBIs as pointed out by Daniel.
- Addressed Nicolin's comments regarding msi_resume and CMDQV flush
- Addressed Daniel's comments about CMDQ locking and draining
- Addressed issues related to draining the evtq and priq
- Dropped the code to identify and track user-space attachments
[v2]
- https://lore.kernel.org/all/20250418233409.3926715-1-praan@google.com/
- Introduced `arm_smmu_rpm_get_if_active` for eliding TLBIs & CFGIs
- Updated the rpm helper invocation strategy.
- Drained all queues in suspend callback (including tegra241-cmdv)
- Cache and restore msi_msg instead of free-ing realloc-ing on resume
- Added support to identify and track user-space attachments
- Fixed the setup_irqs as per Nicolin & Mostafa's suggestions
- Used force_runtime_suspend/resume instead as per Mostafa's suggestion.
- Added "Reviewed-by" line from Mostafa on an unchanged patch
[v1]
- https://lore.kernel.org/all/20250319004254.2547950-1-praan@google.com/
Ashish Mhetre (1):
iommu/tegra241-cmdqv: Restore PROD and CONS after resume
Pranjal Shrivastava (9):
iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs
iommu/arm-smmu-v3: Add a helper to drain cmd queues
iommu/tegra241-cmdqv: Add a helper to drain VCMDQs
iommu/arm-smmu-v3: Cache and restore MSI config
iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions
iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
iommu/arm-smmu-v3: Handle gerror during suspend
iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks
iommu/arm-smmu-v3: Invoke pm_runtime before hw access
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 18 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 449 +++++++++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 18 +
.../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 29 ++
4 files changed, 485 insertions(+), 29 deletions(-)
--
2.54.0.rc0.605.g598a273b03-goog
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v6 01/10] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs
2026-04-14 19:46 [PATCH v6 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
@ 2026-04-14 19:46 ` Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 02/10] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
` (8 subsequent siblings)
9 siblings, 0 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-14 19:46 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
Pranjal Shrivastava
Refactor arm_smmu_setup_irqs by splitting it into two parts, one for
registering interrupt handlers and the other one for enabling interrupt
generation in the hardware. This refactor helps in re-initialization of
hardware interrupts as part of a subsequent patch that enables runtime
power management for the arm-smmu-v3 driver.
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 47 +++++++++++++--------
1 file changed, 30 insertions(+), 17 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f6901c5437ed..a77e0472e5c0 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4727,14 +4727,32 @@ static void arm_smmu_setup_unique_irqs(struct arm_smmu_device *smmu)
}
}
+static void arm_smmu_enable_irqs(struct arm_smmu_device *smmu)
+{
+ int ret;
+ u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN;
+
+ if (smmu->features & ARM_SMMU_FEAT_PRI)
+ irqen_flags |= IRQ_CTRL_PRIQ_IRQEN;
+
+ ret = arm_smmu_write_reg_sync(smmu, irqen_flags,
+ ARM_SMMU_IRQ_CTRL, ARM_SMMU_IRQ_CTRLACK);
+ if (ret)
+ dev_warn(smmu->dev, "failed to enable irqs\n");
+}
+
+static int arm_smmu_disable_irqs(struct arm_smmu_device *smmu)
+{
+ return arm_smmu_write_reg_sync(smmu, 0, ARM_SMMU_IRQ_CTRL,
+ ARM_SMMU_IRQ_CTRLACK);
+}
+
static int arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
{
int ret, irq;
- u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN;
/* Disable IRQs first */
- ret = arm_smmu_write_reg_sync(smmu, 0, ARM_SMMU_IRQ_CTRL,
- ARM_SMMU_IRQ_CTRLACK);
+ ret = arm_smmu_disable_irqs(smmu);
if (ret) {
dev_err(smmu->dev, "failed to disable irqs\n");
return ret;
@@ -4756,15 +4774,6 @@ static int arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
} else
arm_smmu_setup_unique_irqs(smmu);
- if (smmu->features & ARM_SMMU_FEAT_PRI)
- irqen_flags |= IRQ_CTRL_PRIQ_IRQEN;
-
- /* Enable interrupt generation on the SMMU */
- ret = arm_smmu_write_reg_sync(smmu, irqen_flags,
- ARM_SMMU_IRQ_CTRL, ARM_SMMU_IRQ_CTRLACK);
- if (ret)
- dev_warn(smmu->dev, "failed to enable irqs\n");
-
return 0;
}
@@ -4907,11 +4916,8 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
}
}
- ret = arm_smmu_setup_irqs(smmu);
- if (ret) {
- dev_err(smmu->dev, "failed to setup irqs\n");
- return ret;
- }
+ /* Enable interrupt generation on the SMMU */
+ arm_smmu_enable_irqs(smmu);
if (is_kdump_kernel())
enables &= ~(CR0_EVTQEN | CR0_PRIQEN);
@@ -5545,6 +5551,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
/* Check for RMRs and install bypass STEs if any */
arm_smmu_rmr_install_bypass_ste(smmu);
+ /* Setup interrupt handlers */
+ ret = arm_smmu_setup_irqs(smmu);
+ if (ret) {
+ dev_err(smmu->dev, "failed to setup irqs\n");
+ goto err_free_iopf;
+ }
+
/* Reset the device */
ret = arm_smmu_device_reset(smmu);
if (ret)
--
2.54.0.rc0.605.g598a273b03-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v6 02/10] iommu/arm-smmu-v3: Add a helper to drain cmd queues
2026-04-14 19:46 [PATCH v6 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 01/10] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
@ 2026-04-14 19:46 ` Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 03/10] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
` (7 subsequent siblings)
9 siblings, 0 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-14 19:46 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
Pranjal Shrivastava
Before we suspend SMMU, we want to ensure that all commands (especially
ATC_INV) have been flushed by the CMDQ, i.e. the CMDQs are empty.
Add a helper function that polls till the queues are dained.
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 34 +++++++++++++++++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 ++
2 files changed, 37 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a77e0472e5c0..088a7d1c002b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -993,6 +993,40 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
cmds->num, true);
}
+int arm_smmu_queue_poll_until_empty(struct arm_smmu_device *smmu,
+ struct arm_smmu_queue *q)
+{
+ struct arm_smmu_queue_poll qp;
+ struct arm_smmu_ll_queue *llq = &q->llq;
+ int ret = 0;
+
+ queue_poll_init(smmu, &qp);
+ do {
+ if (queue_empty(llq))
+ break;
+
+ ret = queue_poll(&qp);
+ WRITE_ONCE(llq->cons, readl_relaxed(q->cons_reg));
+
+ } while (!ret);
+
+ return ret;
+}
+
+static int arm_smmu_drain_queues(struct arm_smmu_device *smmu)
+{
+ int ret;
+
+ /*
+ * Since this is only called from the suspend callback where
+ * exclusive access is ensured as CMDQ_PROD_STOP_FLAG blocks new
+ * command submissions to the cmdq, we can skip the cmdq locking.
+ */
+ ret = arm_smmu_queue_poll_until_empty(smmu, &smmu->cmdq.q);
+
+ return ret;
+}
+
static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused,
struct iommu_page_response *resp)
{
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index ef42df4753ec..47799d58d10c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1098,6 +1098,9 @@ int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq *cmdq);
+int arm_smmu_queue_poll_until_empty(struct arm_smmu_device *smmu,
+ struct arm_smmu_queue *q);
+
static inline bool arm_smmu_master_canwbs(struct arm_smmu_master *master)
{
return dev_iommu_fwspec_get(master->dev)->flags &
--
2.54.0.rc0.605.g598a273b03-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v6 03/10] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs
2026-04-14 19:46 [PATCH v6 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 01/10] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 02/10] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
@ 2026-04-14 19:46 ` Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 04/10] iommu/tegra241-cmdqv: Restore PROD and CONS after resume Pranjal Shrivastava
` (6 subsequent siblings)
9 siblings, 0 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-14 19:46 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
Pranjal Shrivastava
The tegra241-cmdqv driver supports vCMDQs which need to be drained
before suspending the SMMU. The current driver implementation only uses
VINTF0 for vCMDQs owned by the kernel which need to be drained. Add a
helper that drains all the enabled vCMDQs under VINTF0.
Add another function ptr to arm_smmu_impl_ops to drain implementation
specified queues and call it within `arm_smmu_drain_queues`.
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 7 +++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 +
.../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 27 +++++++++++++++++++
3 files changed, 35 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 088a7d1c002b..e865a8aa2210 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1024,6 +1024,13 @@ static int arm_smmu_drain_queues(struct arm_smmu_device *smmu)
*/
ret = arm_smmu_queue_poll_until_empty(smmu, &smmu->cmdq.q);
+ if (ret)
+ goto out;
+
+ /* Drain all implementation-specific queues */
+ if (smmu->impl_ops && smmu->impl_ops->drain_queues)
+ ret = smmu->impl_ops->drain_queues(smmu);
+out:
return ret;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 47799d58d10c..001734bc3c7d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -818,6 +818,7 @@ struct arm_smmu_impl_ops {
size_t (*get_viommu_size)(enum iommu_viommu_type viommu_type);
int (*vsmmu_init)(struct arm_vsmmu *vsmmu,
const struct iommu_user_data *user_data);
+ int (*drain_queues)(struct arm_smmu_device *smmu);
};
/* An SMMUv3 instance */
diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
index 83f6e9f6c51d..b3ed2149f72b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
@@ -414,6 +414,32 @@ tegra241_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
return &vcmdq->cmdq;
}
+static int tegra241_cmdqv_drain_vintf0_lvcmdqs(struct arm_smmu_device *smmu)
+{
+ struct tegra241_cmdqv *cmdqv =
+ container_of(smmu, struct tegra241_cmdqv, smmu);
+ struct tegra241_vintf *vintf = cmdqv->vintfs[0];
+ int ret = 0;
+ u16 lidx;
+
+ /* Kernel only uses VINTF0. Return if it's disabled */
+ if (!READ_ONCE(vintf->enabled))
+ return 0;
+
+ for (lidx = 0; lidx < cmdqv->num_lvcmdqs_per_vintf; lidx++) {
+ struct tegra241_vcmdq *vcmdq = vintf->lvcmdqs[lidx];
+
+ if (!vcmdq || !READ_ONCE(vcmdq->enabled))
+ continue;
+
+ ret = arm_smmu_queue_poll_until_empty(smmu, &vcmdq->cmdq.q);
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
+
/* HW Reset Functions */
/*
@@ -845,6 +871,7 @@ static struct arm_smmu_impl_ops tegra241_cmdqv_impl_ops = {
.get_secondary_cmdq = tegra241_cmdqv_get_cmdq,
.device_reset = tegra241_cmdqv_hw_reset,
.device_remove = tegra241_cmdqv_remove,
+ .drain_queues = tegra241_cmdqv_drain_vintf0_lvcmdqs,
/* For user-space use */
.hw_info = tegra241_cmdqv_hw_info,
.get_viommu_size = tegra241_cmdqv_get_vintf_size,
--
2.54.0.rc0.605.g598a273b03-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v6 04/10] iommu/tegra241-cmdqv: Restore PROD and CONS after resume
2026-04-14 19:46 [PATCH v6 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (2 preceding siblings ...)
2026-04-14 19:46 ` [PATCH v6 03/10] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
@ 2026-04-14 19:46 ` Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 05/10] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
` (5 subsequent siblings)
9 siblings, 0 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-14 19:46 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
Pranjal Shrivastava
From: Ashish Mhetre <amhetre@nvidia.com>
PROD and CONS indices for vcmdqs are getting set to 0 after resume.
Because of this the vcmdq is not consuming commands after resume.
Fix this by restoring PROD and CONS indices after resume from
saved pointers.
Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
index b3ed2149f72b..eff6b7f6d52e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
@@ -511,6 +511,8 @@ static int tegra241_vcmdq_hw_init(struct tegra241_vcmdq *vcmdq)
/* Configure and enable VCMDQ */
writeq_relaxed(vcmdq->cmdq.q.q_base, REG_VCMDQ_PAGE1(vcmdq, BASE));
+ writel_relaxed(vcmdq->cmdq.q.llq.prod, REG_VCMDQ_PAGE0(vcmdq, PROD));
+ writel_relaxed(vcmdq->cmdq.q.llq.cons, REG_VCMDQ_PAGE0(vcmdq, CONS));
ret = vcmdq_write_config(vcmdq, VCMDQ_EN);
if (ret) {
--
2.54.0.rc0.605.g598a273b03-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v6 05/10] iommu/arm-smmu-v3: Cache and restore MSI config
2026-04-14 19:46 [PATCH v6 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (3 preceding siblings ...)
2026-04-14 19:46 ` [PATCH v6 04/10] iommu/tegra241-cmdqv: Restore PROD and CONS after resume Pranjal Shrivastava
@ 2026-04-14 19:46 ` Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 06/10] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions Pranjal Shrivastava
` (4 subsequent siblings)
9 siblings, 0 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-14 19:46 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
Pranjal Shrivastava
The SMMU's MSI configuration registers (*_IRQ_CFGn) containing target
address, data and memory attributes lose their state when the SMMU is
powered down. We'll need to cache and restore their contents to ensure
that MSIs work after the system resumes.
To address this, cache the original `msi_msg` within the `msi_desc`
when the configuration is first written by `arm_smmu_write_msi_msg`.
This primarily includes the target address and data since the memory
attributes are fixed.
Introduce a new helper `arm_smmu_resume_msis` which will later be called
during the driver's resume callback. The helper would retrieve the
cached MSI message for each relevant interrupt (evtq, gerr, priq) via
get_cached_msi_msg & re-config the registers via arm_smmu_write_msi_msg.
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 37 +++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e865a8aa2210..4fa452465b4d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4677,6 +4677,9 @@ static void arm_smmu_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
struct arm_smmu_device *smmu = dev_get_drvdata(dev);
phys_addr_t *cfg = arm_smmu_msi_cfg[desc->msi_index];
+ /* Cache the msi_msg for resume */
+ desc->msg = *msg;
+
doorbell = (((u64)msg->address_hi) << 32) | msg->address_lo;
doorbell &= MSI_CFG0_ADDR_MASK;
@@ -4685,6 +4688,40 @@ static void arm_smmu_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
writel_relaxed(ARM_SMMU_MEMATTR_DEVICE_nGnRE, smmu->base + cfg[2]);
}
+static void arm_smmu_resume_msi(struct arm_smmu_device *smmu,
+ unsigned int irq, const char *name)
+{
+ struct msi_desc *desc;
+ struct msi_msg msg;
+
+ if (!irq)
+ return;
+
+ desc = irq_get_msi_desc(irq);
+ if (!desc) {
+ dev_err(smmu->dev, "Failed to resume msi: %s", name);
+ return;
+ }
+
+ get_cached_msi_msg(irq, &msg);
+ arm_smmu_write_msi_msg(desc, &msg);
+}
+
+static void arm_smmu_resume_msis(struct arm_smmu_device *smmu)
+{
+ if (!(smmu->features & ARM_SMMU_FEAT_MSI))
+ return;
+
+ if (!dev_get_msi_domain(smmu->dev))
+ return;
+
+ arm_smmu_resume_msi(smmu, smmu->gerr_irq, "gerror");
+ arm_smmu_resume_msi(smmu, smmu->evtq.q.irq, "evtq");
+
+ if (smmu->features & ARM_SMMU_FEAT_PRI)
+ arm_smmu_resume_msi(smmu, smmu->priq.q.irq, "priq");
+}
+
static void arm_smmu_setup_msis(struct arm_smmu_device *smmu)
{
int ret, nvec = ARM_SMMU_MAX_MSIS;
--
2.54.0.rc0.605.g598a273b03-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v6 06/10] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions
2026-04-14 19:46 [PATCH v6 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (4 preceding siblings ...)
2026-04-14 19:46 ` [PATCH v6 05/10] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
@ 2026-04-14 19:46 ` Pranjal Shrivastava
2026-04-22 4:31 ` Daniel Mentz
2026-04-14 19:46 ` [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
` (3 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-14 19:46 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
Pranjal Shrivastava
Introduce a new bit flag, CMDQ_PROD_STOP_FLAG (bit 30), in the command
queue's producer index to safely gate command submissions during device
suspension.
The flag embeds the suspend state directly into the existing global state
The flag checked in the compxchg loop in arm_smmu_cmdq_issue_cmdlist(),
which acts as a Point of Commitment, ensuring that no indices are
reserved or committed once the SMMU begins suspending.
This prevents a situation of "abandoned batches" where indices are
incremented but commands are never written, which would otherwise
lead to timeout during the drain poll.
Update queue_inc_prod_n() to preserve this flag during index
calculations, ensuring that any in-flight commands that successfully
passed the point of commitment can proceed to completion while the
flag remains set.
Suggested-by: Daniel Mentz <danielmentz@google.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 20 +++++++++++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++
2 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 4fa452465b4d..93ce9ea81991 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -209,7 +209,8 @@ static int queue_sync_prod_in(struct arm_smmu_queue *q)
static u32 queue_inc_prod_n(struct arm_smmu_ll_queue *q, int n)
{
u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + n;
- return Q_OVF(q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
+
+ return Q_OVF(q->prod) | Q_STOP(q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
}
static void queue_poll_init(struct arm_smmu_device *smmu,
@@ -818,8 +819,25 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
do {
u64 old;
+ /*
+ * If the SMMU is suspended/suspending, any new CMDs are elided.
+ * This loop is the Point of Commitment. If we haven't cmpxchg'd
+ * our new indices yet, we can safely bail. Once the indices are
+ * committed, we MUST write valid commands to those slots to
+ * avoid indefinite polling in the drain function.
+ */
+ if (Q_STOP(llq.prod)) {
+ local_irq_restore(flags);
+ return 0;
+ }
+
while (!queue_has_space(&llq, n + sync)) {
local_irq_restore(flags);
+
+ /* Avoid waiting for space if the SMMU is suspending */
+ if (Q_STOP(READ_ONCE(cmdq->q.llq.prod)))
+ return 0;
+
if (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq))
dev_err_ratelimited(smmu->dev, "CMDQ timeout\n");
local_irq_save(flags);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 001734bc3c7d..4bace6a85d29 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -389,6 +389,8 @@ static inline unsigned int arm_smmu_cdtab_l2_idx(unsigned int ssid)
#define CMDQ_ERR_CERROR_ATC_INV_IDX 3
#define CMDQ_PROD_OWNED_FLAG Q_OVERFLOW_FLAG
+#define CMDQ_PROD_STOP_FLAG (1U << 30)
+#define Q_STOP(p) ((p) & CMDQ_PROD_STOP_FLAG)
/*
* This is used to size the command queue and therefore must be at least
--
2.54.0.rc0.605.g598a273b03-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-04-14 19:46 [PATCH v6 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (5 preceding siblings ...)
2026-04-14 19:46 ` [PATCH v6 06/10] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions Pranjal Shrivastava
@ 2026-04-14 19:46 ` Pranjal Shrivastava
2026-04-21 4:47 ` Daniel Mentz
` (2 more replies)
2026-04-14 19:47 ` [PATCH v6 08/10] iommu/arm-smmu-v3: Handle gerror during suspend Pranjal Shrivastava
` (2 subsequent siblings)
9 siblings, 3 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-14 19:46 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
Pranjal Shrivastava
Implement pm_runtime and system sleep ops for arm-smmu-v3.
The suspend callback configures the SMMU to abort new transactions,
disables the main translation unit and then drains the command queue
to ensure completion of any in-flight commands. A software gate
(STOP_FLAG) and synchronization barriers are used to quiesce the command
submission pipeline and ensure state consistency before power-off.
The resume callback restores the MSI configuration and performs a full
device reset via `arm_smmu_device_reset` to bring the SMMU back to an
operational state. The MSIs are cached during the msi_write and are
restored during the resume operation by using the helper.
Suggested-by: Daniel Mentz <danielmentz@google.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 179 ++++++++++++++++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 +
2 files changed, 187 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 93ce9ea81991..a6aee4538b57 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -28,6 +28,7 @@
#include <linux/platform_device.h>
#include <linux/sort.h>
#include <linux/string_choices.h>
+#include <linux/pm_runtime.h>
#include <kunit/visibility.h>
#include <uapi/linux/iommufd.h>
@@ -109,6 +110,48 @@ static const char * const event_class_str[] = {
static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
+/* Runtime PM helpers */
+__maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
+{
+ int ret;
+
+ if (pm_runtime_enabled(smmu->dev)) {
+ ret = pm_runtime_resume_and_get(smmu->dev);
+ if (ret < 0) {
+ dev_err(smmu->dev, "failed to resume device: %d\n", ret);
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
+__maybe_unused static void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
+{
+ int ret;
+
+ if (pm_runtime_enabled(smmu->dev)) {
+ ret = pm_runtime_put_autosuspend(smmu->dev);
+ if (ret < 0)
+ dev_err(smmu->dev, "failed to suspend device: %d\n", ret);
+ }
+}
+
+/*
+ * This should always return true if devlinks are setup correctly
+ * and the client using the SMMU is active.
+ */
+__maybe_unused bool arm_smmu_rpm_get_if_active(struct arm_smmu_device *smmu)
+{
+ if (!arm_smmu_is_active(smmu))
+ return false;
+
+ if (pm_runtime_enabled(smmu->dev))
+ return pm_runtime_get_if_active(smmu->dev) > 0;
+
+ return true;
+}
+
static void parse_driver_options(struct arm_smmu_device *smmu)
{
int i = 0;
@@ -5700,6 +5743,141 @@ static void arm_smmu_device_shutdown(struct platform_device *pdev)
arm_smmu_device_disable(smmu);
}
+static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
+{
+ struct arm_smmu_device *smmu = dev_get_drvdata(dev);
+ struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
+ struct arm_smmu_ll_queue llq, head;
+ int timeout = ARM_SMMU_SUSPEND_TIMEOUT_US;
+ u32 enables, target;
+ u64 old;
+ int ret;
+
+ /* Abort all transactions before disable to avoid spurious bypass */
+ arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
+
+ /* Disable the SMMU via CR0.EN and all queues except CMDQ */
+ enables = CR0_CMDQEN;
+ ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0, ARM_SMMU_CR0ACK);
+ if (ret) {
+ dev_err(smmu->dev, "failed to disable SMMUEN\n");
+ return ret;
+ }
+
+ /*
+ * At this point the SMMU is completely disabled and won't access
+ * any translation/config structures, even speculative accesses
+ * aren't performed as per the IHI0070 spec (section 6.3.9.6).
+ */
+
+ /* Mark the CMDQ to stop */
+ llq.val = READ_ONCE(cmdq->q.llq.val);
+ do {
+ head = llq;
+ head.prod |= CMDQ_PROD_STOP_FLAG;
+
+ old = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);
+ if (old == llq.val)
+ break;
+
+ llq.val = old;
+ } while (1);
+
+ /* Wait for the last committed owner to reach the hardware */
+ target = head.prod & ~CMDQ_PROD_STOP_FLAG;
+ while (atomic_read(&cmdq->owner_prod) != target && --timeout)
+ udelay(1);
+
+ /*
+ * Entering suspend implies no active clients. A race or timeout here
+ * indicates a broken RPM or devlink dependency. We proceed anyway to
+ * prioritize memory safety (avoiding stale TLBs) over bus faults.
+ */
+ if (!timeout)
+ dev_err(smmu->dev, "cmdq owner wait timeout, (check runtime PM + devlinks)\n");
+
+ /* Drain the CMDQs */
+ ret = arm_smmu_drain_queues(smmu);
+ if (ret)
+ dev_warn(smmu->dev, "failed to drain queues, forcing suspend\n");
+
+ /* Wait for cmdq->lock == 0 to ensure last CMDQ_CONS_REG is written */
+ timeout = ARM_SMMU_SUSPEND_TIMEOUT_US;
+ while (atomic_read(&cmdq->lock) != 0 && --timeout)
+ udelay(1);
+
+ /* Timing out here implies misconfigured Runtime PM or broken devlinks */
+ if (!timeout)
+ dev_err(smmu->dev, "cmdq lock != 0, forcing suspend. Polling CPUs may fault.\n");
+
+ /* Disable everything */
+ arm_smmu_device_disable(smmu);
+
+ /* Handle any pending gerrors before powering down */
+ arm_smmu_handle_gerror(smmu);
+
+ dev_dbg(dev, "suspended smmu\n");
+
+ return 0;
+}
+
+static int __maybe_unused arm_smmu_runtime_resume(struct device *dev)
+{
+ int ret;
+ struct arm_smmu_device *smmu = dev_get_drvdata(dev);
+ struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
+ struct arm_smmu_ll_queue llq;
+
+ /* Clear the STOP FLAG to allow CMDQ Submissions */
+ llq.val = READ_ONCE(cmdq->q.llq.val);
+ while (1) {
+ u64 old_val;
+ struct arm_smmu_ll_queue head = llq;
+
+ if (!(head.prod & CMDQ_PROD_STOP_FLAG))
+ break;
+
+ head.prod &= ~CMDQ_PROD_STOP_FLAG;
+ old_val = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);
+ if (old_val == llq.val)
+ break;
+ llq.val = old_val;
+ }
+
+ /* Re-configure MSIs */
+ arm_smmu_resume_msis(smmu);
+
+ ret = arm_smmu_device_reset(smmu);
+ if (ret)
+ dev_err(dev, "failed to reset during resume operation: %d\n", ret);
+
+ dev_dbg(dev, "resumed device\n");
+
+ return ret;
+}
+
+static int __maybe_unused arm_smmu_pm_suspend(struct device *dev)
+{
+ if (pm_runtime_suspended(dev))
+ return 0;
+
+ return arm_smmu_runtime_suspend(dev);
+}
+
+static int __maybe_unused arm_smmu_pm_resume(struct device *dev)
+{
+ if (pm_runtime_suspended(dev))
+ return 0;
+
+ return arm_smmu_runtime_resume(dev);
+}
+
+static const struct dev_pm_ops arm_smmu_pm_ops = {
+ SET_SYSTEM_SLEEP_PM_OPS(arm_smmu_pm_suspend, arm_smmu_pm_resume)
+ SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend,
+ arm_smmu_runtime_resume, NULL)
+};
+
static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,smmu-v3", },
{ },
@@ -5716,6 +5894,7 @@ static struct platform_driver arm_smmu_driver = {
.driver = {
.name = "arm-smmu-v3",
.of_match_table = arm_smmu_of_match,
+ .pm = &arm_smmu_pm_ops,
.suppress_bind_attrs = true,
},
.probe = arm_smmu_device_probe,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 4bace6a85d29..3a20a553105e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -504,11 +504,14 @@ static inline unsigned int arm_smmu_cdtab_l2_idx(unsigned int ssid)
/* High-level queue structures */
#define ARM_SMMU_POLL_TIMEOUT_US 1000000 /* 1s! */
+#define ARM_SMMU_SUSPEND_TIMEOUT_US 1000000 /* 1s! */
#define ARM_SMMU_POLL_SPIN_COUNT 10
#define MSI_IOVA_BASE 0x8000000
#define MSI_IOVA_LENGTH 0x100000
+#define RPM_AUTOSUSPEND_DELAY_MS 15
+
enum pri_resp {
PRI_RESP_DENY = 0,
PRI_RESP_FAIL = 1,
@@ -1149,6 +1152,11 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq *cmdq, u64 *cmds, int n,
bool sync);
+static inline bool arm_smmu_is_active(struct arm_smmu_device *smmu)
+{
+ return !Q_STOP(READ_ONCE(smmu->cmdq.q.llq.prod));
+}
+
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
void arm_smmu_sva_notifier_synchronize(void);
--
2.54.0.rc0.605.g598a273b03-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v6 08/10] iommu/arm-smmu-v3: Handle gerror during suspend
2026-04-14 19:46 [PATCH v6 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (6 preceding siblings ...)
2026-04-14 19:46 ` [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
@ 2026-04-14 19:47 ` Pranjal Shrivastava
2026-04-14 19:47 ` [PATCH v6 09/10] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
2026-04-14 19:47 ` [PATCH v6 10/10] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
9 siblings, 0 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-14 19:47 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
Pranjal Shrivastava, Jason Gunthorpe
The GERROR register's state might be lost when the SMMU is powered down
during runtime suspend. Handle any pending errors before suspending.
Refactor the gerror handling logic into a helper function and invoke it
from the runtime suspend callback after disabling the SMMU. This ensures
that any late-breaking gerrors are logged and ack'ed before the hardware
state is lost.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a6aee4538b57..28d7e82e4692 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2453,10 +2453,9 @@ static irqreturn_t arm_smmu_priq_thread(int irq, void *dev)
static int arm_smmu_device_disable(struct arm_smmu_device *smmu);
-static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
+static irqreturn_t arm_smmu_handle_gerror(struct arm_smmu_device *smmu)
{
u32 gerror, gerrorn, active;
- struct arm_smmu_device *smmu = dev;
gerror = readl_relaxed(smmu->base + ARM_SMMU_GERROR);
gerrorn = readl_relaxed(smmu->base + ARM_SMMU_GERRORN);
@@ -2496,9 +2495,17 @@ static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
arm_smmu_cmdq_skip_err(smmu);
writel(gerror, smmu->base + ARM_SMMU_GERRORN);
+
return IRQ_HANDLED;
}
+static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
+{
+ struct arm_smmu_device *smmu = dev;
+
+ return arm_smmu_handle_gerror(smmu);
+}
+
static irqreturn_t arm_smmu_combined_irq_thread(int irq, void *dev)
{
struct arm_smmu_device *smmu = dev;
--
2.54.0.rc0.605.g598a273b03-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v6 09/10] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks
2026-04-14 19:46 [PATCH v6 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (7 preceding siblings ...)
2026-04-14 19:47 ` [PATCH v6 08/10] iommu/arm-smmu-v3: Handle gerror during suspend Pranjal Shrivastava
@ 2026-04-14 19:47 ` Pranjal Shrivastava
2026-04-14 19:47 ` [PATCH v6 10/10] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
9 siblings, 0 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-14 19:47 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
Pranjal Shrivastava
Enable PM runtime for SMMUs having a power-domain during smmu probe.
Add a devlink between the clients and SMMU device. The absence of a
power domain effectively disables runtime power management.
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 28d7e82e4692..be091739af49 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4375,6 +4375,9 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
pci_prepare_ats(to_pci_dev(dev), stu);
}
+ device_link_add(dev, smmu->dev,
+ DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE_SUPPLIER);
+
return &smmu->iommu;
err_free_master:
@@ -5709,11 +5712,18 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
if (ret)
goto err_disable;
+ if (dev->pm_domain) {
+ pm_runtime_set_active(dev);
+ pm_runtime_use_autosuspend(dev);
+ pm_runtime_set_autosuspend_delay(dev, RPM_AUTOSUSPEND_DELAY_MS);
+ pm_runtime_enable(dev);
+ }
+
/* And we're up. Go go go! */
ret = iommu_device_sysfs_add(&smmu->iommu, dev, NULL,
"smmu3.%pa", &ioaddr);
if (ret)
- goto err_disable;
+ goto err_rpm_disable;
ret = iommu_device_register(&smmu->iommu, &arm_smmu_ops, dev);
if (ret) {
@@ -5725,6 +5735,8 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
err_free_sysfs:
iommu_device_sysfs_remove(&smmu->iommu);
+err_rpm_disable:
+ pm_runtime_disable(dev);
err_disable:
arm_smmu_device_disable(smmu);
err_free_iopf:
--
2.54.0.rc0.605.g598a273b03-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v6 10/10] iommu/arm-smmu-v3: Invoke pm_runtime before hw access
2026-04-14 19:46 [PATCH v6 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (8 preceding siblings ...)
2026-04-14 19:47 ` [PATCH v6 09/10] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
@ 2026-04-14 19:47 ` Pranjal Shrivastava
2026-05-19 18:31 ` Daniel Mentz
9 siblings, 1 reply; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-14 19:47 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
Pranjal Shrivastava
Invoke the pm_runtime helpers at all places before accessing the hw.
The idea is to invoke runtime_pm helpers at common points which are
used by exposed ops or interrupt handlers. Elide all TLB and CFG
invalidations if the smmu is suspended but not ATC invalidations.
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 18 ++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 108 ++++++++++++++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 +
3 files changed, 118 insertions(+), 12 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index ddae0b07c76b..b92e891035bb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -15,6 +15,11 @@ void *arm_smmu_hw_info(struct device *dev, u32 *length,
struct iommu_hw_info_arm_smmuv3 *info;
u32 __iomem *base_idr;
unsigned int i;
+ int ret;
+
+ ret = arm_smmu_rpm_get(master->smmu);
+ if (ret < 0)
+ return ERR_PTR(-EIO);
if (*type != IOMMU_HW_INFO_TYPE_DEFAULT &&
*type != IOMMU_HW_INFO_TYPE_ARM_SMMUV3) {
@@ -24,8 +29,10 @@ void *arm_smmu_hw_info(struct device *dev, u32 *length,
}
info = kzalloc_obj(*info);
- if (!info)
+ if (!info) {
+ arm_smmu_rpm_put(master->smmu);
return ERR_PTR(-ENOMEM);
+ }
base_idr = master->smmu->base + ARM_SMMU_IDR0;
for (i = 0; i <= 5; i++)
@@ -36,6 +43,7 @@ void *arm_smmu_hw_info(struct device *dev, u32 *length,
*length = sizeof(*info);
*type = IOMMU_HW_INFO_TYPE_ARM_SMMUV3;
+ arm_smmu_rpm_put(master->smmu);
return info;
}
@@ -386,8 +394,12 @@ int arm_vsmmu_cache_invalidate(struct iommufd_viommu *viommu,
continue;
/* FIXME always uses the main cmdq rather than trying to group by type */
- ret = arm_smmu_cmdq_issue_cmdlist(smmu, &smmu->cmdq, last->cmd,
- cur - last, true);
+ if (arm_smmu_rpm_get_if_active(smmu)) {
+ ret = arm_smmu_cmdq_issue_cmdlist(smmu, &smmu->cmdq, last->cmd,
+ cur - last, true);
+ arm_smmu_rpm_put(smmu);
+ }
+
if (ret) {
cur--;
goto out;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index be091739af49..983042fd7a5b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -111,7 +111,7 @@ static const char * const event_class_str[] = {
static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
/* Runtime PM helpers */
-__maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
+int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
{
int ret;
@@ -126,7 +126,7 @@ __maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
return 0;
}
-__maybe_unused static void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
+void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
{
int ret;
@@ -141,7 +141,7 @@ __maybe_unused static void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
* This should always return true if devlinks are setup correctly
* and the client using the SMMU is active.
*/
-__maybe_unused bool arm_smmu_rpm_get_if_active(struct arm_smmu_device *smmu)
+bool arm_smmu_rpm_get_if_active(struct arm_smmu_device *smmu)
{
if (!arm_smmu_is_active(smmu))
return false;
@@ -1100,7 +1100,9 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
{
struct arm_smmu_cmdq_ent cmd = {0};
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_device *smmu = master->smmu;
int sid = master->streams[0].id;
+ int ret;
if (WARN_ON(!master->stall_enabled))
return;
@@ -1120,6 +1122,25 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
break;
}
+ /*
+ * The SMMU is guaranteed to be active via device_link if any master is
+ * active. Furthermore, on suspend we set GBPA to abort, flushing any
+ * pending stalled transactions.
+ *
+ * Receiving a page fault while suspended implies a bug in the power
+ * dependency chain or a stale event. Since the SMMU is powered down
+ * and the command queue is inaccessible, we cannot issue the
+ * RESUME command and must drop it.
+ */
+ if (!arm_smmu_is_active(smmu)) {
+ dev_err(smmu->dev, "Ignoring page fault while suspended\n");
+ return;
+ }
+
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ return;
+
arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
/*
* Don't send a SYNC, it doesn't do anything for RESUME or PRI_RESP.
@@ -1127,6 +1148,7 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
* terminated... at some point in the future. PRI_RESP is fire and
* forget.
*/
+ arm_smmu_rpm_put(smmu);
}
/* Invalidation array manipulation functions */
@@ -1652,6 +1674,9 @@ static void arm_smmu_sync_cd(struct arm_smmu_master *master,
},
};
+ if (!arm_smmu_rpm_get_if_active(smmu))
+ return;
+
arm_smmu_cmdq_batch_init(smmu, &cmds, &cmd);
for (i = 0; i < master->num_streams; i++) {
cmd.cfgi.sid = master->streams[i].id;
@@ -1659,6 +1684,7 @@ static void arm_smmu_sync_cd(struct arm_smmu_master *master,
}
arm_smmu_cmdq_batch_submit(smmu, &cmds);
+ arm_smmu_rpm_put(smmu);
}
static void arm_smmu_write_cd_l1_desc(struct arm_smmu_cdtab_l1 *dst,
@@ -1949,6 +1975,7 @@ static void arm_smmu_ste_writer_sync_entry(struct arm_smmu_entry_writer *writer)
{
struct arm_smmu_ste_writer *ste_writer =
container_of(writer, struct arm_smmu_ste_writer, writer);
+ struct arm_smmu_device *smmu = writer->master->smmu;
struct arm_smmu_cmdq_ent cmd = {
.opcode = CMDQ_OP_CFGI_STE,
.cfgi = {
@@ -1957,7 +1984,11 @@ static void arm_smmu_ste_writer_sync_entry(struct arm_smmu_entry_writer *writer)
},
};
- arm_smmu_cmdq_issue_cmd_with_sync(writer->master->smmu, &cmd);
+ if (!arm_smmu_rpm_get_if_active(smmu))
+ return;
+
+ arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
+ arm_smmu_rpm_put(smmu);
}
static const struct arm_smmu_entry_writer_ops arm_smmu_ste_writer_ops = {
@@ -2362,6 +2393,7 @@ static void arm_smmu_dump_event(struct arm_smmu_device *smmu, u64 *raw,
static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
{
+ int ret;
u64 evt[EVTQ_ENT_DWORDS];
struct arm_smmu_event event = {0};
struct arm_smmu_device *smmu = dev;
@@ -2370,6 +2402,10 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
DEFAULT_RATELIMIT_BURST);
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ return IRQ_NONE;
+
do {
while (!queue_remove_raw(q, evt)) {
arm_smmu_decode_event(smmu, evt, &event);
@@ -2390,6 +2426,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
/* Sync our overflow flag, as we believe we're up to speed */
queue_sync_cons_ovf(q);
+ arm_smmu_rpm_put(smmu);
return IRQ_HANDLED;
}
@@ -2437,6 +2474,11 @@ static irqreturn_t arm_smmu_priq_thread(int irq, void *dev)
struct arm_smmu_queue *q = &smmu->priq.q;
struct arm_smmu_ll_queue *llq = &q->llq;
u64 evt[PRIQ_ENT_DWORDS];
+ int ret;
+
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ return IRQ_NONE;
do {
while (!queue_remove_raw(q, evt))
@@ -2448,6 +2490,7 @@ static irqreturn_t arm_smmu_priq_thread(int irq, void *dev)
/* Sync our overflow flag, as we believe we're up to speed */
queue_sync_cons_ovf(q);
+ arm_smmu_rpm_put(smmu);
return IRQ_HANDLED;
}
@@ -2502,8 +2545,19 @@ static irqreturn_t arm_smmu_handle_gerror(struct arm_smmu_device *smmu)
static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
{
struct arm_smmu_device *smmu = dev;
+ irqreturn_t ret;
+
+ /* If we are suspended, this is a spurious interrupt */
+ if (arm_smmu_rpm_get_if_active(smmu) <= 0) {
+ dev_err(smmu->dev,
+ "Ignoring gerror interrupt because the SMMU is suspended\n");
+ return IRQ_NONE;
+ }
+
+ ret = arm_smmu_handle_gerror(smmu);
+ arm_smmu_rpm_put(smmu);
- return arm_smmu_handle_gerror(smmu);
+ return ret;
}
static irqreturn_t arm_smmu_combined_irq_thread(int irq, void *dev)
@@ -2593,19 +2647,25 @@ arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size,
static int arm_smmu_atc_inv_master(struct arm_smmu_master *master,
ioasid_t ssid)
{
- int i;
+ int i, ret = 0;
struct arm_smmu_cmdq_ent cmd;
struct arm_smmu_cmdq_batch cmds;
arm_smmu_atc_inv_to_cmd(ssid, 0, 0, &cmd);
+ /* Shouldn't be elided if there's no devlink inconsistency */
+ if (WARN_ON_ONCE(!arm_smmu_rpm_get_if_active(master->smmu)))
+ return 0;
+
arm_smmu_cmdq_batch_init(master->smmu, &cmds, &cmd);
for (i = 0; i < master->num_streams; i++) {
cmd.atc.sid = master->streams[i].id;
arm_smmu_cmdq_batch_add(master->smmu, &cmds, &cmd);
}
- return arm_smmu_cmdq_batch_submit(master->smmu, &cmds);
+ ret = arm_smmu_cmdq_batch_submit(master->smmu, &cmds);
+ arm_smmu_rpm_put(master->smmu);
+ return ret;
}
/* IO_PGTABLE API */
@@ -2823,7 +2883,16 @@ static void __arm_smmu_domain_inv_range(struct arm_smmu_invs *invs,
if (cmds.num &&
(next == end || arm_smmu_invs_end_batch(cur, next))) {
- arm_smmu_cmdq_batch_submit(smmu, &cmds);
+ bool active = arm_smmu_rpm_get_if_active(smmu);
+
+ if (arm_smmu_inv_is_ats(cur))
+ WARN_ON_ONCE(!active);
+
+ if (active) {
+ arm_smmu_cmdq_batch_submit(smmu, &cmds);
+ arm_smmu_rpm_put(smmu);
+ }
+ /* Drop this batch to ensure the next one's fresh */
cmds.num = 0;
}
cur = next;
@@ -3540,6 +3609,9 @@ static void arm_smmu_inv_flush_iotlb_tag(struct arm_smmu_inv *inv)
{
struct arm_smmu_cmdq_ent cmd = {};
+ if (!arm_smmu_rpm_get_if_active(inv->smmu))
+ return;
+
switch (inv->type) {
case INV_TYPE_S1_ASID:
cmd.tlbi.asid = inv->id;
@@ -3549,11 +3621,14 @@ static void arm_smmu_inv_flush_iotlb_tag(struct arm_smmu_inv *inv)
cmd.tlbi.vmid = inv->id;
break;
default:
- return;
+ goto out_rpm_put;
}
cmd.opcode = inv->nsize_opcode;
arm_smmu_cmdq_issue_cmd_with_sync(inv->smmu, &cmd);
+
+out_rpm_put:
+ arm_smmu_rpm_put(inv->smmu);
}
/* Should be installed after arm_smmu_install_ste_for_dev() */
@@ -5747,10 +5822,19 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
static void arm_smmu_device_remove(struct platform_device *pdev)
{
struct arm_smmu_device *smmu = platform_get_drvdata(pdev);
+ int ret;
iommu_device_unregister(&smmu->iommu);
iommu_device_sysfs_remove(&smmu->iommu);
+
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ goto free_iopf;
+
arm_smmu_device_disable(smmu);
+ arm_smmu_rpm_put(smmu);
+
+free_iopf:
iopf_queue_free(smmu->evtq.iopf);
ida_destroy(&smmu->vmid_map);
}
@@ -5758,8 +5842,14 @@ static void arm_smmu_device_remove(struct platform_device *pdev)
static void arm_smmu_device_shutdown(struct platform_device *pdev)
{
struct arm_smmu_device *smmu = platform_get_drvdata(pdev);
+ int ret;
+
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ return;
arm_smmu_device_disable(smmu);
+ arm_smmu_rpm_put(smmu);
}
static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 3a20a553105e..a697fa3c457b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1152,11 +1152,15 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq *cmdq, u64 *cmds, int n,
bool sync);
+int arm_smmu_rpm_get(struct arm_smmu_device *smmu);
+void arm_smmu_rpm_put(struct arm_smmu_device *smmu);
static inline bool arm_smmu_is_active(struct arm_smmu_device *smmu)
{
return !Q_STOP(READ_ONCE(smmu->cmdq.q.llq.prod));
}
+bool arm_smmu_rpm_get_if_active(struct arm_smmu_device *smmu);
+
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
void arm_smmu_sva_notifier_synchronize(void);
--
2.54.0.rc0.605.g598a273b03-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-04-14 19:46 ` [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
@ 2026-04-21 4:47 ` Daniel Mentz
2026-04-22 12:20 ` Pranjal Shrivastava
2026-04-21 15:46 ` Daniel Mentz
2026-04-22 4:25 ` Daniel Mentz
2 siblings, 1 reply; 27+ messages in thread
From: Daniel Mentz @ 2026-04-21 4:47 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Tue, Apr 14, 2026 at 12:47 PM Pranjal Shrivastava <praan@google.com> wrote:
> + /* Mark the CMDQ to stop */
> + llq.val = READ_ONCE(cmdq->q.llq.val);
> + do {
> + head = llq;
> + head.prod |= CMDQ_PROD_STOP_FLAG;
> +
> + old = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);
> + if (old == llq.val)
> + break;
> +
> + llq.val = old;
> + } while (1);
Can the equivalent be achieved with the following?
atomic_or(CMDQ_PROD_STOP_FLAG, &cmdq->q.llq.atomic.prod);
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-04-14 19:46 ` [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
2026-04-21 4:47 ` Daniel Mentz
@ 2026-04-21 15:46 ` Daniel Mentz
2026-04-22 12:23 ` Pranjal Shrivastava
2026-04-22 4:25 ` Daniel Mentz
2 siblings, 1 reply; 27+ messages in thread
From: Daniel Mentz @ 2026-04-21 15:46 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Tue, Apr 14, 2026 at 12:47 PM Pranjal Shrivastava <praan@google.com> wrote:
> + /* Clear the STOP FLAG to allow CMDQ Submissions */
> + llq.val = READ_ONCE(cmdq->q.llq.val);
> + while (1) {
> + u64 old_val;
> + struct arm_smmu_ll_queue head = llq;
> +
> + if (!(head.prod & CMDQ_PROD_STOP_FLAG))
> + break;
> +
> + head.prod &= ~CMDQ_PROD_STOP_FLAG;
> + old_val = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);
> + if (old_val == llq.val)
> + break;
> + llq.val = old_val;
> + }
Can you use
atomic_andnot(CMDQ_PROD_STOP_FLAG, &cmdq->q.llq.atomic.prod);
here?
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-04-14 19:46 ` [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
2026-04-21 4:47 ` Daniel Mentz
2026-04-21 15:46 ` Daniel Mentz
@ 2026-04-22 4:25 ` Daniel Mentz
2026-04-22 13:40 ` Pranjal Shrivastava
2 siblings, 1 reply; 27+ messages in thread
From: Daniel Mentz @ 2026-04-22 4:25 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Tue, Apr 14, 2026 at 12:47 PM Pranjal Shrivastava <praan@google.com> wrote:
> +/* Runtime PM helpers */
> +__maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
Please be consistent about placing the __maybe_unused attribute. I
believe that in the other functions, the placement is between the
return type and the function name.
> +
> +/*
> + * This should always return true if devlinks are setup correctly
> + * and the client using the SMMU is active.
> + */
> +__maybe_unused bool arm_smmu_rpm_get_if_active(struct arm_smmu_device *smmu)
> +{
> + if (!arm_smmu_is_active(smmu))
> + return false;
I'm wondering if this check is redundant. What would happen if we removed it?
> +
> + if (pm_runtime_enabled(smmu->dev))
> + return pm_runtime_get_if_active(smmu->dev) > 0;
> +
> + return true;
> +}
> +static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
> +{
> + struct arm_smmu_device *smmu = dev_get_drvdata(dev);
> + struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
> + struct arm_smmu_ll_queue llq, head;
> + int timeout = ARM_SMMU_SUSPEND_TIMEOUT_US;
> + u32 enables, target;
> + u64 old;
> + int ret;
> +
> + /* Abort all transactions before disable to avoid spurious bypass */
> + arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> +
> + /* Disable the SMMU via CR0.EN and all queues except CMDQ */
> + enables = CR0_CMDQEN;
> + ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0, ARM_SMMU_CR0ACK);
> + if (ret) {
> + dev_err(smmu->dev, "failed to disable SMMUEN\n");
"failed to disable SMMU". "EN" is redundant.
> + return ret;
> + }
> +
> + /*
> + * At this point the SMMU is completely disabled and won't access
> + * any translation/config structures, even speculative accesses
> + * aren't performed as per the IHI0070 spec (section 6.3.9.6).
> + */
> +
> + /* Mark the CMDQ to stop */
> + llq.val = READ_ONCE(cmdq->q.llq.val);
> + do {
> + head = llq;
> + head.prod |= CMDQ_PROD_STOP_FLAG;
> +
> + old = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);
> + if (old == llq.val)
> + break;
> +
> + llq.val = old;
> + } while (1);
> +
> + /* Wait for the last committed owner to reach the hardware */
> + target = head.prod & ~CMDQ_PROD_STOP_FLAG;
> + while (atomic_read(&cmdq->owner_prod) != target && --timeout)
> + udelay(1);
Should we just use the following line from arm_smmu_cmdq_issue_cmdlist()?
atomic_cond_read_relaxed(&cmdq->owner_prod, VAL == target);
> +
> + /*
> + * Entering suspend implies no active clients. A race or timeout here
> + * indicates a broken RPM or devlink dependency. We proceed anyway to
> + * prioritize memory safety (avoiding stale TLBs) over bus faults.
Can you elaborate on how broken dependencies will result in a stuck
cmdq->owner_prod?
> + */
> + if (!timeout)
> + dev_err(smmu->dev, "cmdq owner wait timeout, (check runtime PM + devlinks)\n");
> +
> + /* Drain the CMDQs */
> + ret = arm_smmu_drain_queues(smmu);
> + if (ret)
> + dev_warn(smmu->dev, "failed to drain queues, forcing suspend\n");
> +
> + /* Wait for cmdq->lock == 0 to ensure last CMDQ_CONS_REG is written */
> + timeout = ARM_SMMU_SUSPEND_TIMEOUT_US;
> + while (atomic_read(&cmdq->lock) != 0 && --timeout)
Can we use
atomic_cond_read_relaxed(&cmdq->lock, VAL == 0);
> + udelay(1);
> +
> + /* Timing out here implies misconfigured Runtime PM or broken devlinks */
> + if (!timeout)
> + dev_err(smmu->dev, "cmdq lock != 0, forcing suspend. Polling CPUs may fault.\n");
> +
> + /* Disable everything */
> + arm_smmu_device_disable(smmu);
> +
> + /* Handle any pending gerrors before powering down */
> + arm_smmu_handle_gerror(smmu);
> +
> + dev_dbg(dev, "suspended smmu\n");
> +
> + return 0;
> +}
> +
> +static int __maybe_unused arm_smmu_runtime_resume(struct device *dev)
> +{
> + int ret;
> + struct arm_smmu_device *smmu = dev_get_drvdata(dev);
> + struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
> + struct arm_smmu_ll_queue llq;
> +
> + /* Clear the STOP FLAG to allow CMDQ Submissions */
Clearing the STOP flag here seems premature. I propose the following sequence:
* Set up command queue (SMMU_CR1, SMMU_CMDQ_BASE etc.)
* Enable command queue
* Clear STOP flag
* Enable SMMU
By clearing the STOP flag, you allow arm_smmu_cmdq_issue_cmdlist() to
write to SMMU_CMDQ_PROD which then races against
arm_smmu_device_reset() which also writes to SMMU_CMDQ_PROD.
> + llq.val = READ_ONCE(cmdq->q.llq.val);
> + while (1) {
> + u64 old_val;
> + struct arm_smmu_ll_queue head = llq;
> +
> + if (!(head.prod & CMDQ_PROD_STOP_FLAG))
> + break;
> +
> + head.prod &= ~CMDQ_PROD_STOP_FLAG;
> + old_val = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);
> + if (old_val == llq.val)
> + break;
> + llq.val = old_val;
> + }
> +
> + /* Re-configure MSIs */
> + arm_smmu_resume_msis(smmu);
> +
> + ret = arm_smmu_device_reset(smmu);
> + if (ret)
> + dev_err(dev, "failed to reset during resume operation: %d\n", ret);
> +
> + dev_dbg(dev, "resumed device\n");
arm_smmu_runtime_suspend() prints "resumed smmu" (not "device"). Try
to be consistent with the language used in messages.
> +
> + return ret;
> +}
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 06/10] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions
2026-04-14 19:46 ` [PATCH v6 06/10] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions Pranjal Shrivastava
@ 2026-04-22 4:31 ` Daniel Mentz
2026-04-22 12:18 ` Pranjal Shrivastava
0 siblings, 1 reply; 27+ messages in thread
From: Daniel Mentz @ 2026-04-22 4:31 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Tue, Apr 14, 2026 at 12:47 PM Pranjal Shrivastava <praan@google.com> wrote:
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -209,7 +209,8 @@ static int queue_sync_prod_in(struct arm_smmu_queue *q)
> static u32 queue_inc_prod_n(struct arm_smmu_ll_queue *q, int n)
> {
> u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + n;
> - return Q_OVF(q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
> +
> + return Q_OVF(q->prod) | Q_STOP(q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
Is there a situation where we need to increment prod and retain the
STOP flag? What happens if we leave this particular line as is?
> }
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 06/10] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions
2026-04-22 4:31 ` Daniel Mentz
@ 2026-04-22 12:18 ` Pranjal Shrivastava
0 siblings, 0 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-22 12:18 UTC (permalink / raw)
To: Daniel Mentz
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Tue, Apr 21, 2026 at 09:31:33PM -0700, Daniel Mentz wrote:
> On Tue, Apr 14, 2026 at 12:47 PM Pranjal Shrivastava <praan@google.com> wrote:
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -209,7 +209,8 @@ static int queue_sync_prod_in(struct arm_smmu_queue *q)
> > static u32 queue_inc_prod_n(struct arm_smmu_ll_queue *q, int n)
> > {
> > u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + n;
> > - return Q_OVF(q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
> > +
> > + return Q_OVF(q->prod) | Q_STOP(q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
>
> Is there a situation where we need to increment prod and retain the
> STOP flag? What happens if we leave this particular line as is?
>
The CMDQ_PROD_STOP_FLAG must be preserved in queue_inc_prod_n() to
prevent it from being accidentally cleared during after being set.
This can happen in the following cases:
1. Point of Commitment Races: In arm_smmu_cmdq_issue_cmdlist(), a thread
could pass the initial Q_STOP check and successfully cmpxchg (commit)
its indices. If the suspend handler then sets the STOP_FLAG before the
thread reaches the if (sync) block, the thread will subsequently call
queue_inc_prod_n() to calculate indices for the sync command. If the
flag is not sticky here, this call would overwrite the global state
and inadvertently clear the STOP_FLAG, re-opening the queue for new
submissions while the SMMU is trying to suspend.
2. GERROR: If a Gerror occurs during the suspend, the handler
(arm_smmu_cmdq_skip_err) may increment the producer index to bypass a
faulty command. Since this handler could be asynchronous, we must ensure
that the STOP_FLAG is respoected. Preserving the flag ensures that we
don't accidentally clear the flag.
> > }
Thanks,
Praan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-04-21 4:47 ` Daniel Mentz
@ 2026-04-22 12:20 ` Pranjal Shrivastava
0 siblings, 0 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-22 12:20 UTC (permalink / raw)
To: Daniel Mentz
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Mon, Apr 20, 2026 at 09:47:36PM -0700, Daniel Mentz wrote:
> On Tue, Apr 14, 2026 at 12:47 PM Pranjal Shrivastava <praan@google.com> wrote:
> > + /* Mark the CMDQ to stop */
> > + llq.val = READ_ONCE(cmdq->q.llq.val);
> > + do {
> > + head = llq;
> > + head.prod |= CMDQ_PROD_STOP_FLAG;
> > +
> > + old = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);
> > + if (old == llq.val)
> > + break;
> > +
> > + llq.val = old;
> > + } while (1);
>
> Can the equivalent be achieved with the following?
>
> atomic_or(CMDQ_PROD_STOP_FLAG, &cmdq->q.llq.atomic.prod);
Ack, I'll update it to use atomic_fetch_or_relaxed() in the next version
Thanks,
Praan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-04-21 15:46 ` Daniel Mentz
@ 2026-04-22 12:23 ` Pranjal Shrivastava
0 siblings, 0 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-22 12:23 UTC (permalink / raw)
To: Daniel Mentz
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Tue, Apr 21, 2026 at 08:46:12AM -0700, Daniel Mentz wrote:
> On Tue, Apr 14, 2026 at 12:47 PM Pranjal Shrivastava <praan@google.com> wrote:
> > + /* Clear the STOP FLAG to allow CMDQ Submissions */
> > + llq.val = READ_ONCE(cmdq->q.llq.val);
> > + while (1) {
> > + u64 old_val;
> > + struct arm_smmu_ll_queue head = llq;
> > +
> > + if (!(head.prod & CMDQ_PROD_STOP_FLAG))
> > + break;
> > +
> > + head.prod &= ~CMDQ_PROD_STOP_FLAG;
> > + old_val = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);
> > + if (old_val == llq.val)
> > + break;
> > + llq.val = old_val;
> > + }
>
> Can you use
>
> atomic_andnot(CMDQ_PROD_STOP_FLAG, &cmdq->q.llq.atomic.prod);
>
> here?
Ack, I'll use atomic_andnot_relaxed(); for resume.
Thanks,
Praan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-04-22 4:25 ` Daniel Mentz
@ 2026-04-22 13:40 ` Pranjal Shrivastava
2026-04-24 5:24 ` Daniel Mentz
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-04-22 13:40 UTC (permalink / raw)
To: Daniel Mentz
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Tue, Apr 21, 2026 at 09:25:12PM -0700, Daniel Mentz wrote:
> On Tue, Apr 14, 2026 at 12:47 PM Pranjal Shrivastava <praan@google.com> wrote:
> > +/* Runtime PM helpers */
> > +__maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
>
> Please be consistent about placing the __maybe_unused attribute. I
> believe that in the other functions, the placement is between the
> return type and the function name.
>
Ack.
> > +
> > +/*
> > + * This should always return true if devlinks are setup correctly
> > + * and the client using the SMMU is active.
> > + */
> > +__maybe_unused bool arm_smmu_rpm_get_if_active(struct arm_smmu_device *smmu)
> > +{
> > + if (!arm_smmu_is_active(smmu))
> > + return false;
>
> I'm wondering if this check is redundant. What would happen if we removed it?
>
You're right that pm_runtime_get_if_active() shall suffice. However, I
added the arm_smmu_is_active() check as a lockless fast path for unmaps
Consider a scneario where there're high-frequency unmaps (unmap-storm)
while the SMMU is suspended. Without this check, every thread would
contend for the dev->power.lock spinlock inside the RPM core
(pm_runtime_get_if_active) just to discover the device is not active.
This check allows CPUs to elide the command immediately without
contending for the dev->power.lock spinlock inside the RPM core
(get_if_active call) and avoids unnecessary cacheline bouncing when the
SMMU is suspended.
> > +
> > + if (pm_runtime_enabled(smmu->dev))
> > + return pm_runtime_get_if_active(smmu->dev) > 0;
> > +
> > + return true;
> > +}
>
> > +static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
> > +{
> > + struct arm_smmu_device *smmu = dev_get_drvdata(dev);
> > + struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
> > + struct arm_smmu_ll_queue llq, head;
> > + int timeout = ARM_SMMU_SUSPEND_TIMEOUT_US;
> > + u32 enables, target;
> > + u64 old;
> > + int ret;
> > +
> > + /* Abort all transactions before disable to avoid spurious bypass */
> > + arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> > +
> > + /* Disable the SMMU via CR0.EN and all queues except CMDQ */
> > + enables = CR0_CMDQEN;
> > + ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0, ARM_SMMU_CR0ACK);
> > + if (ret) {
> > + dev_err(smmu->dev, "failed to disable SMMUEN\n");
>
> "failed to disable SMMU". "EN" is redundant.
>
Ack. I attempted to be verbose highlighting that the attempt was to
disable non-CMDQ stuff but I don't think that's needed. I'll update this
> > + return ret;
> > + }
> > +
> > + /*
> > + * At this point the SMMU is completely disabled and won't access
> > + * any translation/config structures, even speculative accesses
> > + * aren't performed as per the IHI0070 spec (section 6.3.9.6).
> > + */
> > +
> > + /* Mark the CMDQ to stop */
> > + llq.val = READ_ONCE(cmdq->q.llq.val);
> > + do {
> > + head = llq;
> > + head.prod |= CMDQ_PROD_STOP_FLAG;
> > +
> > + old = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);
> > + if (old == llq.val)
> > + break;
> > +
> > + llq.val = old;
> > + } while (1);
> > +
> > + /* Wait for the last committed owner to reach the hardware */
> > + target = head.prod & ~CMDQ_PROD_STOP_FLAG;
> > + while (atomic_read(&cmdq->owner_prod) != target && --timeout)
> > + udelay(1);
>
> Should we just use the following line from arm_smmu_cmdq_issue_cmdlist()?
>
> atomic_cond_read_relaxed(&cmdq->owner_prod, VAL == target);
>
I did consider atomic_cond_read_relaxed() but didn't go for it as it
doesn't have a timeout, we could wait here indefinitely.. (it's a
while(1) loop)
> > +
> > + /*
> > + * Entering suspend implies no active clients. A race or timeout here
> > + * indicates a broken RPM or devlink dependency. We proceed anyway to
> > + * prioritize memory safety (avoiding stale TLBs) over bus faults.
>
> Can you elaborate on how broken dependencies will result in a stuck
> cmdq->owner_prod?
>
I believe you are right, this comment needs to be updated.
With the CMDQ_PROD_STOP_FLAG gating mechanism, concurrent submissions are
handled safely even if device links are broken (they either complete if
committed before the flag, or bail out if they arrive after). A timeout
here would actually indicate something like a CMDQ stall rather than just
a missing device link. I will update the comment to reflect this.
> > + */
> > + if (!timeout)
> > + dev_err(smmu->dev, "cmdq owner wait timeout, (check runtime PM + devlinks)\n");
> > +
> > + /* Drain the CMDQs */
> > + ret = arm_smmu_drain_queues(smmu);
> > + if (ret)
> > + dev_warn(smmu->dev, "failed to drain queues, forcing suspend\n");
> > +
> > + /* Wait for cmdq->lock == 0 to ensure last CMDQ_CONS_REG is written */
> > + timeout = ARM_SMMU_SUSPEND_TIMEOUT_US;
> > + while (atomic_read(&cmdq->lock) != 0 && --timeout)
>
> Can we use
>
> atomic_cond_read_relaxed(&cmdq->lock, VAL == 0);
>
Same as above, worried about a lockup due to CMDQ Stalls.
> > + udelay(1);
> > +
> > + /* Timing out here implies misconfigured Runtime PM or broken devlinks */
> > + if (!timeout)
> > + dev_err(smmu->dev, "cmdq lock != 0, forcing suspend. Polling CPUs may fault.\n");
> > +
> > + /* Disable everything */
> > + arm_smmu_device_disable(smmu);
> > +
> > + /* Handle any pending gerrors before powering down */
> > + arm_smmu_handle_gerror(smmu);
> > +
> > + dev_dbg(dev, "suspended smmu\n");
> > +
> > + return 0;
> > +}
> > +
> > +static int __maybe_unused arm_smmu_runtime_resume(struct device *dev)
> > +{
> > + int ret;
> > + struct arm_smmu_device *smmu = dev_get_drvdata(dev);
> > + struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
> > + struct arm_smmu_ll_queue llq;
> > +
> > + /* Clear the STOP FLAG to allow CMDQ Submissions */
>
> Clearing the STOP flag here seems premature. I propose the following sequence:
>
> * Set up command queue (SMMU_CR1, SMMU_CMDQ_BASE etc.)
> * Enable command queue
> * Clear STOP flag
> * Enable SMMU
>
> By clearing the STOP flag, you allow arm_smmu_cmdq_issue_cmdlist() to
> write to SMMU_CMDQ_PROD which then races against
> arm_smmu_device_reset() which also writes to SMMU_CMDQ_PROD.
>
I agree, IIUC you're suggesting to clear the stop flag before the
TLBI_ALL and CFGI_ALL commands are issued by device_reset? We'll need
to clear the flag before device_reset issues the TLBI_ALL AND CFGI_ALL.
Thus the sequence should be:
* Set up CMDQ
* Enable CMDQ
* Clear STOP flag
* Issue TLBI_ALL + CFGI_ALL
* Enable SMMU
Right?
> > + llq.val = READ_ONCE(cmdq->q.llq.val);
> > + while (1) {
> > + u64 old_val;
> > + struct arm_smmu_ll_queue head = llq;
> > +
> > + if (!(head.prod & CMDQ_PROD_STOP_FLAG))
> > + break;
> > +
> > + head.prod &= ~CMDQ_PROD_STOP_FLAG;
> > + old_val = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);
> > + if (old_val == llq.val)
> > + break;
> > + llq.val = old_val;
> > + }
> > +
> > + /* Re-configure MSIs */
> > + arm_smmu_resume_msis(smmu);
> > +
> > + ret = arm_smmu_device_reset(smmu);
> > + if (ret)
> > + dev_err(dev, "failed to reset during resume operation: %d\n", ret);
> > +
> > + dev_dbg(dev, "resumed device\n");
>
> arm_smmu_runtime_suspend() prints "resumed smmu" (not "device"). Try
> to be consistent with the language used in messages.
>
Ack. I'll update it to dev_dbg(dev, "resumed smmu\n");
> > +
> > + return ret;
> > +}
Thanks,
Praan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-04-22 13:40 ` Pranjal Shrivastava
@ 2026-04-24 5:24 ` Daniel Mentz
2026-04-24 15:16 ` Jason Gunthorpe
2026-04-24 15:13 ` Jason Gunthorpe
2026-05-19 20:50 ` Daniel Mentz
2 siblings, 1 reply; 27+ messages in thread
From: Daniel Mentz @ 2026-04-24 5:24 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Wed, Apr 22, 2026 at 6:41 AM Pranjal Shrivastava <praan@google.com> wrote:
> > > + /* Clear the STOP FLAG to allow CMDQ Submissions */
> >
> > Clearing the STOP flag here seems premature. I propose the following sequence:
> >
> > * Set up command queue (SMMU_CR1, SMMU_CMDQ_BASE etc.)
> > * Enable command queue
> > * Clear STOP flag
> > * Enable SMMU
> >
> > By clearing the STOP flag, you allow arm_smmu_cmdq_issue_cmdlist() to
> > write to SMMU_CMDQ_PROD which then races against
> > arm_smmu_device_reset() which also writes to SMMU_CMDQ_PROD.
> >
>
> I agree, IIUC you're suggesting to clear the stop flag before the
> TLBI_ALL and CFGI_ALL commands are issued by device_reset? We'll need
> to clear the flag before device_reset issues the TLBI_ALL AND CFGI_ALL.
> Thus the sequence should be:
>
> * Set up CMDQ
> * Enable CMDQ
> * Clear STOP flag
> * Issue TLBI_ALL + CFGI_ALL
> * Enable SMMU
>
> Right?
Yes
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-04-22 13:40 ` Pranjal Shrivastava
2026-04-24 5:24 ` Daniel Mentz
@ 2026-04-24 15:13 ` Jason Gunthorpe
2026-05-04 11:15 ` Pranjal Shrivastava
2026-05-19 20:50 ` Daniel Mentz
2 siblings, 1 reply; 27+ messages in thread
From: Jason Gunthorpe @ 2026-04-24 15:13 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: Daniel Mentz, iommu, Will Deacon, Joerg Roedel, Robin Murphy,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Wed, Apr 22, 2026 at 01:40:53PM +0000, Pranjal Shrivastava wrote:
> > > +/*
> > > + * This should always return true if devlinks are setup correctly
> > > + * and the client using the SMMU is active.
> > > + */
> > > +__maybe_unused bool arm_smmu_rpm_get_if_active(struct arm_smmu_device *smmu)
> > > +{
> > > + if (!arm_smmu_is_active(smmu))
> > > + return false;
> >
> > I'm wondering if this check is redundant. What would happen if we removed it?
> >
>
> You're right that pm_runtime_get_if_active() shall suffice. However, I
> added the arm_smmu_is_active() check as a lockless fast path for unmaps
>
> Consider a scneario where there're high-frequency unmaps (unmap-storm)
> while the SMMU is suspended. Without this check, every thread would
> contend for the dev->power.lock spinlock inside the RPM core
> (pm_runtime_get_if_active) just to discover the device is not active.
>
> This check allows CPUs to elide the command immediately without
> contending for the dev->power.lock spinlock inside the RPM core
> (get_if_active call) and avoids unnecessary cacheline bouncing when the
> SMMU is suspended.
But these sorts of pre checks are almost always racy, I think you need a
comment or lockdep statement explaining how the locking works.
Jason
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-04-24 5:24 ` Daniel Mentz
@ 2026-04-24 15:16 ` Jason Gunthorpe
2026-05-04 9:01 ` Pranjal Shrivastava
0 siblings, 1 reply; 27+ messages in thread
From: Jason Gunthorpe @ 2026-04-24 15:16 UTC (permalink / raw)
To: Daniel Mentz
Cc: Pranjal Shrivastava, iommu, Will Deacon, Joerg Roedel,
Robin Murphy, Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Thu, Apr 23, 2026 at 10:24:30PM -0700, Daniel Mentz wrote:
> On Wed, Apr 22, 2026 at 6:41 AM Pranjal Shrivastava <praan@google.com> wrote:
> > > > + /* Clear the STOP FLAG to allow CMDQ Submissions */
> > >
> > > Clearing the STOP flag here seems premature. I propose the following sequence:
> > >
> > > * Set up command queue (SMMU_CR1, SMMU_CMDQ_BASE etc.)
> > > * Enable command queue
> > > * Clear STOP flag
> > > * Enable SMMU
> > >
> > > By clearing the STOP flag, you allow arm_smmu_cmdq_issue_cmdlist() to
> > > write to SMMU_CMDQ_PROD which then races against
> > > arm_smmu_device_reset() which also writes to SMMU_CMDQ_PROD.
> > >
> >
> > I agree, IIUC you're suggesting to clear the stop flag before the
> > TLBI_ALL and CFGI_ALL commands are issued by device_reset? We'll need
> > to clear the flag before device_reset issues the TLBI_ALL AND CFGI_ALL.
> > Thus the sequence should be:
> >
> > * Set up CMDQ
> > * Enable CMDQ
> > * Clear STOP flag
> > * Issue TLBI_ALL + CFGI_ALL
> > * Enable SMMU
> >
> > Right?
>
> Yes
This is making a very similar fencing argument to how Nicolin's series
works.
After clearing stop but before enablie SMMU you have to acquire all
concurrent threads updated data structure memory into this thread
because as soon as enable SMMU happens it will start reading that
memory and it must be 'either we see the latest copy' or 'we see an
old copy and guarenteed to see the invalidation'
Exactly the same as the reasoning used for attach with the
invalidation list. So I think you should look at the barrier design
there and ensure it exists here too.
Jason
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-04-24 15:16 ` Jason Gunthorpe
@ 2026-05-04 9:01 ` Pranjal Shrivastava
0 siblings, 0 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-05-04 9:01 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Daniel Mentz, iommu, Will Deacon, Joerg Roedel, Robin Murphy,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Fri, Apr 24, 2026 at 12:16:39PM -0300, Jason Gunthorpe wrote:
> On Thu, Apr 23, 2026 at 10:24:30PM -0700, Daniel Mentz wrote:
> > On Wed, Apr 22, 2026 at 6:41 AM Pranjal Shrivastava <praan@google.com> wrote:
> > > > > + /* Clear the STOP FLAG to allow CMDQ Submissions */
> > > >
> > > > Clearing the STOP flag here seems premature. I propose the following sequence:
> > > >
> > > > * Set up command queue (SMMU_CR1, SMMU_CMDQ_BASE etc.)
> > > > * Enable command queue
> > > > * Clear STOP flag
> > > > * Enable SMMU
> > > >
> > > > By clearing the STOP flag, you allow arm_smmu_cmdq_issue_cmdlist() to
> > > > write to SMMU_CMDQ_PROD which then races against
> > > > arm_smmu_device_reset() which also writes to SMMU_CMDQ_PROD.
> > > >
> > >
> > > I agree, IIUC you're suggesting to clear the stop flag before the
> > > TLBI_ALL and CFGI_ALL commands are issued by device_reset? We'll need
> > > to clear the flag before device_reset issues the TLBI_ALL AND CFGI_ALL.
> > > Thus the sequence should be:
> > >
> > > * Set up CMDQ
> > > * Enable CMDQ
> > > * Clear STOP flag
> > > * Issue TLBI_ALL + CFGI_ALL
> > > * Enable SMMU
> > >
> > > Right?
> >
> > Yes
>
> This is making a very similar fencing argument to how Nicolin's series
> works.
>
The invs array series?
> After clearing stop but before enablie SMMU you have to acquire all
> concurrent threads updated data structure memory into this thread
> because as soon as enable SMMU happens it will start reading that
> memory and it must be 'either we see the latest copy' or 'we see an
> old copy and guarenteed to see the invalidation'
>
> Exactly the same as the reasoning used for attach with the
> invalidation list. So I think you should look at the barrier design
> there and ensure it exists here too.
>
Hmm.. I see your point. We need to ensure that any structure updates
(like STE/CD or page table changes) made by concurrent threads are
globally visible before we send the MMIO write to enable the SMMU,
i.e. we're talking about memory ordering between other CPUs and
the Resume CPU (CPU running resume), without a barrier on the Resume
CPU, the SMMUEN=1 write could potentially beat a client thread's memory
update. If the SMMU is enabled and fetches a stale configuration before
the RAM update lands, it could cache an inconsistent state.
I guess I'll add an smp_mb() before the SMMUEN enable to act as a
system-wide synchronization point.
Thanks,
Praan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-04-24 15:13 ` Jason Gunthorpe
@ 2026-05-04 11:15 ` Pranjal Shrivastava
0 siblings, 0 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-05-04 11:15 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Daniel Mentz, iommu, Will Deacon, Joerg Roedel, Robin Murphy,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Fri, Apr 24, 2026 at 12:13:25PM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 22, 2026 at 01:40:53PM +0000, Pranjal Shrivastava wrote:
>
> > > > +/*
> > > > + * This should always return true if devlinks are setup correctly
> > > > + * and the client using the SMMU is active.
> > > > + */
> > > > +__maybe_unused bool arm_smmu_rpm_get_if_active(struct arm_smmu_device *smmu)
> > > > +{
> > > > + if (!arm_smmu_is_active(smmu))
> > > > + return false;
> > >
> > > I'm wondering if this check is redundant. What would happen if we removed it?
> > >
> >
> > You're right that pm_runtime_get_if_active() shall suffice. However, I
> > added the arm_smmu_is_active() check as a lockless fast path for unmaps
> >
> > Consider a scneario where there're high-frequency unmaps (unmap-storm)
> > while the SMMU is suspended. Without this check, every thread would
> > contend for the dev->power.lock spinlock inside the RPM core
> > (pm_runtime_get_if_active) just to discover the device is not active.
> >
> > This check allows CPUs to elide the command immediately without
> > contending for the dev->power.lock spinlock inside the RPM core
> > (get_if_active call) and avoids unnecessary cacheline bouncing when the
> > SMMU is suspended.
>
> But these sorts of pre checks are almost always racy, I think you need a
> comment or lockdep statement explaining how the locking works.
Ack. It is a benign race in this case and the check serves as a
performance heuristic to avoid dev->power.lock contention during unmap
storms while the SMMU is suspended.
The race is safe because the check is layered:
if a thread incorrectly passes this pre-check while the SMMU is
suspending, it will either be caught by the subsequent call to
pm_runtime_get_if_active() or by the final, Q_STOP check in the
arm_smmu_cmdq_issue_cmdlist() at the point of commitment.
I'll add a comment explaining this approach to clarify the intent.
Thanks,
Praan
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 10/10] iommu/arm-smmu-v3: Invoke pm_runtime before hw access
2026-04-14 19:47 ` [PATCH v6 10/10] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
@ 2026-05-19 18:31 ` Daniel Mentz
2026-05-25 19:53 ` Pranjal Shrivastava
0 siblings, 1 reply; 27+ messages in thread
From: Daniel Mentz @ 2026-05-19 18:31 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Tue, Apr 14, 2026 at 12:47 PM Pranjal Shrivastava <praan@google.com> wrote:
> /* IO_PGTABLE API */
> @@ -2823,7 +2883,16 @@ static void __arm_smmu_domain_inv_range(struct arm_smmu_invs *invs,
>
> if (cmds.num &&
> (next == end || arm_smmu_invs_end_batch(cur, next))) {
> - arm_smmu_cmdq_batch_submit(smmu, &cmds);
> + bool active = arm_smmu_rpm_get_if_active(smmu);
> +
> + if (arm_smmu_inv_is_ats(cur))
> + WARN_ON_ONCE(!active);
> +
> + if (active) {
> + arm_smmu_cmdq_batch_submit(smmu, &cmds);
> + arm_smmu_rpm_put(smmu);
> + }
> + /* Drop this batch to ensure the next one's fresh */
> cmds.num = 0;
> }
> cur = next;
Correct me if I'm wrong, but I don't think this is safe.
Let's say arm_smmu_is_active() returns true and
pm_runtime_get_if_active() returns 0 because dev->power.runtime_status
!= RPM_ACTIVE. In this situation, is it safe to skip
arm_smmu_cmdq_batch_submit()? I think the answer is no.
My understanding is that it is only safe to elide invalidations when
SMMUEN in SMMU_CR0 is cleared. Imagine runtime_status is
RPM_SUSPENDING. pm_runtime_get_if_active() returns 0, we elide
invalidations despite SMMU still being enabled.
I think that to decide whether to elide invalidations, we should
exclusively rely on the Q_STOP test in arm_smmu_cmdq_issue_cmdlist(),
not on the RPM framework.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-04-22 13:40 ` Pranjal Shrivastava
2026-04-24 5:24 ` Daniel Mentz
2026-04-24 15:13 ` Jason Gunthorpe
@ 2026-05-19 20:50 ` Daniel Mentz
2 siblings, 0 replies; 27+ messages in thread
From: Daniel Mentz @ 2026-05-19 20:50 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Wed, Apr 22, 2026 at 6:41 AM Pranjal Shrivastava <praan@google.com> wrote:
> > > +
> > > +/*
> > > + * This should always return true if devlinks are setup correctly
> > > + * and the client using the SMMU is active.
> > > + */
> > > +__maybe_unused bool arm_smmu_rpm_get_if_active(struct arm_smmu_device *smmu)
> > > +{
> > > + if (!arm_smmu_is_active(smmu))
> > > + return false;
> >
> > I'm wondering if this check is redundant. What would happen if we removed it?
> >
>
> You're right that pm_runtime_get_if_active() shall suffice. However, I
> added the arm_smmu_is_active() check as a lockless fast path for maps
It appears as if you're optimizing the RPM_SUSPENDED case at the
expense of the RPM_ACTIVE case. In the RPM_ACTIVE case, we now perform
an additional READ_ONCE(smmu->cmdq.q.llq.prod) that wouldn't otherwise
be necessary.
I expect unmap operations to complete much quicker in the
RPM_SUSPENDED case anyway, as we skip all communication with the SMMU.
Consequently, there is less motivation to optimize for that specific
scenario.
>
> Consider a scneario where there're high-frequency unmaps (unmap-storm)
> while the SMMU is suspended. Without this check, every thread would
> contend for the dev->power.lock spinlock inside the RPM core
> (pm_runtime_get_if_active) just to discover the device is not active.
>
> This check allows CPUs to elide the command immediately without
> contending for the dev->power.lock spinlock inside the RPM core
> (get_if_active call) and avoids unnecessary cacheline bouncing when the
> SMMU is suspended.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v6 10/10] iommu/arm-smmu-v3: Invoke pm_runtime before hw access
2026-05-19 18:31 ` Daniel Mentz
@ 2026-05-25 19:53 ` Pranjal Shrivastava
0 siblings, 0 replies; 27+ messages in thread
From: Pranjal Shrivastava @ 2026-05-25 19:53 UTC (permalink / raw)
To: Daniel Mentz
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Ashish Mhetre
On Tue, May 19, 2026 at 11:31:58AM -0700, Daniel Mentz wrote:
> On Tue, Apr 14, 2026 at 12:47 PM Pranjal Shrivastava <praan@google.com> wrote:
> > /* IO_PGTABLE API */
> > @@ -2823,7 +2883,16 @@ static void __arm_smmu_domain_inv_range(struct arm_smmu_invs *invs,
> >
> > if (cmds.num &&
> > (next == end || arm_smmu_invs_end_batch(cur, next))) {
> > - arm_smmu_cmdq_batch_submit(smmu, &cmds);
> > + bool active = arm_smmu_rpm_get_if_active(smmu);
> > +
> > + if (arm_smmu_inv_is_ats(cur))
> > + WARN_ON_ONCE(!active);
> > +
> > + if (active) {
> > + arm_smmu_cmdq_batch_submit(smmu, &cmds);
> > + arm_smmu_rpm_put(smmu);
> > + }
> > + /* Drop this batch to ensure the next one's fresh */
> > cmds.num = 0;
> > }
> > cur = next;
>
> Correct me if I'm wrong, but I don't think this is safe.
>
> Let's say arm_smmu_is_active() returns true and
> pm_runtime_get_if_active() returns 0 because dev->power.runtime_status
> != RPM_ACTIVE. In this situation, is it safe to skip
> arm_smmu_cmdq_batch_submit()? I think the answer is no.
>
> My understanding is that it is only safe to elide invalidations when
> SMMUEN in SMMU_CR0 is cleared. Imagine runtime_status is
> RPM_SUSPENDING. pm_runtime_get_if_active() returns 0, we elide
> invalidations despite SMMU still being enabled.
>
> I think that to decide whether to elide invalidations, we should
> exclusively rely on the Q_STOP test in arm_smmu_cmdq_issue_cmdlist(),
> not on the RPM framework.
Hmm.. the Q_STOP flag indicates the SMMUEN state 1:1.
Alright, I'll drop reliance on get_if_active everywhere and simply elide
based on Q_STOP. Even if we race with suspend/resume we should be safe
in that case because of the Q_STOP checks within the issue_cmdlist
(point of commitment).
Thanks,
Praan
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2026-05-25 19:53 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-14 19:46 [PATCH v6 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 01/10] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 02/10] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 03/10] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 04/10] iommu/tegra241-cmdqv: Restore PROD and CONS after resume Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 05/10] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 06/10] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions Pranjal Shrivastava
2026-04-22 4:31 ` Daniel Mentz
2026-04-22 12:18 ` Pranjal Shrivastava
2026-04-14 19:46 ` [PATCH v6 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
2026-04-21 4:47 ` Daniel Mentz
2026-04-22 12:20 ` Pranjal Shrivastava
2026-04-21 15:46 ` Daniel Mentz
2026-04-22 12:23 ` Pranjal Shrivastava
2026-04-22 4:25 ` Daniel Mentz
2026-04-22 13:40 ` Pranjal Shrivastava
2026-04-24 5:24 ` Daniel Mentz
2026-04-24 15:16 ` Jason Gunthorpe
2026-05-04 9:01 ` Pranjal Shrivastava
2026-04-24 15:13 ` Jason Gunthorpe
2026-05-04 11:15 ` Pranjal Shrivastava
2026-05-19 20:50 ` Daniel Mentz
2026-04-14 19:47 ` [PATCH v6 08/10] iommu/arm-smmu-v3: Handle gerror during suspend Pranjal Shrivastava
2026-04-14 19:47 ` [PATCH v6 09/10] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
2026-04-14 19:47 ` [PATCH v6 10/10] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
2026-05-19 18:31 ` Daniel Mentz
2026-05-25 19:53 ` Pranjal Shrivastava
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.