* [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops
@ 2026-05-27 22:13 Pranjal Shrivastava
2026-05-27 22:13 ` [PATCH v7 01/11] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
` (11 more replies)
0 siblings, 12 replies; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-27 22:13 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
linux-arm-kernel, Pranjal Shrivastava
As arm-smmu-v3 rapidly finds its way into SoCs designed for hand-held
devices, power management capabilities, similar to its predecessors, are
crucial for these applications. This series introduces power management
support for the arm-smmu-v3 driver.
Design
======
The arm-smmu-v3 primarily operates with in-memory data structures
through HW registers pointing to these data structures. The proposed design
makes use of this fact for implementing suspend and resume ops, centered
around a software gate embedded in the command queue.
1. CMDQ Gate (CMDQ_PROD_STOP_FLAG)
To safely manage runtime PM without regressing performance on high core
count servers or systems not opting for runtime power management,
this series introduces a CMDQ_PROD_STOP_FLAG (bit 30) in the command
queue's producer index. The flag acts as a Point of Commitment in the
cmpxchg loop of arm_smmu_cmdq_issue_cmdlist(), ensuring no new indices
are reserved once suspension begins.
2. Suspend / Resume Flow
The "suspend" operation follows a multi-stage quiesce sequence:
a. Stop Traffic: Sets SMMUEN=0 and GBPA=Abort to halt new transactions.
b. Gate CMDQ: Sets the CMDQ_PROD_STOP_FLAG to block new submissions.
c. Command Flush: Waits for any in-flight "owner" threads to commit
their reserved indices to hardware.
d. HW Drain: Polls the CMDQ until all committed commands are consumed.
e. SW Quiesce: Waits for all concurrent threads to release the shared
cmdq->lock, ensuring no CPUs are left polling CONS register.
Entering the suspend sequence implies that the device has no active clients
Any racing command submissions or failure to quiesce at this stage indicate
a bug in the Runtime PM or device link dependencies. Such races should not
happen in practice, which justifies the non-failing nature of the suspend
sequence in favor of memory safety.
The "resume" operation clears the STOP_FLAG & performs a full device
reset via arm_smmu_device_reset(), which re-initializes the HW using the
SW-copies maintained by the driver and clears all cached configurations.
3. Guarding Hardware Access and Elision
The driver ensures the SMMU is active before hardware access via
arm_smmu_rpm_get() and arm_smmu_rpm_put() helpers. Commands are elided
if the SMMU is suspended by observing the CMDQ_PROD_STOP_FLAG
(via arm_smmu_can_elide()). For ATC invalidations, devlinks must
guarantee the SMMU is active if the endpoint is active, a
WARN_ON_ONCE() catches inconsistencies. Elision is safe because
hardware reset on resume invalidates caches, and PCIe device links handle
power dependencies for ATC.
4. Interrupt Re-config
a. Wired irqs: The series refactors arm_smmu_setup_irqs to allow
separate installation of handlers, aiding in correct re-initialization.
b. MSIs: The series caches the msi_msg and restores it during resume
via a new arm_smmu_resume_msis() helper.
c. GERROR: Late-breaking global errors are captured and handled
immediately after SMMU disablement during suspend to ensure no diagnostic
information is lost.
Scalability and Performance
===========================
A key design goal of this series is to ensure that high-performance systems
(typically servers) that do not enable runtime PM are not penalized. By
embedding a stop flag in the command queue's producer index and designing
RPM helpers to perform only read-only checks when RPM is disabled, command
submission on these systems incurs negligible overhead. Power-managed
systems only utilize runtime PM atomics as necessary, ensuring that
scalability is maintained across all hardware classes.
Call for review
===============
Any insights/comments on the proposed changes are appreciated,
especially regarding the synchronization & elision logic.
[v7]
- Rebased on the latest arm/smmu/updates branch (which has the removal of
struct arm_smmu_cmdq_ent merged)
- Converted manual cmpxchg loops in suspend/resume to use atomics
- Re-worked to elide invalidations solely based on the CMDQ_PROD_STOP_FLAG
via arm_smmu_can_elide(), dropping any need for pm_runtime_get_if_active
- Added an smp_mb() fence in the reset sequence to ensure that the SMMU
acquires all RAM updates made by newly un-gated threads before SMMUEN=1
- Implemented bitwise masking for the PROD register to prevent software
metadata (STOP_FLAG) bits from being written to physical hardware.
- Introduced a KUnit test suite to verify the CMDQ gating algorithm
[v6]
- https://lore.kernel.org/all/20260414194702.1229094-1-praan@google.com/
- Replaced the atomic nr_cmdq_users counter with CMDQ_PROD_STOP_FLAG
to eliminate atomic overhead on high-core count servers.
- Implemented a 5-step quiesce sequence in runtime_suspend including
pipeline flushes and software completion barriers.
- Introduced arm_smmu_rpm_get_if_active() to elide TLB/CFG/ATC
invalidations when the SMMU is suspended.
- Added WARN_ON_ONCE() in invalidation paths to detect inconsistent
power states for active endpoints.
- Refined batch submission in __arm_smmu_domain_inv_range() to ensure
clean state when dropping batches.
- Refactored GERROR handling for better integration with suspend.
- Added Suggested-by tags for Daniel Mentz.
[v5]
- https://lore.kernel.org/all/20260126151157.3418145-1-praan@google.com/
- Refactored GERROR handling into a helper function and invoked it during
runtime suspend after disabling the SMMU to capture any late-breaking
gerrors as suggested by Jason.
- Updated `arm_smmu_page_response` to be power-state aware and drop
page faults received while suspended.
- Included a patch from Ashish to correctly restore PROD and CONS
indices for tegra241-cmdqv after a hardware reset.
- Collected Reviewed-bys from Mostafa and Nicolin.
[v4]
- https://lore.kernel.org/all/20251117191433.3360130-1-praan@google.com/
- Dropped the `pm_runtime_get_if_not_suspended()` API in favor of a
simpler, driver-specific biased counter (`nr_cmdq_users`) to manage
runtime PM state.
- Reworked the suspend callback to poll on the biased counter before
disabling the SMMU.
- Addressed comments for the MSI refactor.
[v3]
- https://lore.kernel.org/all/20250616203149.2649118-1-praan@google.com/
- Introduced `pm_runtime_get_if_not_suspended` API to avoid races due
to bouncing RPM states while eliding TLBIs as pointed out by Daniel.
- Addressed Nicolin's comments regarding msi_resume and CMDQV flush
- Addressed Daniel's comments about CMDQ locking and draining
- Addressed issues related to draining the evtq and priq
- Dropped the code to identify and track user-space attachments
[v2]
- https://lore.kernel.org/all/20250418233409.3926715-1-praan@google.com/
- Introduced `arm_smmu_rpm_get_if_active` for eliding TLBIs & CFGIs
- Updated the rpm helper invocation strategy.
- Drained all queues in suspend callback (including tegra241-cmdv)
- Cache and restore msi_msg instead of free-ing realloc-ing on resume
- Added support to identify and track user-space attachments
- Fixed the setup_irqs as per Nicolin & Mostafa's suggestions
- Used force_runtime_suspend/resume instead as per Mostafa's suggestion.
- Added "Reviewed-by" line from Mostafa on an unchanged patch
[v1]
- https://lore.kernel.org/all/20250319004254.2547950-1-praan@google.com/
Ashish Mhetre (1):
iommu/tegra241-cmdqv: Restore PROD and CONS after resume
Pranjal Shrivastava (10):
iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs
iommu/arm-smmu-v3: Add a helper to drain cmd queues
iommu/tegra241-cmdqv: Add a helper to drain VCMDQs
iommu/arm-smmu-v3: Cache and restore MSI config
iommu/arm-smmu-v3: Handle gerror during suspend
iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions
iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks
iommu/arm-smmu-v3: Invoke pm_runtime before hw access
iommu/arm-smmu-v3: Add KUnit unit tests for Runtime PM
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 15 +-
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 150 ++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 445 ++++++++++++++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 27 ++
.../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 29 ++
5 files changed, 636 insertions(+), 30 deletions(-)
--
2.54.0.794.g4f17f83d09-goog
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v7 01/11] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
@ 2026-05-27 22:13 ` Pranjal Shrivastava
2026-05-27 22:13 ` [PATCH v7 02/11] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
` (10 subsequent siblings)
11 siblings, 0 replies; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-27 22:13 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
linux-arm-kernel, Pranjal Shrivastava
Refactor arm_smmu_setup_irqs by splitting it into two parts, one for
registering interrupt handlers and the other one for enabling interrupt
generation in the hardware. This refactor helps in re-initialization of
hardware interrupts as part of a subsequent patch that enables runtime
power management for the arm-smmu-v3 driver.
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 47 +++++++++++++--------
1 file changed, 30 insertions(+), 17 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 8ce3e801eda3..3dad2c2d3283 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4601,14 +4601,32 @@ static void arm_smmu_setup_unique_irqs(struct arm_smmu_device *smmu)
}
}
+static void arm_smmu_enable_irqs(struct arm_smmu_device *smmu)
+{
+ int ret;
+ u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN;
+
+ if (smmu->features & ARM_SMMU_FEAT_PRI)
+ irqen_flags |= IRQ_CTRL_PRIQ_IRQEN;
+
+ ret = arm_smmu_write_reg_sync(smmu, irqen_flags,
+ ARM_SMMU_IRQ_CTRL, ARM_SMMU_IRQ_CTRLACK);
+ if (ret)
+ dev_warn(smmu->dev, "failed to enable irqs\n");
+}
+
+static int arm_smmu_disable_irqs(struct arm_smmu_device *smmu)
+{
+ return arm_smmu_write_reg_sync(smmu, 0, ARM_SMMU_IRQ_CTRL,
+ ARM_SMMU_IRQ_CTRLACK);
+}
+
static int arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
{
int ret, irq;
- u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN;
/* Disable IRQs first */
- ret = arm_smmu_write_reg_sync(smmu, 0, ARM_SMMU_IRQ_CTRL,
- ARM_SMMU_IRQ_CTRLACK);
+ ret = arm_smmu_disable_irqs(smmu);
if (ret) {
dev_err(smmu->dev, "failed to disable irqs\n");
return ret;
@@ -4630,15 +4648,6 @@ static int arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
} else
arm_smmu_setup_unique_irqs(smmu);
- if (smmu->features & ARM_SMMU_FEAT_PRI)
- irqen_flags |= IRQ_CTRL_PRIQ_IRQEN;
-
- /* Enable interrupt generation on the SMMU */
- ret = arm_smmu_write_reg_sync(smmu, irqen_flags,
- ARM_SMMU_IRQ_CTRL, ARM_SMMU_IRQ_CTRLACK);
- if (ret)
- dev_warn(smmu->dev, "failed to enable irqs\n");
-
return 0;
}
@@ -4779,11 +4788,8 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
}
}
- ret = arm_smmu_setup_irqs(smmu);
- if (ret) {
- dev_err(smmu->dev, "failed to setup irqs\n");
- return ret;
- }
+ /* Enable interrupt generation on the SMMU */
+ arm_smmu_enable_irqs(smmu);
if (is_kdump_kernel())
enables &= ~(CR0_EVTQEN | CR0_PRIQEN);
@@ -5417,6 +5423,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
/* Check for RMRs and install bypass STEs if any */
arm_smmu_rmr_install_bypass_ste(smmu);
+ /* Setup interrupt handlers */
+ ret = arm_smmu_setup_irqs(smmu);
+ if (ret) {
+ dev_err(smmu->dev, "failed to setup irqs\n");
+ goto err_free_iopf;
+ }
+
/* Reset the device */
ret = arm_smmu_device_reset(smmu);
if (ret)
--
2.54.0.794.g4f17f83d09-goog
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v7 02/11] iommu/arm-smmu-v3: Add a helper to drain cmd queues
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2026-05-27 22:13 ` [PATCH v7 01/11] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
@ 2026-05-27 22:13 ` Pranjal Shrivastava
2026-05-28 1:35 ` Nicolin Chen
2026-05-27 22:13 ` [PATCH v7 03/11] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
` (9 subsequent siblings)
11 siblings, 1 reply; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-27 22:13 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
linux-arm-kernel, Pranjal Shrivastava
Before we suspend SMMU, we want to ensure that all commands (especially
ATC_INV) have been flushed by the CMDQ, i.e. the CMDQs are empty.
Add a helper function that polls till the queues are dained.
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 34 +++++++++++++++++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 ++
2 files changed, 37 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 3dad2c2d3283..0e77ef1e4523 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -884,6 +884,40 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
cmds->num, true);
}
+int arm_smmu_queue_poll_until_empty(struct arm_smmu_device *smmu,
+ struct arm_smmu_queue *q)
+{
+ struct arm_smmu_queue_poll qp;
+ struct arm_smmu_ll_queue *llq = &q->llq;
+ int ret = 0;
+
+ queue_poll_init(smmu, &qp);
+ do {
+ if (queue_empty(llq))
+ break;
+
+ ret = queue_poll(&qp);
+ WRITE_ONCE(llq->cons, readl_relaxed(q->cons_reg));
+
+ } while (!ret);
+
+ return ret;
+}
+
+static int arm_smmu_drain_queues(struct arm_smmu_device *smmu)
+{
+ int ret;
+
+ /*
+ * Since this is only called from the suspend callback where
+ * exclusive access is ensured as CMDQ_PROD_STOP_FLAG blocks new
+ * command submissions to the cmdq, we can skip the cmdq locking.
+ */
+ ret = arm_smmu_queue_poll_until_empty(smmu, &smmu->cmdq.q);
+
+ return ret;
+}
+
static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused,
struct iommu_page_response *resp)
{
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 16353596e08a..c855ab4962ed 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1165,6 +1165,9 @@ int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq *cmdq);
+int arm_smmu_queue_poll_until_empty(struct arm_smmu_device *smmu,
+ struct arm_smmu_queue *q);
+
static inline bool arm_smmu_master_canwbs(struct arm_smmu_master *master)
{
return dev_iommu_fwspec_get(master->dev)->flags &
--
2.54.0.794.g4f17f83d09-goog
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v7 03/11] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2026-05-27 22:13 ` [PATCH v7 01/11] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
2026-05-27 22:13 ` [PATCH v7 02/11] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
@ 2026-05-27 22:13 ` Pranjal Shrivastava
2026-05-27 22:14 ` [PATCH v7 04/11] iommu/tegra241-cmdqv: Restore PROD and CONS after resume Pranjal Shrivastava
` (8 subsequent siblings)
11 siblings, 0 replies; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-27 22:13 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
linux-arm-kernel, Pranjal Shrivastava
The tegra241-cmdqv driver supports vCMDQs which need to be drained
before suspending the SMMU. The current driver implementation only uses
VINTF0 for vCMDQs owned by the kernel which need to be drained. Add a
helper that drains all the enabled vCMDQs under VINTF0.
Add another function ptr to arm_smmu_impl_ops to drain implementation
specified queues and call it within `arm_smmu_drain_queues`.
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 7 +++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 +
.../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 27 +++++++++++++++++++
3 files changed, 35 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 0e77ef1e4523..8682be5060ed 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -915,6 +915,13 @@ static int arm_smmu_drain_queues(struct arm_smmu_device *smmu)
*/
ret = arm_smmu_queue_poll_until_empty(smmu, &smmu->cmdq.q);
+ if (ret)
+ goto out;
+
+ /* Drain all implementation-specific queues */
+ if (smmu->impl_ops && smmu->impl_ops->drain_queues)
+ ret = smmu->impl_ops->drain_queues(smmu);
+out:
return ret;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index c855ab4962ed..24d5e28eea88 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -885,6 +885,7 @@ struct arm_smmu_impl_ops {
size_t (*get_viommu_size)(enum iommu_viommu_type viommu_type);
int (*vsmmu_init)(struct arm_vsmmu *vsmmu,
const struct iommu_user_data *user_data);
+ int (*drain_queues)(struct arm_smmu_device *smmu);
};
/* An SMMUv3 instance */
diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
index 67be62a6e764..cb1e75e4ee91 100644
--- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
@@ -414,6 +414,32 @@ tegra241_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
return &vcmdq->cmdq;
}
+static int tegra241_cmdqv_drain_vintf0_lvcmdqs(struct arm_smmu_device *smmu)
+{
+ struct tegra241_cmdqv *cmdqv =
+ container_of(smmu, struct tegra241_cmdqv, smmu);
+ struct tegra241_vintf *vintf = cmdqv->vintfs[0];
+ int ret = 0;
+ u16 lidx;
+
+ /* Kernel only uses VINTF0. Return if it's disabled */
+ if (!READ_ONCE(vintf->enabled))
+ return 0;
+
+ for (lidx = 0; lidx < cmdqv->num_lvcmdqs_per_vintf; lidx++) {
+ struct tegra241_vcmdq *vcmdq = vintf->lvcmdqs[lidx];
+
+ if (!vcmdq || !READ_ONCE(vcmdq->enabled))
+ continue;
+
+ ret = arm_smmu_queue_poll_until_empty(smmu, &vcmdq->cmdq.q);
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
+
/* HW Reset Functions */
/*
@@ -845,6 +871,7 @@ static struct arm_smmu_impl_ops tegra241_cmdqv_impl_ops = {
.get_secondary_cmdq = tegra241_cmdqv_get_cmdq,
.device_reset = tegra241_cmdqv_hw_reset,
.device_remove = tegra241_cmdqv_remove,
+ .drain_queues = tegra241_cmdqv_drain_vintf0_lvcmdqs,
/* For user-space use */
.hw_info = tegra241_cmdqv_hw_info,
.get_viommu_size = tegra241_cmdqv_get_vintf_size,
--
2.54.0.794.g4f17f83d09-goog
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v7 04/11] iommu/tegra241-cmdqv: Restore PROD and CONS after resume
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (2 preceding siblings ...)
2026-05-27 22:13 ` [PATCH v7 03/11] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
@ 2026-05-27 22:14 ` Pranjal Shrivastava
2026-05-28 18:14 ` Nicolin Chen
2026-05-27 22:14 ` [PATCH v7 05/11] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
` (7 subsequent siblings)
11 siblings, 1 reply; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-27 22:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
linux-arm-kernel, Pranjal Shrivastava
From: Ashish Mhetre <amhetre@nvidia.com>
PROD and CONS indices for vcmdqs are getting set to 0 after resume.
Because of this the vcmdq is not consuming commands after resume.
Fix this by restoring PROD and CONS indices after resume from
saved pointers.
Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
index cb1e75e4ee91..866cae7b73e5 100644
--- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
@@ -511,6 +511,8 @@ static int tegra241_vcmdq_hw_init(struct tegra241_vcmdq *vcmdq)
/* Configure and enable VCMDQ */
writeq_relaxed(vcmdq->cmdq.q.q_base, REG_VCMDQ_PAGE1(vcmdq, BASE));
+ writel_relaxed(vcmdq->cmdq.q.llq.prod, REG_VCMDQ_PAGE0(vcmdq, PROD));
+ writel_relaxed(vcmdq->cmdq.q.llq.cons, REG_VCMDQ_PAGE0(vcmdq, CONS));
ret = vcmdq_write_config(vcmdq, VCMDQ_EN);
if (ret) {
--
2.54.0.794.g4f17f83d09-goog
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v7 05/11] iommu/arm-smmu-v3: Cache and restore MSI config
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (3 preceding siblings ...)
2026-05-27 22:14 ` [PATCH v7 04/11] iommu/tegra241-cmdqv: Restore PROD and CONS after resume Pranjal Shrivastava
@ 2026-05-27 22:14 ` Pranjal Shrivastava
2026-05-28 18:36 ` Nicolin Chen
2026-05-27 22:14 ` [PATCH v7 06/11] iommu/arm-smmu-v3: Handle gerror during suspend Pranjal Shrivastava
` (6 subsequent siblings)
11 siblings, 1 reply; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-27 22:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
linux-arm-kernel, Pranjal Shrivastava
The SMMU's MSI configuration registers (*_IRQ_CFGn) containing target
address, data and memory attributes lose their state when the SMMU is
powered down. We'll need to cache and restore their contents to ensure
that MSIs work after the system resumes.
To address this, cache the original `msi_msg` within the `msi_desc`
when the configuration is first written by `arm_smmu_write_msi_msg`.
This primarily includes the target address and data since the memory
attributes are fixed.
Introduce a new helper `arm_smmu_resume_msis` which will later be called
during the driver's resume callback. The helper would retrieve the
cached MSI message for each relevant interrupt (evtq, gerr, priq) via
get_cached_msi_msg & re-config the registers via arm_smmu_write_msi_msg.
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 37 +++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 8682be5060ed..93cee32f6c99 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4551,6 +4551,9 @@ static void arm_smmu_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
struct arm_smmu_device *smmu = dev_get_drvdata(dev);
phys_addr_t *cfg = arm_smmu_msi_cfg[desc->msi_index];
+ /* Cache the msi_msg for resume */
+ desc->msg = *msg;
+
doorbell = (((u64)msg->address_hi) << 32) | msg->address_lo;
doorbell &= MSI_CFG0_ADDR_MASK;
@@ -4559,6 +4562,40 @@ static void arm_smmu_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
writel_relaxed(ARM_SMMU_MEMATTR_DEVICE_nGnRE, smmu->base + cfg[2]);
}
+static void arm_smmu_resume_msi(struct arm_smmu_device *smmu,
+ unsigned int irq, const char *name)
+{
+ struct msi_desc *desc;
+ struct msi_msg msg;
+
+ if (!irq)
+ return;
+
+ desc = irq_get_msi_desc(irq);
+ if (!desc) {
+ dev_err(smmu->dev, "Failed to resume msi: %s", name);
+ return;
+ }
+
+ get_cached_msi_msg(irq, &msg);
+ arm_smmu_write_msi_msg(desc, &msg);
+}
+
+static void arm_smmu_resume_msis(struct arm_smmu_device *smmu)
+{
+ if (!(smmu->features & ARM_SMMU_FEAT_MSI))
+ return;
+
+ if (!dev_get_msi_domain(smmu->dev))
+ return;
+
+ arm_smmu_resume_msi(smmu, smmu->gerr_irq, "gerror");
+ arm_smmu_resume_msi(smmu, smmu->evtq.q.irq, "evtq");
+
+ if (smmu->features & ARM_SMMU_FEAT_PRI)
+ arm_smmu_resume_msi(smmu, smmu->priq.q.irq, "priq");
+}
+
static void arm_smmu_setup_msis(struct arm_smmu_device *smmu)
{
int ret, nvec = ARM_SMMU_MAX_MSIS;
--
2.54.0.794.g4f17f83d09-goog
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v7 06/11] iommu/arm-smmu-v3: Handle gerror during suspend
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (4 preceding siblings ...)
2026-05-27 22:14 ` [PATCH v7 05/11] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
@ 2026-05-27 22:14 ` Pranjal Shrivastava
2026-05-28 18:53 ` Nicolin Chen
2026-05-27 22:14 ` [PATCH v7 07/11] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions Pranjal Shrivastava
` (5 subsequent siblings)
11 siblings, 1 reply; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-27 22:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
linux-arm-kernel, Pranjal Shrivastava, Jason Gunthorpe
The GERROR register's state might be lost when the SMMU is powered down
during runtime suspend. Handle any pending errors before suspending.
Refactor the gerror handling logic into a helper function and invoke it
from the runtime suspend callback after disabling the SMMU. This ensures
that any late-breaking gerrors are logged and ack'ed before the hardware
state is lost.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 93cee32f6c99..191b5b5b805c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2257,10 +2257,9 @@ static irqreturn_t arm_smmu_priq_thread(int irq, void *dev)
static int arm_smmu_device_disable(struct arm_smmu_device *smmu);
-static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
+static irqreturn_t arm_smmu_handle_gerror(struct arm_smmu_device *smmu)
{
u32 gerror, gerrorn, active;
- struct arm_smmu_device *smmu = dev;
gerror = readl_relaxed(smmu->base + ARM_SMMU_GERROR);
gerrorn = readl_relaxed(smmu->base + ARM_SMMU_GERRORN);
@@ -2300,9 +2299,17 @@ static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
arm_smmu_cmdq_skip_err(smmu);
writel(gerror, smmu->base + ARM_SMMU_GERRORN);
+
return IRQ_HANDLED;
}
+static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
+{
+ struct arm_smmu_device *smmu = dev;
+
+ return arm_smmu_handle_gerror(smmu);
+}
+
static irqreturn_t arm_smmu_combined_irq_thread(int irq, void *dev)
{
struct arm_smmu_device *smmu = dev;
--
2.54.0.794.g4f17f83d09-goog
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v7 07/11] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (5 preceding siblings ...)
2026-05-27 22:14 ` [PATCH v7 06/11] iommu/arm-smmu-v3: Handle gerror during suspend Pranjal Shrivastava
@ 2026-05-27 22:14 ` Pranjal Shrivastava
2026-05-28 19:41 ` Nicolin Chen
2026-05-27 22:14 ` [PATCH v7 08/11] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
` (4 subsequent siblings)
11 siblings, 1 reply; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-27 22:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
linux-arm-kernel, Pranjal Shrivastava
Introduce a new bit flag, CMDQ_PROD_STOP_FLAG (bit 30), in the command
queue's producer index to safely gate command submissions during device
suspension.
The flag embeds the suspend state directly into the existing global state
The flag checked in the compxchg loop in arm_smmu_cmdq_issue_cmdlist(),
which acts as a Point of Commitment, ensuring that no indices are
reserved or committed once the SMMU begins suspending.
This prevents a situation of "abandoned batches" where indices are
incremented but commands are never written, which would otherwise
lead to timeout during the drain poll.
Update queue_inc_prod_n() to preserve this flag during index
calculations, ensuring that any in-flight commands that successfully
passed the point of commitment can proceed to completion while the
flag remains set.
Suggested-by: Daniel Mentz <danielmentz@google.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 20 +++++++++++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++
2 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 191b5b5b805c..a28417108a8a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -210,7 +210,8 @@ static int queue_sync_prod_in(struct arm_smmu_queue *q)
static u32 queue_inc_prod_n(struct arm_smmu_ll_queue *q, int n)
{
u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + n;
- return Q_OVF(q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
+
+ return Q_OVF(q->prod) | Q_STOP(q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
}
static void queue_poll_init(struct arm_smmu_device *smmu,
@@ -718,8 +719,25 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
do {
u64 old;
+ /*
+ * If the SMMU is suspended/suspending, any new CMDs are elided.
+ * This loop is the Point of Commitment. If we haven't cmpxchg'd
+ * our new indices yet, we can safely bail. Once the indices are
+ * committed, we MUST write valid commands to those slots to
+ * avoid indefinite polling in the drain function.
+ */
+ if (Q_STOP(llq.prod)) {
+ local_irq_restore(flags);
+ return 0;
+ }
+
while (!queue_has_space(&llq, n + sync)) {
local_irq_restore(flags);
+
+ /* Avoid waiting for space if the SMMU is suspending */
+ if (Q_STOP(READ_ONCE(cmdq->q.llq.prod)))
+ return 0;
+
if (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq))
dev_err_ratelimited(smmu->dev, "CMDQ timeout\n");
local_irq_save(flags);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 24d5e28eea88..a3c8417c87d8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -389,6 +389,8 @@ static inline unsigned int arm_smmu_cdtab_l2_idx(unsigned int ssid)
#define CMDQ_ERR_CERROR_ATC_INV_IDX 3
#define CMDQ_PROD_OWNED_FLAG Q_OVERFLOW_FLAG
+#define CMDQ_PROD_STOP_FLAG (1U << 30)
+#define Q_STOP(p) ((p) & CMDQ_PROD_STOP_FLAG)
struct arm_smmu_cmd {
u64 data[CMDQ_ENT_DWORDS];
--
2.54.0.794.g4f17f83d09-goog
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v7 08/11] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (6 preceding siblings ...)
2026-05-27 22:14 ` [PATCH v7 07/11] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions Pranjal Shrivastava
@ 2026-05-27 22:14 ` Pranjal Shrivastava
2026-05-28 19:39 ` Nicolin Chen
2026-05-27 22:14 ` [PATCH v7 09/11] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
` (3 subsequent siblings)
11 siblings, 1 reply; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-27 22:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
linux-arm-kernel, Pranjal Shrivastava
Implement pm_runtime and system sleep ops for arm-smmu-v3.
The suspend callback configures the SMMU to abort new transactions,
disables the main translation unit and then drains the command queue
to ensure completion of any in-flight commands. A software gate
(STOP_FLAG) and synchronization barriers are used to quiesce the command
submission pipeline and ensure state consistency before power-off.
To prevent software metadata flags from leaking into physical registers
or polluting the tracking pointer, a newly introduced bitmask
(CMDQ_PROD_IDX_MASK) is applied to all fast-path register writes and
tracking updates, avoiding post-resume deadlocks.
The resume callback restores the MSI configuration and performs a full
device reset via `arm_smmu_device_reset` to bring the SMMU back to an
operational state. The MSIs are cached during the msi_write and are
restored during the resume operation by using the helper. The STOP_FLAG
is cleared only after the CMDQ is enabled in hardware.
Suggested-by: Daniel Mentz <danielmentz@google.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 165 +++++++++++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 18 +++
2 files changed, 181 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a28417108a8a..6cbb1724b377 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -28,6 +28,7 @@
#include <linux/platform_device.h>
#include <linux/sort.h>
#include <linux/string_choices.h>
+#include <linux/pm_runtime.h>
#include <kunit/visibility.h>
#include <uapi/linux/iommufd.h>
@@ -110,6 +111,38 @@ static const char * const event_class_str[] = {
static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
static bool arm_smmu_ats_supported(struct arm_smmu_master *master);
+/* Runtime PM helpers */
+__maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
+{
+ int ret;
+
+ if (pm_runtime_enabled(smmu->dev)) {
+ ret = pm_runtime_resume_and_get(smmu->dev);
+ if (ret < 0) {
+ dev_err(smmu->dev, "failed to resume device: %d\n", ret);
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
+__maybe_unused static void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
+{
+ int ret;
+
+ if (pm_runtime_enabled(smmu->dev)) {
+ ret = pm_runtime_put_autosuspend(smmu->dev);
+ if (ret < 0)
+ dev_err(smmu->dev, "failed to suspend device: %d\n", ret);
+ }
+}
+
+static inline u32 arm_smmu_cmdq_owner_prod_idx(struct arm_smmu_cmdq *cmdq)
+{
+ return atomic_read(&cmdq->owner_prod) & CMDQ_PROD_IDX_MASK;
+}
+
static void parse_driver_options(struct arm_smmu_device *smmu)
{
int i = 0;
@@ -789,7 +822,8 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
/* b. Stop gathering work by clearing the owned flag */
prod = atomic_fetch_andnot_relaxed(CMDQ_PROD_OWNED_FLAG,
&cmdq->q.llq.atomic.prod);
- prod &= ~CMDQ_PROD_OWNED_FLAG;
+ /* Strip all metadata flags */
+ prod &= CMDQ_PROD_IDX_MASK;
/*
* c. Wait for any gathered work to be written to the queue.
@@ -4827,7 +4861,8 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
/* Command queue */
writeq_relaxed(smmu->cmdq.q.q_base, smmu->base + ARM_SMMU_CMDQ_BASE);
- writel_relaxed(smmu->cmdq.q.llq.prod, smmu->base + ARM_SMMU_CMDQ_PROD);
+ writel_relaxed(smmu->cmdq.q.llq.prod & CMDQ_PROD_IDX_MASK,
+ smmu->base + ARM_SMMU_CMDQ_PROD);
writel_relaxed(smmu->cmdq.q.llq.cons, smmu->base + ARM_SMMU_CMDQ_CONS);
enables = CR0_CMDQEN;
@@ -4838,6 +4873,10 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
return ret;
}
+ /* Clear any flags from the previous life */
+ atomic_andnot(CMDQ_PROD_STOP_FLAG, &smmu->cmdq.owner_prod);
+ atomic_andnot(CMDQ_PROD_STOP_FLAG, &smmu->cmdq.q.llq.atomic.prod);
+
/* Invalidate any cached configuration */
arm_smmu_cmdq_issue_cmd_with_sync(smmu, arm_smmu_make_cmd_cfgi_all());
@@ -4897,6 +4936,21 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
if (is_kdump_kernel())
enables &= ~(CR0_EVTQEN | CR0_PRIQEN);
+ /*
+ * While the SMMU was suspended, concurrent CPU threads may have
+ * updated in-memory structures (such as STEs, CDs, and PTEs).
+ * Any invalidations corresponding to those updates were safely
+ * elided because the command queue was stopped (STOP_FLAG == 1).
+ *
+ * Since the reset invalidate-all commands above have fully cleared
+ * the HW TLBs and config caches, the SMMU will fetch these descriptors
+ * directly from RAM as soon as translation is enabled.
+ *
+ * Add a memory barrier to collect all prior RAM writes to ensure the
+ * SMMU sees a consistent view of memory before translation is enabled.
+ */
+ smp_mb();
+
/* Enable the SMMU interface */
enables |= CR0_SMMUEN;
ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
@@ -5579,6 +5633,112 @@ static void arm_smmu_device_shutdown(struct platform_device *pdev)
arm_smmu_device_disable(smmu);
}
+static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
+{
+ struct arm_smmu_device *smmu = dev_get_drvdata(dev);
+ struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
+ int timeout = ARM_SMMU_SUSPEND_TIMEOUT_US;
+ u32 enables, target;
+ int ret;
+
+ /* Abort all transactions before disable to avoid spurious bypass */
+ arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
+
+ /* Disable the SMMU via CR0.EN and all queues except CMDQ */
+ enables = CR0_CMDQEN;
+ ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0, ARM_SMMU_CR0ACK);
+ if (ret) {
+ dev_err(smmu->dev, "failed to disable SMMU\n");
+ return ret;
+ }
+
+ /*
+ * At this point the SMMU is completely disabled and won't access
+ * any translation/config structures, even speculative accesses
+ * aren't performed as per the IHI0070 spec (section 6.3.9.6).
+ */
+
+ /* Mark the CMDQ to stop and get the target index before the stop */
+ target = atomic_fetch_or_relaxed(CMDQ_PROD_STOP_FLAG, &cmdq->q.llq.atomic.prod);
+ target &= CMDQ_PROD_IDX_MASK;
+
+
+ /* Wait for the last committed owner to reach the hardware */
+ while ((arm_smmu_cmdq_owner_prod_idx(cmdq) != target) && --timeout)
+ udelay(1);
+
+ /*
+ * Entering suspend implies no active clients. A timeout here
+ * indicates a fatal CMDQ lockup or hardware stall. We proceed
+ * anyway to prioritize memory safety (avoiding stale TLBs)
+ */
+ if (!timeout)
+ dev_err(smmu->dev, "cmdq owner wait timeout, (check runtime PM + devlinks)\n");
+
+ /* Drain the CMDQs */
+ ret = arm_smmu_drain_queues(smmu);
+ if (ret)
+ dev_warn(smmu->dev, "failed to drain queues, forcing suspend\n");
+
+ /* Wait for cmdq->lock == 0 to ensure last CMDQ_CONS_REG is written */
+ timeout = ARM_SMMU_SUSPEND_TIMEOUT_US;
+ while (atomic_read(&cmdq->lock) != 0 && --timeout)
+ udelay(1);
+
+ /* Timing out here implies misconfigured Runtime PM or broken devlinks */
+ if (!timeout)
+ dev_err(smmu->dev, "cmdq lock != 0, forcing suspend. Polling CPUs may fault.\n");
+
+ /* Disable everything */
+ arm_smmu_device_disable(smmu);
+
+ /* Handle any pending gerrors before powering down */
+ arm_smmu_handle_gerror(smmu);
+
+ dev_dbg(dev, "suspended smmu\n");
+
+ return 0;
+}
+
+static int __maybe_unused arm_smmu_runtime_resume(struct device *dev)
+{
+ struct arm_smmu_device *smmu = dev_get_drvdata(dev);
+ int ret;
+
+ /* Re-configure MSIs */
+ arm_smmu_resume_msis(smmu);
+
+ ret = arm_smmu_device_reset(smmu);
+ if (ret)
+ dev_err(dev, "failed to reset during resume operation: %d\n", ret);
+
+ dev_dbg(dev, "resumed smmu\n");
+
+ return ret;
+}
+
+static int __maybe_unused arm_smmu_pm_suspend(struct device *dev)
+{
+ if (pm_runtime_suspended(dev))
+ return 0;
+
+ return arm_smmu_runtime_suspend(dev);
+}
+
+static int __maybe_unused arm_smmu_pm_resume(struct device *dev)
+{
+ if (pm_runtime_suspended(dev))
+ return 0;
+
+ return arm_smmu_runtime_resume(dev);
+}
+
+static const struct dev_pm_ops arm_smmu_pm_ops = {
+ SET_SYSTEM_SLEEP_PM_OPS(arm_smmu_pm_suspend, arm_smmu_pm_resume)
+ SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend,
+ arm_smmu_runtime_resume, NULL)
+};
+
static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,smmu-v3", },
{ },
@@ -5595,6 +5755,7 @@ static struct platform_driver arm_smmu_driver = {
.driver = {
.name = "arm-smmu-v3",
.of_match_table = arm_smmu_of_match,
+ .pm = &arm_smmu_pm_ops,
.suppress_bind_attrs = true,
},
.probe = arm_smmu_device_probe,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index a3c8417c87d8..97ec9d9caffe 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -392,6 +392,9 @@ static inline unsigned int arm_smmu_cdtab_l2_idx(unsigned int ssid)
#define CMDQ_PROD_STOP_FLAG (1U << 30)
#define Q_STOP(p) ((p) & CMDQ_PROD_STOP_FLAG)
+/* Mask out software-only metadata flags to get the pure queue index/wrap bits */
+#define CMDQ_PROD_IDX_MASK ~(CMDQ_PROD_STOP_FLAG | CMDQ_PROD_OWNED_FLAG)
+
struct arm_smmu_cmd {
u64 data[CMDQ_ENT_DWORDS];
};
@@ -655,11 +658,14 @@ arm_smmu_make_cmd_tlbi(enum arm_smmu_cmdq_opcode op, u16 asid, u16 vmid)
/* High-level queue structures */
#define ARM_SMMU_POLL_TIMEOUT_US 1000000 /* 1s! */
+#define ARM_SMMU_SUSPEND_TIMEOUT_US 1000000 /* 1s! */
#define ARM_SMMU_POLL_SPIN_COUNT 10
#define MSI_IOVA_BASE 0x8000000
#define MSI_IOVA_LENGTH 0x100000
+#define RPM_AUTOSUSPEND_DELAY_MS 15
+
struct arm_smmu_ll_queue {
union {
u64 val;
@@ -1217,6 +1223,18 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
struct arm_smmu_cmd *cmds, int n,
bool sync);
+/*
+ * Lockless pre-check to elide invalidations if SMMU is suspended.
+ * Races with concurrent suspend are benign: the cmpxchg loop in
+ * arm_smmu_cmdq_issue_cmdlist() acts as the true commit point.
+ * If we lose the race, that loop observes Q_STOP == 1 and safely
+ * drops the command. If we win, the suspend thread waits for us.
+ */
+static inline bool arm_smmu_can_elide(struct arm_smmu_device *smmu)
+{
+ return !!Q_STOP(READ_ONCE(smmu->cmdq.q.llq.prod));
+}
+
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
void arm_smmu_sva_notifier_synchronize(void);
--
2.54.0.794.g4f17f83d09-goog
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v7 09/11] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (7 preceding siblings ...)
2026-05-27 22:14 ` [PATCH v7 08/11] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
@ 2026-05-27 22:14 ` Pranjal Shrivastava
2026-05-28 20:13 ` Nicolin Chen
2026-05-27 22:14 ` [PATCH v7 10/11] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
` (2 subsequent siblings)
11 siblings, 1 reply; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-27 22:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
linux-arm-kernel, Pranjal Shrivastava
Enable PM runtime for SMMUs having a power-domain during smmu probe.
Add a devlink between the clients and SMMU device. The absence of a
power domain effectively disables runtime power management.
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6cbb1724b377..34e249656ab4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4240,6 +4240,9 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
pci_prepare_ats(to_pci_dev(dev), stu);
}
+ device_link_add(dev, smmu->dev,
+ DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE_SUPPLIER);
+
return &smmu->iommu;
err_free_master:
@@ -5592,11 +5595,18 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
if (ret)
goto err_disable;
+ if (dev->pm_domain) {
+ pm_runtime_set_active(dev);
+ pm_runtime_use_autosuspend(dev);
+ pm_runtime_set_autosuspend_delay(dev, RPM_AUTOSUSPEND_DELAY_MS);
+ pm_runtime_enable(dev);
+ }
+
/* And we're up. Go go go! */
ret = iommu_device_sysfs_add(&smmu->iommu, dev, NULL,
"smmu3.%pa", &ioaddr);
if (ret)
- goto err_disable;
+ goto err_rpm_disable;
ret = iommu_device_register(&smmu->iommu, &arm_smmu_ops, dev);
if (ret) {
@@ -5608,6 +5618,8 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
err_free_sysfs:
iommu_device_sysfs_remove(&smmu->iommu);
+err_rpm_disable:
+ pm_runtime_disable(dev);
err_disable:
arm_smmu_device_disable(smmu);
err_free_iopf:
--
2.54.0.794.g4f17f83d09-goog
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v7 10/11] iommu/arm-smmu-v3: Invoke pm_runtime before hw access
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (8 preceding siblings ...)
2026-05-27 22:14 ` [PATCH v7 09/11] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
@ 2026-05-27 22:14 ` Pranjal Shrivastava
2026-05-28 20:28 ` Nicolin Chen
2026-05-27 22:14 ` [PATCH v7 11/11] iommu/arm-smmu-v3: Add KUnit unit tests for Runtime PM Pranjal Shrivastava
2026-05-28 18:05 ` [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Nicolin Chen
11 siblings, 1 reply; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-27 22:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
linux-arm-kernel, Pranjal Shrivastava
Invoke the pm_runtime helpers at all places before accessing the hw.
The idea is to invoke runtime_pm helpers at common points which are used
by exposed ops or interrupt handlers. TLB and CFG invalidations are
elided if the SMMU is suspended by observing the CMDQ_PROD_STOP_FLAG via
the arm_smmu_can_elide() helper.
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 15 ++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 115 ++++++++++++++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +
3 files changed, 123 insertions(+), 10 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index 1e9f7d2de344..c10de7aaf399 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -15,6 +15,11 @@ void *arm_smmu_hw_info(struct device *dev, u32 *length,
struct iommu_hw_info_arm_smmuv3 *info;
u32 __iomem *base_idr;
unsigned int i;
+ int ret;
+
+ ret = arm_smmu_rpm_get(master->smmu);
+ if (ret < 0)
+ return ERR_PTR(-EIO);
if (*type != IOMMU_HW_INFO_TYPE_DEFAULT &&
*type != IOMMU_HW_INFO_TYPE_ARM_SMMUV3) {
@@ -24,8 +29,10 @@ void *arm_smmu_hw_info(struct device *dev, u32 *length,
}
info = kzalloc_obj(*info);
- if (!info)
+ if (!info) {
+ arm_smmu_rpm_put(master->smmu);
return ERR_PTR(-ENOMEM);
+ }
base_idr = master->smmu->base + ARM_SMMU_IDR0;
for (i = 0; i <= 5; i++)
@@ -36,6 +43,7 @@ void *arm_smmu_hw_info(struct device *dev, u32 *length,
*length = sizeof(*info);
*type = IOMMU_HW_INFO_TYPE_ARM_SMMUV3;
+ arm_smmu_rpm_put(master->smmu);
return info;
}
@@ -386,8 +394,9 @@ int arm_vsmmu_cache_invalidate(struct iommufd_viommu *viommu,
continue;
/* FIXME always uses the main cmdq rather than trying to group by type */
- ret = arm_smmu_cmdq_issue_cmdlist(smmu, &smmu->cmdq, &last->cmd,
- cur - last, true);
+ if (!arm_smmu_can_elide(smmu))
+ ret = arm_smmu_cmdq_issue_cmdlist(smmu, &smmu->cmdq, &last->cmd,
+ cur - last, true);
if (ret) {
cur--;
goto out;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 34e249656ab4..d42f78777571 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -112,7 +112,7 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
static bool arm_smmu_ats_supported(struct arm_smmu_master *master);
/* Runtime PM helpers */
-__maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
+int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
{
int ret;
@@ -127,7 +127,7 @@ __maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
return 0;
}
-__maybe_unused static void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
+void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
{
int ret;
@@ -981,7 +981,9 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
struct iommu_page_response *resp)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_device *smmu = master->smmu;
u8 resume_resp;
+ int ret;
if (WARN_ON(!master->stall_enabled))
return;
@@ -999,6 +1001,25 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
break;
}
+ /*
+ * The SMMU is guaranteed to be active via device_link if any master is
+ * active. Furthermore, on suspend we set GBPA to abort, flushing any
+ * pending stalled transactions.
+ *
+ * Receiving a page fault while suspended implies a bug in the power
+ * dependency chain or a stale event. Since the SMMU is powered down
+ * and the command queue is inaccessible, we cannot issue the
+ * RESUME command and must drop it.
+ */
+ if (arm_smmu_can_elide(smmu)) {
+ dev_err(smmu->dev, "Ignoring page fault while suspended\n");
+ return;
+ }
+
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ return;
+
arm_smmu_cmdq_issue_cmd(master->smmu,
arm_smmu_make_cmd_resume(master->streams[0].id,
resp->grpid,
@@ -1009,6 +1030,7 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
* terminated... at some point in the future. PRI_RESP is fire and
* forget.
*/
+ arm_smmu_rpm_put(smmu);
}
/* Invalidation array manipulation functions */
@@ -1528,13 +1550,15 @@ static void arm_smmu_sync_cd(struct arm_smmu_master *master,
struct arm_smmu_device *smmu = master->smmu;
struct arm_smmu_cmd cmd = arm_smmu_make_cmd_cfgi_cd(0, ssid, leaf);
+ if (arm_smmu_can_elide(smmu))
+ return;
+
arm_smmu_cmdq_batch_init_cmd(smmu, &cmds, &cmd);
for (i = 0; i < master->num_streams; i++)
arm_smmu_cmdq_batch_add_cmd(
smmu, &cmds,
arm_smmu_make_cmd_cfgi_cd(master->streams[i].id, ssid,
leaf));
-
arm_smmu_cmdq_batch_submit(smmu, &cmds);
}
@@ -1826,9 +1850,12 @@ static void arm_smmu_ste_writer_sync_entry(struct arm_smmu_entry_writer *writer)
{
struct arm_smmu_ste_writer *ste_writer =
container_of(writer, struct arm_smmu_ste_writer, writer);
+ struct arm_smmu_device *smmu = writer->master->smmu;
- arm_smmu_cmdq_issue_cmd_with_sync(
- writer->master->smmu,
+ if (arm_smmu_can_elide(smmu))
+ return;
+
+ arm_smmu_cmdq_issue_cmd_with_sync(smmu,
arm_smmu_make_cmd_cfgi_ste(ste_writer->sid, true));
}
@@ -2228,6 +2255,7 @@ static void arm_smmu_dump_event(struct arm_smmu_device *smmu, u64 *raw,
static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
{
+ int ret;
u64 evt[EVTQ_ENT_DWORDS];
struct arm_smmu_event event = {0};
struct arm_smmu_device *smmu = dev;
@@ -2236,6 +2264,10 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
DEFAULT_RATELIMIT_BURST);
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ return IRQ_NONE;
+
do {
while (!queue_remove_raw(q, evt)) {
arm_smmu_decode_event(smmu, evt, &event);
@@ -2256,6 +2288,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
/* Sync our overflow flag, as we believe we're up to speed */
queue_sync_cons_ovf(q);
+ arm_smmu_rpm_put(smmu);
return IRQ_HANDLED;
}
@@ -2293,6 +2326,11 @@ static irqreturn_t arm_smmu_priq_thread(int irq, void *dev)
struct arm_smmu_queue *q = &smmu->priq.q;
struct arm_smmu_ll_queue *llq = &q->llq;
u64 evt[PRIQ_ENT_DWORDS];
+ int ret;
+
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ return IRQ_NONE;
do {
while (!queue_remove_raw(q, evt))
@@ -2304,6 +2342,7 @@ static irqreturn_t arm_smmu_priq_thread(int irq, void *dev)
/* Sync our overflow flag, as we believe we're up to speed */
queue_sync_cons_ovf(q);
+ arm_smmu_rpm_put(smmu);
return IRQ_HANDLED;
}
@@ -2358,8 +2397,33 @@ static irqreturn_t arm_smmu_handle_gerror(struct arm_smmu_device *smmu)
static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
{
struct arm_smmu_device *smmu = dev;
+ irqreturn_t ret;
+
+ /*
+ * Global Errors are only processed if the SMMU is active.
+ *
+ * If the STOP_FLAG is set (can_elide == true), the hardware is
+ * either already disabled or in the process of being disabled.
+ * Any errors captured during the quiesce/drain phase will be
+ * handled by the explicit arm_smmu_handle_gerror() call at the
+ * end of arm_smmu_runtime_suspend() callback. On resume, the
+ * STOP_FLAG is cleared before interrupts are re-enabled, ensuring
+ * no valid errors are missed.
+ *
+ * A lockless check is favoured here over a dynamic PM core check
+ * since the runtime_pm_get_if_active would return false during
+ * transient states like RPM_RESUMING & ignore level-triggered
+ * interrupts.
+ */
+ if (arm_smmu_can_elide(smmu)) {
+ dev_err(smmu->dev,
+ "Ignoring gerror interrupt because the SMMU is suspended\n");
+ return IRQ_NONE;
+ }
+
+ ret = arm_smmu_handle_gerror(smmu);
- return arm_smmu_handle_gerror(smmu);
+ return ret;
}
static irqreturn_t arm_smmu_combined_irq_thread(int irq, void *dev)
@@ -2442,6 +2506,10 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master,
struct arm_smmu_cmd cmd;
struct arm_smmu_cmdq_batch cmds;
+ /* Shouldn't be elided if there's no devlink inconsistency */
+ if (WARN_ON_ONCE(arm_smmu_can_elide(master->smmu)))
+ return 0;
+
cmd = arm_smmu_make_cmd_atc_inv_all(0, IOMMU_NO_PASID);
arm_smmu_cmdq_batch_init_cmd(master->smmu, &cmds, &cmd);
for (i = 0; i < master->num_streams; i++)
@@ -2687,7 +2755,22 @@ static void __arm_smmu_domain_inv_range(struct arm_smmu_invs *invs,
if (cmds.num &&
(next == end || arm_smmu_invs_end_batch(cur, next))) {
- arm_smmu_cmdq_batch_submit(smmu, &cmds);
+
+ /*
+ * Concurrent suspend races are benign: the cmdq allocation cmpxchg
+ * loop acts as the serialization point to safely drop the batch
+ * without MMIO accesses. Concurrent resume is caught by the HW
+ * reset cache invalidation, ensuring state consistency.
+ */
+ bool elide = arm_smmu_can_elide(smmu);
+
+ if (arm_smmu_inv_is_ats(cur))
+ WARN_ON_ONCE(elide);
+
+ if (!elide)
+ arm_smmu_cmdq_batch_submit(smmu, &cmds);
+
+ /* Drop this batch to ensure the next one's fresh */
cmds.num = 0;
}
cur = next;
@@ -3404,6 +3487,9 @@ arm_smmu_install_new_domain_invs(struct arm_smmu_attach_state *state)
static void arm_smmu_inv_flush_iotlb_tag(struct arm_smmu_inv *inv)
{
+ if (arm_smmu_can_elide(inv->smmu))
+ return;
+
switch (inv->type) {
case INV_TYPE_S1_ASID:
arm_smmu_cmdq_issue_cmd_with_sync(
@@ -5630,10 +5716,19 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
static void arm_smmu_device_remove(struct platform_device *pdev)
{
struct arm_smmu_device *smmu = platform_get_drvdata(pdev);
+ int ret;
iommu_device_unregister(&smmu->iommu);
iommu_device_sysfs_remove(&smmu->iommu);
+
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ goto free_iopf;
+
arm_smmu_device_disable(smmu);
+ arm_smmu_rpm_put(smmu);
+
+free_iopf:
iopf_queue_free(smmu->evtq.iopf);
ida_destroy(&smmu->vmid_map);
}
@@ -5641,8 +5736,14 @@ static void arm_smmu_device_remove(struct platform_device *pdev)
static void arm_smmu_device_shutdown(struct platform_device *pdev)
{
struct arm_smmu_device *smmu = platform_get_drvdata(pdev);
+ int ret;
+
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ return;
arm_smmu_device_disable(smmu);
+ arm_smmu_rpm_put(smmu);
}
static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 97ec9d9caffe..39fb26552e57 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1235,6 +1235,9 @@ static inline bool arm_smmu_can_elide(struct arm_smmu_device *smmu)
return !!Q_STOP(READ_ONCE(smmu->cmdq.q.llq.prod));
}
+int arm_smmu_rpm_get(struct arm_smmu_device *smmu);
+void arm_smmu_rpm_put(struct arm_smmu_device *smmu);
+
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
void arm_smmu_sva_notifier_synchronize(void);
--
2.54.0.794.g4f17f83d09-goog
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v7 11/11] iommu/arm-smmu-v3: Add KUnit unit tests for Runtime PM
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (9 preceding siblings ...)
2026-05-27 22:14 ` [PATCH v7 10/11] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
@ 2026-05-27 22:14 ` Pranjal Shrivastava
2026-05-28 21:43 ` Nicolin Chen
2026-05-28 18:05 ` [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Nicolin Chen
11 siblings, 1 reply; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-27 22:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Ashish Mhetre,
linux-arm-kernel, Pranjal Shrivastava
Introduce a kunit selftests to verify the Runtime PM elision gating,
post-suspend elisions and progress on resumption under active
invalidation load. Simulate concurrent HW suspension using a timer.
Mock all HW registers and CMDQ buffers by allocating them on RAM.
Make the mock CMDQ self-consuming to avoid hitting queue_full scenarios.
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 150 ++++++++++++++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
2 files changed, 151 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
index add671363c82..ef9a17343ed8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
@@ -3,6 +3,10 @@
* Copyright 2024 Google LLC.
*/
#include <kunit/test.h>
+#include <linux/delay.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/timer.h>
#include <linux/io-pgtable.h>
#include "arm-smmu-v3.h"
@@ -771,7 +775,153 @@ static void arm_smmu_v3_invs_test(struct kunit *test)
kfree(test_b);
}
+struct arm_smmu_test_pm_context {
+ u32 *mock_prod_reg;
+ u64 *mock_base;
+ unsigned long *mock_valid_map;
+};
+
+/*
+ * Shared helper to allocate and map a fully functional, self-consuming mock
+ * CMDQ. By redirecting both prod_reg and cons_reg to the same memory location,
+ * every producer write is instantly reflected as a consumer completion.
+ */
+static struct arm_smmu_test_pm_context *
+arm_smmu_v3_test_init_mock_cmdq(struct kunit *test, struct arm_smmu_device *smmu)
+{
+ struct arm_smmu_test_pm_context *ctx;
+ struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
+
+ ctx = kunit_kzalloc(test, sizeof(*ctx), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, ctx);
+
+ /* Use a standard 1024-entry queue for safe word bitmask alignment */
+ ctx->mock_prod_reg = kunit_kzalloc(test, sizeof(u32), GFP_KERNEL);
+ ctx->mock_base = kunit_kzalloc(test, 1024 * 16, GFP_KERNEL);
+ ctx->mock_valid_map = kunit_kzalloc(test, BITS_TO_LONGS(1024) * sizeof(long), GFP_KERNEL);
+
+ KUNIT_ASSERT_NOT_NULL(test, ctx->mock_prod_reg);
+ KUNIT_ASSERT_NOT_NULL(test, ctx->mock_base);
+ KUNIT_ASSERT_NOT_NULL(test, ctx->mock_valid_map);
+
+ smmu->features = 0;
+ /* 1024 entries */
+ cmdq->q.llq.max_n_shift = 10;
+ /* SMMUv3 commands are 2 dwords (16 bytes) */
+ cmdq->q.ent_dwords = 2;
+ cmdq->q.base = (__le64 *)ctx->mock_base;
+ cmdq->valid_map = (atomic_long_t *)ctx->mock_valid_map;
+
+ /* Self-Consuming, prod == cons always ensures queue empty */
+ cmdq->q.prod_reg = (__force u32 __iomem *)ctx->mock_prod_reg;
+ cmdq->q.cons_reg = (__force u32 __iomem *)ctx->mock_prod_reg;
+
+ /* Initialize memory indices */
+ atomic_set(&cmdq->q.llq.atomic.prod, 0);
+ atomic_set(&cmdq->q.llq.atomic.cons, 0);
+ atomic_set(&cmdq->owner_prod, 0);
+ *ctx->mock_prod_reg = 0;
+
+ return ctx;
+}
+
+struct arm_smmu_test_timer_context {
+ struct kunit *test;
+ struct arm_smmu_device *smmu;
+ struct arm_smmu_test_pm_context *pm_ctx;
+ struct timer_list timer;
+ bool stopped;
+};
+
+/* Asynchronous timer callback running in softirq context */
+static void arm_smmu_v3_test_rpm_timer_callback(struct timer_list *t)
+{
+ struct arm_smmu_test_timer_context *ctx =
+ timer_container_of(ctx, t, timer);
+ struct arm_smmu_cmdq *cmdq = &ctx->smmu->cmdq;
+
+ /*
+ * Asynchronously set the STOP_FLAG (Close the gate).
+ * Simulate a concurrent runtime suspend event interrupting
+ * the invalidations happening under active load.
+ */
+ atomic_or(CMDQ_PROD_STOP_FLAG, &cmdq->q.llq.atomic.prod);
+
+ /* Signal to the main storm loop that suspend has triggered */
+ WRITE_ONCE(ctx->stopped, true);
+}
+
+/*
+ * Verify SMMU PM Runtime gating, elision, and post-suspend resumption
+ * safety sequentially under active stress.
+ */
+static void arm_smmu_v3_test_rpm_stress_race(struct kunit *test)
+{
+ struct arm_smmu_device *smmu = kunit_kzalloc(test, sizeof(*smmu), GFP_KERNEL);
+ struct arm_smmu_test_pm_context *pm_ctx;
+ struct arm_smmu_test_timer_context *timer_ctx;
+ struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
+ struct arm_smmu_cmd cmd = arm_smmu_make_cmd_cfgi_all();
+ u32 stopped_prod;
+ int i, ret;
+
+ KUNIT_ASSERT_NOT_NULL(test, smmu);
+ pm_ctx = arm_smmu_v3_test_init_mock_cmdq(test, smmu);
+
+ timer_ctx = kunit_kmalloc(test, sizeof(*timer_ctx), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, timer_ctx);
+ timer_ctx->test = test;
+ timer_ctx->smmu = smmu;
+ timer_ctx->pm_ctx = pm_ctx;
+ timer_ctx->stopped = false;
+
+ /* Setup the kernel software timer */
+ timer_setup(&timer_ctx->timer, arm_smmu_v3_test_rpm_timer_callback, 0);
+
+ /* Start the timer to fire in 10 jiffies */
+ mod_timer(&timer_ctx->timer, jiffies + msecs_to_jiffies(10));
+
+ /* Execute the unmap storm until the timer triggers */
+ while (!READ_ONCE(timer_ctx->stopped)) {
+ ret = arm_smmu_cmdq_issue_cmdlist(smmu, cmdq, &cmd, 1, false);
+ if (ret != 0) {
+ KUNIT_FAIL(test, "Unmap Storm loop failed to submit cmd list: %d\n", ret);
+ break;
+ }
+
+ usleep_range(50, 100);
+ }
+
+ timer_delete_sync(&timer_ctx->timer);
+
+ /* Establish the globally stable, post-storm register baseline index */
+ stopped_prod = *pm_ctx->mock_prod_reg;
+
+ /*
+ * Attempt multiple unmaps while the SMMU is disabled (STOP_GATE is set)
+ * Every single invalidation must be instantly elided, returning 0
+ * immediately and leaving the mock MMIO register completely frozen.
+ */
+ for (i = 0; i < 1000; i++) {
+ ret = arm_smmu_cmdq_issue_cmdlist(smmu, cmdq, &cmd, 1, false);
+ if (ret != 0) {
+ KUNIT_FAIL(test, "Gated storm loop failed at iteration %d: %d\n", i, ret);
+ break;
+ }
+ }
+ KUNIT_EXPECT_EQ(test, stopped_prod, *pm_ctx->mock_prod_reg);
+
+ /*
+ * Clear the STOP_FLAG (resume SMMU). A new invalidation must now
+ * successfully commit, moving the prod_reg by exactly 1 slot
+ */
+ atomic_andnot(CMDQ_PROD_STOP_FLAG, &cmdq->q.llq.atomic.prod);
+ KUNIT_EXPECT_EQ(test, 0, arm_smmu_cmdq_issue_cmdlist(smmu, cmdq, &cmd, 1, false));
+ KUNIT_EXPECT_EQ(test, stopped_prod + 1, *pm_ctx->mock_prod_reg);
+}
+
static struct kunit_case arm_smmu_v3_test_cases[] = {
+ KUNIT_CASE(arm_smmu_v3_test_rpm_stress_race),
KUNIT_CASE(arm_smmu_v3_write_ste_test_bypass_to_abort),
KUNIT_CASE(arm_smmu_v3_write_ste_test_abort_to_bypass),
KUNIT_CASE(arm_smmu_v3_write_ste_test_cdtable_to_abort),
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index d42f78777571..dd7cfd3c16f4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -871,6 +871,7 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
local_irq_restore(flags);
return ret;
}
+EXPORT_SYMBOL_IF_KUNIT(arm_smmu_cmdq_issue_cmdlist);
static int arm_smmu_cmdq_issue_cmd_p(struct arm_smmu_device *smmu,
struct arm_smmu_cmd *cmd, bool sync)
--
2.54.0.794.g4f17f83d09-goog
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v7 02/11] iommu/arm-smmu-v3: Add a helper to drain cmd queues
2026-05-27 22:13 ` [PATCH v7 02/11] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
@ 2026-05-28 1:35 ` Nicolin Chen
2026-05-28 10:34 ` Pranjal Shrivastava
0 siblings, 1 reply; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 1:35 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Wed, May 27, 2026 at 10:13:58PM +0000, Pranjal Shrivastava wrote:
> +static int arm_smmu_drain_queues(struct arm_smmu_device *smmu)
So, this function has a very generic name for queues but its use
case is limited to cmdq in a suspend context, and might not work
for other queues.
I am adding another helper for in-flight evetq and priq:
- Drain a queue for a detaching master; other attached masters
might still advance the prod, so queue wouldn't be empty.
- evtq and priq sets wfe=false.
So, there is a naming conflict here.
Maybe:
arm_smmu_drain_queue_for_cmdq()
arm_smmu_drain_queue_for_iopf()
?
Nicolin
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 02/11] iommu/arm-smmu-v3: Add a helper to drain cmd queues
2026-05-28 1:35 ` Nicolin Chen
@ 2026-05-28 10:34 ` Pranjal Shrivastava
2026-05-28 22:09 ` Nicolin Chen
0 siblings, 1 reply; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-28 10:34 UTC (permalink / raw)
To: Nicolin Chen
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Wed, May 27, 2026 at 06:35:56PM -0700, Nicolin Chen wrote:
> On Wed, May 27, 2026 at 10:13:58PM +0000, Pranjal Shrivastava wrote:
> > +static int arm_smmu_drain_queues(struct arm_smmu_device *smmu)
>
> So, this function has a very generic name for queues but its use
> case is limited to cmdq in a suspend context, and might not work
> for other queues.
>
> I am adding another helper for in-flight evetq and priq:
> - Drain a queue for a detaching master; other attached masters
> might still advance the prod, so queue wouldn't be empty.
> - evtq and priq sets wfe=false.
>
> So, there is a naming conflict here.
>
> Maybe:
> arm_smmu_drain_queue_for_cmdq()
> arm_smmu_drain_queue_for_iopf()
> ?
Ack. Sorry about that, the earlier versions were draining the other
queues too but that has changed from the last few versions.
I'm planning to rename it to arm_smmu_drain_cmdq(), sounds good?
Thanks,
Praan
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (10 preceding siblings ...)
2026-05-27 22:14 ` [PATCH v7 11/11] iommu/arm-smmu-v3: Add KUnit unit tests for Runtime PM Pranjal Shrivastava
@ 2026-05-28 18:05 ` Nicolin Chen
11 siblings, 0 replies; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 18:05 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Wed, May 27, 2026 at 10:13:56PM +0000, Pranjal Shrivastava wrote:
> [v7]
> - Rebased on the latest arm/smmu/updates branch (which has the removal of
> struct arm_smmu_cmdq_ent merged)
Sashiko failed to apply:
https://sashiko.dev/#/patchset/20260527221407.1756491-1-praan%40google.com
Let's add a base-commit to the series and see if it would work?
Nicolin
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 04/11] iommu/tegra241-cmdqv: Restore PROD and CONS after resume
2026-05-27 22:14 ` [PATCH v7 04/11] iommu/tegra241-cmdqv: Restore PROD and CONS after resume Pranjal Shrivastava
@ 2026-05-28 18:14 ` Nicolin Chen
0 siblings, 0 replies; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 18:14 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Wed, May 27, 2026 at 10:14:00PM +0000, Pranjal Shrivastava wrote:
> From: Ashish Mhetre <amhetre@nvidia.com>
>
> PROD and CONS indices for vcmdqs are getting set to 0 after resume.
> Because of this the vcmdq is not consuming commands after resume.
> Fix this by restoring PROD and CONS indices after resume from
> saved pointers.
>
> Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
> Signed-off-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 05/11] iommu/arm-smmu-v3: Cache and restore MSI config
2026-05-27 22:14 ` [PATCH v7 05/11] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
@ 2026-05-28 18:36 ` Nicolin Chen
2026-05-28 21:57 ` Pranjal Shrivastava
0 siblings, 1 reply; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 18:36 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Wed, May 27, 2026 at 10:14:01PM +0000, Pranjal Shrivastava wrote:
> The SMMU's MSI configuration registers (*_IRQ_CFGn) containing target
> address, data and memory attributes lose their state when the SMMU is
> powered down. We'll need to cache and restore their contents to ensure
> that MSIs work after the system resumes.
>
> To address this, cache the original `msi_msg` within the `msi_desc`
> when the configuration is first written by `arm_smmu_write_msi_msg`.
> This primarily includes the target address and data since the memory
> attributes are fixed.
>
> Introduce a new helper `arm_smmu_resume_msis` which will later be called
> during the driver's resume callback. The helper would retrieve the
> cached MSI message for each relevant interrupt (evtq, gerr, priq) via
> get_cached_msi_msg & re-config the registers via arm_smmu_write_msi_msg.
>
> Signed-off-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> +static void arm_smmu_resume_msi(struct arm_smmu_device *smmu,
> + unsigned int irq, const char *name)
> +{
> + struct msi_desc *desc;
> + struct msi_msg msg;
> +
> + if (!irq)
> + return;
> +
> + desc = irq_get_msi_desc(irq);
> + if (!desc) {
> + dev_err(smmu->dev, "Failed to resume msi: %s", name);
> + return;
> + }
> +
> + get_cached_msi_msg(irq, &msg);
I wonder if we could use __get_cached_msi_msg() since we got desc?
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 06/11] iommu/arm-smmu-v3: Handle gerror during suspend
2026-05-27 22:14 ` [PATCH v7 06/11] iommu/arm-smmu-v3: Handle gerror during suspend Pranjal Shrivastava
@ 2026-05-28 18:53 ` Nicolin Chen
2026-05-28 21:59 ` Pranjal Shrivastava
0 siblings, 1 reply; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 18:53 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel,
Jason Gunthorpe
On Wed, May 27, 2026 at 10:14:02PM +0000, Pranjal Shrivastava wrote:
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 93cee32f6c99..191b5b5b805c 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2257,10 +2257,9 @@ static irqreturn_t arm_smmu_priq_thread(int irq, void *dev)
>
> static int arm_smmu_device_disable(struct arm_smmu_device *smmu);
>
> -static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
> +static irqreturn_t arm_smmu_handle_gerror(struct arm_smmu_device *smmu)
We don't need a lock because the new caller would disable the IRQ.
Yet, maybe add a line of comments:
/* Lockless; must ensure that there are no concurrent callers */
Otherwise,
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 08/11] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-05-27 22:14 ` [PATCH v7 08/11] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
@ 2026-05-28 19:39 ` Nicolin Chen
2026-05-28 21:21 ` Pranjal Shrivastava
0 siblings, 1 reply; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 19:39 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Wed, May 27, 2026 at 10:14:04PM +0000, Pranjal Shrivastava wrote:
> +/* Runtime PM helpers */
> +__maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
> +{
> + int ret;
> +
> + if (pm_runtime_enabled(smmu->dev)) {
> + ret = pm_runtime_resume_and_get(smmu->dev);
> + if (ret < 0) {
> + dev_err(smmu->dev, "failed to resume device: %d\n", ret);
> + return ret;
> + }
> + }
> +
> + return 0;
Nit: ret is used within the first if.
Yet, I wouldn't like it to move. So, maybe do an early return:
if (!pm_runtime_enabled(smmu->dev))
return 0;
> +__maybe_unused static void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
> +{
> + int ret;
> +
> + if (pm_runtime_enabled(smmu->dev)) {
> + ret = pm_runtime_put_autosuspend(smmu->dev);
> + if (ret < 0)
> + dev_err(smmu->dev, "failed to suspend device: %d\n", ret);
> + }
> +}
Ditto
> +
> +static inline u32 arm_smmu_cmdq_owner_prod_idx(struct arm_smmu_cmdq *cmdq)
> +{
> + return atomic_read(&cmdq->owner_prod) & CMDQ_PROD_IDX_MASK;
> +}
> +
> static void parse_driver_options(struct arm_smmu_device *smmu)
> {
> int i = 0;
> @@ -789,7 +822,8 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> /* b. Stop gathering work by clearing the owned flag */
> prod = atomic_fetch_andnot_relaxed(CMDQ_PROD_OWNED_FLAG,
> &cmdq->q.llq.atomic.prod);
> - prod &= ~CMDQ_PROD_OWNED_FLAG;
> + /* Strip all metadata flags */
> + prod &= CMDQ_PROD_IDX_MASK;
Should its prior atomic_fetch_andnot_relaxed() call do something
about the CMDQ_PROD_STOP_FLAG as well?
> +/*
> + * Lockless pre-check to elide invalidations if SMMU is suspended.
> + * Races with concurrent suspend are benign: the cmpxchg loop in
> + * arm_smmu_cmdq_issue_cmdlist() acts as the true commit point.
> + * If we lose the race, that loop observes Q_STOP == 1 and safely
> + * drops the command. If we win, the suspend thread waits for us.
> + */
> +static inline bool arm_smmu_can_elide(struct arm_smmu_device *smmu)
> +{
> + return !!Q_STOP(READ_ONCE(smmu->cmdq.q.llq.prod));
> +}
arm_smmu_cmdq_can_elide()
Should it handle a secondary_cmdq?
Nicolin
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 07/11] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions
2026-05-27 22:14 ` [PATCH v7 07/11] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions Pranjal Shrivastava
@ 2026-05-28 19:41 ` Nicolin Chen
2026-05-28 21:57 ` Pranjal Shrivastava
0 siblings, 1 reply; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 19:41 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Wed, May 27, 2026 at 10:14:03PM +0000, Pranjal Shrivastava wrote:
> Introduce a new bit flag, CMDQ_PROD_STOP_FLAG (bit 30), in the command
> queue's producer index to safely gate command submissions during device
> suspension.
>
> The flag embeds the suspend state directly into the existing global state
> The flag checked in the compxchg loop in arm_smmu_cmdq_issue_cmdlist(),
s/compxchg/cmpxchg
> which acts as a Point of Commitment, ensuring that no indices are
> reserved or committed once the SMMU begins suspending.
>
> This prevents a situation of "abandoned batches" where indices are
> incremented but commands are never written, which would otherwise
> lead to timeout during the drain poll.
>
> Update queue_inc_prod_n() to preserve this flag during index
> calculations, ensuring that any in-flight commands that successfully
> passed the point of commitment can proceed to completion while the
> flag remains set.
>
> Suggested-by: Daniel Mentz <danielmentz@google.com>
> Signed-off-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 09/11] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks
2026-05-27 22:14 ` [PATCH v7 09/11] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
@ 2026-05-28 20:13 ` Nicolin Chen
2026-05-28 21:36 ` Pranjal Shrivastava
0 siblings, 1 reply; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 20:13 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Wed, May 27, 2026 at 10:14:05PM +0000, Pranjal Shrivastava wrote:
> @@ -5592,11 +5595,18 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
> if (ret)
> goto err_disable;
>
> + if (dev->pm_domain) {
> + pm_runtime_set_active(dev);
> + pm_runtime_use_autosuspend(dev);
> + pm_runtime_set_autosuspend_delay(dev, RPM_AUTOSUSPEND_DELAY_MS);
> + pm_runtime_enable(dev);
> + }
> +
> /* And we're up. Go go go! */
> ret = iommu_device_sysfs_add(&smmu->iommu, dev, NULL,
> "smmu3.%pa", &ioaddr);
> if (ret)
> - goto err_disable;
> + goto err_rpm_disable;
>
> ret = iommu_device_register(&smmu->iommu, &arm_smmu_ops, dev);
> if (ret) {
Can we move the entire pm_runtime_* piece at the end of this probe
function, like arm-smmu v2 driver does?
It is no-fail, so we wouldn't need to add the revert path.
Otherwise,
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 10/11] iommu/arm-smmu-v3: Invoke pm_runtime before hw access
2026-05-27 22:14 ` [PATCH v7 10/11] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
@ 2026-05-28 20:28 ` Nicolin Chen
2026-05-28 21:46 ` Pranjal Shrivastava
0 siblings, 1 reply; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 20:28 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Wed, May 27, 2026 at 10:14:06PM +0000, Pranjal Shrivastava wrote:
> TLB and CFG invalidations are
> elided if the SMMU is suspended by observing the CMDQ_PROD_STOP_FLAG via
> the arm_smmu_can_elide() helper.
All the arm_smmu_can_elide() call sites here would eventually elide
the commands in arm_smmu_cmdq_issue_cmdlist() that is already gated
by CMDQ_PROD_STOP_FLAG? It doesn't seem necessary to gate again?
Nicolin
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 08/11] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-05-28 19:39 ` Nicolin Chen
@ 2026-05-28 21:21 ` Pranjal Shrivastava
2026-05-28 22:13 ` Nicolin Chen
0 siblings, 1 reply; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-28 21:21 UTC (permalink / raw)
To: Nicolin Chen
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Thu, May 28, 2026 at 12:39:46PM -0700, Nicolin Chen wrote:
> On Wed, May 27, 2026 at 10:14:04PM +0000, Pranjal Shrivastava wrote:
> > +/* Runtime PM helpers */
> > +__maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
> > +{
> > + int ret;
> > +
> > + if (pm_runtime_enabled(smmu->dev)) {
> > + ret = pm_runtime_resume_and_get(smmu->dev);
> > + if (ret < 0) {
> > + dev_err(smmu->dev, "failed to resume device: %d\n", ret);
> > + return ret;
> > + }
> > + }
> > +
> > + return 0;
>
> Nit: ret is used within the first if.
>
> Yet, I wouldn't like it to move. So, maybe do an early return:
>
> if (!pm_runtime_enabled(smmu->dev))
> return 0;
>
Ack. I'll add an early return.
> > +__maybe_unused static void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
> > +{
> > + int ret;
> > +
> > + if (pm_runtime_enabled(smmu->dev)) {
> > + ret = pm_runtime_put_autosuspend(smmu->dev);
> > + if (ret < 0)
> > + dev_err(smmu->dev, "failed to suspend device: %d\n", ret);
> > + }
> > +}
>
> Ditto
>
Same here.
> > +
> > +static inline u32 arm_smmu_cmdq_owner_prod_idx(struct arm_smmu_cmdq *cmdq)
> > +{
> > + return atomic_read(&cmdq->owner_prod) & CMDQ_PROD_IDX_MASK;
> > +}
> > +
> > static void parse_driver_options(struct arm_smmu_device *smmu)
> > {
> > int i = 0;
> > @@ -789,7 +822,8 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> > /* b. Stop gathering work by clearing the owned flag */
> > prod = atomic_fetch_andnot_relaxed(CMDQ_PROD_OWNED_FLAG,
> > &cmdq->q.llq.atomic.prod);
> > - prod &= ~CMDQ_PROD_OWNED_FLAG;
> > + /* Strip all metadata flags */
> > + prod &= CMDQ_PROD_IDX_MASK;
>
> Should its prior atomic_fetch_andnot_relaxed() call do something
> about the CMDQ_PROD_STOP_FLAG as well?
Umm.. No, the atomic_fetch_andnot_relaxed() call must leave the STOP_FLAG.
This block is the owner-publish phase, which occurs *after* the Point of
Commitment. If a submission successfully reserved its indices before the
gate closed, it shall be allowed to finish.
If the owner thread cleared the STOP_FLAG here in the global memory, it
would prematurely re-open the gate, allowing new racing submissions to
leak in during the suspend sequence.
The algorithm works mainly because the RPM callbacks are the only ones
that are allowed to manipulate this flag.
>
> > +/*
> > + * Lockless pre-check to elide invalidations if SMMU is suspended.
> > + * Races with concurrent suspend are benign: the cmpxchg loop in
> > + * arm_smmu_cmdq_issue_cmdlist() acts as the true commit point.
> > + * If we lose the race, that loop observes Q_STOP == 1 and safely
> > + * drops the command. If we win, the suspend thread waits for us.
> > + */
> > +static inline bool arm_smmu_can_elide(struct arm_smmu_device *smmu)
> > +{
> > + return !!Q_STOP(READ_ONCE(smmu->cmdq.q.llq.prod));
> > +}
>
> arm_smmu_cmdq_can_elide()
Nice name! I'll update it.
>
> Should it handle a secondary_cmdq?
No, I don't think we need to check secondary queues here. The STOP_FLAG
being set on the primary CMDQ during the suspend should suffice to
indicate the entire SMMU's power state. That's why the function takes in
the smmu ptr and not the cmdq.
Altough, thanks for discussing this, I noticed that while rebasing I
missed the elision check at one place: arm_smmu_write_ste()
I'll take add a check in that too in the next version.
Did you have another scenario / use-case in mind where this might not
work?
Thanks,
Praan
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 09/11] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks
2026-05-28 20:13 ` Nicolin Chen
@ 2026-05-28 21:36 ` Pranjal Shrivastava
0 siblings, 0 replies; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-28 21:36 UTC (permalink / raw)
To: Nicolin Chen
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Thu, May 28, 2026 at 01:13:40PM -0700, Nicolin Chen wrote:
> On Wed, May 27, 2026 at 10:14:05PM +0000, Pranjal Shrivastava wrote:
> > @@ -5592,11 +5595,18 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
> > if (ret)
> > goto err_disable;
> >
> > + if (dev->pm_domain) {
> > + pm_runtime_set_active(dev);
> > + pm_runtime_use_autosuspend(dev);
> > + pm_runtime_set_autosuspend_delay(dev, RPM_AUTOSUSPEND_DELAY_MS);
> > + pm_runtime_enable(dev);
> > + }
> > +
> > /* And we're up. Go go go! */
> > ret = iommu_device_sysfs_add(&smmu->iommu, dev, NULL,
> > "smmu3.%pa", &ioaddr);
> > if (ret)
> > - goto err_disable;
> > + goto err_rpm_disable;
> >
> > ret = iommu_device_register(&smmu->iommu, &arm_smmu_ops, dev);
> > if (ret) {
>
> Can we move the entire pm_runtime_* piece at the end of this probe
> function, like arm-smmu v2 driver does?
>
> It is no-fail, so we wouldn't need to add the revert path.
Hmm.. I kept it earlier since I thought iommu_device_register starts the
whole chain of probing masters, def_domain attach etc. but I guess it
could work, the client devices can attach during the initial probe
as pm_runtime remains disabled until registration is complete.
I'll just check if there are some weird edge cases here, just to be sure
>
> Otherwise,
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Thanks!
Praan
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 11/11] iommu/arm-smmu-v3: Add KUnit unit tests for Runtime PM
2026-05-27 22:14 ` [PATCH v7 11/11] iommu/arm-smmu-v3: Add KUnit unit tests for Runtime PM Pranjal Shrivastava
@ 2026-05-28 21:43 ` Nicolin Chen
0 siblings, 0 replies; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 21:43 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Wed, May 27, 2026 at 10:14:07PM +0000, Pranjal Shrivastava wrote:
> Introduce a kunit selftests to verify the Runtime PM elision gating,
> post-suspend elisions and progress on resumption under active
> invalidation load. Simulate concurrent HW suspension using a timer.
> Mock all HW registers and CMDQ buffers by allocating them on RAM.
> Make the mock CMDQ self-consuming to avoid hitting queue_full scenarios.
>
> Signed-off-by: Pranjal Shrivastava <praan@google.com>
Maybe I am wrong, yet this reads like an entirely AI-generated code.
So, should it have an Assisted-by?
Documentation/process/submitting-patches.rst:637:Using Assisted-by:
Documentation/process/submitting-patches.rst:641:you need to acknowledge that use by adding an Assisted-by tag. Failure to
Documentation/process/coding-assistants.rst:44:Contributions should include an Assisted-by tag in the following format::
Documentation/process/coding-assistants.rst:46: Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]
Documentation/process/coding-assistants.rst:59: Assisted-by: Claude:claude-3-opus coccinelle sparse
> +struct arm_smmu_test_pm_context {
> + u32 *mock_prod_reg;
> + u64 *mock_base;
> + unsigned long *mock_valid_map;
> +};
It seems all about cmdq only, so maybe:
struct arm_smmu_mock_cmdq
?
> +/*
> + * Shared helper to allocate and map a fully functional, self-consuming mock
Is it really a "Shared" helper? I see only one caller..
> + * CMDQ. By redirecting both prod_reg and cons_reg to the same memory location,
> + * every producer write is instantly reflected as a consumer completion.
> + */
> +static struct arm_smmu_test_pm_context *
> +arm_smmu_v3_test_init_mock_cmdq(struct kunit *test, struct arm_smmu_device *smmu)
> +{
> + struct arm_smmu_test_pm_context *ctx;
> + struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
> +
> + ctx = kunit_kzalloc(test, sizeof(*ctx), GFP_KERNEL);
> + KUNIT_ASSERT_NOT_NULL(test, ctx);
This ctx struct isn't that large. Maybe the test function (caller)
can put on the stack and pass its point int than allocation?
> + /* Use a standard 1024-entry queue for safe word bitmask alignment */
> + ctx->mock_prod_reg = kunit_kzalloc(test, sizeof(u32), GFP_KERNEL);
Is a u32 allocation necessary?
> + ctx->mock_base = kunit_kzalloc(test, 1024 * 16, GFP_KERNEL);
s/16/sizeof(struct arm_smmu_cmd)
> + ctx->mock_valid_map = kunit_kzalloc(test, BITS_TO_LONGS(1024) * sizeof(long), GFP_KERNEL);
> +
> + KUNIT_ASSERT_NOT_NULL(test, ctx->mock_prod_reg);
> + KUNIT_ASSERT_NOT_NULL(test, ctx->mock_base);
> + KUNIT_ASSERT_NOT_NULL(test, ctx->mock_valid_map);
> +
> + smmu->features = 0;
> + /* 1024 entries */
> + cmdq->q.llq.max_n_shift = 10;
> + /* SMMUv3 commands are 2 dwords (16 bytes) */
> + cmdq->q.ent_dwords = 2;
CMDQ_ENT_DWORDS?
> + cmdq->q.base = (__le64 *)ctx->mock_base;
ctx->mock_base doesn't seem to have any other use? Why does it have
to be in the ctx structure?
> + cmdq->valid_map = (atomic_long_t *)ctx->mock_valid_map;
Same questions.
> + /* Self-Consuming, prod == cons always ensures queue empty */
> + cmdq->q.prod_reg = (__force u32 __iomem *)ctx->mock_prod_reg;
> + cmdq->q.cons_reg = (__force u32 __iomem *)ctx->mock_prod_reg;
> +
> + /* Initialize memory indices */
> + atomic_set(&cmdq->q.llq.atomic.prod, 0);
> + atomic_set(&cmdq->q.llq.atomic.cons, 0);
> + atomic_set(&cmdq->owner_prod, 0);
> + *ctx->mock_prod_reg = 0;
Arguably, mock_prod_reg is only used to test against a prod snapshot
so it doesn't seem necessary to be in a structure.
> + return ctx;
> +}
> +
> +struct arm_smmu_test_timer_context {
> + struct kunit *test;
> + struct arm_smmu_device *smmu;
> + struct arm_smmu_test_pm_context *pm_ctx;
pm_ctx doesn't seem to be used by the timer callback function?
Is it necessary in this structure? Even "test" isn't used.
>. + struct timer_list timer;
> + bool stopped;
> +};
> +
> +/* Asynchronous timer callback running in softirq context */
Would be nicer to replace this with the inline comments:
/* Simulate a concurrent runtime suspend event interrupting the invalidations */
> +static void arm_smmu_v3_test_rpm_timer_callback(struct timer_list *t)
> +{
> + struct arm_smmu_test_timer_context *ctx =
> + timer_container_of(ctx, t, timer);
> + struct arm_smmu_cmdq *cmdq = &ctx->smmu->cmdq;
> +
> + /*
> + * Asynchronously set the STOP_FLAG (Close the gate).
> + * Simulate a concurrent runtime suspend event interrupting
> + * the invalidations happening under active load.
> + */
> + atomic_or(CMDQ_PROD_STOP_FLAG, &cmdq->q.llq.atomic.prod);
> +
> + /* Signal to the main storm loop that suspend has triggered */
> + WRITE_ONCE(ctx->stopped, true);
Then, it should be named "suspended".
> +}
> +
> +/*
> + * Verify SMMU PM Runtime gating, elision, and post-suspend resumption
> + * safety sequentially under active stress.
> + */
> +static void arm_smmu_v3_test_rpm_stress_race(struct kunit *test)
> +{
> + struct arm_smmu_device *smmu = kunit_kzalloc(test, sizeof(*smmu), GFP_KERNEL);
arm-smmu-v3-test has a global arm_smmu_device to use.
> + struct arm_smmu_test_pm_context *pm_ctx;
> + struct arm_smmu_test_timer_context *timer_ctx;
> + struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
> + struct arm_smmu_cmd cmd = arm_smmu_make_cmd_cfgi_all();
Can we organize this a bit? Maybe in an inverted xmas tree?
Also, it would be nicer to run git-clang-format.
> + u32 stopped_prod;
> + int i, ret;
> +
> + KUNIT_ASSERT_NOT_NULL(test, smmu);
> + pm_ctx = arm_smmu_v3_test_init_mock_cmdq(test, smmu);
> +
> + timer_ctx = kunit_kmalloc(test, sizeof(*timer_ctx), GFP_KERNEL);
> + KUNIT_ASSERT_NOT_NULL(test, timer_ctx);
The allocation doesn't seem necessary either. We use kunit_kzalloc
in this file only when the memory (usually an array) is too large.
> + timer_ctx->stopped = false;
Then, should have been kunit_kzalloc() at least. Or " = {0};", if
keep it on the stack.
> + /* Setup the kernel software timer */
> + timer_setup(&timer_ctx->timer, arm_smmu_v3_test_rpm_timer_callback, 0);
> +
> + /* Start the timer to fire in 10 jiffies */
> + mod_timer(&timer_ctx->timer, jiffies + msecs_to_jiffies(10));
Both lines reads pretty clear without that verbose comments...
> +
> + /* Execute the unmap storm until the timer triggers */
> + while (!READ_ONCE(timer_ctx->stopped)) {
> + ret = arm_smmu_cmdq_issue_cmdlist(smmu, cmdq, &cmd, 1, false);
> + if (ret != 0) {
> + KUNIT_FAIL(test, "Unmap Storm loop failed to submit cmd list: %d\n", ret);
> + break;
> + }
As of now, arm_smmu_cmdq_issue_cmdlist() fails only on -ETIMEDOUT
and spits a print already. So, maybe simplify it:
if (arm_smmu_cmdq_issue_cmdlist())
break;
> static struct kunit_case arm_smmu_v3_test_cases[] = {
> + KUNIT_CASE(arm_smmu_v3_test_rpm_stress_race),
> KUNIT_CASE(arm_smmu_v3_write_ste_test_bypass_to_abort),
> KUNIT_CASE(arm_smmu_v3_write_ste_test_abort_to_bypass),
> KUNIT_CASE(arm_smmu_v3_write_ste_test_cdtable_to_abort),
Usually we append to the end. And following the naming convention:
arm_smmu_v3_rpm_test_stress_race
Nicolin
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 10/11] iommu/arm-smmu-v3: Invoke pm_runtime before hw access
2026-05-28 20:28 ` Nicolin Chen
@ 2026-05-28 21:46 ` Pranjal Shrivastava
2026-05-28 22:01 ` Nicolin Chen
0 siblings, 1 reply; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-28 21:46 UTC (permalink / raw)
To: Nicolin Chen
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Thu, May 28, 2026 at 01:28:15PM -0700, Nicolin Chen wrote:
> On Wed, May 27, 2026 at 10:14:06PM +0000, Pranjal Shrivastava wrote:
> > TLB and CFG invalidations are
> > elided if the SMMU is suspended by observing the CMDQ_PROD_STOP_FLAG via
> > the arm_smmu_can_elide() helper.
>
> All the arm_smmu_can_elide() call sites here would eventually elide
> the commands in arm_smmu_cmdq_issue_cmdlist() that is already gated
> by CMDQ_PROD_STOP_FLAG? It doesn't seem necessary to gate again?
While issue_cmdlist() would eventually elide these commands, the
can_elide() check is necessary to return early during suspension.
This avoids unnecessary stack allocation, cmd building, and spinlock
contention on the cmdq->lock for threads that are anyway about to be
elided.
By dropping these requests immediately, we significantly reduce cacheline
bouncing and contention during unmap storms. Furthermore, the early check
also allows us to specifically trigger the WARN_ON_ONCE() for broken
devlinks.
Thanks,
Praan
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 05/11] iommu/arm-smmu-v3: Cache and restore MSI config
2026-05-28 18:36 ` Nicolin Chen
@ 2026-05-28 21:57 ` Pranjal Shrivastava
2026-05-28 22:03 ` Nicolin Chen
0 siblings, 1 reply; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-28 21:57 UTC (permalink / raw)
To: Nicolin Chen
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Thu, May 28, 2026 at 11:36:06AM -0700, Nicolin Chen wrote:
> On Wed, May 27, 2026 at 10:14:01PM +0000, Pranjal Shrivastava wrote:
> > The SMMU's MSI configuration registers (*_IRQ_CFGn) containing target
> > address, data and memory attributes lose their state when the SMMU is
> > powered down. We'll need to cache and restore their contents to ensure
> > that MSIs work after the system resumes.
> >
> > To address this, cache the original `msi_msg` within the `msi_desc`
> > when the configuration is first written by `arm_smmu_write_msi_msg`.
> > This primarily includes the target address and data since the memory
> > attributes are fixed.
> >
> > Introduce a new helper `arm_smmu_resume_msis` which will later be called
> > during the driver's resume callback. The helper would retrieve the
> > cached MSI message for each relevant interrupt (evtq, gerr, priq) via
> > get_cached_msi_msg & re-config the registers via arm_smmu_write_msi_msg.
> >
> > Signed-off-by: Pranjal Shrivastava <praan@google.com>
>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
>
> > +static void arm_smmu_resume_msi(struct arm_smmu_device *smmu,
> > + unsigned int irq, const char *name)
> > +{
> > + struct msi_desc *desc;
> > + struct msi_msg msg;
> > +
> > + if (!irq)
> > + return;
> > +
> > + desc = irq_get_msi_desc(irq);
> > + if (!desc) {
> > + dev_err(smmu->dev, "Failed to resume msi: %s", name);
> > + return;
> > + }
> > +
> > + get_cached_msi_msg(irq, &msg);
>
> I wonder if we could use __get_cached_msi_msg() since we got desc?
We could've (and that's a nice suggestion), but unfortunately it
(__get_cached_msi_msg) is not exported via EXPORT_SYMBOL_GPL() :(
We could break modular builds as CONFIG_ARM_SMMU_V3 is tristate.
Let me know if you have a strong preference about exporting it, we can
add a patch in the next versions.
Thanks,
Praan
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 07/11] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions
2026-05-28 19:41 ` Nicolin Chen
@ 2026-05-28 21:57 ` Pranjal Shrivastava
0 siblings, 0 replies; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-28 21:57 UTC (permalink / raw)
To: Nicolin Chen
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Thu, May 28, 2026 at 12:41:35PM -0700, Nicolin Chen wrote:
> On Wed, May 27, 2026 at 10:14:03PM +0000, Pranjal Shrivastava wrote:
> > Introduce a new bit flag, CMDQ_PROD_STOP_FLAG (bit 30), in the command
> > queue's producer index to safely gate command submissions during device
> > suspension.
> >
> > The flag embeds the suspend state directly into the existing global state
> > The flag checked in the compxchg loop in arm_smmu_cmdq_issue_cmdlist(),
>
> s/compxchg/cmpxchg
Ack. I'll fix it.
>
> > which acts as a Point of Commitment, ensuring that no indices are
> > reserved or committed once the SMMU begins suspending.
> >
> > This prevents a situation of "abandoned batches" where indices are
> > incremented but commands are never written, which would otherwise
> > lead to timeout during the drain poll.
> >
> > Update queue_inc_prod_n() to preserve this flag during index
> > calculations, ensuring that any in-flight commands that successfully
> > passed the point of commitment can proceed to completion while the
> > flag remains set.
> >
> > Suggested-by: Daniel Mentz <danielmentz@google.com>
> > Signed-off-by: Pranjal Shrivastava <praan@google.com>
>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Thanks!
Praan
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 06/11] iommu/arm-smmu-v3: Handle gerror during suspend
2026-05-28 18:53 ` Nicolin Chen
@ 2026-05-28 21:59 ` Pranjal Shrivastava
0 siblings, 0 replies; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-28 21:59 UTC (permalink / raw)
To: Nicolin Chen
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel,
Jason Gunthorpe
On Thu, May 28, 2026 at 11:53:42AM -0700, Nicolin Chen wrote:
> On Wed, May 27, 2026 at 10:14:02PM +0000, Pranjal Shrivastava wrote:
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 93cee32f6c99..191b5b5b805c 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -2257,10 +2257,9 @@ static irqreturn_t arm_smmu_priq_thread(int irq, void *dev)
> >
> > static int arm_smmu_device_disable(struct arm_smmu_device *smmu);
> >
> > -static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
> > +static irqreturn_t arm_smmu_handle_gerror(struct arm_smmu_device *smmu)
>
> We don't need a lock because the new caller would disable the IRQ.
>
> Yet, maybe add a line of comments:
> /* Lockless; must ensure that there are no concurrent callers */
Ack, I'll add the comment.
>
> Otherwise,
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Thanks!
Praan
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 10/11] iommu/arm-smmu-v3: Invoke pm_runtime before hw access
2026-05-28 21:46 ` Pranjal Shrivastava
@ 2026-05-28 22:01 ` Nicolin Chen
2026-05-28 22:25 ` Pranjal Shrivastava
0 siblings, 1 reply; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 22:01 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Thu, May 28, 2026 at 09:46:33PM +0000, Pranjal Shrivastava wrote:
> On Thu, May 28, 2026 at 01:28:15PM -0700, Nicolin Chen wrote:
> > On Wed, May 27, 2026 at 10:14:06PM +0000, Pranjal Shrivastava wrote:
> > > TLB and CFG invalidations are
> > > elided if the SMMU is suspended by observing the CMDQ_PROD_STOP_FLAG via
> > > the arm_smmu_can_elide() helper.
> >
> > All the arm_smmu_can_elide() call sites here would eventually elide
> > the commands in arm_smmu_cmdq_issue_cmdlist() that is already gated
> > by CMDQ_PROD_STOP_FLAG? It doesn't seem necessary to gate again?
>
> While issue_cmdlist() would eventually elide these commands, the
> can_elide() check is necessary to return early during suspension.
>
> This avoids unnecessary stack allocation, cmd building, and spinlock
> contention on the cmdq->lock for threads that are anyway about to be
> elided.
We aren't in the perf sensitive path.. most of those aren't going
to be that bad.
arm_smmu_cmdq_shared_lock() on the other hand is taken at step 2,
and the STOP flag in the same function is gated at step 1?
> By dropping these requests immediately, we significantly reduce cacheline
> bouncing and contention during unmap storms.
How significantly, so as to justify invading every command issue()
call site, which would be difficult to maintain? If we really need
an early return, it would be nicer to have a common place at least.
> Furthermore, the early check
> also allows us to specifically trigger the WARN_ON_ONCE() for broken
> devlinks.
This doesn't sound very appealing to me.. And even if there is a
very plausible reason, it should be noted in the commit message.
Nicolin
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 05/11] iommu/arm-smmu-v3: Cache and restore MSI config
2026-05-28 21:57 ` Pranjal Shrivastava
@ 2026-05-28 22:03 ` Nicolin Chen
0 siblings, 0 replies; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 22:03 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Thu, May 28, 2026 at 09:57:04PM +0000, Pranjal Shrivastava wrote:
> On Thu, May 28, 2026 at 11:36:06AM -0700, Nicolin Chen wrote:
> > On Wed, May 27, 2026 at 10:14:01PM +0000, Pranjal Shrivastava wrote:
> > > + get_cached_msi_msg(irq, &msg);
> >
> > I wonder if we could use __get_cached_msi_msg() since we got desc?
>
> We could've (and that's a nice suggestion), but unfortunately it
> (__get_cached_msi_msg) is not exported via EXPORT_SYMBOL_GPL() :(
> We could break modular builds as CONFIG_ARM_SMMU_V3 is tristate.
>
> Let me know if you have a strong preference about exporting it, we can
> add a patch in the next versions.
Not necessary. I see all drivers use get_cached_msi_msg() too.
Nicolin
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 02/11] iommu/arm-smmu-v3: Add a helper to drain cmd queues
2026-05-28 10:34 ` Pranjal Shrivastava
@ 2026-05-28 22:09 ` Nicolin Chen
0 siblings, 0 replies; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 22:09 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Thu, May 28, 2026 at 10:34:42AM +0000, Pranjal Shrivastava wrote:
> On Wed, May 27, 2026 at 06:35:56PM -0700, Nicolin Chen wrote:
> > On Wed, May 27, 2026 at 10:13:58PM +0000, Pranjal Shrivastava wrote:
> > > +static int arm_smmu_drain_queues(struct arm_smmu_device *smmu)
> >
> > So, this function has a very generic name for queues but its use
> > case is limited to cmdq in a suspend context, and might not work
> > for other queues.
> >
> > I am adding another helper for in-flight evetq and priq:
> > - Drain a queue for a detaching master; other attached masters
> > might still advance the prod, so queue wouldn't be empty.
> > - evtq and priq sets wfe=false.
> >
> > So, there is a naming conflict here.
> >
> > Maybe:
> > arm_smmu_drain_queue_for_cmdq()
> > arm_smmu_drain_queue_for_iopf()
> > ?
>
> Ack. Sorry about that, the earlier versions were draining the other
> queues too but that has changed from the last few versions.
>
> I'm planning to rename it to arm_smmu_drain_cmdq(), sounds good?
That sounds okay.
Though my series is still at the early review stage, I would like
to merge those two functions at some point
Do you know if my implementation can work for cmdq too? Seems that
only the WFE would be different, which can be a bool?
Nicolin
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 08/11] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-05-28 21:21 ` Pranjal Shrivastava
@ 2026-05-28 22:13 ` Nicolin Chen
0 siblings, 0 replies; 34+ messages in thread
From: Nicolin Chen @ 2026-05-28 22:13 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Thu, May 28, 2026 at 09:21:22PM +0000, Pranjal Shrivastava wrote:
> On Thu, May 28, 2026 at 12:39:46PM -0700, Nicolin Chen wrote:
> > On Wed, May 27, 2026 at 10:14:04PM +0000, Pranjal Shrivastava wrote:
> > > @@ -789,7 +822,8 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> > > /* b. Stop gathering work by clearing the owned flag */
> > > prod = atomic_fetch_andnot_relaxed(CMDQ_PROD_OWNED_FLAG,
> > > &cmdq->q.llq.atomic.prod);
> > > - prod &= ~CMDQ_PROD_OWNED_FLAG;
> > > + /* Strip all metadata flags */
> > > + prod &= CMDQ_PROD_IDX_MASK;
> >
> > Should its prior atomic_fetch_andnot_relaxed() call do something
> > about the CMDQ_PROD_STOP_FLAG as well?
>
> Umm.. No, the atomic_fetch_andnot_relaxed() call must leave the STOP_FLAG.
> This block is the owner-publish phase, which occurs *after* the Point of
> Commitment. If a submission successfully reserved its indices before the
> gate closed, it shall be allowed to finish.
>
> If the owner thread cleared the STOP_FLAG here in the global memory, it
> would prematurely re-open the gate, allowing new racing submissions to
> leak in during the suspend sequence.
>
> The algorithm works mainly because the RPM callbacks are the only ones
> that are allowed to manipulate this flag.
That reads plausible to me. It would be nicer to leave a brief note
inline.
> > Should it handle a secondary_cmdq?
>
> No, I don't think we need to check secondary queues here. The STOP_FLAG
> being set on the primary CMDQ during the suspend should suffice to
> indicate the entire SMMU's power state. That's why the function takes in
> the smmu ptr and not the cmdq.
But if we don't set STOP flag to the secondary_cmdq, wouldn't the
drain() timeout on the secondary_cmdq that does not elide?
Nicolin
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v7 10/11] iommu/arm-smmu-v3: Invoke pm_runtime before hw access
2026-05-28 22:01 ` Nicolin Chen
@ 2026-05-28 22:25 ` Pranjal Shrivastava
0 siblings, 0 replies; 34+ messages in thread
From: Pranjal Shrivastava @ 2026-05-28 22:25 UTC (permalink / raw)
To: Nicolin Chen
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz, Ashish Mhetre, linux-arm-kernel
On Thu, May 28, 2026 at 03:01:13PM -0700, Nicolin Chen wrote:
> On Thu, May 28, 2026 at 09:46:33PM +0000, Pranjal Shrivastava wrote:
> > On Thu, May 28, 2026 at 01:28:15PM -0700, Nicolin Chen wrote:
> > > On Wed, May 27, 2026 at 10:14:06PM +0000, Pranjal Shrivastava wrote:
> > > > TLB and CFG invalidations are
> > > > elided if the SMMU is suspended by observing the CMDQ_PROD_STOP_FLAG via
> > > > the arm_smmu_can_elide() helper.
> > >
> > > All the arm_smmu_can_elide() call sites here would eventually elide
> > > the commands in arm_smmu_cmdq_issue_cmdlist() that is already gated
> > > by CMDQ_PROD_STOP_FLAG? It doesn't seem necessary to gate again?
> >
> > While issue_cmdlist() would eventually elide these commands, the
> > can_elide() check is necessary to return early during suspension.
> >
> > This avoids unnecessary stack allocation, cmd building, and spinlock
> > contention on the cmdq->lock for threads that are anyway about to be
> > elided.
>
> We aren't in the perf sensitive path.. most of those aren't going
> to be that bad.
>
> arm_smmu_cmdq_shared_lock() on the other hand is taken at step 2,
> and the STOP flag in the same function is gated at step 1?
DMA unmaps frequently occur from atomic contexts, interrupt handlers etc
Thee Step 1 check in issue_cmdlist() happens under local_irq_save().
We may argue that it doesn't happen for long though..
>
> > By dropping these requests immediately, we significantly reduce cacheline
> > bouncing and contention during unmap storms.
>
> How significantly, so as to justify invading every command issue()
> call site, which would be difficult to maintain? If we really need
> an early return, it would be nicer to have a common place at least.
Eliding early is more of an early-exit from the DMA unmap paths really..
If maintaining these high-level elision checks at 4 or 5 call sites is
a maintenance burden, maybe we could move the logic into the issue_cmd
macros?
Thanks,
Praan
^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2026-05-28 22:25 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-27 22:13 [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2026-05-27 22:13 ` [PATCH v7 01/11] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
2026-05-27 22:13 ` [PATCH v7 02/11] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
2026-05-28 1:35 ` Nicolin Chen
2026-05-28 10:34 ` Pranjal Shrivastava
2026-05-28 22:09 ` Nicolin Chen
2026-05-27 22:13 ` [PATCH v7 03/11] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
2026-05-27 22:14 ` [PATCH v7 04/11] iommu/tegra241-cmdqv: Restore PROD and CONS after resume Pranjal Shrivastava
2026-05-28 18:14 ` Nicolin Chen
2026-05-27 22:14 ` [PATCH v7 05/11] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
2026-05-28 18:36 ` Nicolin Chen
2026-05-28 21:57 ` Pranjal Shrivastava
2026-05-28 22:03 ` Nicolin Chen
2026-05-27 22:14 ` [PATCH v7 06/11] iommu/arm-smmu-v3: Handle gerror during suspend Pranjal Shrivastava
2026-05-28 18:53 ` Nicolin Chen
2026-05-28 21:59 ` Pranjal Shrivastava
2026-05-27 22:14 ` [PATCH v7 07/11] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions Pranjal Shrivastava
2026-05-28 19:41 ` Nicolin Chen
2026-05-28 21:57 ` Pranjal Shrivastava
2026-05-27 22:14 ` [PATCH v7 08/11] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
2026-05-28 19:39 ` Nicolin Chen
2026-05-28 21:21 ` Pranjal Shrivastava
2026-05-28 22:13 ` Nicolin Chen
2026-05-27 22:14 ` [PATCH v7 09/11] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
2026-05-28 20:13 ` Nicolin Chen
2026-05-28 21:36 ` Pranjal Shrivastava
2026-05-27 22:14 ` [PATCH v7 10/11] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
2026-05-28 20:28 ` Nicolin Chen
2026-05-28 21:46 ` Pranjal Shrivastava
2026-05-28 22:01 ` Nicolin Chen
2026-05-28 22:25 ` Pranjal Shrivastava
2026-05-27 22:14 ` [PATCH v7 11/11] iommu/arm-smmu-v3: Add KUnit unit tests for Runtime PM Pranjal Shrivastava
2026-05-28 21:43 ` Nicolin Chen
2026-05-28 18:05 ` [PATCH v7 00/11] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Nicolin Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox