* [RFC PATCH v4 0/8] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops
@ 2025-11-17 19:14 Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 1/8] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
` (7 more replies)
0 siblings, 8 replies; 19+ messages in thread
From: Pranjal Shrivastava @ 2025-11-17 19:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Pranjal Shrivastava
As arm-smmu-v3 rapidly finds its way into SoCs designed for hand-held
devices, power management capabilities, similar to its predecessors, are
crucial for these applications. This series introduces power management
support for the arm-smmu-v3 driver.
Design
=======
The arm-smmu-v3 primarily operates with in-memory data structures
through HW registers pointing to these data structures. The proposed design
makes use of this fact for implementing suspend and resume ops, centered
around a driver-specific biased usage counter.
1. Biased Usage Counter
To safely manage runtime PM state, this series introduces an atomic
`nr_cmdq_users` counter. This counter is "biased" by being initialized to
1, representing an "idle but active" state.
The suspend callback waits for this counter to be exactly 1 and then
atomically transitions it to 0, signifying a "suspended" state. Any
other value indicates that the command queue is in use, and the suspend
operation is aborted.
Code paths that submit commands to the hardware (the "fast path") use
`atomic_inc_not_zero()` to acquire a reference. This operation only
succeeds if the counter is not zero (i.e., not suspended), effectively
gating hardware access. The reference is dropped using
`atomic_dec_return_release()` after the hardware interaction is
complete.
2. Suspend / Resume Flow
The "suspend" op now polls for a short period (100us) for the usage
counter to become 1. Once successful, it configures the GBPA register
to abort new transactions and disables the SMMU, EVTQ, and PRIQ, leaving
only the CMDQ enabled to drain any in-flight commands.
The "resume" operation uses the `arm_smmu_device_reset` function, which
re-initializes the HW using the SW-copies maintained by the driver
(e.g., prod/cons pointers, queue base addresses) and clears all cached
configurations.
3. Guarding Hardware Access
Instead of eliding operations, the driver now ensures the SMMU is active
before any hardware access. This is managed by `arm_smmu_rpm_get()` and
`arm_smmu_rpm_put()` helpers, which wrap the standard runtime PM calls.
These wrappers are now used throughout the driver in interrupt handlers,
TLB invalidation paths, and any other function that touches hardware
registers. This ensures that operations are implicitly queued until the
device resumes.
4. Interrupt Re-config
a. Wired irqs: The series refactors the `arm_smmu_setup_irqs` to be
able to enable/disable irqs and install their handlers separately to
help with the re-initialization of the interrupts correctly.
b. MSIs: The series relies on caching the `msi_msg` and retrieving it
through a newly introduced helper `arm_smmu_resume_msis()` which
re-configures the *_IRQ_CFGn registers by writing back the cached `msi_msg`.
Call for review
Any insights/comments on the proposed changes are appreciated,
especially in areas related to locking, atomic contexts, and potential
optimizations.
Note: The series isn't tested with MSIs and is weakly tested for PCIe
clients. The same holds true for tegra241_cmdv changes. Any help in
reviewing and testing these parts is much appreciated.
[v4]
- Dropped the `pm_runtime_get_if_not_suspended()` API in favor of a
simpler, driver-specific biased counter (`nr_cmdq_users`) to manage
runtime PM state.
- Reworked the suspend callback to poll on the biased counter before
disabling the SMMU.
- Addressed comments for the MSI refactor.
[v3]
- https://lore.kernel.org/all/20250616203149.2649118-1-praan@google.com/
- Introduced `pm_runtime_get_if_not_suspended` API to avoid races due
to bouncing RPM states while eliding TLBIs as pointed out by Daniel.
- Addressed Nicolin's comments regarding msi_resume and CMDQV flush
- Addressed Daniel's comments about CMDQ locking and draining
- Addressed issues related to draining the evtq and priq
- Dropped the code to identify and track user-space attachments
[v2]
- https://lore.kernel.org/all/20250418233409.3926715-1-praan@google.com/
- Introduced `arm_smmu_rpm_get_if_active` for eliding TLBIs & CFGIs
- Updated the rpm helper invocation strategy.
- Drained all queues in suspend callback (including tegra241-cmdv)
- Cache and restore msi_msg instead of free-ing realloc-ing on resume
- Added support to identify and track user-space attachments
- Fixed the setup_irqs as per Nicolin & Mostafa's suggestions
- Used force_runtime_suspend/resume instead as per Mostafa's suggestion.
- Added "Reviewed-by" line from Mostafa on an unchanged patch
[v1]
- https://lore.kernel.org/all/20250319004254.2547950-1-praan@google.com/
Pranjal Shrivastava (8):
iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs
iommu/arm-smmu-v3: Add a helper to drain cmd queues
iommu/tegra241-cmdqv: Add a helper to drain VCMDQs
iommu/arm-smmu-v3: Cache and restore MSI config
iommu/arm-smmu-v3: Add a usage counter for cmdq
iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks
iommu/arm-smmu-v3: Invoke pm_runtime before hw access
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 10 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 411 ++++++++++++++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 13 +
.../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 27 ++
4 files changed, 417 insertions(+), 44 deletions(-)
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v4 1/8] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs
2025-11-17 19:14 [RFC PATCH v4 0/8] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
@ 2025-11-17 19:14 ` Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 2/8] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
` (6 subsequent siblings)
7 siblings, 0 replies; 19+ messages in thread
From: Pranjal Shrivastava @ 2025-11-17 19:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Pranjal Shrivastava
Refactor arm_smmu_setup_irqs by splitting it into two parts, one for
registering interrupt handlers and the other one for enabling interrupt
generation in the hardware. This refactor helps in re-initialization of
hardware interrupts as part of a subsequent patch that enables runtime
power management for the arm-smmu-v3 driver.
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 47 +++++++++++++--------
1 file changed, 30 insertions(+), 17 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 2a8b46b948f0..c5efd9713775 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4035,14 +4035,32 @@ static void arm_smmu_setup_unique_irqs(struct arm_smmu_device *smmu)
}
}
+static void arm_smmu_enable_irqs(struct arm_smmu_device *smmu)
+{
+ int ret;
+ u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN;
+
+ if (smmu->features & ARM_SMMU_FEAT_PRI)
+ irqen_flags |= IRQ_CTRL_PRIQ_IRQEN;
+
+ ret = arm_smmu_write_reg_sync(smmu, irqen_flags,
+ ARM_SMMU_IRQ_CTRL, ARM_SMMU_IRQ_CTRLACK);
+ if (ret)
+ dev_warn(smmu->dev, "failed to enable irqs\n");
+}
+
+static int arm_smmu_disable_irqs(struct arm_smmu_device *smmu)
+{
+ return arm_smmu_write_reg_sync(smmu, 0, ARM_SMMU_IRQ_CTRL,
+ ARM_SMMU_IRQ_CTRLACK);
+}
+
static int arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
{
int ret, irq;
- u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN;
/* Disable IRQs first */
- ret = arm_smmu_write_reg_sync(smmu, 0, ARM_SMMU_IRQ_CTRL,
- ARM_SMMU_IRQ_CTRLACK);
+ ret = arm_smmu_disable_irqs(smmu);
if (ret) {
dev_err(smmu->dev, "failed to disable irqs\n");
return ret;
@@ -4064,15 +4082,6 @@ static int arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
} else
arm_smmu_setup_unique_irqs(smmu);
- if (smmu->features & ARM_SMMU_FEAT_PRI)
- irqen_flags |= IRQ_CTRL_PRIQ_IRQEN;
-
- /* Enable interrupt generation on the SMMU */
- ret = arm_smmu_write_reg_sync(smmu, irqen_flags,
- ARM_SMMU_IRQ_CTRL, ARM_SMMU_IRQ_CTRLACK);
- if (ret)
- dev_warn(smmu->dev, "failed to enable irqs\n");
-
return 0;
}
@@ -4215,11 +4224,8 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
}
}
- ret = arm_smmu_setup_irqs(smmu);
- if (ret) {
- dev_err(smmu->dev, "failed to setup irqs\n");
- return ret;
- }
+ /* Enable interrupt generation on the SMMU */
+ arm_smmu_enable_irqs(smmu);
if (is_kdump_kernel())
enables &= ~(CR0_EVTQEN | CR0_PRIQEN);
@@ -4818,6 +4824,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
/* Check for RMRs and install bypass STEs if any */
arm_smmu_rmr_install_bypass_ste(smmu);
+ /* Setup interrupt handlers */
+ ret = arm_smmu_setup_irqs(smmu);
+ if (ret) {
+ dev_err(smmu->dev, "failed to setup irqs\n");
+ goto err_free_iopf;
+ }
+
/* Reset the device */
ret = arm_smmu_device_reset(smmu);
if (ret)
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v4 2/8] iommu/arm-smmu-v3: Add a helper to drain cmd queues
2025-11-17 19:14 [RFC PATCH v4 0/8] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 1/8] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
@ 2025-11-17 19:14 ` Pranjal Shrivastava
2025-11-17 19:31 ` Nicolin Chen
2025-11-18 3:48 ` Daniel Mentz
2025-11-17 19:14 ` [PATCH v4 3/8] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
` (5 subsequent siblings)
7 siblings, 2 replies; 19+ messages in thread
From: Pranjal Shrivastava @ 2025-11-17 19:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Pranjal Shrivastava
Before we suspend SMMU, we want to ensure that all commands (especially
ATC_INV) have been flushed by the CMDQ, i.e. the CMDQs are empty.
Add a helper function that polls till the queues are dained.
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 40 +++++++++++++++++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 ++
2 files changed, 43 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c5efd9713775..c50cd975920e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -981,6 +981,46 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
cmds->num, true);
}
+int arm_smmu_queue_poll_until_empty(struct arm_smmu_device *smmu,
+ struct arm_smmu_queue *q)
+{
+ struct arm_smmu_queue_poll qp;
+ struct arm_smmu_ll_queue *llq = &q->llq;
+ int ret = 0;
+
+ queue_poll_init(smmu, &qp);
+ do {
+ if (queue_empty(llq))
+ break;
+
+ ret = queue_poll(&qp);
+ WRITE_ONCE(llq->cons, readl_relaxed(q->cons_reg));
+
+ } while (!ret);
+
+ return ret;
+}
+
+static int arm_smmu_drain_queues(struct arm_smmu_device *smmu)
+{
+ struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
+ unsigned long flags;
+ int ret;
+
+ /*
+ * Since this is only called from the suspend callback, we
+ * should be able to acquire the exclusive lock without failing.
+ */
+ arm_smmu_cmdq_exclusive_trylock_irqsave(cmdq, flags);
+ ret = arm_smmu_queue_poll_until_empty(smmu, &smmu->cmdq.q);
+ arm_smmu_cmdq_exclusive_unlock_irqrestore(cmdq, flags);
+
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused,
struct iommu_page_response *resp)
{
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index ae23aacc3840..b52ea821361e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -983,6 +983,9 @@ int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq *cmdq);
+int arm_smmu_queue_poll_until_empty(struct arm_smmu_device *smmu,
+ struct arm_smmu_queue *q);
+
static inline bool arm_smmu_master_canwbs(struct arm_smmu_master *master)
{
return dev_iommu_fwspec_get(master->dev)->flags &
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v4 3/8] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs
2025-11-17 19:14 [RFC PATCH v4 0/8] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 1/8] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 2/8] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
@ 2025-11-17 19:14 ` Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 4/8] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
` (4 subsequent siblings)
7 siblings, 0 replies; 19+ messages in thread
From: Pranjal Shrivastava @ 2025-11-17 19:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Pranjal Shrivastava
The tegra241-cmdqv driver supports vCMDQs which need to be drained
before suspending the SMMU. The current driver implementation only uses
VINTF0 for vCMDQs owned by the kernel which need to be drained. Add a
helper that drains all the enabled vCMDQs under VINTF0.
Add another function ptr to arm_smmu_impl_ops to drain implementation
specified queues and call it within `arm_smmu_drain_queues`.
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 +
.../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 27 +++++++++++++++++++
3 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c50cd975920e..31041ce93e03 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1018,7 +1018,11 @@ static int arm_smmu_drain_queues(struct arm_smmu_device *smmu)
if (ret)
return ret;
- return 0;
+ /* Drain all implementation-specific queues */
+ if (smmu->impl_ops && smmu->impl_ops->drain_queues)
+ ret = smmu->impl_ops->drain_queues(smmu);
+
+ return ret;
}
static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index b52ea821361e..2db0ec14b2ed 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -732,6 +732,7 @@ struct arm_smmu_impl_ops {
size_t (*get_viommu_size)(enum iommu_viommu_type viommu_type);
int (*vsmmu_init)(struct arm_vsmmu *vsmmu,
const struct iommu_user_data *user_data);
+ int (*drain_queues)(struct arm_smmu_device *smmu);
};
/* An SMMUv3 instance */
diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
index 378104cd395e..de15bc44f9b3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
@@ -416,6 +416,32 @@ tegra241_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
return &vcmdq->cmdq;
}
+static int tegra241_cmdqv_drain_vintf0_lvcmdqs(struct arm_smmu_device *smmu)
+{
+ struct tegra241_cmdqv *cmdqv =
+ container_of(smmu, struct tegra241_cmdqv, smmu);
+ struct tegra241_vintf *vintf = cmdqv->vintfs[0];
+ int ret = 0;
+ u16 lidx;
+
+ /* Kernel only uses VINTF0. Return if it's disabled */
+ if (!READ_ONCE(vintf->enabled))
+ return 0;
+
+ for (lidx = 0; lidx < cmdqv->num_lvcmdqs_per_vintf; lidx++) {
+ struct tegra241_vcmdq *vcmdq = vintf->lvcmdqs[lidx];
+
+ if (!vcmdq || !READ_ONCE(vcmdq->enabled))
+ continue;
+
+ ret = arm_smmu_queue_poll_until_empty(smmu, &vcmdq->cmdq.q);
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
+
/* HW Reset Functions */
/*
@@ -846,6 +872,7 @@ static struct arm_smmu_impl_ops tegra241_cmdqv_impl_ops = {
.get_secondary_cmdq = tegra241_cmdqv_get_cmdq,
.device_reset = tegra241_cmdqv_hw_reset,
.device_remove = tegra241_cmdqv_remove,
+ .drain_queues = tegra241_cmdqv_drain_vintf0_lvcmdqs,
/* For user-space use */
.hw_info = tegra241_cmdqv_hw_info,
.get_viommu_size = tegra241_cmdqv_get_vintf_size,
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v4 4/8] iommu/arm-smmu-v3: Cache and restore MSI config
2025-11-17 19:14 [RFC PATCH v4 0/8] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (2 preceding siblings ...)
2025-11-17 19:14 ` [PATCH v4 3/8] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
@ 2025-11-17 19:14 ` Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 5/8] iommu/arm-smmu-v3: Add a usage counter for cmdq Pranjal Shrivastava
` (3 subsequent siblings)
7 siblings, 0 replies; 19+ messages in thread
From: Pranjal Shrivastava @ 2025-11-17 19:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Pranjal Shrivastava
The SMMU's MSI configuration registers (*_IRQ_CFGn) containing target
address, data and memory attributes lose their state when the SMMU is
powered down. We'll need to cache and restore their contents to ensure
that MSIs work after the system resumes.
To address this, cache the original `msi_msg` within the `msi_desc`
when the configuration is first written by `arm_smmu_write_msi_msg`.
This primarily includes the target address and data since the memory
attributes are fixed.
Introduce a new helper `arm_smmu_resume_msis` which will later be called
during the driver's resume callback. The helper would retrieve the
cached MSI message for each relevant interrupt (evtq, gerr, priq) via
get_cached_msi_msg & re-config the registers via arm_smmu_write_msi_msg.
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 37 +++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 31041ce93e03..1d6c60bee7dd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3988,6 +3988,9 @@ static void arm_smmu_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
struct arm_smmu_device *smmu = dev_get_drvdata(dev);
phys_addr_t *cfg = arm_smmu_msi_cfg[desc->msi_index];
+ /* Cache the msi_msg for resume */
+ desc->msg = *msg;
+
doorbell = (((u64)msg->address_hi) << 32) | msg->address_lo;
doorbell &= MSI_CFG0_ADDR_MASK;
@@ -3996,6 +3999,40 @@ static void arm_smmu_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
writel_relaxed(ARM_SMMU_MEMATTR_DEVICE_nGnRE, smmu->base + cfg[2]);
}
+static void arm_smmu_resume_msi(struct arm_smmu_device *smmu,
+ unsigned int irq, const char *name)
+{
+ struct msi_desc *desc;
+ struct msi_msg msg;
+
+ if (!irq)
+ return;
+
+ desc = irq_get_msi_desc(irq);
+ if (!desc) {
+ dev_err(smmu->dev, "Failed to resume msi: %s", name);
+ return;
+ }
+
+ get_cached_msi_msg(irq, &msg);
+ arm_smmu_write_msi_msg(desc, &msg);
+}
+
+static void arm_smmu_resume_msis(struct arm_smmu_device *smmu)
+{
+ if (!(smmu->features & ARM_SMMU_FEAT_MSI))
+ return;
+
+ if (!dev_get_msi_domain(smmu->dev))
+ return;
+
+ arm_smmu_resume_msi(smmu, smmu->gerr_irq, "gerror");
+ arm_smmu_resume_msi(smmu, smmu->evtq.q.irq, "evtq");
+
+ if (smmu->features & ARM_SMMU_FEAT_PRI)
+ arm_smmu_resume_msi(smmu, smmu->priq.q.irq, "priq");
+}
+
static void arm_smmu_setup_msis(struct arm_smmu_device *smmu)
{
int ret, nvec = ARM_SMMU_MAX_MSIS;
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v4 5/8] iommu/arm-smmu-v3: Add a usage counter for cmdq
2025-11-17 19:14 [RFC PATCH v4 0/8] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (3 preceding siblings ...)
2025-11-17 19:14 ` [PATCH v4 4/8] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
@ 2025-11-17 19:14 ` Pranjal Shrivastava
2025-11-18 6:05 ` Sairaj Kodilkar
2025-11-17 19:14 ` [PATCH v4 6/8] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
` (2 subsequent siblings)
7 siblings, 1 reply; 19+ messages in thread
From: Pranjal Shrivastava @ 2025-11-17 19:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Pranjal Shrivastava
Introduce a biased counter to track the number of active cmdq owners as
a preparatory step for the runtime PM implementation.
The counter will be used to gate command submission, preventing the
submission of new commands while the device is suspended and deferring
suspend while the command submissions are in-flight.
The counter is biased to a value of 1 during device reset. A cmdq owner
or a thread issuing cmds with sync, increment it before accessing HW
registers and decrements it with release semantics afterwards.
A value of 1 represents an idle (but active) state. A suspend operation
will set it to from 1 -> 0 representing the suspended state.
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 69 +++++++++++++++++----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +
2 files changed, 61 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 1d6c60bee7dd..d6e75d1646d6 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -794,7 +794,7 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
u64 cmd_sync[CMDQ_ENT_DWORDS];
u32 prod;
unsigned long flags;
- bool owner;
+ bool owner, has_ref = false;
struct arm_smmu_ll_queue llq, head;
int ret = 0;
@@ -808,8 +808,15 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
while (!queue_has_space(&llq, n + sync)) {
local_irq_restore(flags);
+
+ if (!atomic_inc_not_zero(&smmu->nr_cmdq_users))
+ /* Device is suspended, don't wait for space */
+ return 0;
+
if (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq))
dev_err_ratelimited(smmu->dev, "CMDQ timeout\n");
+
+ atomic_dec_return_release(&smmu->nr_cmdq_users);
local_irq_save(flags);
}
@@ -868,10 +875,35 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
arm_smmu_cmdq_poll_valid_map(cmdq, llq.prod, prod);
/*
- * d. Advance the hardware prod pointer
+ * d. Advance the hardware prod pointer (if smmu is still active)
* Control dependency ordering from the entries becoming valid.
*/
- writel_relaxed(prod, cmdq->q.prod_reg);
+ if (atomic_inc_not_zero(&smmu->nr_cmdq_users)) {
+ writel_relaxed(prod, cmdq->q.prod_reg);
+
+ if (sync) {
+ has_ref = true;
+ } else {
+ /*
+ * Use release semantics to enforce ordering without a full barrier.
+ * This ensures the prior writel_relaxed() is ordered/visible
+ * before the refcount decrement, avoiding the heavy pipeline
+ * stall of a full wmb().
+ *
+ * We need the atomic_dec_return_release() below and the
+ * atomic_set_release() in step (e) below doesn't suffice.
+ *
+ * Specifically, without release semantics on the decrement,
+ * the CPU is free to reorder the independent atomic_dec_relaxed()
+ * before the writel_relaxed().
+ *
+ * If this happens, the refcount could drop to zero, allowing the PM
+ * suspend path (running on another CPU) to disable the SMMU before
+ * the register write completes, resulting in a bus fault.
+ */
+ atomic_dec_return_release(&smmu->nr_cmdq_users);
+ }
+ }
/*
* e. Tell the next owner we're done
@@ -883,14 +915,23 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
/* 5. If we are inserting a CMD_SYNC, we must wait for it to complete */
if (sync) {
- llq.prod = queue_inc_prod_n(&llq, n);
- ret = arm_smmu_cmdq_poll_until_sync(smmu, cmdq, &llq);
- if (ret) {
- dev_err_ratelimited(smmu->dev,
- "CMD_SYNC timeout at 0x%08x [hwprod 0x%08x, hwcons 0x%08x]\n",
- llq.prod,
- readl_relaxed(cmdq->q.prod_reg),
- readl_relaxed(cmdq->q.cons_reg));
+
+ /* If we are not the owner, check if we're suspended */
+ if (!has_ref) {
+ if (atomic_inc_not_zero(&smmu->nr_cmdq_users))
+ has_ref = true;
+ }
+
+ if (has_ref) {
+ llq.prod = queue_inc_prod_n(&llq, n);
+ ret = arm_smmu_cmdq_poll_until_sync(smmu, cmdq, &llq);
+ if (ret) {
+ dev_err_ratelimited(smmu->dev,
+ "CMD_SYNC timeout at 0x%08x [hwprod 0x%08x, hwcons 0x%08x]\n",
+ llq.prod,
+ readl_relaxed(cmdq->q.prod_reg),
+ readl_relaxed(cmdq->q.cons_reg));
+ }
}
/*
@@ -903,6 +944,9 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
}
}
+ if (has_ref)
+ atomic_dec_return_release(&smmu->nr_cmdq_users);
+
local_irq_restore(flags);
return ret;
}
@@ -4251,6 +4295,9 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
return ret;
}
+ /* Set the cmdq to be active before issuing any commands */
+ atomic_set(&smmu->nr_cmdq_users, 1);
+
/* Invalidate any cached configuration */
cmd.opcode = CMDQ_OP_CFGI_ALL;
arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 2db0ec14b2ed..924580610ce0 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -806,6 +806,9 @@ struct arm_smmu_device {
struct rb_root streams;
struct mutex streams_mutex;
+
+ /* Tracks the cmdq usage count for runtime PM */
+ atomic_t nr_cmdq_users;
};
struct arm_smmu_stream {
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v4 6/8] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2025-11-17 19:14 [RFC PATCH v4 0/8] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (4 preceding siblings ...)
2025-11-17 19:14 ` [PATCH v4 5/8] iommu/arm-smmu-v3: Add a usage counter for cmdq Pranjal Shrivastava
@ 2025-11-17 19:14 ` Pranjal Shrivastava
2025-12-24 7:39 ` Ashish Mhetre
2025-11-17 19:14 ` [PATCH v4 7/8] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 8/8] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
7 siblings, 1 reply; 19+ messages in thread
From: Pranjal Shrivastava @ 2025-11-17 19:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Pranjal Shrivastava
Implement pm_runtime and system sleep ops for arm-smmu-v3.
The suspend callback configures the SMMU to abort new transactions,
disables the main translation unit and then drains the command queue
to ensure completion of any in-flight commands.
The resume callback restores the MSI configuration and performs a full
device reset via `arm_smmu_device_reset` to bring the SMMU back to an
operational state. The MSIs are cached during the msi_write and are
restored during the resume operation by using the helper.
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 109 ++++++++++++++++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +
2 files changed, 112 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index d6e75d1646d6..44875c526183 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -26,6 +26,7 @@
#include <linux/pci.h>
#include <linux/pci-ats.h>
#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
#include <linux/string_choices.h>
#include <kunit/visibility.h>
#include <uapi/linux/iommufd.h>
@@ -108,6 +109,33 @@ static const char * const event_class_str[] = {
static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
+/* Runtime PM helpers */
+__maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
+{
+ int ret;
+
+ if (pm_runtime_enabled(smmu->dev)) {
+ ret = pm_runtime_resume_and_get(smmu->dev);
+ if (ret < 0) {
+ dev_err(smmu->dev, "Failed to resume device: %d\n", ret);
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
+__maybe_unused static void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
+{
+ int ret;
+
+ if (pm_runtime_enabled(smmu->dev)) {
+ ret = pm_runtime_put_autosuspend(smmu->dev);
+ if (ret < 0)
+ dev_err(smmu->dev, "Failed to suspend device: %d\n", ret);
+ }
+}
+
static void parse_driver_options(struct arm_smmu_device *smmu)
{
int i = 0;
@@ -5005,6 +5033,86 @@ static void arm_smmu_device_shutdown(struct platform_device *pdev)
arm_smmu_device_disable(smmu);
}
+static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
+{
+ struct arm_smmu_device *smmu = dev_get_drvdata(dev);
+ int timeout = ARM_SMMU_SUSPEND_TIMEOUT_US;
+ u32 enables;
+ int ret;
+
+ /* Try to suspend the device, wait for in-flight submissions */
+ do {
+ if (atomic_cmpxchg(&smmu->nr_cmdq_users, 1, 0) == 1)
+ break;
+
+ udelay(1);
+ } while (--timeout);
+
+ if (!timeout) {
+ dev_warn(smmu->dev, "SMMU in use, aborting suspend\n");
+ return -EAGAIN;
+ }
+
+ /* Abort all transactions before disable to avoid spurious bypass */
+ arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
+
+ /* Disable the SMMU via CR0.EN and all queues except CMDQ */
+ enables = CR0_CMDQEN;
+ ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0, ARM_SMMU_CR0ACK);
+ if (ret) {
+ dev_err(smmu->dev, "Timed-out while disabling smmu\n");
+ atomic_set(&smmu->nr_cmdq_users, 1);
+ return ret;
+ }
+
+ /*
+ * At this point the SMMU is completely disabled and won't access
+ * any translation/config structures, even speculative accesses
+ * aren't performed as per the IHI0070 spec (section 6.3.9.6).
+ */
+
+ /* Wait for the CMDQs to be drained to flush any pending commands */
+ ret = arm_smmu_drain_queues(smmu);
+ if (ret)
+ dev_err(smmu->dev, "Draining queues timed-out..forcing suspend\n");
+
+ /* Disable everything */
+ arm_smmu_device_disable(smmu);
+ dev_dbg(dev, "Suspended smmu\n");
+
+ return 0;
+}
+
+static int __maybe_unused arm_smmu_runtime_resume(struct device *dev)
+{
+ int ret;
+ struct arm_smmu_device *smmu = dev_get_drvdata(dev);
+
+ dev_dbg(dev, "Resuming device\n");
+
+ /* Re-configure MSIs */
+ arm_smmu_resume_msis(smmu);
+
+ /*
+ * The reset will re-initialize all the base addresses, queues,
+ * prod and cons maintained within struct arm_smmu_device as well as
+ * re-enable the interrupts.
+ */
+ ret = arm_smmu_device_reset(smmu);
+
+ if (ret)
+ dev_err(dev, "Failed to reset during resume operation: %d\n", ret);
+
+ return ret;
+}
+
+static const struct dev_pm_ops arm_smmu_pm_ops = {
+ SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+ pm_runtime_force_resume)
+ SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend,
+ arm_smmu_runtime_resume, NULL)
+};
+
static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,smmu-v3", },
{ },
@@ -5021,6 +5129,7 @@ static struct platform_driver arm_smmu_driver = {
.driver = {
.name = "arm-smmu-v3",
.of_match_table = arm_smmu_of_match,
+ .pm = &arm_smmu_pm_ops,
.suppress_bind_attrs = true,
},
.probe = arm_smmu_device_probe,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 924580610ce0..eefa5853033c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -503,11 +503,14 @@ static inline unsigned int arm_smmu_cdtab_l2_idx(unsigned int ssid)
/* High-level queue structures */
#define ARM_SMMU_POLL_TIMEOUT_US 1000000 /* 1s! */
+#define ARM_SMMU_SUSPEND_TIMEOUT_US 100 /* 100us! */
#define ARM_SMMU_POLL_SPIN_COUNT 10
#define MSI_IOVA_BASE 0x8000000
#define MSI_IOVA_LENGTH 0x100000
+#define RPM_AUTOSUSPEND_DELAY_MS 15
+
enum pri_resp {
PRI_RESP_DENY = 0,
PRI_RESP_FAIL = 1,
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v4 7/8] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks
2025-11-17 19:14 [RFC PATCH v4 0/8] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (5 preceding siblings ...)
2025-11-17 19:14 ` [PATCH v4 6/8] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
@ 2025-11-17 19:14 ` Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 8/8] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
7 siblings, 0 replies; 19+ messages in thread
From: Pranjal Shrivastava @ 2025-11-17 19:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Pranjal Shrivastava
Enable PM runtime for SMMUs having a power-domain during smmu probe.
Add a devlink between the clients and SMMU device. The absence of a
power domain effectively disables runtime power management.
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 44875c526183..49a3a46994b1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3685,6 +3685,9 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
pci_prepare_ats(to_pci_dev(dev), stu);
}
+ device_link_add(dev, smmu->dev,
+ DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE_SUPPLIER);
+
return &smmu->iommu;
err_free_master:
@@ -4992,11 +4995,18 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
if (ret)
goto err_disable;
+ if (dev->pm_domain) {
+ pm_runtime_set_active(dev);
+ pm_runtime_use_autosuspend(dev);
+ pm_runtime_set_autosuspend_delay(dev, RPM_AUTOSUSPEND_DELAY_MS);
+ pm_runtime_enable(dev);
+ }
+
/* And we're up. Go go go! */
ret = iommu_device_sysfs_add(&smmu->iommu, dev, NULL,
"smmu3.%pa", &ioaddr);
if (ret)
- goto err_disable;
+ goto err_rpm_disable;
ret = iommu_device_register(&smmu->iommu, &arm_smmu_ops, dev);
if (ret) {
@@ -5008,6 +5018,8 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
err_free_sysfs:
iommu_device_sysfs_remove(&smmu->iommu);
+err_rpm_disable:
+ pm_runtime_disable(dev);
err_disable:
arm_smmu_device_disable(smmu);
err_free_iopf:
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v4 8/8] iommu/arm-smmu-v3: Invoke pm_runtime before hw access
2025-11-17 19:14 [RFC PATCH v4 0/8] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
` (6 preceding siblings ...)
2025-11-17 19:14 ` [PATCH v4 7/8] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
@ 2025-11-17 19:14 ` Pranjal Shrivastava
2025-11-18 0:14 ` Jason Gunthorpe
7 siblings, 1 reply; 19+ messages in thread
From: Pranjal Shrivastava @ 2025-11-17 19:14 UTC (permalink / raw)
To: iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, Pranjal Shrivastava
Invoke the pm_runtime helpers at all places before accessing the hw.
The idea is to invoke runtime_pm helpers at common points which are
used by exposed ops or interrupt handlers. Elide all TLB and CFG
invalidations if the smmu is suspended but not ATC invalidations.
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 10 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 95 +++++++++++++++----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +
3 files changed, 91 insertions(+), 17 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index 8cd8929bbfdf..d2c4c67600cf 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -15,6 +15,11 @@ void *arm_smmu_hw_info(struct device *dev, u32 *length,
struct iommu_hw_info_arm_smmuv3 *info;
u32 __iomem *base_idr;
unsigned int i;
+ int ret;
+
+ ret = arm_smmu_rpm_get(master->smmu);
+ if (ret < 0)
+ return ERR_PTR(-EIO);
if (*type != IOMMU_HW_INFO_TYPE_DEFAULT &&
*type != IOMMU_HW_INFO_TYPE_ARM_SMMUV3) {
@@ -24,8 +29,10 @@ void *arm_smmu_hw_info(struct device *dev, u32 *length,
}
info = kzalloc(sizeof(*info), GFP_KERNEL);
- if (!info)
+ if (!info) {
+ arm_smmu_rpm_put(master->smmu);
return ERR_PTR(-ENOMEM);
+ }
base_idr = master->smmu->base + ARM_SMMU_IDR0;
for (i = 0; i <= 5; i++)
@@ -36,6 +43,7 @@ void *arm_smmu_hw_info(struct device *dev, u32 *length,
*length = sizeof(*info);
*type = IOMMU_HW_INFO_TYPE_ARM_SMMUV3;
+ arm_smmu_rpm_put(master->smmu);
return info;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 49a3a46994b1..ec84e6381416 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -110,7 +110,7 @@ static const char * const event_class_str[] = {
static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
/* Runtime PM helpers */
-__maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
+int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
{
int ret;
@@ -125,7 +125,7 @@ __maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
return 0;
}
-__maybe_unused static void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
+void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
{
int ret;
@@ -1102,7 +1102,9 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
{
struct arm_smmu_cmdq_ent cmd = {0};
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_device *smmu = master->smmu;
int sid = master->streams[0].id;
+ int ret;
if (WARN_ON(!master->stall_enabled))
return;
@@ -1122,6 +1124,10 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
break;
}
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ return;
+
arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
/*
* Don't send a SYNC, it doesn't do anything for RESUME or PRI_RESP.
@@ -1129,6 +1135,7 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
* terminated... at some point in the future. PRI_RESP is fire and
* forget.
*/
+ arm_smmu_rpm_put(smmu);
}
/* Context descriptor manipulation functions */
@@ -2057,6 +2064,7 @@ static void arm_smmu_dump_event(struct arm_smmu_device *smmu, u64 *raw,
static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
{
+ int ret;
u64 evt[EVTQ_ENT_DWORDS];
struct arm_smmu_event event = {0};
struct arm_smmu_device *smmu = dev;
@@ -2065,6 +2073,10 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
DEFAULT_RATELIMIT_BURST);
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ return IRQ_NONE;
+
do {
while (!queue_remove_raw(q, evt)) {
arm_smmu_decode_event(smmu, evt, &event);
@@ -2085,6 +2097,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
/* Sync our overflow flag, as we believe we're up to speed */
queue_sync_cons_ovf(q);
+ arm_smmu_rpm_put(smmu);
return IRQ_HANDLED;
}
@@ -2132,6 +2145,11 @@ static irqreturn_t arm_smmu_priq_thread(int irq, void *dev)
struct arm_smmu_queue *q = &smmu->priq.q;
struct arm_smmu_ll_queue *llq = &q->llq;
u64 evt[PRIQ_ENT_DWORDS];
+ int ret;
+
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ return IRQ_NONE;
do {
while (!queue_remove_raw(q, evt))
@@ -2143,6 +2161,7 @@ static irqreturn_t arm_smmu_priq_thread(int irq, void *dev)
/* Sync our overflow flag, as we believe we're up to speed */
queue_sync_cons_ovf(q);
+ arm_smmu_rpm_put(smmu);
return IRQ_HANDLED;
}
@@ -2152,13 +2171,24 @@ static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
{
u32 gerror, gerrorn, active;
struct arm_smmu_device *smmu = dev;
+ int ret;
+
+ if (pm_runtime_enabled(smmu->dev)) {
+ ret = pm_runtime_get_if_active(smmu->dev);
+ if (ret == 0) {
+ dev_err(smmu->dev, "Ignoring gerror interrupt because device isn't rpm active\n");
+ return IRQ_NONE;
+ }
+ }
gerror = readl_relaxed(smmu->base + ARM_SMMU_GERROR);
gerrorn = readl_relaxed(smmu->base + ARM_SMMU_GERRORN);
active = gerror ^ gerrorn;
- if (!(active & GERROR_ERR_MASK))
+ if (!(active & GERROR_ERR_MASK)) {
+ arm_smmu_rpm_put(smmu);
return IRQ_NONE; /* No errors pending */
+ }
dev_warn(smmu->dev,
"unexpected global error reported (0x%08x), this could be serious\n",
@@ -2191,6 +2221,7 @@ static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
arm_smmu_cmdq_skip_err(smmu);
writel(gerror, smmu->base + ARM_SMMU_GERRORN);
+ arm_smmu_rpm_put(smmu);
return IRQ_HANDLED;
}
@@ -2281,26 +2312,33 @@ arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size,
static int arm_smmu_atc_inv_master(struct arm_smmu_master *master,
ioasid_t ssid)
{
- int i;
+ int i, ret;
struct arm_smmu_cmdq_ent cmd;
struct arm_smmu_cmdq_batch cmds;
arm_smmu_atc_inv_to_cmd(ssid, 0, 0, &cmd);
+ /* ATC invalidations shouldn't be elided */
+ ret = arm_smmu_rpm_get(master->smmu);
+ if (ret < 0)
+ return ret;
+
arm_smmu_cmdq_batch_init(master->smmu, &cmds, &cmd);
for (i = 0; i < master->num_streams; i++) {
cmd.atc.sid = master->streams[i].id;
arm_smmu_cmdq_batch_add(master->smmu, &cmds, &cmd);
}
- return arm_smmu_cmdq_batch_submit(master->smmu, &cmds);
+ ret = arm_smmu_cmdq_batch_submit(master->smmu, &cmds);
+ arm_smmu_rpm_put(master->smmu);
+ return ret;
}
int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
unsigned long iova, size_t size)
{
struct arm_smmu_master_domain *master_domain;
- int i;
+ int i, ret;
unsigned long flags;
struct arm_smmu_cmdq_ent cmd = {
.opcode = CMDQ_OP_ATC_INV,
@@ -2327,6 +2365,11 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
if (!atomic_read(&smmu_domain->nr_ats_masters))
return 0;
+ /* ATC invalidations shouldn't be elided */
+ ret = arm_smmu_rpm_get(smmu_domain->smmu);
+ if (ret < 0)
+ return ret;
+
arm_smmu_cmdq_batch_init(smmu_domain->smmu, &cmds, &cmd);
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
@@ -2355,7 +2398,9 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
}
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
- return arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds);
+ ret = arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds);
+ arm_smmu_rpm_put(smmu_domain->smmu);
+ return ret;
}
/* IO_PGTABLE API */
@@ -2377,6 +2422,7 @@ static void arm_smmu_tlb_inv_context(void *cookie)
} else {
cmd.opcode = CMDQ_OP_TLBI_S12_VMALL;
cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
+
arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
}
arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
@@ -2471,6 +2517,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
cmd.opcode = CMDQ_OP_TLBI_S2_IPA;
cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
}
+
__arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
if (smmu_domain->nest_parent) {
@@ -3312,13 +3359,14 @@ static int arm_smmu_blocking_set_dev_pasid(struct iommu_domain *new_domain,
sid_domain->type == IOMMU_DOMAIN_BLOCKED)
sid_domain->ops->attach_dev(sid_domain, dev);
}
+
return 0;
}
-static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
- struct device *dev,
- struct arm_smmu_ste *ste,
- unsigned int s1dss)
+static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
+ struct device *dev,
+ struct arm_smmu_ste *ste,
+ unsigned int s1dss)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct arm_smmu_attach_state state = {
@@ -3361,6 +3409,7 @@ static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
* descriptor from arm_smmu_share_asid().
*/
arm_smmu_clear_cd(master, IOMMU_NO_PASID);
+ return 0;
}
static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
@@ -3371,8 +3420,8 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
arm_smmu_master_clear_vmaster(master);
arm_smmu_make_bypass_ste(master->smmu, &ste);
- arm_smmu_attach_dev_ste(domain, dev, &ste, STRTAB_STE_1_S1DSS_BYPASS);
- return 0;
+ return arm_smmu_attach_dev_ste(domain, dev, &ste,
+ STRTAB_STE_1_S1DSS_BYPASS);
}
static const struct iommu_domain_ops arm_smmu_identity_ops = {
@@ -3392,9 +3441,8 @@ static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
arm_smmu_master_clear_vmaster(master);
arm_smmu_make_abort_ste(&ste);
- arm_smmu_attach_dev_ste(domain, dev, &ste,
- STRTAB_STE_1_S1DSS_TERMINATE);
- return 0;
+ return arm_smmu_attach_dev_ste(domain, dev, &ste,
+ STRTAB_STE_1_S1DSS_TERMINATE);
}
static const struct iommu_domain_ops arm_smmu_blocked_ops = {
@@ -5030,10 +5078,19 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
static void arm_smmu_device_remove(struct platform_device *pdev)
{
struct arm_smmu_device *smmu = platform_get_drvdata(pdev);
+ int ret;
iommu_device_unregister(&smmu->iommu);
iommu_device_sysfs_remove(&smmu->iommu);
+
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ goto free_iopf;
+
arm_smmu_device_disable(smmu);
+ arm_smmu_rpm_put(smmu);
+
+free_iopf:
iopf_queue_free(smmu->evtq.iopf);
ida_destroy(&smmu->vmid_map);
}
@@ -5041,8 +5098,14 @@ static void arm_smmu_device_remove(struct platform_device *pdev)
static void arm_smmu_device_shutdown(struct platform_device *pdev)
{
struct arm_smmu_device *smmu = platform_get_drvdata(pdev);
+ int ret;
+
+ ret = arm_smmu_rpm_get(smmu);
+ if (ret < 0)
+ return;
arm_smmu_device_disable(smmu);
+ arm_smmu_rpm_put(smmu);
}
static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index eefa5853033c..8486a2e95b03 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1021,6 +1021,9 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq *cmdq, u64 *cmds, int n,
bool sync);
+int arm_smmu_rpm_get(struct arm_smmu_device *smmu);
+void arm_smmu_rpm_put(struct arm_smmu_device *smmu);
+
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
void arm_smmu_sva_notifier_synchronize(void);
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v4 2/8] iommu/arm-smmu-v3: Add a helper to drain cmd queues
2025-11-17 19:14 ` [PATCH v4 2/8] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
@ 2025-11-17 19:31 ` Nicolin Chen
2025-11-18 3:48 ` Daniel Mentz
1 sibling, 0 replies; 19+ messages in thread
From: Nicolin Chen @ 2025-11-17 19:31 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Daniel Mentz
On Mon, Nov 17, 2025 at 07:14:27PM +0000, Pranjal Shrivastava wrote:
> Before we suspend SMMU, we want to ensure that all commands (especially
> ATC_INV) have been flushed by the CMDQ, i.e. the CMDQs are empty.
> Add a helper function that polls till the queues are dained.
>
> Signed-off-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v4 8/8] iommu/arm-smmu-v3: Invoke pm_runtime before hw access
2025-11-17 19:14 ` [PATCH v4 8/8] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
@ 2025-11-18 0:14 ` Jason Gunthorpe
2025-11-20 20:26 ` Pranjal Shrivastava
0 siblings, 1 reply; 19+ messages in thread
From: Jason Gunthorpe @ 2025-11-18 0:14 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Mostafa Saleh,
Nicolin Chen, Daniel Mentz
On Mon, Nov 17, 2025 at 07:14:33PM +0000, Pranjal Shrivastava wrote:
> Invoke the pm_runtime helpers at all places before accessing the hw.
> The idea is to invoke runtime_pm helpers at common points which are
> used by exposed ops or interrupt handlers. Elide all TLB and CFG
> invalidations if the smmu is suspended but not ATC invalidations.
>
> Signed-off-by: Pranjal Shrivastava <praan@google.com>
> ---
> .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 10 +-
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 95 +++++++++++++++----
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +
> 3 files changed, 91 insertions(+), 17 deletions(-)
I wonder if the cleanup.h macros would improve this?
> @@ -1122,6 +1124,10 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
> break;
> }
>
> + ret = arm_smmu_rpm_get(smmu);
> + if (ret < 0)
> + return;
> +
> arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
> /*
> * Don't send a SYNC, it doesn't do anything for RESUME or PRI_RESP.
> @@ -1129,6 +1135,7 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
> * terminated... at some point in the future. PRI_RESP is fire and
> * forget.
> */
> + arm_smmu_rpm_put(smmu);
> }
What are the rules for power managemnt? If the SMMU sleeps and GBPA is
set to abort all then how can we have an outstanding concurrent DMA that
needs page fault handling?
ie if we resume the fault and the racily power down the SMMU isn't
there a good chance the transaction will blow up anyhow?
Isn't that a bug we should detect?
Basically, shouldn't some of these be more like 'if not powered up
then something is very wrong so log and fail' ?
> @@ -2152,13 +2171,24 @@ static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
> {
> u32 gerror, gerrorn, active;
> struct arm_smmu_device *smmu = dev;
> + int ret;
> +
> + if (pm_runtime_enabled(smmu->dev)) {
> + ret = pm_runtime_get_if_active(smmu->dev);
> + if (ret == 0) {
> + dev_err(smmu->dev, "Ignoring gerror interrupt because device isn't rpm active\n");
> + return IRQ_NONE;
> + }
> + }
This looks troubled too, if powering down looses the gerror register,
then this scheme is racily going to loose gerror events?
Seems like it should be checking that gerror is indeed 0 before
powering down? Under the assumption there is no concurrent activity
this should be race free? If it is not zero then it should be handled
before it is lost.
> /* IO_PGTABLE API */
> @@ -2377,6 +2422,7 @@ static void arm_smmu_tlb_inv_context(void *cookie)
> } else {
> cmd.opcode = CMDQ_OP_TLBI_S12_VMALL;
> cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
> +
> arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
> }
> arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
> @@ -2471,6 +2517,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
> cmd.opcode = CMDQ_OP_TLBI_S2_IPA;
> cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
> }
> +
> __arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
>
> if (smmu_domain->nest_parent) {
> @@ -3312,13 +3359,14 @@ static int arm_smmu_blocking_set_dev_pasid(struct iommu_domain *new_domain,
> sid_domain->type == IOMMU_DOMAIN_BLOCKED)
> sid_domain->ops->attach_dev(sid_domain, dev);
> }
> +
> return 0;
extra new lines??
> -static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
> - struct device *dev,
> - struct arm_smmu_ste *ste,
> - unsigned int s1dss)
> +static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
> + struct device *dev,
> + struct arm_smmu_ste *ste,
> + unsigned int s1dss)
> {
> struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> struct arm_smmu_attach_state state = {
> @@ -3361,6 +3409,7 @@ static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
> * descriptor from arm_smmu_share_asid().
> */
> arm_smmu_clear_cd(master, IOMMU_NO_PASID);
> + return 0;
> }
??
>
> static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
> @@ -3371,8 +3420,8 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
>
> arm_smmu_master_clear_vmaster(master);
> arm_smmu_make_bypass_ste(master->smmu, &ste);
> - arm_smmu_attach_dev_ste(domain, dev, &ste, STRTAB_STE_1_S1DSS_BYPASS);
> - return 0;
> + return arm_smmu_attach_dev_ste(domain, dev, &ste,
> + STRTAB_STE_1_S1DSS_BYPASS);
> }
These things are not allowed to fail, please be careful not to
propogate errors beyond where they should be. This is the biggest
thing I don't like about this series, we already have no error
recovery and now we are adding more cases where errors can't be
propogated.
For instance this call really cannot fail and propogating an error is
nonsense.
If you really couldn't power it up then you are better to keep it
powered down and update the in memory structures knowing that a
powered down device cannot cache them.
Jason
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v4 2/8] iommu/arm-smmu-v3: Add a helper to drain cmd queues
2025-11-17 19:14 ` [PATCH v4 2/8] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
2025-11-17 19:31 ` Nicolin Chen
@ 2025-11-18 3:48 ` Daniel Mentz
2025-11-20 20:45 ` Pranjal Shrivastava
1 sibling, 1 reply; 19+ messages in thread
From: Daniel Mentz @ 2025-11-18 3:48 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen
On Mon, Nov 17, 2025 at 11:15 AM Pranjal Shrivastava <praan@google.com> wrote:
> +int arm_smmu_queue_poll_until_empty(struct arm_smmu_device *smmu,
> + struct arm_smmu_queue *q)
Since this function is only ever used with the Command Queue (cmdq),
could you simplify the signature by obtaining the queue pointer
directly from &smmu->cmdq.q within the function?
> +{
> + struct arm_smmu_queue_poll qp;
> + struct arm_smmu_ll_queue *llq = &q->llq;
> + int ret = 0;
I believe you don't need to initialize this to 0. It won't be used
uninitialized.
> +
> + queue_poll_init(smmu, &qp);
> + do {
> + if (queue_empty(llq))
> + break;
> +
> + ret = queue_poll(&qp);
> + WRITE_ONCE(llq->cons, readl_relaxed(q->cons_reg));
> +
> + } while (!ret);
> +
> + return ret;
> +}
> +
> +static int arm_smmu_drain_queues(struct arm_smmu_device *smmu)
> +{
> + struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
> + unsigned long flags;
> + int ret;
> +
> + /*
> + * Since this is only called from the suspend callback, we
> + * should be able to acquire the exclusive lock without failing.
> + */
> + arm_smmu_cmdq_exclusive_trylock_irqsave(cmdq, flags);
If you assume that we're the exclusive user and then ignore the return
value of arm_smmu_cmdq_exclusive_trylock_irqsave, I'm wondering if you
can just skip the locking entirely.
> + ret = arm_smmu_queue_poll_until_empty(smmu, &smmu->cmdq.q);
> + arm_smmu_cmdq_exclusive_unlock_irqrestore(cmdq, flags);
> +
> + if (ret)
> + return ret;
> +
> + return 0;
Can you just always "return ret;". Wouldn't that be the same?
> +}
> +
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v4 5/8] iommu/arm-smmu-v3: Add a usage counter for cmdq
2025-11-17 19:14 ` [PATCH v4 5/8] iommu/arm-smmu-v3: Add a usage counter for cmdq Pranjal Shrivastava
@ 2025-11-18 6:05 ` Sairaj Kodilkar
2025-11-20 20:48 ` Pranjal Shrivastava
0 siblings, 1 reply; 19+ messages in thread
From: Sairaj Kodilkar @ 2025-11-18 6:05 UTC (permalink / raw)
To: Pranjal Shrivastava, iommu
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz
On 11/18/2025 12:44 AM, Pranjal Shrivastava wrote:
> Introduce a biased counter to track the number of active cmdq owners as
> a preparatory step for the runtime PM implementation.
>
> The counter will be used to gate command submission, preventing the
> submission of new commands while the device is suspended and deferring
> suspend while the command submissions are in-flight.
>
> The counter is biased to a value of 1 during device reset. A cmdq owner
> or a thread issuing cmds with sync, increment it before accessing HW
> registers and decrements it with release semantics afterwards.
>
> A value of 1 represents an idle (but active) state. A suspend operation
> will set it to from 1 -> 0 representing the suspended state.
>
> Signed-off-by: Pranjal Shrivastava <praan@google.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 69 +++++++++++++++++----
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +
> 2 files changed, 61 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 1d6c60bee7dd..d6e75d1646d6 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -794,7 +794,7 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> u64 cmd_sync[CMDQ_ENT_DWORDS];
> u32 prod;
> unsigned long flags;
> - bool owner;
> + bool owner, has_ref = false;
> struct arm_smmu_ll_queue llq, head;
> int ret = 0;
>
> @@ -808,8 +808,15 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
>
> while (!queue_has_space(&llq, n + sync)) {
> local_irq_restore(flags);
> +
> + if (!atomic_inc_not_zero(&smmu->nr_cmdq_users))
> + /* Device is suspended, don't wait for space */
> + return 0;
> +
> if (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq))
> dev_err_ratelimited(smmu->dev, "CMDQ timeout\n");
> +
> + atomic_dec_return_release(&smmu->nr_cmdq_users);
> local_irq_save(flags);
> }
>
> @@ -868,10 +875,35 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> arm_smmu_cmdq_poll_valid_map(cmdq, llq.prod, prod);
>
> /*
> - * d. Advance the hardware prod pointer
> + * d. Advance the hardware prod pointer (if smmu is still active)
> * Control dependency ordering from the entries becoming valid.
> */
> - writel_relaxed(prod, cmdq->q.prod_reg);
> + if (atomic_inc_not_zero(&smmu->nr_cmdq_users)) {
> + writel_relaxed(prod, cmdq->q.prod_reg);
> +
> + if (sync) {
> + has_ref = true;
> + } else {
> + /*
> + * Use release semantics to enforce ordering without a full barrier.
> + * This ensures the prior writel_relaxed() is ordered/visible
> + * before the refcount decrement, avoiding the heavy pipeline
> + * stall of a full wmb().
> + *
> + * We need the atomic_dec_return_release() below and the
> + * atomic_set_release() in step (e) below doesn't suffice.
> + *
> + * Specifically, without release semantics on the decrement,
> + * the CPU is free to reorder the independent atomic_dec_relaxed()
> + * before the writel_relaxed().
> + *
> + * If this happens, the refcount could drop to zero, allowing the PM
> + * suspend path (running on another CPU) to disable the SMMU before
> + * the register write completes, resulting in a bus fault.
> + */
> + atomic_dec_return_release(&smmu->nr_cmdq_users);
> + }
> + }
>
> /*
> * e. Tell the next owner we're done
> @@ -883,14 +915,23 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
>
> /* 5. If we are inserting a CMD_SYNC, we must wait for it to complete */
> if (sync) {
> - llq.prod = queue_inc_prod_n(&llq, n);
> - ret = arm_smmu_cmdq_poll_until_sync(smmu, cmdq, &llq);
> - if (ret) {
> - dev_err_ratelimited(smmu->dev,
> - "CMD_SYNC timeout at 0x%08x [hwprod 0x%08x, hwcons 0x%08x]\n",
> - llq.prod,
> - readl_relaxed(cmdq->q.prod_reg),
> - readl_relaxed(cmdq->q.cons_reg));
> +
> + /* If we are not the owner, check if we're suspended */
> + if (!has_ref) {
> + if (atomic_inc_not_zero(&smmu->nr_cmdq_users))
> + has_ref = true;
> + }
> +
> + if (has_ref) {
You can merge above two if condition as follow
if (has_ref || atomic_inc_not_zero(&smmu->nr_cmdq_users)) {
has_ref = true;
....
}
Thanks
Sairaj
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v4 8/8] iommu/arm-smmu-v3: Invoke pm_runtime before hw access
2025-11-18 0:14 ` Jason Gunthorpe
@ 2025-11-20 20:26 ` Pranjal Shrivastava
0 siblings, 0 replies; 19+ messages in thread
From: Pranjal Shrivastava @ 2025-11-20 20:26 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Mostafa Saleh,
Nicolin Chen, Daniel Mentz
On Mon, Nov 17, 2025 at 08:14:39PM -0400, Jason Gunthorpe wrote:
> On Mon, Nov 17, 2025 at 07:14:33PM +0000, Pranjal Shrivastava wrote:
> > Invoke the pm_runtime helpers at all places before accessing the hw.
> > The idea is to invoke runtime_pm helpers at common points which are
> > used by exposed ops or interrupt handlers. Elide all TLB and CFG
> > invalidations if the smmu is suspended but not ATC invalidations.
> >
> > Signed-off-by: Pranjal Shrivastava <praan@google.com>
> > ---
> > .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 10 +-
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 95 +++++++++++++++----
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +
> > 3 files changed, 91 insertions(+), 17 deletions(-)
>
> I wonder if the cleanup.h macros would improve this?
>
> > @@ -1122,6 +1124,10 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
> > break;
> > }
> >
> > + ret = arm_smmu_rpm_get(smmu);
> > + if (ret < 0)
> > + return;
> > +
> > arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
> > /*
> > * Don't send a SYNC, it doesn't do anything for RESUME or PRI_RESP.
> > @@ -1129,6 +1135,7 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
> > * terminated... at some point in the future. PRI_RESP is fire and
> > * forget.
> > */
> > + arm_smmu_rpm_put(smmu);
> > }
>
> What are the rules for power managemnt? If the SMMU sleeps and GBPA is
> set to abort all then how can we have an outstanding concurrent DMA that
> needs page fault handling?
>
> ie if we resume the fault and the racily power down the SMMU isn't
> there a good chance the transaction will blow up anyhow?
>
> Isn't that a bug we should detect?
>
> Basically, shouldn't some of these be more like 'if not powered up
> then something is very wrong so log and fail' ?
>
Yes, I think I assumed that this thread won't run while we're powered
off which was wrong. I'll add a check to confirm we're powered up. The
RPM calls are just to ensure that we don't suspend while issuing the
resume.
> > @@ -2152,13 +2171,24 @@ static irqreturn_t arm_smmu_gerror_handler(int irq, void *dev)
> > {
> > u32 gerror, gerrorn, active;
> > struct arm_smmu_device *smmu = dev;
> > + int ret;
> > +
> > + if (pm_runtime_enabled(smmu->dev)) {
> > + ret = pm_runtime_get_if_active(smmu->dev);
> > + if (ret == 0) {
> > + dev_err(smmu->dev, "Ignoring gerror interrupt because device isn't rpm active\n");
> > + return IRQ_NONE;
> > + }
> > + }
>
> This looks troubled too, if powering down looses the gerror register,
> then this scheme is racily going to loose gerror events?
>
> Seems like it should be checking that gerror is indeed 0 before
> powering down? Under the assumption there is no concurrent activity
> this should be race free? If it is not zero then it should be handled
> before it is lost.
Yes, I believe we can check it after disabling the CMDQ and log the
events.
>
> > /* IO_PGTABLE API */
> > @@ -2377,6 +2422,7 @@ static void arm_smmu_tlb_inv_context(void *cookie)
> > } else {
> > cmd.opcode = CMDQ_OP_TLBI_S12_VMALL;
> > cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
> > +
> > arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);> > }
> > arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
> > @@ -2471,6 +2517,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
> > cmd.opcode = CMDQ_OP_TLBI_S2_IPA;
> > cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
> > }
> > +
> > __arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
> >
> > if (smmu_domain->nest_parent) {
> > @@ -3312,13 +3359,14 @@ static int arm_smmu_blocking_set_dev_pasid(struct iommu_domain *new_domain,
> > sid_domain->type == IOMMU_DOMAIN_BLOCKED)
> > sid_domain->ops->attach_dev(sid_domain, dev);
> > }
> > +
> > return 0;
>
> extra new lines??
>
Ughh, my bad, I'll fix this, all of these are just some rebase-remnants
> > -static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
> > - struct device *dev,
> > - struct arm_smmu_ste *ste,
> > - unsigned int s1dss)
> > +static int arm_smmu_attach_dev_ste(struct iommu_domain *domain,
> > + struct device *dev,
> > + struct arm_smmu_ste *ste,
> > + unsigned int s1dss)
> > {
> > struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > struct arm_smmu_attach_state state = {
> > @@ -3361,6 +3409,7 @@ static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
> > * descriptor from arm_smmu_share_asid().
> > */
> > arm_smmu_clear_cd(master, IOMMU_NO_PASID);
> > + return 0;
> > }
>
> ??
>
> >
> > static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
> > @@ -3371,8 +3420,8 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
> >
> > arm_smmu_master_clear_vmaster(master);
> > arm_smmu_make_bypass_ste(master->smmu, &ste);
> > - arm_smmu_attach_dev_ste(domain, dev, &ste, STRTAB_STE_1_S1DSS_BYPASS);
> > - return 0;
> > + return arm_smmu_attach_dev_ste(domain, dev, &ste,
> > + STRTAB_STE_1_S1DSS_BYPASS);
> > }
>
> These things are not allowed to fail, please be careful not to
> propogate errors beyond where they should be. This is the biggest
> thing I don't like about this series, we already have no error
> recovery and now we are adding more cases where errors can't be
> propogated.
>
> For instance this call really cannot fail and propogating an error is
> nonsense.
>
> If you really couldn't power it up then you are better to keep it
> powered down and update the in memory structures knowing that a
> powered down device cannot cache them.
>
Ughh, sorry, these are just some rebase-remnants. I'll fix them.
> Jason
>
Thanks!
Praan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v4 2/8] iommu/arm-smmu-v3: Add a helper to drain cmd queues
2025-11-18 3:48 ` Daniel Mentz
@ 2025-11-20 20:45 ` Pranjal Shrivastava
0 siblings, 0 replies; 19+ messages in thread
From: Pranjal Shrivastava @ 2025-11-20 20:45 UTC (permalink / raw)
To: Daniel Mentz
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen
On Mon, Nov 17, 2025 at 07:48:12PM -0800, Daniel Mentz wrote:
> On Mon, Nov 17, 2025 at 11:15 AM Pranjal Shrivastava <praan@google.com> wrote:
> > +int arm_smmu_queue_poll_until_empty(struct arm_smmu_device *smmu,
> > + struct arm_smmu_queue *q)
>
> Since this function is only ever used with the Command Queue (cmdq),
> could you simplify the signature by obtaining the queue pointer
> directly from &smmu->cmdq.q within the function?
>
Ack. I'll pick up the cmdq from the smmu ptr.
> > +{
> > + struct arm_smmu_queue_poll qp;
> > + struct arm_smmu_ll_queue *llq = &q->llq;
> > + int ret = 0;
>
> I believe you don't need to initialize this to 0. It won't be used
> uninitialized.
>
Ack.
> > +
> > + queue_poll_init(smmu, &qp);
> > + do {
> > + if (queue_empty(llq))
> > + break;
> > +
> > + ret = queue_poll(&qp);
> > + WRITE_ONCE(llq->cons, readl_relaxed(q->cons_reg));
> > +
> > + } while (!ret);
> > +
> > + return ret;
> > +}
> > +
> > +static int arm_smmu_drain_queues(struct arm_smmu_device *smmu)
> > +{
> > + struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
> > + unsigned long flags;
> > + int ret;
> > +
> > + /*
> > + * Since this is only called from the suspend callback, we
> > + * should be able to acquire the exclusive lock without failing.
> > + */
> > + arm_smmu_cmdq_exclusive_trylock_irqsave(cmdq, flags);
>
> If you assume that we're the exclusive user and then ignore the return
> value of arm_smmu_cmdq_exclusive_trylock_irqsave, I'm wondering if you
> can just skip the locking entirely.
>
Correct, with this approach we can be sure that we are the only owner as
we'll be executing this much after setting the nr_users = 0;
> > + ret = arm_smmu_queue_poll_until_empty(smmu, &smmu->cmdq.q);
> > + arm_smmu_cmdq_exclusive_unlock_irqrestore(cmdq, flags);
> > +
> > + if (ret)
> > + return ret;
> > +
> > + return 0;
>
> Can you just always "return ret;". Wouldn't that be the same?
>
We can, since we're suspending anyway even if this times out.
> > +}
> > +
Thanks,
Praan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v4 5/8] iommu/arm-smmu-v3: Add a usage counter for cmdq
2025-11-18 6:05 ` Sairaj Kodilkar
@ 2025-11-20 20:48 ` Pranjal Shrivastava
0 siblings, 0 replies; 19+ messages in thread
From: Pranjal Shrivastava @ 2025-11-20 20:48 UTC (permalink / raw)
To: Sairaj Kodilkar
Cc: iommu, Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz
On Tue, Nov 18, 2025 at 11:35:01AM +0530, Sairaj Kodilkar wrote:
> On 11/18/2025 12:44 AM, Pranjal Shrivastava wrote:
> > Introduce a biased counter to track the number of active cmdq owners as
> > a preparatory step for the runtime PM implementation.
> >
> > The counter will be used to gate command submission, preventing the
> > submission of new commands while the device is suspended and deferring
> > suspend while the command submissions are in-flight.
> >
> > The counter is biased to a value of 1 during device reset. A cmdq owner
> > or a thread issuing cmds with sync, increment it before accessing HW
> > registers and decrements it with release semantics afterwards.
> >
> > A value of 1 represents an idle (but active) state. A suspend operation
> > will set it to from 1 -> 0 representing the suspended state.
> >
> > Signed-off-by: Pranjal Shrivastava <praan@google.com>
> > ---
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 69 +++++++++++++++++----
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +
> > 2 files changed, 61 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 1d6c60bee7dd..d6e75d1646d6 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -794,7 +794,7 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> > u64 cmd_sync[CMDQ_ENT_DWORDS];
> > u32 prod;
> > unsigned long flags;
> > - bool owner;
> > + bool owner, has_ref = false;
> > struct arm_smmu_ll_queue llq, head;
> > int ret = 0;
> > @@ -808,8 +808,15 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> > while (!queue_has_space(&llq, n + sync)) {
> > local_irq_restore(flags);
> > +
> > + if (!atomic_inc_not_zero(&smmu->nr_cmdq_users))
> > + /* Device is suspended, don't wait for space */
> > + return 0;
> > +
> > if (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq))
> > dev_err_ratelimited(smmu->dev, "CMDQ timeout\n");
> > +
> > + atomic_dec_return_release(&smmu->nr_cmdq_users);
> > local_irq_save(flags);
> > }
> > @@ -868,10 +875,35 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> > arm_smmu_cmdq_poll_valid_map(cmdq, llq.prod, prod);
> > /*
> > - * d. Advance the hardware prod pointer
> > + * d. Advance the hardware prod pointer (if smmu is still active)
> > * Control dependency ordering from the entries becoming valid.
> > */
> > - writel_relaxed(prod, cmdq->q.prod_reg);
> > + if (atomic_inc_not_zero(&smmu->nr_cmdq_users)) {
> > + writel_relaxed(prod, cmdq->q.prod_reg);
> > +
> > + if (sync) {
> > + has_ref = true;
> > + } else {
> > + /*
> > + * Use release semantics to enforce ordering without a full barrier.
> > + * This ensures the prior writel_relaxed() is ordered/visible
> > + * before the refcount decrement, avoiding the heavy pipeline
> > + * stall of a full wmb().
> > + *
> > + * We need the atomic_dec_return_release() below and the
> > + * atomic_set_release() in step (e) below doesn't suffice.
> > + *
> > + * Specifically, without release semantics on the decrement,
> > + * the CPU is free to reorder the independent atomic_dec_relaxed()
> > + * before the writel_relaxed().
> > + *
> > + * If this happens, the refcount could drop to zero, allowing the PM
> > + * suspend path (running on another CPU) to disable the SMMU before
> > + * the register write completes, resulting in a bus fault.
> > + */
> > + atomic_dec_return_release(&smmu->nr_cmdq_users);
> > + }
> > + }
> > /*
> > * e. Tell the next owner we're done
> > @@ -883,14 +915,23 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> > /* 5. If we are inserting a CMD_SYNC, we must wait for it to complete */
> > if (sync) {
> > - llq.prod = queue_inc_prod_n(&llq, n);
> > - ret = arm_smmu_cmdq_poll_until_sync(smmu, cmdq, &llq);
> > - if (ret) {
> > - dev_err_ratelimited(smmu->dev,
> > - "CMD_SYNC timeout at 0x%08x [hwprod 0x%08x, hwcons 0x%08x]\n",
> > - llq.prod,
> > - readl_relaxed(cmdq->q.prod_reg),
> > - readl_relaxed(cmdq->q.cons_reg));
> > +
> > + /* If we are not the owner, check if we're suspended */
> > + if (!has_ref) {
> > + if (atomic_inc_not_zero(&smmu->nr_cmdq_users))
> > + has_ref = true;
> > + }
> > +
> > + if (has_ref) {
>
> You can merge above two if condition as follow
>
> if (has_ref || atomic_inc_not_zero(&smmu->nr_cmdq_users)) {
> has_ref = true;
> ....
> }
>
Hmm.. yes, I guess redundantly setting has_ref = true again is fine..
> Thanks
> Sairaj
>
Thanks,
Praan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v4 6/8] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2025-11-17 19:14 ` [PATCH v4 6/8] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
@ 2025-12-24 7:39 ` Ashish Mhetre
2026-01-12 8:50 ` Pranjal Shrivastava
0 siblings, 1 reply; 19+ messages in thread
From: Ashish Mhetre @ 2025-12-24 7:39 UTC (permalink / raw)
To: Pranjal Shrivastava, iommu, Nicolin Chen
Cc: Will Deacon, Joerg Roedel, Robin Murphy, Jason Gunthorpe,
Mostafa Saleh, Nicolin Chen, Daniel Mentz, amhetre
On 11/18/2025 12:44 AM, Pranjal Shrivastava wrote:
> Implement pm_runtime and system sleep ops for arm-smmu-v3.
>
> The suspend callback configures the SMMU to abort new transactions,
> disables the main translation unit and then drains the command queue
> to ensure completion of any in-flight commands.
>
> The resume callback restores the MSI configuration and performs a full
> device reset via `arm_smmu_device_reset` to bring the SMMU back to an
> operational state. The MSIs are cached during the msi_write and are
> restored during the resume operation by using the helper.
>
> Signed-off-by: Pranjal Shrivastava <praan@google.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 109 ++++++++++++++++++++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +
> 2 files changed, 112 insertions(+)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index d6e75d1646d6..44875c526183 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -26,6 +26,7 @@
> #include <linux/pci.h>
> #include <linux/pci-ats.h>
> #include <linux/platform_device.h>
> +#include <linux/pm_runtime.h>
> #include <linux/string_choices.h>
> #include <kunit/visibility.h>
> #include <uapi/linux/iommufd.h>
> @@ -108,6 +109,33 @@ static const char * const event_class_str[] = {
>
> static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
>
> +/* Runtime PM helpers */
> +__maybe_unused static int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
> +{
> + int ret;
> +
> + if (pm_runtime_enabled(smmu->dev)) {
> + ret = pm_runtime_resume_and_get(smmu->dev);
> + if (ret < 0) {
> + dev_err(smmu->dev, "Failed to resume device: %d\n", ret);
> + return ret;
> + }
> + }
> +
> + return 0;
> +}
> +
> +__maybe_unused static void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
> +{
> + int ret;
> +
> + if (pm_runtime_enabled(smmu->dev)) {
> + ret = pm_runtime_put_autosuspend(smmu->dev);
> + if (ret < 0)
> + dev_err(smmu->dev, "Failed to suspend device: %d\n", ret);
> + }
> +}
> +
> static void parse_driver_options(struct arm_smmu_device *smmu)
> {
> int i = 0;
> @@ -5005,6 +5033,86 @@ static void arm_smmu_device_shutdown(struct platform_device *pdev)
> arm_smmu_device_disable(smmu);
> }
>
> +static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
> +{
> + struct arm_smmu_device *smmu = dev_get_drvdata(dev);
> + int timeout = ARM_SMMU_SUSPEND_TIMEOUT_US;
> + u32 enables;
> + int ret;
> +
> + /* Try to suspend the device, wait for in-flight submissions */
> + do {
> + if (atomic_cmpxchg(&smmu->nr_cmdq_users, 1, 0) == 1)
> + break;
> +
> + udelay(1);
> + } while (--timeout);
> +
> + if (!timeout) {
> + dev_warn(smmu->dev, "SMMU in use, aborting suspend\n");
> + return -EAGAIN;
> + }
> +
> + /* Abort all transactions before disable to avoid spurious bypass */
> + arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> +
> + /* Disable the SMMU via CR0.EN and all queues except CMDQ */
> + enables = CR0_CMDQEN;
> + ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0, ARM_SMMU_CR0ACK);
> + if (ret) {
> + dev_err(smmu->dev, "Timed-out while disabling smmu\n");
> + atomic_set(&smmu->nr_cmdq_users, 1);
> + return ret;
> + }
> +
> + /*
> + * At this point the SMMU is completely disabled and won't access
> + * any translation/config structures, even speculative accesses
> + * aren't performed as per the IHI0070 spec (section 6.3.9.6).
> + */
> +
> + /* Wait for the CMDQs to be drained to flush any pending commands */
> + ret = arm_smmu_drain_queues(smmu);
> + if (ret)
> + dev_err(smmu->dev, "Draining queues timed-out..forcing suspend\n");
> +
> + /* Disable everything */
> + arm_smmu_device_disable(smmu);
> + dev_dbg(dev, "Suspended smmu\n");
> +
> + return 0;
> +}
> +
> +static int __maybe_unused arm_smmu_runtime_resume(struct device *dev)
> +{
> + int ret;
> + struct arm_smmu_device *smmu = dev_get_drvdata(dev);
> +
> + dev_dbg(dev, "Resuming device\n");
> +
> + /* Re-configure MSIs */
> + arm_smmu_resume_msis(smmu);
> +
> + /*
> + * The reset will re-initialize all the base addresses, queues,
> + * prod and cons maintained within struct arm_smmu_device as well as
> + * re-enable the interrupts.
> + */
> + ret = arm_smmu_device_reset(smmu);
> +
> + if (ret)
> + dev_err(dev, "Failed to reset during resume operation: %d\n", ret);
> +
> + return ret;
> +}
> +
> +static const struct dev_pm_ops arm_smmu_pm_ops = {
> + SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
> + pm_runtime_force_resume)
> + SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend,
> + arm_smmu_runtime_resume, NULL)
> +};
> +
Hi Pranjal, Nic,
I tested these patches on Tegra264 with CMDQV enabled and 16 vcmdqs
assigned to guest. I found an issue in resume path of CMDQV.
In tegra241_cmdqv_hw_reset() function, the PROD and CONS pointers for
VCMDQs are being set to 0 on resume. Because of this, commands from
VCMDQs we not being consumed post resume and I got CMD_SYNC timeouts.
We need to restore the prod and cons indices for VCMDQs on resume.
This is similar to what is done for SMMU's physical CMDQ.
Also, current implementation uses pm_runtime_force_suspend/resume()
which requires runtime PM to be enabled, but runtime PM is only enabled
when dev->pm_domain exists. I had to use dedicated system sleep callbacks
that directly call arm_smmu_runtime_suspend/resume(), bypassing the runtime
PM dependency to get suspend/resume working on Tegra264.
I got it working by making following changes on top of your series and
validated that after resume all commands are being consumed and SMMU clients
are working fine:
From 8ead50a7e2fbdfa30fbe2c927c47a440b447c864 Mon Sep 17 00:00:00 2001
From: Ashish Mhetre<amhetre@nvidia.com>
Date: Tue, 23 Dec 2025 08:42:31 +0000
Subject: [PATCH] iommu/tegra241-cmdqv: Restore PROD and CONS after resume
X-NVConfidentiality: public
PROD and CONS indices for vcmdqs are getting set to 0 after resume.
Because of this the vcmdq is not consuming commands after resume.
Fix this by restoring PROD and CONS indices after resume from
saved pointers.
Signed-off-by: Ashish Mhetre<amhetre@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
index 378104cd395e..00ec684fe3a4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
@@ -483,6 +483,8 @@ static int tegra241_vcmdq_hw_init(struct tegra241_vcmdq *vcmdq)
/* Configure and enable VCMDQ */
writeq_relaxed(vcmdq->cmdq.q.q_base, REG_VCMDQ_PAGE1(vcmdq, BASE));
+ writel_relaxed(vcmdq->cmdq.q.llq.prod, REG_VCMDQ_PAGE0(vcmdq, PROD));
+ writel_relaxed(vcmdq->cmdq.q.llq.cons, REG_VCMDQ_PAGE0(vcmdq, CONS));
ret = vcmdq_write_config(vcmdq, VCMDQ_EN);
if (ret) {
--
2.25.1
From 51a11812bb40d341e11233129a01ce248957bd25 Mon Sep 17 00:00:00 2001
From: Ashish Mhetre <amhetre@nvidia.com>
Date: Tue, 23 Dec 2025 09:30:04 +0000
Subject: [PATCH] iommu/arm-smmu-v3: Fix system suspend/resume when
runtime PM
is not enabled
X-NVConfidentiality: public The current implementation uses
pm_runtime_force_suspend() and
pm_runtime_force_resume() as system sleep callbacks. These generic PM
helpers are designed to bridge runtime PM with system sleep by forcing
the device through the runtime PM path.
However, these helpers only work correctly when runtime PM is enabled
for the device. In arm_smmu_device_probe(), runtime PM is conditionally
enabled:
if (dev->pm_domain) {
pm_runtime_set_active(dev);
pm_runtime_enable(dev);
}
On platforms where the SMMU does not have an associated power domain,
pm_runtime_enable() is never called. As a result, when the system
enters suspend, pm_runtime_force_suspend() effectively becomes a
no-op, it does not invoke arm_smmu_runtime_suspend(), leaving the
SMMU hardware in an undefined state during system sleep. Fix this by
introducing dedicated system sleep callbacks
(arm_smmu_pm_suspend/resume) that directly invoke the existing runtime
suspend/resume functions. This ensures the SMMU is properly suspended
and resumed during system sleep, regardless of whether runtime PM is
enabled.
The pm_runtime_suspended() check ensures we don't double-suspend the
device if it was already suspended via runtime PM during normal
operation. Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 20 +++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b1c9c733ed8d..4707adc63d86 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -5180,9 +5180,35 @@ static int __maybe_unused
arm_smmu_runtime_resume(struct device *dev)
return ret;
}
+static int __maybe_unused arm_smmu_pm_resume(struct device *dev)
+{
+ if (pm_runtime_suspended(dev))
+ return 0;
+
+ return arm_smmu_runtime_resume(dev);
+}
+
+static int __maybe_unused arm_smmu_pm_suspend(struct device *dev)
+{
+ if (pm_runtime_suspended(dev))
+ return 0;
+
+ return arm_smmu_runtime_suspend(dev);
+}
+
static const struct dev_pm_ops arm_smmu_pm_ops = {
- SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
- pm_runtime_force_resume)
+ SET_SYSTEM_SLEEP_PM_OPS(arm_smmu_pm_suspend,
+ arm_smmu_pm_resume)
SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend,
arm_smmu_runtime_resume, NULL)
};
--
2.25.1
Please see if these changes make sense and squash to the series if they do.
> static const struct of_device_id arm_smmu_of_match[] = {
> { .compatible = "arm,smmu-v3", },
> { },
> @@ -5021,6 +5129,7 @@ static struct platform_driver arm_smmu_driver = {
> .driver = {
> .name = "arm-smmu-v3",
> .of_match_table = arm_smmu_of_match,
> + .pm = &arm_smmu_pm_ops,
> .suppress_bind_attrs = true,
> },
> .probe = arm_smmu_device_probe,
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 924580610ce0..eefa5853033c 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -503,11 +503,14 @@ static inline unsigned int arm_smmu_cdtab_l2_idx(unsigned int ssid)
>
> /* High-level queue structures */
> #define ARM_SMMU_POLL_TIMEOUT_US 1000000 /* 1s! */
> +#define ARM_SMMU_SUSPEND_TIMEOUT_US 100 /* 100us! */
> #define ARM_SMMU_POLL_SPIN_COUNT 10
>
> #define MSI_IOVA_BASE 0x8000000
> #define MSI_IOVA_LENGTH 0x100000
>
> +#define RPM_AUTOSUSPEND_DELAY_MS 15
> +
> enum pri_resp {
> PRI_RESP_DENY = 0,
> PRI_RESP_FAIL = 1,
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v4 6/8] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2025-12-24 7:39 ` Ashish Mhetre
@ 2026-01-12 8:50 ` Pranjal Shrivastava
2026-01-13 5:06 ` Ashish Mhetre
0 siblings, 1 reply; 19+ messages in thread
From: Pranjal Shrivastava @ 2026-01-12 8:50 UTC (permalink / raw)
To: Ashish Mhetre
Cc: iommu, Nicolin Chen, Will Deacon, Joerg Roedel, Robin Murphy,
Jason Gunthorpe, Mostafa Saleh, Daniel Mentz, amhetre
On Wed, Dec 24, 2025 at 01:09:32PM +0530, Ashish Mhetre wrote:
>
>
> On 11/18/2025 12:44 AM, Pranjal Shrivastava wrote:
> > Implement pm_runtime and system sleep ops for arm-smmu-v3.
> >
> > The suspend callback configures the SMMU to abort new transactions,
> > disables the main translation unit and then drains the command queue
> > to ensure completion of any in-flight commands.
> >
> > The resume callback restores the MSI configuration and performs a full
> > device reset via `arm_smmu_device_reset` to bring the SMMU back to an
> > operational state. The MSIs are cached during the msi_write and are
> > restored during the resume operation by using the helper.
> >
> > Signed-off-by: Pranjal Shrivastava <praan@google.com>
> > ---
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 109 ++++++++++++++++++++
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +
> > 2 files changed, 112 insertions(+)
> >
>
> Hi Pranjal, Nic,
>
> I tested these patches on Tegra264 with CMDQV enabled and 16 vcmdqs
> assigned to guest. I found an issue in resume path of CMDQV.
> In tegra241_cmdqv_hw_reset() function, the PROD and CONS pointers for
> VCMDQs are being set to 0 on resume. Because of this, commands from
> VCMDQs we not being consumed post resume and I got CMD_SYNC timeouts.
> We need to restore the prod and cons indices for VCMDQs on resume.
> This is similar to what is done for SMMU's physical CMDQ.
Hi Ashish,
Thanks for testing this with Tegra, I'll include this patch for CMDQV.
> Also, current implementation uses pm_runtime_force_suspend/resume()
> which requires runtime PM to be enabled, but runtime PM is only enabled
> when dev->pm_domain exists. I had to use dedicated system sleep callbacks
> that directly call arm_smmu_runtime_suspend/resume(), bypassing the runtime
> PM dependency to get suspend/resume working on Tegra264.
>
Thanks for pointing this out, I'll revert to the approach I had in v1:
https://lore.kernel.org/all/20250319004254.2547950-4-praan@google.com/
> I got it working by making following changes on top of your series and
> validated that after resume all commands are being consumed and SMMU clients
> are working fine:
>
>
> From 8ead50a7e2fbdfa30fbe2c927c47a440b447c864 Mon Sep 17 00:00:00 2001
> From: Ashish Mhetre<amhetre@nvidia.com>
> Date: Tue, 23 Dec 2025 08:42:31 +0000
> Subject: [PATCH] iommu/tegra241-cmdqv: Restore PROD and CONS after resume
> X-NVConfidentiality: public
>
> PROD and CONS indices for vcmdqs are getting set to 0 after resume.
> Because of this the vcmdq is not consuming commands after resume.
> Fix this by restoring PROD and CONS indices after resume from
> saved pointers.
>
> Signed-off-by: Ashish Mhetre<amhetre@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
> index 378104cd395e..00ec684fe3a4 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
> @@ -483,6 +483,8 @@ static int tegra241_vcmdq_hw_init(struct tegra241_vcmdq *vcmdq)
>
> /* Configure and enable VCMDQ */
> writeq_relaxed(vcmdq->cmdq.q.q_base, REG_VCMDQ_PAGE1(vcmdq, BASE));
> + writel_relaxed(vcmdq->cmdq.q.llq.prod, REG_VCMDQ_PAGE0(vcmdq, PROD));
> + writel_relaxed(vcmdq->cmdq.q.llq.cons, REG_VCMDQ_PAGE0(vcmdq, CONS));
>
> ret = vcmdq_write_config(vcmdq, VCMDQ_EN);
> if (ret) {
> --
> 2.25.1
>
> From 51a11812bb40d341e11233129a01ce248957bd25 Mon Sep 17 00:00:00 2001
> From: Ashish Mhetre <amhetre@nvidia.com>
> Date: Tue, 23 Dec 2025 09:30:04 +0000
> Subject: [PATCH] iommu/arm-smmu-v3: Fix system suspend/resume when runtime
> PM
> is not enabled
> X-NVConfidentiality: public The current implementation uses
> pm_runtime_force_suspend() and
> pm_runtime_force_resume() as system sleep callbacks. These generic PM
> helpers are designed to bridge runtime PM with system sleep by forcing
> the device through the runtime PM path.
> However, these helpers only work correctly when runtime PM is enabled
> for the device. In arm_smmu_device_probe(), runtime PM is conditionally
> enabled:
> if (dev->pm_domain) {
> pm_runtime_set_active(dev);
> pm_runtime_enable(dev);
> }
> On platforms where the SMMU does not have an associated power domain,
> pm_runtime_enable() is never called. As a result, when the system
> enters suspend, pm_runtime_force_suspend() effectively becomes a
> no-op, it does not invoke arm_smmu_runtime_suspend(), leaving the
> SMMU hardware in an undefined state during system sleep. Fix this by
> introducing dedicated system sleep callbacks
> (arm_smmu_pm_suspend/resume) that directly invoke the existing runtime
> suspend/resume functions. This ensures the SMMU is properly suspended
> and resumed during system sleep, regardless of whether runtime PM is
> enabled.
> The pm_runtime_suspended() check ensures we don't double-suspend the
> device if it was already suspended via runtime PM during normal
> operation. Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 20 +++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index b1c9c733ed8d..4707adc63d86 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -5180,9 +5180,35 @@ static int __maybe_unused
> arm_smmu_runtime_resume(struct device *dev)
> return ret;
> }
> +static int __maybe_unused arm_smmu_pm_resume(struct device *dev)
> +{
> + if (pm_runtime_suspended(dev))
> + return 0;
> +
> + return arm_smmu_runtime_resume(dev);
> +}
> +
> +static int __maybe_unused arm_smmu_pm_suspend(struct device *dev)
> +{
> + if (pm_runtime_suspended(dev))
> + return 0;
> +
> + return arm_smmu_runtime_suspend(dev);
> +}
> +
> static const struct dev_pm_ops arm_smmu_pm_ops = {
> - SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
> - pm_runtime_force_resume)
> + SET_SYSTEM_SLEEP_PM_OPS(arm_smmu_pm_suspend,
> + arm_smmu_pm_resume)
> SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend,
> arm_smmu_runtime_resume, NULL)
> };
> --
> 2.25.1
>
>
>
>
> Please see if these changes make sense and squash to the series if they do.
>
I'll include the CMDQV patch from you as is in my next version and
revert to use the SLEEP_PM_OPS as in my v1. It'd be nice to have the
`Tested-by` from you for the next version too!
Thanks!
Praan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v4 6/8] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
2026-01-12 8:50 ` Pranjal Shrivastava
@ 2026-01-13 5:06 ` Ashish Mhetre
0 siblings, 0 replies; 19+ messages in thread
From: Ashish Mhetre @ 2026-01-13 5:06 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, Nicolin Chen, Will Deacon, Joerg Roedel, Robin Murphy,
Jason Gunthorpe, Mostafa Saleh, Daniel Mentz
On 1/12/2026 2:20 PM, Pranjal Shrivastava wrote:
> External email: Use caution opening links or attachments
>
>
> On Wed, Dec 24, 2025 at 01:09:32PM +0530, Ashish Mhetre wrote:
>>
>> On 11/18/2025 12:44 AM, Pranjal Shrivastava wrote:
>>> Implement pm_runtime and system sleep ops for arm-smmu-v3.
>>>
>>> The suspend callback configures the SMMU to abort new transactions,
>>> disables the main translation unit and then drains the command queue
>>> to ensure completion of any in-flight commands.
>>>
>>> The resume callback restores the MSI configuration and performs a full
>>> device reset via `arm_smmu_device_reset` to bring the SMMU back to an
>>> operational state. The MSIs are cached during the msi_write and are
>>> restored during the resume operation by using the helper.
>>>
>>> Signed-off-by: Pranjal Shrivastava <praan@google.com>
>>> ---
>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 109 ++++++++++++++++++++
>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +
>>> 2 files changed, 112 insertions(+)
>>>
>> Hi Pranjal, Nic,
>>
>> I tested these patches on Tegra264 with CMDQV enabled and 16 vcmdqs
>> assigned to guest. I found an issue in resume path of CMDQV.
>> In tegra241_cmdqv_hw_reset() function, the PROD and CONS pointers for
>> VCMDQs are being set to 0 on resume. Because of this, commands from
>> VCMDQs we not being consumed post resume and I got CMD_SYNC timeouts.
>> We need to restore the prod and cons indices for VCMDQs on resume.
>> This is similar to what is done for SMMU's physical CMDQ.
> Hi Ashish,
>
> Thanks for testing this with Tegra, I'll include this patch for CMDQV.
>
>> Also, current implementation uses pm_runtime_force_suspend/resume()
>> which requires runtime PM to be enabled, but runtime PM is only enabled
>> when dev->pm_domain exists. I had to use dedicated system sleep callbacks
>> that directly call arm_smmu_runtime_suspend/resume(), bypassing the runtime
>> PM dependency to get suspend/resume working on Tegra264.
>>
> Thanks for pointing this out, I'll revert to the approach I had in v1:
> https://lore.kernel.org/all/20250319004254.2547950-4-praan@google.com/
>
>> I got it working by making following changes on top of your series and
>> validated that after resume all commands are being consumed and SMMU clients
>> are working fine:
>>
>>
>> From 8ead50a7e2fbdfa30fbe2c927c47a440b447c864 Mon Sep 17 00:00:00 2001
>> From: Ashish Mhetre<amhetre@nvidia.com>
>> Date: Tue, 23 Dec 2025 08:42:31 +0000
>> Subject: [PATCH] iommu/tegra241-cmdqv: Restore PROD and CONS after resume
>> X-NVConfidentiality: public
>>
>> PROD and CONS indices for vcmdqs are getting set to 0 after resume.
>> Because of this the vcmdq is not consuming commands after resume.
>> Fix this by restoring PROD and CONS indices after resume from
>> saved pointers.
>>
>> Signed-off-by: Ashish Mhetre<amhetre@nvidia.com>
>> ---
>> drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
>> index 378104cd395e..00ec684fe3a4 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
>> +++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
>> @@ -483,6 +483,8 @@ static int tegra241_vcmdq_hw_init(struct tegra241_vcmdq *vcmdq)
>>
>> /* Configure and enable VCMDQ */
>> writeq_relaxed(vcmdq->cmdq.q.q_base, REG_VCMDQ_PAGE1(vcmdq, BASE));
>> + writel_relaxed(vcmdq->cmdq.q.llq.prod, REG_VCMDQ_PAGE0(vcmdq, PROD));
>> + writel_relaxed(vcmdq->cmdq.q.llq.cons, REG_VCMDQ_PAGE0(vcmdq, CONS));
>>
>> ret = vcmdq_write_config(vcmdq, VCMDQ_EN);
>> if (ret) {
>> --
>> 2.25.1
>>
>> From 51a11812bb40d341e11233129a01ce248957bd25 Mon Sep 17 00:00:00 2001
>> From: Ashish Mhetre <amhetre@nvidia.com>
>> Date: Tue, 23 Dec 2025 09:30:04 +0000
>> Subject: [PATCH] iommu/arm-smmu-v3: Fix system suspend/resume when runtime
>> PM
>> is not enabled
>> X-NVConfidentiality: public The current implementation uses
>> pm_runtime_force_suspend() and
>> pm_runtime_force_resume() as system sleep callbacks. These generic PM
>> helpers are designed to bridge runtime PM with system sleep by forcing
>> the device through the runtime PM path.
>> However, these helpers only work correctly when runtime PM is enabled
>> for the device. In arm_smmu_device_probe(), runtime PM is conditionally
>> enabled:
>> if (dev->pm_domain) {
>> pm_runtime_set_active(dev);
>> pm_runtime_enable(dev);
>> }
>> On platforms where the SMMU does not have an associated power domain,
>> pm_runtime_enable() is never called. As a result, when the system
>> enters suspend, pm_runtime_force_suspend() effectively becomes a
>> no-op, it does not invoke arm_smmu_runtime_suspend(), leaving the
>> SMMU hardware in an undefined state during system sleep. Fix this by
>> introducing dedicated system sleep callbacks
>> (arm_smmu_pm_suspend/resume) that directly invoke the existing runtime
>> suspend/resume functions. This ensures the SMMU is properly suspended
>> and resumed during system sleep, regardless of whether runtime PM is
>> enabled.
>> The pm_runtime_suspended() check ensures we don't double-suspend the
>> device if it was already suspended via runtime PM during normal
>> operation. Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
>> ---
>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 20 +++++++++++++++++++--
>> 1 file changed, 18 insertions(+), 2 deletions(-)
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> index b1c9c733ed8d..4707adc63d86 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> @@ -5180,9 +5180,35 @@ static int __maybe_unused
>> arm_smmu_runtime_resume(struct device *dev)
>> return ret;
>> }
>> +static int __maybe_unused arm_smmu_pm_resume(struct device *dev)
>> +{
>> + if (pm_runtime_suspended(dev))
>> + return 0;
>> +
>> + return arm_smmu_runtime_resume(dev);
>> +}
>> +
>> +static int __maybe_unused arm_smmu_pm_suspend(struct device *dev)
>> +{
>> + if (pm_runtime_suspended(dev))
>> + return 0;
>> +
>> + return arm_smmu_runtime_suspend(dev);
>> +}
>> +
>> static const struct dev_pm_ops arm_smmu_pm_ops = {
>> - SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
>> - pm_runtime_force_resume)
>> + SET_SYSTEM_SLEEP_PM_OPS(arm_smmu_pm_suspend,
>> + arm_smmu_pm_resume)
>> SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend,
>> arm_smmu_runtime_resume, NULL)
>> };
>> --
>> 2.25.1
>>
>>
>>
>>
>> Please see if these changes make sense and squash to the series if they do.
>>
> I'll include the CMDQV patch from you as is in my next version and
> revert to use the SLEEP_PM_OPS as in my v1. It'd be nice to have the
> `Tested-by` from you for the next version too!
>
> Thanks!
> Praan
Sure, I'll test the next version on Tegra and provide feedback. Thanks
Pranjal!
Thanks,
Ashish Mhetre
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2026-01-13 5:06 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-17 19:14 [RFC PATCH v4 0/8] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 1/8] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 2/8] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
2025-11-17 19:31 ` Nicolin Chen
2025-11-18 3:48 ` Daniel Mentz
2025-11-20 20:45 ` Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 3/8] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 4/8] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 5/8] iommu/arm-smmu-v3: Add a usage counter for cmdq Pranjal Shrivastava
2025-11-18 6:05 ` Sairaj Kodilkar
2025-11-20 20:48 ` Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 6/8] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
2025-12-24 7:39 ` Ashish Mhetre
2026-01-12 8:50 ` Pranjal Shrivastava
2026-01-13 5:06 ` Ashish Mhetre
2025-11-17 19:14 ` [PATCH v4 7/8] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
2025-11-17 19:14 ` [PATCH v4 8/8] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
2025-11-18 0:14 ` Jason Gunthorpe
2025-11-20 20:26 ` Pranjal Shrivastava
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.