All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/8] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops
@ 2025-06-16 20:31 Pranjal Shrivastava
  2025-06-16 20:31 ` [RFC PATCH v3 1/8] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
                   ` (7 more replies)
  0 siblings, 8 replies; 28+ messages in thread
From: Pranjal Shrivastava @ 2025-06-16 20:31 UTC (permalink / raw)
  To: Joerg Roedel, Will Deacon, Robin Murphy, Jason Gunthorpe,
	Rafael J. Wysocki
  Cc: Nicolin Chen, Mostafa Saleh, Daniel Mentz, iommu,
	Pranjal Shrivastava

As arm-smmu-v3 rapidly finds its way into SoCs designed for hand-held
devices, power management capabilities, similar to its predecessors, are
crucial for these applications. This series introduces power management
support for the arm-smmu-v3 driver.

Design
=======
The arm-smmu-v3 primarily operates with in-memory data structures
through HW registers pointing to these data structures in some fashion.
The proposed design tries to make use of this fact for implementing the
suspend and resume ops.

1. Suspend / Resume
The idea for the "suspend" op is to wait till the cmd queues are flushed
before disabling the SMMU through CR0. Note that waiting for the other 
queues like the event queue and PRI queue is not required in this
version as by design their bottom-halves hold a pm reference ensuring
that if the suspend callback is invoked, both the queues have been
drained. In order to avoid mis-use or spurious transactions 
(b/w SMMU disable -> power-down), the GBPA register is configured to
abort all transactions.

The resume operation uses the `arm_smmu_device_reset` function which
re-initializes the HW using the SW-copies maintained by the driver. For
example, prod/cons for queues, base addresses for queues & tables. The
arm_smmu_device_reset also clears the TLBs and config caches.

2. Interrupt Re-config
a. Wired irqs: The series refactors the `arm_smmu_setup_irqs` to be
able to enable/disable irqs and install their handlers separately to
help with the re-initialization of the interrupts correctly.

b. MSIs: The series relies on caching the msi_msg and retrieving it
through a newly introduced helper `arm_smmu_resume_msis()` which
re-configures the *_IRQ_CFGn registers via writing back cached msi_msgs.

3. Eliding TLBIs and CFGIs
The existing pm_runtime helpers like pm_runtime_get_if_active() and
pm_runtime_get_if_in_use(), are too strict for certain use cases. They
fail if the device is in a transient state like RPM_SUSPENDING, which
can lead to drivers making incorrect assumptions about the dev's state.

These helpers don't suffice for cases where one wishes to elide HW TLB
invalidations if the device is powered off. As discussed in the previous
versions, it is wasteful to wake up the SMMU just to issue a TLBI since
we anyway clear the TLBs and Config cache on resume. The existing APIs
like pm_runtime_get_if_active or pm_runtime_get_if_in_use fail to help
us acheive this. Consider the following sequence of operations:

a. The SMMU is in `RPM_SUSPENDING` state
b. The SMMU driver calls pm_runtime_get_if_active/in_use
c. Depending on these API, the driver elides a HW clean-up op like:

if (pm_runtime_get_if_in_active(dev))
	invalidate_tlb(dev);
else
	// Skip flush, assuming device will fully suspend

d. Now, a client wakes up, causing the SMMU's state to bounce from
   RPM_SUSPENDING to RPM_ACTIVE without invoking any rpm callbacks
   preventing the resume callback from clearing all TLBs.

e. The SMMU continues operate with stale TLB entries.

In order to avoid the above situation, this series introduces a new
helper function, pm_runtime_get_if_not_suspended(), The new APU, 
pm_runtime_get_if_not_suspended() increments the device's runtime 
PM reference only if it's rpm state is NOT RPM_SUSPENDED and it cancels
any pending autosuspend timers. This allows us to reliably clear TLBs or
Config cache if the SMMU is NOT suspended, ensuring correct operation.

4. Invoking runtime_pm_get/put
Given that most of the configuration done by arm-smmu-v3 is stored in
memory, the idea in this version of the series is to elide all TLBIs and
CFGIs if the SMMU is suspended and only go ahead with ATC invalidations.

Thus, for most calls, the SMMU driver would make the required changes to
the in-memory data structures, but elide all TLBIs, CFGIs and prefetches
This is done by introducing another runtime PM helper based on the newly
introduced pm_runtime_get_if_not_suspended.

Only places where the driver does a pm_runtime_resume_and_get is where
touching the HW is unavoidable and important commands like ATC_INV and
other commands resuming stalled transactions.

Future Work / Potential Improvements
Ensuring a pm reference is held for user-space controlled IOMMUs.
As per a discussion with Will off-list, it was decided to remove the
additional code for tracking user-owned IOMMUs / insecure attachments
from the driver.

Call for review
Any insights/comments on the proposed changes are appreciated,
especially in areas related to locking, atomic contexts, early resume, 
PCIe-related considerations etc. or any other potential optimizations.

Note: The series isn't tested with MSIs and weakly tested for PCIe
clients. The same holds true for tegra241_cmdv changes. Any help in
reviewing and testing these parts is much appreciated.

Changelog:

[v3]
 - Introduced `pm_runtime_get_if_not_suspended` API to avoid races due
   to bouncing RPM states while eliding TLBIs as pointed out by Daniel.
 - Addressed Nicolin's comments regarding msi_resume and CMDQV flush
 - Addressed Daniel's comments about CMDQ locking and draining
 - Addressed issues related to draining the evtq and priq
 - Dropped the code to identify and track user-space attachments

[v2]
 - https://lore.kernel.org/all/20250418233409.3926715-1-praan@google.com/
 - Introduced `arm_smmu_rpm_get_if_active` for eliding TLBIs & CFGIs
 - Updated the rpm helper invocation strategy.
 - Drained all queues in suspend callback (including tegra241-cmdv)
 - Cache and restore msi_msg instead of free-ing realloc-ing on resume
 - Added support to identify and track user-space attachments
 - Fixed the setup_irqs as per Nicolin & Mostafa's suggestions
 - Used force_runtime_suspend/resume instead as per Mostafa's suggestion.
 - Added "Reviewed-by" line from Mostafa on an unchanged patch

[v1]
 - https://lore.kernel.org/all/20250319004254.2547950-1-praan@google.com/


Pranjal Shrivastava (8):
  iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs
  iommu/arm-smmu-v3: Add a helper to drain cmd queues
  iommu/tegra241-cmdqv: Add a helper to drain VCMDQs
  iommu/arm-smmu-v3: Cache and restore MSI config
  pm: runtime: Introduce pm_runtime_get_if_not_suspended()
  iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops
  iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks
  iommu/arm-smmu-v3: Invoke pm_runtime before hw access

 drivers/base/power/runtime.c                  |  41 ++
 .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c     |  11 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 378 ++++++++++++++++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |   9 +
 .../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c    |  27 ++
 include/linux/pm_runtime.h                    |   5 +
 6 files changed, 438 insertions(+), 33 deletions(-)

-- 
2.50.0.rc2.692.g299adb8693-goog


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2025-07-21 12:44 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-16 20:31 [RFC PATCH v3 0/8] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2025-06-16 20:31 ` [RFC PATCH v3 1/8] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
2025-07-08 15:15   ` Mostafa Saleh
2025-06-16 20:31 ` [RFC PATCH v3 2/8] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
2025-07-08 15:32   ` Mostafa Saleh
2025-07-14  9:24     ` Pranjal Shrivastava
2025-06-16 20:31 ` [RFC PATCH v3 3/8] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
2025-06-16 20:31 ` [RFC PATCH v3 4/8] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
2025-07-08 15:34   ` Mostafa Saleh
2025-07-14  9:01     ` Pranjal Shrivastava
2025-06-16 20:31 ` [RFC PATCH v3 5/8] pm: runtime: Introduce pm_runtime_get_if_not_suspended() Pranjal Shrivastava
2025-06-17  9:19   ` kernel test robot
2025-07-08 22:00   ` Nicolin Chen
2025-07-09 15:51     ` Pranjal Shrivastava
2025-07-09  6:44   ` Rafael J. Wysocki
2025-07-09 15:51     ` Pranjal Shrivastava
2025-07-09 16:35       ` Rafael J. Wysocki
2025-07-09 17:06         ` Pranjal Shrivastava
2025-07-09 19:37           ` Rafael J. Wysocki
2025-07-10  8:59             ` Pranjal Shrivastava
2025-07-10 10:29               ` Rafael J. Wysocki
2025-07-11 10:20                 ` Pranjal Shrivastava
2025-07-15 23:52                   ` Daniel Mentz
2025-07-16 12:53                     ` Rafael J. Wysocki
2025-07-21 12:44                       ` Will Deacon
2025-06-16 20:31 ` [RFC PATCH v3 6/8] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
2025-06-16 20:31 ` [RFC PATCH v3 7/8] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
2025-06-16 20:31 ` [RFC PATCH v3 8/8] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.