Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] iommu/arm-smmu-v3: Tegra264 invalidation workaround
@ 2026-06-01 10:48 Ashish Mhetre
  2026-06-01 10:48 ` [PATCH v3 1/3] iommu/arm-smmu-v3: Factor out CMDQ batch force-sync conditions Ashish Mhetre
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Ashish Mhetre @ 2026-06-01 10:48 UTC (permalink / raw)
  To: will, robin.murphy, joro, jgg, nicolinc
  Cc: linux-arm-kernel, iommu, linux-kernel, linux-tegra, Ashish Mhetre

Nvidia Tegra264 SMMUs are affected by an erratum where a TLB entry can
survive an invalidation that races with concurrent traffic targeting
the same entry. The hardware-recommended software workaround is to
issue every CFGI/TLBI command (each followed by CMD_SYNC) twice. The
second issue must execute only after the first issue's CMD_SYNC has
completed, giving the sequence:

    TLBI/CFGI ... CMD_SYNC TLBI/CFGI ... CMD_SYNC

ATC_INV is not affected and must not be doubled.

This series implements the workaround by hooking the duplication into
arm_smmu_cmdq_issue_cmdlist(), the single chokepoint that every
synchronous CMDQ submission flows through.

Patch 1 is a preparatory refactor that factors the existing batch
force-sync conditions out of arm_smmu_cmdq_batch_add_cmd_p() into a
new arm_smmu_cmdq_batch_force_sync() helper. No functional change.

Patch 2 detects affected instances using the existing
"nvidia,tegra264-smmu" compatible string, exposes the condition via a
new ARM_SMMU_OPT_TLBI_TWICE option bit, and adds a static-inline
arm_smmu_cmd_needs_tlbi_twice() classifier in arm-smmu-v3.h so that
both the in-tree CMDQ path and the iommufd VSMMU path can share a
single predicate.

Patch 3 wires the workaround in. arm_smmu_cmdq_issue_cmdlist() becomes
a thin wrapper that re-issues a synced cmdlist a second time when the
first command needs doubling. The Tegra264 condition is added to
arm_smmu_cmdq_batch_force_sync() so a full batch carrying CFGI/TLBI
commands flushes with sync=true and is then doubled. The iommufd
VSMMU path (arm_vsmmu_cache_invalidate()) is also taught to split the
user-supplied batch at every "needs doubling" / "doesn't need
doubling" transition via a new arm_vsmmu_can_batch_cmd() predicate,
since that path can otherwise mix CFGI/TLBI with ATC_INV in a single
submission.

The series is based on Jason Gunthorpe's "Remove SMMUv3
struct arm_smmu_cmdq_ent" series [1], specifically commit 13428b0bf794
("iommu/arm-smmu-v3: Directly encode TLBI commands") which is the
final patch of that series in linux-next.

[1] https://lore.kernel.org/all/177919957385.1012282.14787407041669291032.b4-ty@kernel.org/

Changes since v2:
- Add new prep patch 1/3 that factors the existing force-sync
  conditions into arm_smmu_cmdq_batch_force_sync() (from Nicolin).
- Move arm_smmu_cmd_needs_tlbi_twice() to arm-smmu-v3.h as static
  inline taking (smmu, cmd*) and folding in the option check.
- Plug the Tegra264 condition into arm_smmu_cmdq_batch_force_sync()
  instead of carrying a separate need_sync in batch_add_cmd_p().
- Fix iommufd batching: arm_vsmmu_cache_invalidate() can mix
  CFGI/TLBI with ATC_INV in one batch. Split at the boundary via a
  new arm_vsmmu_can_batch_cmd() predicate.
- Patch 2 wording: "next patch wires" -> "a subsequent change will
  wire".

v2: https://lore.kernel.org/all/20260529140830.629738-1-amhetre@nvidia.com/


Nicolin Chen (1):
  iommu/arm-smmu-v3: Factor out CMDQ batch force-sync conditions

Ashish Mhetre (2):
  iommu/arm-smmu-v3: Detect Tegra264 erratum
  iommu/arm-smmu-v3: Issue CFGI/TLBI twice on Tegra264

 .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c     | 23 ++++++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 66 +++++++++++++++----
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   | 40 +++++++++++
 3 files changed, 117 insertions(+), 12 deletions(-)


base-commit: 13428b0bf7947098daf9a1db14a74d33eb1b5079
-- 
2.50.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2026-06-02 21:07 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-01 10:48 [PATCH v3 0/3] iommu/arm-smmu-v3: Tegra264 invalidation workaround Ashish Mhetre
2026-06-01 10:48 ` [PATCH v3 1/3] iommu/arm-smmu-v3: Factor out CMDQ batch force-sync conditions Ashish Mhetre
2026-06-02 19:55   ` Will Deacon
2026-06-02 20:08     ` Nicolin Chen
2026-06-02 20:23       ` Will Deacon
2026-06-01 10:48 ` [PATCH v3 2/3] iommu/arm-smmu-v3: Detect Tegra264 erratum Ashish Mhetre
2026-06-02 20:13   ` Will Deacon
2026-06-02 20:31     ` Nicolin Chen
2026-06-02 20:59       ` Will Deacon
2026-06-02 21:06         ` Nicolin Chen
2026-06-01 10:48 ` [PATCH v3 3/3] iommu/arm-smmu-v3: Issue CFGI/TLBI twice on Tegra264 Ashish Mhetre
2026-06-01 18:37   ` Nicolin Chen
2026-06-02 20:22   ` Will Deacon
2026-06-02 20:35     ` Nicolin Chen
2026-06-02 16:31 ` [PATCH v3 0/3] iommu/arm-smmu-v3: Tegra264 invalidation workaround Mostafa Saleh
2026-06-02 18:23   ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox