Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] iommu/arm-smmu-v3: Tegra264 invalidation workaround
@ 2026-05-28 10:16 Ashish Mhetre
  2026-05-28 10:16 ` [PATCH 1/2] iommu/arm-smmu-v3: Detect Tegra264 erratum Ashish Mhetre
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Ashish Mhetre @ 2026-05-28 10:16 UTC (permalink / raw)
  To: will, robin.murphy, joro, jgg, nicolinc
  Cc: linux-arm-kernel, iommu, linux-kernel, linux-tegra, Ashish Mhetre

Nvidia Tegra264 SMMUs are affected by an erratum where a TLB entry can
survive an invalidation that races with concurrent traffic targeting
the same entry. The hardware-recommended software workaround is to
issue every CFGI/TLBI command (each followed by CMD_SYNC) twice. The
second issue must execute only after the first issue's CMD_SYNC has
completed, giving the sequence:

    TLBI/CFGI ... CMD_SYNC TLBI/CFGI ... CMD_SYNC

This series implements the workaround by hooking the duplication into
the single chokepoint that every synchronous submission flows through
arm_smmu_cmdq_issue_cmdlist().

Patch 1 detects affected instances using the existing
"nvidia,tegra264-smmu" compatible string and exposes the condition
via a new ARM_SMMU_OPT_TLBI_TWICE option bit.

Patch 2 wires the option into the CMDQ submission path which is used to
re-issue the cmdlist when @sync is true and the first command is a
CFGI/TLBI.

Ashish Mhetre (2):
  iommu/arm-smmu-v3: Detect Tegra264 erratum
  iommu/arm-smmu-v3: Issue CFGI/TLBI twice on Tegra264

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 70 +++++++++++++++++++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  8 +++
 2 files changed, 72 insertions(+), 6 deletions(-)


base-commit: f86b1ac9a67321419fec095ecb27584b2f77e339
-- 
2.50.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] iommu/arm-smmu-v3: Detect Tegra264 erratum
  2026-05-28 10:16 [PATCH 0/2] iommu/arm-smmu-v3: Tegra264 invalidation workaround Ashish Mhetre
@ 2026-05-28 10:16 ` Ashish Mhetre
  2026-05-28 10:34   ` Robin Murphy
  2026-05-28 10:16 ` [PATCH 2/2] iommu/arm-smmu-v3: Issue CFGI/TLBI twice on Tegra264 Ashish Mhetre
  2026-05-28 18:41 ` [PATCH 0/2] iommu/arm-smmu-v3: Tegra264 invalidation workaround Nicolin Chen
  2 siblings, 1 reply; 7+ messages in thread
From: Ashish Mhetre @ 2026-05-28 10:16 UTC (permalink / raw)
  To: will, robin.murphy, joro, jgg, nicolinc
  Cc: linux-arm-kernel, iommu, linux-kernel, linux-tegra, Ashish Mhetre

Tegra264 SMMU is affected by erratum where a TLB entry can survive an
invalidation that races with concurrent traffic targeting the same
entry. The hardware-recommended software workaround is to issue every
CFGI/TLBI command (each followed by CMD_SYNC) twice. The second issue is
guaranteed to evict the entry. ATC_INV is not affected and must not be
doubled.

Add the ARM_SMMU_OPT_TLBI_TWICE option and set it on instances matching
the existing "nvidia,tegra264-smmu" compatible. No callers consume the
option yet, next patch wires the workaround into the CMDQ issue paths.

Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 ++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9be589d14a3b..88296c0a5337 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -5229,8 +5229,10 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev,
 	if (of_dma_is_coherent(dev->of_node))
 		smmu->features |= ARM_SMMU_FEAT_COHERENCY;
 
-	if (of_device_is_compatible(dev->of_node, "nvidia,tegra264-smmu"))
+	if (of_device_is_compatible(dev->of_node, "nvidia,tegra264-smmu")) {
 		tegra_cmdqv_dt_probe(dev->of_node, smmu);
+		smmu->options |= ARM_SMMU_OPT_TLBI_TWICE;
+	}
 
 	return ret;
 }
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 16353596e08a..08d1abaf31ae 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -928,6 +928,14 @@ struct arm_smmu_device {
 #define ARM_SMMU_OPT_MSIPOLL		(1 << 2)
 #define ARM_SMMU_OPT_CMDQ_FORCE_SYNC	(1 << 3)
 #define ARM_SMMU_OPT_TEGRA241_CMDQV	(1 << 4)
+/*
+ * Tegra264 erratum: a TLB entry can survive an invalidation that races
+ * with concurrent traffic targeting the same entry. The software
+ * workaround is to issue every CFGI/TLBI command twice, each followed
+ * by CMD_SYNC. The second issue is guaranteed to evict the entry.
+ * ATC_INV commands are not affected and must not be doubled.
+ */
+#define ARM_SMMU_OPT_TLBI_TWICE		(1 << 5)
 	u32				options;
 
 	struct arm_smmu_cmdq		cmdq;
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] iommu/arm-smmu-v3: Issue CFGI/TLBI twice on Tegra264
  2026-05-28 10:16 [PATCH 0/2] iommu/arm-smmu-v3: Tegra264 invalidation workaround Ashish Mhetre
  2026-05-28 10:16 ` [PATCH 1/2] iommu/arm-smmu-v3: Detect Tegra264 erratum Ashish Mhetre
@ 2026-05-28 10:16 ` Ashish Mhetre
  2026-05-28 18:41 ` [PATCH 0/2] iommu/arm-smmu-v3: Tegra264 invalidation workaround Nicolin Chen
  2 siblings, 0 replies; 7+ messages in thread
From: Ashish Mhetre @ 2026-05-28 10:16 UTC (permalink / raw)
  To: will, robin.murphy, joro, jgg, nicolinc
  Cc: linux-arm-kernel, iommu, linux-kernel, linux-tegra, Ashish Mhetre

Apply the workaround for Tegra264 erratum by issuing every CFGI/TLBI
command twice on affected SMMU instances, with CMD_SYNC after each.
The erratum requires this exact sequencing:

    TLBI/CFGI ... CMD_SYNC TLBI/CFGI ... CMD_SYNC

To get this sequence with minimal surgery, hook the workaround into
arm_smmu_cmdq_issue_cmdlist(). Rename the original function to
__arm_smmu_cmdq_issue_cmdlist() and add a thin wrapper that, on
affected SMMUs and when @sync is true, re-issues the same cmdlist a
second time.

A new arm_smmu_cmd_needs_tlbi_twice() helper classifies which opcodes
need the doubling: CFGI_* and TLBI_*.

For batches that exceed CMDQ_BATCH_ENTRIES commands,
arm_smmu_cmdq_batch_add_cmd_p() normally flushes the full buffer with
sync=false, deferring the SYNC to the eventual batch_submit(). On
affected SMMUs this would leave the first chunk's commands issued
only once, since the WAR hook in arm_smmu_cmdq_issue_cmdlist() only
fires on synced submissions. Force a SYNC on the capacity rollover
when the buffer carries CFGI/TLBI commands so every flushed chunk is
correctly doubled.

Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 66 +++++++++++++++++++--
 1 file changed, 61 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 88296c0a5337..38d45f175a2c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -698,10 +698,10 @@ static void arm_smmu_cmdq_write_entries(struct arm_smmu_cmdq *cmdq,
  *   insert their own list of commands then all of the commands from one
  *   CPU will appear before any of the commands from the other CPU.
  */
-int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
-				struct arm_smmu_cmdq *cmdq,
-				struct arm_smmu_cmd *cmds, int n,
-				bool sync)
+static int __arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
+					 struct arm_smmu_cmdq *cmdq,
+					 struct arm_smmu_cmd *cmds, int n,
+					 bool sync)
 {
 	struct arm_smmu_cmd cmd_sync;
 	u32 prod;
@@ -820,6 +820,52 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
 	return ret;
 }
 
+/*
+ * Returns true if @opcode is a CFGI_* or TLBI_* command, i.e. one of the
+ * invalidations covered by Tegra264 erratum (see ARM_SMMU_OPT_TLBI_TWICE).
+ */
+static bool arm_smmu_cmd_needs_tlbi_twice(u8 opcode)
+{
+	switch (opcode) {
+	case CMDQ_OP_CFGI_STE:
+	case CMDQ_OP_CFGI_ALL:
+	case CMDQ_OP_CFGI_CD:
+	case CMDQ_OP_CFGI_CD_ALL:
+	case CMDQ_OP_TLBI_NH_ALL:
+	case CMDQ_OP_TLBI_NH_ASID:
+	case CMDQ_OP_TLBI_NH_VA:
+	case CMDQ_OP_TLBI_NH_VAA:
+	case CMDQ_OP_TLBI_EL2_ALL:
+	case CMDQ_OP_TLBI_EL2_ASID:
+	case CMDQ_OP_TLBI_EL2_VA:
+	case CMDQ_OP_TLBI_S12_VMALL:
+	case CMDQ_OP_TLBI_S2_IPA:
+	case CMDQ_OP_TLBI_NSNH_ALL:
+		return true;
+	default:
+		return false;
+	}
+}
+
+int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
+				struct arm_smmu_cmdq *cmdq,
+				struct arm_smmu_cmd *cmds, int n,
+				bool sync)
+{
+	int ret = __arm_smmu_cmdq_issue_cmdlist(smmu, cmdq, cmds, n, sync);
+
+	/*
+	 * The driver's batch invariants keep a single submission's
+	 * opcode class uniform, so checking the first command is enough.
+	 */
+	if (!ret && sync && (smmu->options & ARM_SMMU_OPT_TLBI_TWICE) &&
+	    arm_smmu_cmd_needs_tlbi_twice(FIELD_GET(CMDQ_0_OP,
+						    cmds[0].data[0])))
+		ret = __arm_smmu_cmdq_issue_cmdlist(smmu, cmdq, cmds, n, sync);
+
+	return ret;
+}
+
 static int arm_smmu_cmdq_issue_cmd_p(struct arm_smmu_device *smmu,
 				     struct arm_smmu_cmd *cmd, bool sync)
 {
@@ -863,8 +909,18 @@ static void arm_smmu_cmdq_batch_add_cmd_p(struct arm_smmu_device *smmu,
 	}
 
 	if (cmds->num == CMDQ_BATCH_ENTRIES) {
+		/*
+		 * Force a SYNC only when the batch carries commands that
+		 * have to be doubled (see ARM_SMMU_OPT_TLBI_TWICE).
+		 * The batch holds a uniform opcode class, so checking
+		 * the first command is sufficient.
+		 */
+		bool need_sync = (smmu->options & ARM_SMMU_OPT_TLBI_TWICE) &&
+				 arm_smmu_cmd_needs_tlbi_twice(FIELD_GET(CMDQ_0_OP,
+									 cmds->cmds[0].data[0]));
+
 		arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmdq, cmds->cmds,
-					    cmds->num, false);
+					    cmds->num, need_sync);
 		arm_smmu_cmdq_batch_init_cmd(smmu, cmds, cmd);
 	}
 
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] iommu/arm-smmu-v3: Detect Tegra264 erratum
  2026-05-28 10:16 ` [PATCH 1/2] iommu/arm-smmu-v3: Detect Tegra264 erratum Ashish Mhetre
@ 2026-05-28 10:34   ` Robin Murphy
  2026-05-28 16:06     ` Ashish Mhetre
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Murphy @ 2026-05-28 10:34 UTC (permalink / raw)
  To: Ashish Mhetre, will, joro, jgg, nicolinc
  Cc: linux-arm-kernel, iommu, linux-kernel, linux-tegra

On 2026-05-28 11:16 am, Ashish Mhetre wrote:
> Tegra264 SMMU is affected by erratum where a TLB entry can survive an
> invalidation that races with concurrent traffic targeting the same
> entry. The hardware-recommended software workaround is to issue every
> CFGI/TLBI command (each followed by CMD_SYNC) twice. The second issue is
> guaranteed to evict the entry. ATC_INV is not affected and must not be
> doubled.
> 
> Add the ARM_SMMU_OPT_TLBI_TWICE option and set it on instances matching
> the existing "nvidia,tegra264-smmu" compatible. No callers consume the
> option yet, next patch wires the workaround into the CMDQ issue paths.

Can you not detect this implementation from IIDR like for our other 
workarounds? Otherwise what about ACPI?

Thanks,
Robin.

> Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
> ---
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +++-
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 ++++++++
>   2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 9be589d14a3b..88296c0a5337 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -5229,8 +5229,10 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev,
>   	if (of_dma_is_coherent(dev->of_node))
>   		smmu->features |= ARM_SMMU_FEAT_COHERENCY;
>   
> -	if (of_device_is_compatible(dev->of_node, "nvidia,tegra264-smmu"))
> +	if (of_device_is_compatible(dev->of_node, "nvidia,tegra264-smmu")) {
>   		tegra_cmdqv_dt_probe(dev->of_node, smmu);
> +		smmu->options |= ARM_SMMU_OPT_TLBI_TWICE;
> +	}
>   
>   	return ret;
>   }
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 16353596e08a..08d1abaf31ae 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -928,6 +928,14 @@ struct arm_smmu_device {
>   #define ARM_SMMU_OPT_MSIPOLL		(1 << 2)
>   #define ARM_SMMU_OPT_CMDQ_FORCE_SYNC	(1 << 3)
>   #define ARM_SMMU_OPT_TEGRA241_CMDQV	(1 << 4)
> +/*
> + * Tegra264 erratum: a TLB entry can survive an invalidation that races
> + * with concurrent traffic targeting the same entry. The software
> + * workaround is to issue every CFGI/TLBI command twice, each followed
> + * by CMD_SYNC. The second issue is guaranteed to evict the entry.
> + * ATC_INV commands are not affected and must not be doubled.
> + */
> +#define ARM_SMMU_OPT_TLBI_TWICE		(1 << 5)
>   	u32				options;
>   
>   	struct arm_smmu_cmdq		cmdq;



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] iommu/arm-smmu-v3: Detect Tegra264 erratum
  2026-05-28 10:34   ` Robin Murphy
@ 2026-05-28 16:06     ` Ashish Mhetre
  2026-05-28 18:38       ` Nicolin Chen
  0 siblings, 1 reply; 7+ messages in thread
From: Ashish Mhetre @ 2026-05-28 16:06 UTC (permalink / raw)
  To: Robin Murphy, will, joro, jgg, nicolinc
  Cc: linux-arm-kernel, iommu, linux-kernel, linux-tegra



On 5/28/2026 4:04 PM, Robin Murphy wrote:
> On 2026-05-28 11:16 am, Ashish Mhetre wrote:
>> Tegra264 SMMU is affected by erratum where a TLB entry can survive an
>> invalidation that races with concurrent traffic targeting the same
>> entry. The hardware-recommended software workaround is to issue every
>> CFGI/TLBI command (each followed by CMD_SYNC) twice. The second issue is
>> guaranteed to evict the entry. ATC_INV is not affected and must not be
>> doubled.
>>
>> Add the ARM_SMMU_OPT_TLBI_TWICE option and set it on instances matching
>> the existing "nvidia,tegra264-smmu" compatible. No callers consume the
>> option yet, next patch wires the workaround into the CMDQ issue paths.
>
> Can you not detect this implementation from IIDR like for our other
> workarounds? Otherwise what about ACPI? 

Neither IDR nor IIDR flags this Tegra264-specific bug. We cannot
detect it from any HW register, so we have to rely on the Tegra264
device tree.
Regarding ACPI, the bug is in Tegra264 only, and Tegra264 is
device-tree-only. It doesn't support ACPI/IORT as of now.

Thanks,
Ashish Mhetre


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] iommu/arm-smmu-v3: Detect Tegra264 erratum
  2026-05-28 16:06     ` Ashish Mhetre
@ 2026-05-28 18:38       ` Nicolin Chen
  0 siblings, 0 replies; 7+ messages in thread
From: Nicolin Chen @ 2026-05-28 18:38 UTC (permalink / raw)
  To: Ashish Mhetre
  Cc: Robin Murphy, will, joro, jgg, linux-arm-kernel, iommu,
	linux-kernel, linux-tegra

On Thu, May 28, 2026 at 09:36:34PM +0530, Ashish Mhetre wrote:
> On 5/28/2026 4:04 PM, Robin Murphy wrote:
> > On 2026-05-28 11:16 am, Ashish Mhetre wrote:
> > > Tegra264 SMMU is affected by erratum where a TLB entry can survive an
> > > invalidation that races with concurrent traffic targeting the same
> > > entry. The hardware-recommended software workaround is to issue every
> > > CFGI/TLBI command (each followed by CMD_SYNC) twice. The second issue is
> > > guaranteed to evict the entry. ATC_INV is not affected and must not be
> > > doubled.
> > > 
> > > Add the ARM_SMMU_OPT_TLBI_TWICE option and set it on instances matching
> > > the existing "nvidia,tegra264-smmu" compatible. No callers consume the
> > > option yet, next patch wires the workaround into the CMDQ issue paths.
> > 
> > Can you not detect this implementation from IIDR like for our other
> > workarounds? Otherwise what about ACPI?
> 
> Neither IDR nor IIDR flags this Tegra264-specific bug. We cannot
> detect it from any HW register, so we have to rely on the Tegra264
> device tree.
> Regarding ACPI, the bug is in Tegra264 only, and Tegra264 is
> device-tree-only. It doesn't support ACPI/IORT as of now.

Let's add a note in the commit message.

Nicolin


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] iommu/arm-smmu-v3: Tegra264 invalidation workaround
  2026-05-28 10:16 [PATCH 0/2] iommu/arm-smmu-v3: Tegra264 invalidation workaround Ashish Mhetre
  2026-05-28 10:16 ` [PATCH 1/2] iommu/arm-smmu-v3: Detect Tegra264 erratum Ashish Mhetre
  2026-05-28 10:16 ` [PATCH 2/2] iommu/arm-smmu-v3: Issue CFGI/TLBI twice on Tegra264 Ashish Mhetre
@ 2026-05-28 18:41 ` Nicolin Chen
  2 siblings, 0 replies; 7+ messages in thread
From: Nicolin Chen @ 2026-05-28 18:41 UTC (permalink / raw)
  To: Ashish Mhetre
  Cc: will, robin.murphy, joro, jgg, linux-arm-kernel, iommu,
	linux-kernel, linux-tegra

On Thu, May 28, 2026 at 10:16:15AM +0000, Ashish Mhetre wrote:
> Nvidia Tegra264 SMMUs are affected by an erratum where a TLB entry can
> survive an invalidation that races with concurrent traffic targeting
> the same entry. The hardware-recommended software workaround is to
> issue every CFGI/TLBI command (each followed by CMD_SYNC) twice. The
> second issue must execute only after the first issue's CMD_SYNC has
> completed, giving the sequence:
> 
>     TLBI/CFGI ... CMD_SYNC TLBI/CFGI ... CMD_SYNC
> 
> This series implements the workaround by hooking the duplication into
> the single chokepoint that every synchronous submission flows through
> arm_smmu_cmdq_issue_cmdlist().
> 
> Patch 1 detects affected instances using the existing
> "nvidia,tegra264-smmu" compatible string and exposes the condition
> via a new ARM_SMMU_OPT_TLBI_TWICE option bit.
> 
> Patch 2 wires the option into the CMDQ submission path which is used to
> re-issue the cmdlist when @sync is true and the first command is a
> CFGI/TLBI.

What base-commit do you format the patches from?

Sashiko failed to apply for running a review:
https://sashiko.dev/#/patchset/20260528101617.4068249-1-amhetre%40nvidia.com

Nicolin


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-05-28 18:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-28 10:16 [PATCH 0/2] iommu/arm-smmu-v3: Tegra264 invalidation workaround Ashish Mhetre
2026-05-28 10:16 ` [PATCH 1/2] iommu/arm-smmu-v3: Detect Tegra264 erratum Ashish Mhetre
2026-05-28 10:34   ` Robin Murphy
2026-05-28 16:06     ` Ashish Mhetre
2026-05-28 18:38       ` Nicolin Chen
2026-05-28 10:16 ` [PATCH 2/2] iommu/arm-smmu-v3: Issue CFGI/TLBI twice on Tegra264 Ashish Mhetre
2026-05-28 18:41 ` [PATCH 0/2] iommu/arm-smmu-v3: Tegra264 invalidation workaround Nicolin Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox