Devicetree
 help / color / mirror / Atom feed
* [PATCH] arm64: dts: marvell: armada-37xx: mark EIP97 as dma-coherent
@ 2026-05-24 12:44 Aleksander Jan Bajkowski
  2026-05-24 13:04 ` sashiko-bot
  0 siblings, 1 reply; 2+ messages in thread
From: Aleksander Jan Bajkowski @ 2026-05-24 12:44 UTC (permalink / raw)
  To: andrew, gregory.clement, sebastian.hesselbarth, robh, krzk+dt,
	conor+dt, linux-arm-kernel, devicetree, linux-kernel
  Cc: Aleksander Jan Bajkowski

Armada 37xx has coherent bus, similar to Armada 7k/8k. Cache
synchronization consumes a lot of CPU cycles. Enabling coherent DMA
increases IOPS performance up to 4 times. Some numbers:
					Data length
Algo		MB	   16	  64	 128	 256	1024	1424	4096
DES-ECB		1	+21 %	+5 %	+5 %	+7 %	+7 %	+3 %	+20 %
AES-ECB-128	1	+21 %	+6 %	+6 %	+6 %	+9 %	+8 %	+22 %
AES-CBC-128	1	+21 %	+5 %	+5 %	+5 %	+6 %	+7 %	+23 %
AES-CBC-256	1	+23 %	+7 %	+8 %	+6 %	+11 %	+13 %	+20 %

					Data length
Algo		MB	    16	    64	   256	  512	  1024	 1420	 4096	 8192
AES-GCM-128	1	 +44 %	 +42 %	 +31 %	 +32 %	 +27 %	+30 %	+32 %	+30 %
AES-GCM-128	8	+319 %	+326 %	+163 %	+148 %	 +75 %	+72 %	+74 %	+41 %
AES-GCM-128	4096	+123 %	+128 %	 +90 %	 +83 %	+116 %	+59 %	+38 %	+28 %

					Data length
Algo		MB	   16	   64	  256	 1024	 2048	 4096	 8192
MD5		1	+21 %	+15 %	+29 %	+25 %	+50 %	+16 %	+20 %
SHA1		1	+24 %	+22 %	+27 %	+22 %	+18 %	+20 %	+20 %
SHA256		1	+30 %	+24 %	+25 %	+26 %	+21 %	+41 %	+19 %
SHA512		1	 +4 %	 +3 %	 +8 %	+10 %	+24 %	+10 %	+11 %

Tested on Armada 3720. Platform passes testmgr selftests.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
---
 arch/arm64/boot/dts/marvell/armada-37xx.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/boot/dts/marvell/armada-37xx.dtsi b/arch/arm64/boot/dts/marvell/armada-37xx.dtsi
index 360fc24fdde2..8418777042b8 100644
--- a/arch/arm64/boot/dts/marvell/armada-37xx.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-37xx.dtsi
@@ -437,6 +437,7 @@ crypto: crypto@90000 {
 				interrupt-names = "ring0", "ring1", "ring2",
 						  "ring3", "eip", "mem";
 				clocks = <&nb_periph_clk 15>;
+				dma-coherent;
 			};
 
 			rwtm: mailbox@b0000 {
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] arm64: dts: marvell: armada-37xx: mark EIP97 as dma-coherent
  2026-05-24 12:44 [PATCH] arm64: dts: marvell: armada-37xx: mark EIP97 as dma-coherent Aleksander Jan Bajkowski
@ 2026-05-24 13:04 ` sashiko-bot
  0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-05-24 13:04 UTC (permalink / raw)
  To: Aleksander Jan Bajkowski; +Cc: conor+dt, robh, devicetree

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Critical] Incorrectly marking the crypto engine as `dma-coherent` on the non-coherent Armada 37xx SoC bypasses necessary software cache maintenance, leading to silent data corruption.
--

commit cb353bcdff0077f7b4e65e2e644ffb6462f3946f
Author: Aleksander Jan Bajkowski <olek2@wp.pl>
Subject: arm64: dts: marvell: armada-37xx: mark EIP97 as dma-coherent

This commit marks the crypto engine (EIP97) on the Armada 37xx SoC as
dma-coherent to skip software cache synchronization.

> diff --git a/arch/arm64/boot/dts/marvell/armada-37xx.dtsi b/arch/arm64/boot/dts/marvell/armada-37xx.dtsi
> index 360fc24fdde22..8418777042b85 100644
> --- a/arch/arm64/boot/dts/marvell/armada-37xx.dtsi
> +++ b/arch/arm64/boot/dts/marvell/armada-37xx.dtsi
> @@ -437,6 +437,7 @@ crypto: crypto@90000 {
>  				interrupt-names = "ring0", "ring1", "ring2",
>  						  "ring3", "eip", "mem";
>  				clocks = <&nb_periph_clk 15>;
> +				dma-coherent;

[Severity: Critical]
Could this bypass necessary software cache maintenance on the non-coherent
Armada 37xx SoC?

Unlike the Armada 7k/8k, the Armada 37xx SoC does not appear to support
hardware I/O cache coherency for its peripheral interconnect.

Adding the dma-coherent property to the crypto node instructs the DMA API to
skip software cache maintenance, such as cleaning dirty CPU cache lines to
RAM before DMA transmission and invalidating them before CPU reads.

The observed performance improvement likely comes from bypassing these
essential CPU operations.

Without cache maintenance, could the crypto engine read stale data from
memory, and the CPU read stale data from its caches, resulting in silent
cryptographic data corruption?

Passing testmgr selftests is a common false positive for this regression,
as small test payloads often happen to be clean in the cache or benefit
from incidental cache states during early boot.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260524124456.350659-1-olek2@wp.pl?part=1

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-05-24 13:04 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-24 12:44 [PATCH] arm64: dts: marvell: armada-37xx: mark EIP97 as dma-coherent Aleksander Jan Bajkowski
2026-05-24 13:04 ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox