Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 15/17] arm64: dts: rockchip: Add EL2 virtual timer interrupt
From: Marc Zyngier @ 2026-05-23 14:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-acpi, linux-kernel, devicetree
  Cc: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber, Yu-Chun Lin [林祐君],
	Heiko Stuebner, Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek
In-Reply-To: <20260523140242.586031-1-maz@kernel.org>

The ARMv8.2 based CPUs used in a number of Rockchip SoCs are missing
the EL2 virtual timer interrupt. Add it.

Acked-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/boot/dts/rockchip/rk356x-base.dtsi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/rockchip/rk356x-base.dtsi b/arch/arm64/boot/dts/rockchip/rk356x-base.dtsi
index 64bdd8b7754b5..a5832895bd392 100644
--- a/arch/arm64/boot/dts/rockchip/rk356x-base.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk356x-base.dtsi
@@ -195,7 +195,8 @@ timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_HIGH>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_HIGH>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_HIGH>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_HIGH>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_HIGH>;
 		arm,no-tick-in-suspend;
 	};
 
-- 
2.47.3



^ permalink raw reply related

* [PATCH v3 12/17] arm64: dts: nvidia: Add EL2 virtual timer interrupt
From: Marc Zyngier @ 2026-05-23 14:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-acpi, linux-kernel, devicetree
  Cc: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber, Yu-Chun Lin [林祐君],
	Heiko Stuebner, Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek
In-Reply-To: <20260523140242.586031-1-maz@kernel.org>

The ARMv8.2 based CPUs used in a number of nvidia SoCs are missing
the EL2 virtual timer interrupt. Add it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 2 ++
 arch/arm64/boot/dts/nvidia/tegra234.dtsi | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index 849694f751d90..45cc180ac9973 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -3163,6 +3163,8 @@ timer {
 			     <GIC_PPI 11
 				(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
 			     <GIC_PPI 10
+				(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
+			     <GIC_PPI 12
 				(GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>;
 		interrupt-parent = <&gic>;
 		always-on;
diff --git a/arch/arm64/boot/dts/nvidia/tegra234.dtsi b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
index 04a95b6658caa..ab9813f9ba30c 100644
--- a/arch/arm64/boot/dts/nvidia/tegra234.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
@@ -5872,7 +5872,8 @@ timer {
 		interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
 			     <GIC_PPI 14 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
 			     <GIC_PPI 11 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
-			     <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>;
+			     <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
+			     <GIC_PPI 12 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>;
 		interrupt-parent = <&gic>;
 		always-on;
 	};
-- 
2.47.3



^ permalink raw reply related

* [PATCH v3 10/17] arm64: dts: intel: Add EL2 virtual timer interrupt
From: Marc Zyngier @ 2026-05-23 14:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-acpi, linux-kernel, devicetree
  Cc: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber, Yu-Chun Lin [林祐君],
	Heiko Stuebner, Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek
In-Reply-To: <20260523140242.586031-1-maz@kernel.org>

The ARMv8.2 based CPUs used in the agilex5 SoC are missing the EL2 virtual
timer interrupt. Add it.

Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/boot/dts/intel/socfpga_agilex5.dtsi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/intel/socfpga_agilex5.dtsi b/arch/arm64/boot/dts/intel/socfpga_agilex5.dtsi
index 02e62d954e949..6db2d48b9bad3 100644
--- a/arch/arm64/boot/dts/intel/socfpga_agilex5.dtsi
+++ b/arch/arm64/boot/dts/intel/socfpga_agilex5.dtsi
@@ -155,7 +155,8 @@ timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_LOW>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_LOW>;
 	};
 
 	usbphy0: usbphy {
-- 
2.47.3



^ permalink raw reply related

* [PATCH v3 00/17] arm64: Use EL2 virtual timer when running VHE
From: Marc Zyngier @ 2026-05-23 14:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-acpi, linux-kernel, devicetree
  Cc: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber, Yu-Chun Lin [林祐君],
	Heiko Stuebner, Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek

This is the third version of the series initially posted at [1],
which

- updates the ACPI GTDT parsing to deal the v3 layout and the EL2
  virtual timer,
- moves the architected timer driver to use it when running VHE,
- fixes a number of DTs to reflect the reality of the HW.

This results in significant performance uplift in deeper nested virt
scenarios, at no overhead to the host.

Patches based on -rc3, tested on Amlogic SM1, QC X1E, Ampere Altra,
and Apple M2, as well as KVM NV guests.

* From v2 [2]:

  - Add more consistency checks to the GTDT parsing

  - Match the virtual counter when using the KVM PTP backend

  - Drop a number of changes to Qualcomm DTs, being only tangentially
    related and that will be posted separately

  - Fix the Realtek Kent platform, which had the GICv3 maintenance
    interrupt advertised as the EL2 virtual timer

  - Collected TBs and RBs, with thanks

* From v1 [2]:

  - Now also using the EL2 virtual counter, which further improve
    things when running at a deeper nesting level

  - Updated consistency checks for the platform timers when finding a
    GTDTv3

  - Collected ABs and RBs, with thanks

[1] https://lore.kernel.org/r/20260507125544.2903406-1-maz@kernel.org
[2] https://lore.kernel.org/r/20260514150945.3917510-1-maz@kernel.org

Marc Zyngier (17):
  ACPI: GTDT: Account for GTDTv3 size when walking the platform timer
    descriptors
  ACPI: GTDT: Parse information related to the EL2 virtual timer
  clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when
    running VHE
  dt-bindings: timer: arm,arch_timer: Fix requirements for interrupt
    description
  arm64: dts: allwinner: Add EL2 virtual timer interrupt
  arm64: dts: amlogic: Add EL2 virtual timer interrupt
  arm64: dts: bst: Add EL2 virtual timer interrupt
  arm64: dts: exynos: Add EL2 virtual timer interrupt
  arm64: dts: freescale: Add EL2 virtual timer interrupt
  arm64: dts: intel: Add EL2 virtual timer interrupt
  arm64: dts: mediatek: Add EL2 virtual timer interrupt
  arm64: dts: nvidia: Add EL2 virtual timer interrupt
  arm64: dts: qcom: Add EL2 virtual timer interrupt
  arm64: dts: realtek: Add EL2 virtual timer interrupt
  arm64: dts: rockchip: Add EL2 virtual timer interrupt
  arm64: dts: sprd: Add EL2 virtual timer interrupt
  arm64: dts: xilinx: Add EL2 virtual timer interrupt

 .../bindings/timer/arm,arch_timer.yaml        | 21 +++----
 .../arm64/boot/dts/allwinner/sun55i-a523.dtsi |  3 +-
 .../boot/dts/amlogic/amlogic-a4-common.dtsi   |  8 ---
 arch/arm64/boot/dts/amlogic/amlogic-a4.dtsi   |  8 +++
 arch/arm64/boot/dts/amlogic/amlogic-a5.dtsi   |  9 +++
 arch/arm64/boot/dts/amlogic/amlogic-s6.dtsi   |  3 +-
 arch/arm64/boot/dts/amlogic/amlogic-s7.dtsi   |  3 +-
 arch/arm64/boot/dts/amlogic/amlogic-s7d.dtsi  |  3 +-
 .../boot/dts/amlogic/meson-g12-common.dtsi    | 13 -----
 arch/arm64/boot/dts/amlogic/meson-g12.dtsi    |  9 +++
 arch/arm64/boot/dts/amlogic/meson-sm1.dtsi    | 10 ++++
 arch/arm64/boot/dts/bst/bstc1200.dtsi         |  3 +-
 arch/arm64/boot/dts/exynos/axis/artpec9.dtsi  |  3 +-
 arch/arm64/boot/dts/exynos/exynos2200.dtsi    |  3 +-
 arch/arm64/boot/dts/exynos/exynos990.dtsi     |  3 +-
 arch/arm64/boot/dts/exynos/exynosautov9.dtsi  |  3 +-
 arch/arm64/boot/dts/exynos/google/gs101.dtsi  |  3 +-
 .../boot/dts/freescale/imx91_93_common.dtsi   |  3 +-
 arch/arm64/boot/dts/freescale/imx94.dtsi      |  3 +-
 arch/arm64/boot/dts/freescale/imx95.dtsi      |  3 +-
 arch/arm64/boot/dts/freescale/imx952.dtsi     |  3 +-
 arch/arm64/boot/dts/freescale/s32n79.dtsi     |  3 +-
 .../arm64/boot/dts/intel/socfpga_agilex5.dtsi |  3 +-
 arch/arm64/boot/dts/mediatek/mt6779.dtsi      |  3 +-
 arch/arm64/boot/dts/mediatek/mt8186.dtsi      |  3 +-
 arch/arm64/boot/dts/mediatek/mt8188.dtsi      |  3 +-
 arch/arm64/boot/dts/mediatek/mt8192.dtsi      |  3 +-
 arch/arm64/boot/dts/mediatek/mt8195.dtsi      |  3 +-
 arch/arm64/boot/dts/nvidia/tegra194.dtsi      |  2 +
 arch/arm64/boot/dts/nvidia/tegra234.dtsi      |  3 +-
 arch/arm64/boot/dts/qcom/eliza.dtsi           |  3 +-
 arch/arm64/boot/dts/qcom/hamoa.dtsi           |  3 +-
 arch/arm64/boot/dts/qcom/kaanapali.dtsi       |  3 +-
 arch/arm64/boot/dts/qcom/kodiak.dtsi          |  3 +-
 arch/arm64/boot/dts/qcom/lemans.dtsi          |  3 +-
 arch/arm64/boot/dts/qcom/monaco.dtsi          |  3 +-
 arch/arm64/boot/dts/qcom/sar2130p.dtsi        |  3 +-
 arch/arm64/boot/dts/qcom/sc8280xp.dtsi        |  3 +-
 arch/arm64/boot/dts/qcom/sm4450.dtsi          |  3 +-
 arch/arm64/boot/dts/qcom/sm8250.dtsi          |  3 +-
 arch/arm64/boot/dts/qcom/sm8350.dtsi          |  3 +-
 arch/arm64/boot/dts/qcom/sm8450.dtsi          |  3 +-
 arch/arm64/boot/dts/qcom/sm8550.dtsi          |  3 +-
 arch/arm64/boot/dts/qcom/sm8650.dtsi          |  3 +-
 arch/arm64/boot/dts/qcom/sm8750.dtsi          |  3 +-
 arch/arm64/boot/dts/realtek/kent.dtsi         |  2 +-
 arch/arm64/boot/dts/realtek/rtd16xx.dtsi      |  3 +-
 arch/arm64/boot/dts/rockchip/rk356x-base.dtsi |  3 +-
 arch/arm64/boot/dts/sprd/sc9863a.dtsi         |  3 +-
 arch/arm64/boot/dts/sprd/ums512.dtsi          |  3 +-
 arch/arm64/boot/dts/sprd/ums9620.dtsi         |  3 +-
 arch/arm64/boot/dts/xilinx/versal-net.dtsi    |  3 +-
 drivers/acpi/arm64/gtdt.c                     | 42 +++++++++++++-
 drivers/clocksource/arm_arch_timer.c          | 55 +++++++++++--------
 54 files changed, 206 insertions(+), 102 deletions(-)

-- 
2.47.3



^ permalink raw reply

* [PATCH v3 08/17] arm64: dts: exynos: Add EL2 virtual timer interrupt
From: Marc Zyngier @ 2026-05-23 14:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-acpi, linux-kernel, devicetree
  Cc: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber, Yu-Chun Lin [林祐君],
	Heiko Stuebner, Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek
In-Reply-To: <20260523140242.586031-1-maz@kernel.org>

A bunch of Samsung SoCs are missing the EL2 virtual timer interrupt
despite using ARMv8.1+ CPUs. Add the missing interrupt, except for
those broken designs where the interrupt is documented as not being
wired.

Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/boot/dts/exynos/axis/artpec9.dtsi | 3 ++-
 arch/arm64/boot/dts/exynos/exynos2200.dtsi   | 3 ++-
 arch/arm64/boot/dts/exynos/exynos990.dtsi    | 3 ++-
 arch/arm64/boot/dts/exynos/exynosautov9.dtsi | 3 ++-
 arch/arm64/boot/dts/exynos/google/gs101.dtsi | 3 ++-
 5 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/boot/dts/exynos/axis/artpec9.dtsi b/arch/arm64/boot/dts/exynos/axis/artpec9.dtsi
index f8ed43c6e8258..cd46aaf056287 100644
--- a/arch/arm64/boot/dts/exynos/axis/artpec9.dtsi
+++ b/arch/arm64/boot/dts/exynos/axis/artpec9.dtsi
@@ -272,6 +272,7 @@ timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_LOW>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_LOW>;
 	};
 };
diff --git a/arch/arm64/boot/dts/exynos/exynos2200.dtsi b/arch/arm64/boot/dts/exynos/exynos2200.dtsi
index 6487ccb58ae76..59662f9bdb98f 100644
--- a/arch/arm64/boot/dts/exynos/exynos2200.dtsi
+++ b/arch/arm64/boot/dts/exynos/exynos2200.dtsi
@@ -1911,7 +1911,8 @@ timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_LOW 0>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_LOW 0>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_LOW 0>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW 0>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW 0>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_LOW 0>;
 		/*
 		 * Non-updatable, broken stock Samsung bootloader does not
 		 * configure CNTFRQ_EL0
diff --git a/arch/arm64/boot/dts/exynos/exynos990.dtsi b/arch/arm64/boot/dts/exynos/exynos990.dtsi
index f8e2a31b4b751..2e6fb24a3c928 100644
--- a/arch/arm64/boot/dts/exynos/exynos990.dtsi
+++ b/arch/arm64/boot/dts/exynos/exynos990.dtsi
@@ -405,7 +405,8 @@ timer {
 		interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>,
 			     <GIC_PPI 14 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>,
 			     <GIC_PPI 11 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>,
-			     <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>;
+			     <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>,
+			     <GIC_PPI 12 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>;
 
 		/*
 		 * Non-updatable, broken stock Samsung bootloader does not
diff --git a/arch/arm64/boot/dts/exynos/exynosautov9.dtsi b/arch/arm64/boot/dts/exynos/exynosautov9.dtsi
index 66628cb32776e..2c34a2b30ad02 100644
--- a/arch/arm64/boot/dts/exynos/exynosautov9.dtsi
+++ b/arch/arm64/boot/dts/exynos/exynosautov9.dtsi
@@ -148,7 +148,8 @@ timer {
 		interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>,
 			     <GIC_PPI 14 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>,
 			     <GIC_PPI 11 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>,
-			     <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>;
+			     <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>,
+			     <GIC_PPI 12 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>;
 	};
 
 	fixed-rate-clocks {
diff --git a/arch/arm64/boot/dts/exynos/google/gs101.dtsi b/arch/arm64/boot/dts/exynos/google/gs101.dtsi
index d085f9fb0f62a..86933f22647b7 100644
--- a/arch/arm64/boot/dts/exynos/google/gs101.dtsi
+++ b/arch/arm64/boot/dts/exynos/google/gs101.dtsi
@@ -1856,7 +1856,8 @@ timer {
 		   <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW) 0>,
 		   <GIC_PPI 14 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW) 0>,
 		   <GIC_PPI 11 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW) 0>,
-		   <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW) 0>;
+		   <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW) 0>,
+		   <GIC_PPI 12 (GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW) 0>;
 	};
 };
 
-- 
2.47.3



^ permalink raw reply related

* [PATCH v3 09/17] arm64: dts: freescale: Add EL2 virtual timer interrupt
From: Marc Zyngier @ 2026-05-23 14:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-acpi, linux-kernel, devicetree
  Cc: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber, Yu-Chun Lin [林祐君],
	Heiko Stuebner, Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek
In-Reply-To: <20260523140242.586031-1-maz@kernel.org>

The ARMv8.2 based CPUs used in a number of NXP/FSL SoCs are missing
the EL2 virtual timer interrupt. Add it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/boot/dts/freescale/imx91_93_common.dtsi | 3 ++-
 arch/arm64/boot/dts/freescale/imx94.dtsi           | 3 ++-
 arch/arm64/boot/dts/freescale/imx95.dtsi           | 3 ++-
 arch/arm64/boot/dts/freescale/imx952.dtsi          | 3 ++-
 arch/arm64/boot/dts/freescale/s32n79.dtsi          | 3 ++-
 5 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/imx91_93_common.dtsi b/arch/arm64/boot/dts/freescale/imx91_93_common.dtsi
index 46a5d2df074d5..679b9a6f7160f 100644
--- a/arch/arm64/boot/dts/freescale/imx91_93_common.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx91_93_common.dtsi
@@ -82,7 +82,8 @@ timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_LOW>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_LOW>;
 		clock-frequency = <24000000>;
 		arm,no-tick-in-suspend;
 		interrupt-parent = <&gic>;
diff --git a/arch/arm64/boot/dts/freescale/imx94.dtsi b/arch/arm64/boot/dts/freescale/imx94.dtsi
index c460ece6070f8..7431ce293625b 100644
--- a/arch/arm64/boot/dts/freescale/imx94.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx94.dtsi
@@ -147,7 +147,8 @@ timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_LOW>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_LOW>;
 		clock-frequency = <24000000>;
 		interrupt-parent = <&gic>;
 		arm,no-tick-in-suspend;
diff --git a/arch/arm64/boot/dts/freescale/imx95.dtsi b/arch/arm64/boot/dts/freescale/imx95.dtsi
index 71394871d8dd0..e318048dc755b 100644
--- a/arch/arm64/boot/dts/freescale/imx95.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx95.dtsi
@@ -524,7 +524,8 @@ timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_LOW>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_LOW>;
 		clock-frequency = <24000000>;
 		arm,no-tick-in-suspend;
 		interrupt-parent = <&gic>;
diff --git a/arch/arm64/boot/dts/freescale/imx952.dtsi b/arch/arm64/boot/dts/freescale/imx952.dtsi
index b30707837f353..7c65956bc72dc 100644
--- a/arch/arm64/boot/dts/freescale/imx952.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx952.dtsi
@@ -298,7 +298,8 @@ timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_LOW>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_LOW>;
 		clock-frequency = <24000000>;
 		arm,no-tick-in-suspend;
 		interrupt-parent = <&gic>;
diff --git a/arch/arm64/boot/dts/freescale/s32n79.dtsi b/arch/arm64/boot/dts/freescale/s32n79.dtsi
index 94ab58783fdc8..fb40abec4c5cd 100644
--- a/arch/arm64/boot/dts/freescale/s32n79.dtsi
+++ b/arch/arm64/boot/dts/freescale/s32n79.dtsi
@@ -357,6 +357,7 @@ timer: timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_LOW>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_LOW>;
 	};
 };
-- 
2.47.3



^ permalink raw reply related

* [PATCH v3 11/17] arm64: dts: mediatek: Add EL2 virtual timer interrupt
From: Marc Zyngier @ 2026-05-23 14:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-acpi, linux-kernel, devicetree
  Cc: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber, Yu-Chun Lin [林祐君],
	Heiko Stuebner, Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek
In-Reply-To: <20260523140242.586031-1-maz@kernel.org>

The ARMv8.1+ based CPUs used in a number of Mediatek SoCs are missing
the EL2 virtual timer interrupt. Add it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/boot/dts/mediatek/mt6779.dtsi | 3 ++-
 arch/arm64/boot/dts/mediatek/mt8186.dtsi | 3 ++-
 arch/arm64/boot/dts/mediatek/mt8188.dtsi | 3 ++-
 arch/arm64/boot/dts/mediatek/mt8192.dtsi | 3 ++-
 arch/arm64/boot/dts/mediatek/mt8195.dtsi | 3 ++-
 5 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/boot/dts/mediatek/mt6779.dtsi b/arch/arm64/boot/dts/mediatek/mt6779.dtsi
index 70f3375916e8c..106df7603d533 100644
--- a/arch/arm64/boot/dts/mediatek/mt6779.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt6779.dtsi
@@ -108,7 +108,8 @@ timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_LOW 0>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_LOW 0>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_LOW 0>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW 0>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW 0>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_LOW 0>;
 	};
 
 	soc {
diff --git a/arch/arm64/boot/dts/mediatek/mt8186.dtsi b/arch/arm64/boot/dts/mediatek/mt8186.dtsi
index b91f88ffae0e8..a4621ce370d8e 100644
--- a/arch/arm64/boot/dts/mediatek/mt8186.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8186.dtsi
@@ -815,7 +815,8 @@ timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_LOW 0>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_LOW 0>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_LOW 0>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW 0>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW 0>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_LOW 0>;
 	};
 
 	soc {
diff --git a/arch/arm64/boot/dts/mediatek/mt8188.dtsi b/arch/arm64/boot/dts/mediatek/mt8188.dtsi
index 75133794cec38..614e75f46c72d 100644
--- a/arch/arm64/boot/dts/mediatek/mt8188.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8188.dtsi
@@ -918,7 +918,8 @@ timer: timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_HIGH 0>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_HIGH 0>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_HIGH 0>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_HIGH 0>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_HIGH 0>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_HIGH 0>;
 		clock-frequency = <13000000>;
 	};
 
diff --git a/arch/arm64/boot/dts/mediatek/mt8192.dtsi b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
index 9f8f115edd4cc..873c4fae6afc9 100644
--- a/arch/arm64/boot/dts/mediatek/mt8192.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
@@ -328,7 +328,8 @@ timer: timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_HIGH 0>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_HIGH 0>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_HIGH 0>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_HIGH 0>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_HIGH 0>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_HIGH 0>;
 		clock-frequency = <13000000>;
 	};
 
diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
index c72e34c57629d..3c9a7a08612b9 100644
--- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
@@ -451,7 +451,8 @@ timer: timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_HIGH 0>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_HIGH 0>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_HIGH 0>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_HIGH 0>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_HIGH 0>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_HIGH 0>;
 	};
 
 	soc {
-- 
2.47.3



^ permalink raw reply related

* [PATCH v3 07/17] arm64: dts: bst: Add EL2 virtual timer interrupt
From: Marc Zyngier @ 2026-05-23 14:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-acpi, linux-kernel, devicetree
  Cc: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber, Yu-Chun Lin [林祐君],
	Heiko Stuebner, Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek
In-Reply-To: <20260523140242.586031-1-maz@kernel.org>

The ARMv8.2 based CPUs used in the bst c1200 SoC are missing the EL2
virtual timer interrupt. Add it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/boot/dts/bst/bstc1200.dtsi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/bst/bstc1200.dtsi b/arch/arm64/boot/dts/bst/bstc1200.dtsi
index dd13c6bfc3c89..104ecf76ced10 100644
--- a/arch/arm64/boot/dts/bst/bstc1200.dtsi
+++ b/arch/arm64/boot/dts/bst/bstc1200.dtsi
@@ -92,6 +92,7 @@ timer {
 		interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 14 IRQ_TYPE_LEVEL_LOW>,
 			     <GIC_PPI 11 IRQ_TYPE_LEVEL_LOW>,
-			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>;
+			     <GIC_PPI 10 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_PPI 12 IRQ_TYPE_LEVEL_LOW>;
 	};
 };
-- 
2.47.3



^ permalink raw reply related

* [PATCH v3 01/17] ACPI: GTDT: Account for GTDTv3 size when walking the platform timer descriptors
From: Marc Zyngier @ 2026-05-23 14:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-acpi, linux-kernel, devicetree
  Cc: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber, Yu-Chun Lin [林祐君],
	Heiko Stuebner, Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek
In-Reply-To: <20260523140242.586031-1-maz@kernel.org>

Since ARMv8.1, the architecture has grown an EL2-private virtual
timer. This has been described in ACPI since ACPI v6.3 and revision
3 of the GTDT table.

An aditional structure was added in ACPICA, though in a rather
bizarre way, and merged in v5.1 as 8f5a14d053100 ("ACPICA: ACPI 6.3:
add GTDT Revision 3 support").

Finally plug the table parsing in GTDT, and correct the parsing of
the platform timer subtables to account for the expanded size of
the base table. This also comes with some extra sanitisation of
the table, in the unlikely case someone got it wrong...

Suggested-by: Sudeep Holla <sudeep.holla@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 drivers/acpi/arm64/gtdt.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/arm64/gtdt.c b/drivers/acpi/arm64/gtdt.c
index ffc867bac2d60..950d5efdf85ea 100644
--- a/drivers/acpi/arm64/gtdt.c
+++ b/drivers/acpi/arm64/gtdt.c
@@ -34,14 +34,25 @@ struct acpi_gtdt_descriptor {
 	void *platform_timer;
 };
 
+struct gtdt_v3 {
+	struct acpi_table_gtdt	gtdt_v2;
+	struct acpi_gtdt_el2	el2_vtimer;
+};
+
 static struct acpi_gtdt_descriptor acpi_gtdt_desc __initdata;
 
 static __init bool platform_timer_valid(void *platform_timer)
 {
 	struct acpi_gtdt_header *gh = platform_timer;
+	void *platform_timer_begin;
 
-	return (platform_timer >= (void *)(acpi_gtdt_desc.gtdt + 1) &&
-		platform_timer < acpi_gtdt_desc.gtdt_end &&
+	if (acpi_gtdt_desc.gtdt->header.revision >= 3)
+		platform_timer_begin = container_of(acpi_gtdt_desc.gtdt, struct gtdt_v3, gtdt_v2) + 1;
+	else
+		platform_timer_begin = acpi_gtdt_desc.gtdt + 1;
+
+	return (platform_timer >= platform_timer_begin &&
+		platform_timer + sizeof(*gh) <= acpi_gtdt_desc.gtdt_end &&
 		gh->length != 0 &&
 		platform_timer + gh->length <= acpi_gtdt_desc.gtdt_end);
 }
@@ -166,6 +177,13 @@ int __init acpi_gtdt_init(struct acpi_table_header *table,
 	u32 cnt = 0;
 
 	gtdt = container_of(table, struct acpi_table_gtdt, header);
+
+	if ((gtdt->header.revision >= 3 && gtdt->header.length < sizeof(struct gtdt_v3)) ||
+	    (gtdt->header.revision == 2 && gtdt->header.length < sizeof(*gtdt))) {
+		pr_err(FW_BUG "GTDT with invalid size %d\n", gtdt->header.length);
+		return -EINVAL;
+	}
+
 	acpi_gtdt_desc.gtdt = gtdt;
 	acpi_gtdt_desc.gtdt_end = (void *)table + table->length;
 	acpi_gtdt_desc.platform_timer = NULL;
-- 
2.47.3



^ permalink raw reply related

* Re: [PATCH v2 2/3] ASoC: sunxi: sun4i-spdif: Resume device before kcontrol register access
From: Bui Duc Phuc @ 2026-05-23 13:55 UTC (permalink / raw)
  To: wens
  Cc: broonie, codekipper, jernej.skrabec, lgirdwood, linux-arm-kernel,
	linux-kernel, linux-sound, linux-sunxi, nichen, perex, samuel,
	tiwai
In-Reply-To: <CAGb2v67UmMmM7bQOSf3VxsN9D8s3N8nMMX079_kiMcPU=VszFg@mail.gmail.com>

Hi Chen-Yu,

On Sat, May 23, 2026 at 2:19 AM Chen-Yu Tsai <wens@kernel.org> wrote:
> And when you do add patches due to Sashiko raising an issue, please
> do mention it in the commit message.
>

As mentioned in the v1 discussion , this issue was originally reported
by Sashiko.
I'll add the Reported-by tag in the next revision.
v1 links:
https://lore.kernel.org/all/20260513105003.81880-1-phucduc.bui@gmail.com/

> Did you actually reproduce the issue, or did you add the patch simply
> because Sashiko mentioned it?
>
Since I lack Sunxi hardware, I couldn't reproduce it or perform runtime testing.
But I did compile-test the patch.
The patch aims to fix unsafe register accesses that occur before ensuring the
device is runtime-resumed.

> On sunxi, either it will hang the system because the bus transaction
> got ignored, or it won't as something else enabled the clock.
>

If Sunxi's PM design already guarantees safe access here,
feel free to reject the patch.

Best Regards,
Phuc


^ permalink raw reply

* [PATCH v2] arm64: tlbflush: Don't broadcast if mm was only active on local cpu
From: Linu Cherian @ 2026-05-23 13:47 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Ryan Roberts, Kevin Brodsky,
	Anshuman Khandual, Yang Shi, Mark Rutland, Huang Ying
  Cc: linux-arm-kernel, linux-kernel, Linu Cherian

From: Ryan Roberts <ryan.roberts@arm.com>

There are 3 variants of tlb flush that invalidate user mappings:
flush_tlb_mm(), flush_tlb_page() and __flush_tlb_range(). All of these
would previously unconditionally broadcast their tlbis to all cpus in
the inner shareable domain.

But this is a waste of effort if we can prove that the mm for which we
are flushing the mappings has only ever been active on the local cpu. In
that case, it is safe to avoid the broadcast and simply invalidate the
current cpu.

So let's track in mm_context_t::active_cpu either the mm has never been
active on any cpu, has been active on more than 1 cpu, or has been
active on precisely 1 cpu - and in that case, which one. We update this
when switching context, being careful to ensure that it gets updated
*before* installing the mm's pgtables. On the reader side, we ensure we
read *after* the previous write(s) to the pgtable(s) that necessitated
the tlb flush have completed. This guarrantees that if a cpu that is
doing a tlb flush sees it's own id in active_cpu, then the old pgtable
entry cannot have been seen by any other cpu and we can flush only the
local cpu.

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Huang Ying <ying.huang@linux.alibaba.com>
[linu.cherian@arm.com: Adapted for v7.1 flush tlb API changes]
Signed-off-by: Linu Cherian <linu.cherian@arm.com>
---
Changelog from RFC v1:
- Adapted for v7.1 flush tlb API changes
  No changes in core logic
- Collected Rb and Tb tags
- lat_mmap benchmark showed dsb(ishst) performs better than dsb(ish),
  hence retained dsb(ishst) in flush_tlb_user_pre	


Testing with 7.1-rc4 :
+-----------------------+---------------------------------------------------+-------------+
| Benchmark             | Result Class                                      |  Improvement|  
+=======================+===================================================+=============+
| perf/syscall          | fork (ops/sec)                                    |   (I) 3.25% |
+-----------------------+---------------------------------------------------+-------------+
| pts/memtier-benchmark | Protocol: Redis Clients: 100 Ratio: 1:5 (Ops/sec) |   (I) 2.70% |
| 			| Protocol: Redis Clients: 100 Ratio: 5:1 (Ops/sec) |   (I) 2.13% |
+-----------------------+---------------------------------------------------+-------------+

 arch/arm64/include/asm/mmu.h         |  12 +++
 arch/arm64/include/asm/mmu_context.h |   2 +
 arch/arm64/include/asm/tlbflush.h    | 127 +++++++++++++++++++++------
 arch/arm64/mm/context.c              |  30 ++++++-
 4 files changed, 141 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 5e1211c540ab..0002101c1f21 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -16,6 +16,17 @@
 #include <linux/refcount.h>
 #include <asm/cpufeature.h>
 
+/*
+ * Sentinal values for mm_context_t::active_cpu. ACTIVE_CPU_NONE indicates the
+ * mm has never been active on any CPU. ACTIVE_CPU_MULTIPLE indicates the mm
+ * has been active on multiple CPUs. Any other value is the ID of the single
+ * CPU that the mm has been active on.
+ */
+enum active_cpu {
+	ACTIVE_CPU_NONE = UINT_MAX,
+	ACTIVE_CPU_MULTIPLE = UINT_MAX - 1,
+};
+
 typedef struct {
 	atomic64_t	id;
 #ifdef CONFIG_COMPAT
@@ -25,6 +36,7 @@ typedef struct {
 	void		*vdso;
 	unsigned long	flags;
 	u8		pkey_allocation_map;
+	unsigned int	active_cpu;
 } mm_context_t;
 
 /*
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 803b68758152..101cae0c7262 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -172,6 +172,8 @@ init_new_context(struct task_struct *tsk, struct mm_struct *mm)
 	/* pkey 0 is the default, so always reserve it. */
 	mm->context.pkey_allocation_map = BIT(0);
 
+	WRITE_ONCE(mm->context.active_cpu, ACTIVE_CPU_NONE);
+
 	return 0;
 }
 
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index c0bf5b398041..1f75bce4fa0d 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -164,6 +164,12 @@ static inline void sme_dvmsync_batch(struct arch_tlbflush_unmap_batch *batch)
 
 typedef void (*tlbi_op)(u64 arg);
 
+static __always_inline void vae1(u64 arg)
+{
+	__tlbi(vae1, arg);
+	__tlbi_user(vae1, arg);
+}
+
 static __always_inline void vae1is(u64 arg)
 {
 	__tlbi(vae1is, arg);
@@ -308,6 +314,74 @@ static inline void __tlbi_sync_s1ish_hyp(void)
 	__repeat_tlbi_sync(vale2is, 0);
 }
 
+typedef unsigned __bitwise tlbf_t;
+
+/* No special behaviour. */
+#define TLBF_NONE		((__force tlbf_t)0)
+
+/* Invalidate tlb entries only, leaving the page table walk cache intact. */
+#define TLBF_NOWALKCACHE	((__force tlbf_t)BIT(0))
+
+/* Skip the trailing dsb after issuing tlbi. */
+#define TLBF_NOSYNC		((__force tlbf_t)BIT(1))
+
+/* Suppress tlb notifier callbacks for this flush operation. */
+#define TLBF_NONOTIFY		((__force tlbf_t)BIT(2))
+
+/* Perform the tlbi locally without broadcasting to other CPUs. */
+#define TLBF_NOBROADCAST	((__force tlbf_t)BIT(3))
+
+/*
+ * Determines whether the user tlbi invalidation can be performed only on the
+ * local CPU or whether it needs to be broadcast. (Returns true for local).
+ * Additionally issues appropriate barrier to ensure prior pgtable updates are
+ * visible to the table walker. Must be paired with flush_tlb_user_post().
+ */
+static inline bool flush_tlb_user_pre(struct mm_struct *mm, tlbf_t flags)
+{
+	unsigned int self, active;
+	bool local;
+
+	migrate_disable();
+
+	if (flags & TLBF_NOBROADCAST) {
+		dsb(nshst);
+		return true;
+	}
+
+	self = smp_processor_id();
+
+	/*
+	 * The load of mm->context.active_cpu must not be reordered before the
+	 * store to the pgtable that necessitated this flush. This ensures that
+	 * if the value read is our cpu id, then no other cpu can have seen the
+	 * old pgtable value and therefore does not need this old value to be
+	 * flushed from its tlb. But we don't want to upgrade the dsb(ishst),
+	 * needed to make the pgtable updates visible to the walker, to a
+	 * dsb(ish) by default. So speculatively load without a barrier and if
+	 * it indicates our cpu id, then upgrade the barrier and re-load.
+	 */
+	active = READ_ONCE(mm->context.active_cpu);
+	if (active == self) {
+		dsb(ish);
+		active = READ_ONCE(mm->context.active_cpu);
+	} else {
+		dsb(ishst);
+	}
+
+	local = active == self;
+	if (!local)
+		migrate_enable();
+
+	return local;
+}
+
+static inline void flush_tlb_user_post(bool local)
+{
+	if (local)
+		migrate_enable();
+}
+
 /*
  *	TLB Invalidation
  *	================
@@ -408,12 +482,20 @@ static inline void flush_tlb_all(void)
 static inline void flush_tlb_mm(struct mm_struct *mm)
 {
 	unsigned long asid;
+	bool local;
 
-	dsb(ishst);
+	local = flush_tlb_user_pre(mm, TLBF_NONE);
 	asid = __TLBI_VADDR(0, ASID(mm));
-	__tlbi(aside1is, asid);
-	__tlbi_user(aside1is, asid);
-	__tlbi_sync_s1ish(mm);
+	if (local) {
+		__tlbi(aside1, asid);
+		__tlbi_user(aside1, asid);
+		dsb(nsh);
+	} else {
+		__tlbi(aside1is, asid);
+		__tlbi_user(aside1is, asid);
+		__tlbi_sync_s1ish(mm);
+	}
+	flush_tlb_user_post(local);
 	mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL);
 }
 
@@ -475,6 +557,12 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
  *    operations can only span an even number of pages. We save this for last to
  *    ensure 64KB start alignment is maintained for the LPA2 case.
  */
+static __always_inline void rvae1(u64 arg)
+{
+	__tlbi(rvae1, arg);
+	__tlbi_user(rvae1, arg);
+}
+
 static __always_inline void rvae1is(u64 arg)
 {
 	__tlbi(rvae1is, arg);
@@ -573,23 +661,6 @@ static inline bool __flush_tlb_range_limit_excess(unsigned long pages,
 	return pages >= (MAX_DVM_OPS * stride) >> PAGE_SHIFT;
 }
 
-typedef unsigned __bitwise tlbf_t;
-
-/* No special behaviour. */
-#define TLBF_NONE		((__force tlbf_t)0)
-
-/* Invalidate tlb entries only, leaving the page table walk cache intact. */
-#define TLBF_NOWALKCACHE	((__force tlbf_t)BIT(0))
-
-/* Skip the trailing dsb after issuing tlbi. */
-#define TLBF_NOSYNC		((__force tlbf_t)BIT(1))
-
-/* Suppress tlb notifier callbacks for this flush operation. */
-#define TLBF_NONOTIFY		((__force tlbf_t)BIT(2))
-
-/* Perform the tlbi locally without broadcasting to other CPUs. */
-#define TLBF_NOBROADCAST	((__force tlbf_t)BIT(3))
-
 static __always_inline void __do_flush_tlb_range(struct vm_area_struct *vma,
 					unsigned long start, unsigned long end,
 					unsigned long stride, int tlb_level,
@@ -597,6 +668,7 @@ static __always_inline void __do_flush_tlb_range(struct vm_area_struct *vma,
 {
 	struct mm_struct *mm = vma->vm_mm;
 	unsigned long asid, pages;
+	bool local;
 
 	pages = (end - start) >> PAGE_SHIFT;
 
@@ -605,10 +677,9 @@ static __always_inline void __do_flush_tlb_range(struct vm_area_struct *vma,
 		return;
 	}
 
-	if (!(flags & TLBF_NOBROADCAST))
-		dsb(ishst);
-	else
-		dsb(nshst);
+	local = flush_tlb_user_pre(mm, flags);
+	if (local && !(flags & TLBF_NOBROADCAST))
+		flags |= TLBF_NOBROADCAST;
 
 	asid = ASID(mm);
 
@@ -622,8 +693,8 @@ static __always_inline void __do_flush_tlb_range(struct vm_area_struct *vma,
 					asid, tlb_level);
 		break;
 	case TLBF_NOBROADCAST:
-		/* Combination unused */
-		BUG();
+		__flush_s1_tlb_range_op(vae1, start, pages, stride,
+					asid, tlb_level);
 		break;
 	case TLBF_NOWALKCACHE | TLBF_NOBROADCAST:
 		__flush_s1_tlb_range_op(vale1, start, pages, stride,
@@ -640,6 +711,8 @@ static __always_inline void __do_flush_tlb_range(struct vm_area_struct *vma,
 		else
 			dsb(nsh);
 	}
+
+	flush_tlb_user_post(local);
 }
 
 static inline void __flush_tlb_range(struct vm_area_struct *vma,
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 0f4a28b87469..f34ed78393e0 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -214,9 +214,10 @@ static u64 new_context(struct mm_struct *mm)
 
 void check_and_switch_context(struct mm_struct *mm)
 {
-	unsigned long flags;
-	unsigned int cpu;
+	unsigned int cpu = smp_processor_id();
 	u64 asid, old_active_asid;
+	unsigned int active;
+	unsigned long flags;
 
 	if (system_supports_cnp())
 		cpu_set_reserved_ttbr0();
@@ -251,7 +252,6 @@ void check_and_switch_context(struct mm_struct *mm)
 		atomic64_set(&mm->context.id, asid);
 	}
 
-	cpu = smp_processor_id();
 	if (cpumask_test_and_clear_cpu(cpu, &tlb_flush_pending))
 		local_flush_tlb_all();
 
@@ -262,6 +262,30 @@ void check_and_switch_context(struct mm_struct *mm)
 
 	arm64_apply_bp_hardening();
 
+	/*
+	 * Update mm->context.active_cpu in such a manner that we avoid cmpxchg
+	 * and dsb unless we definitely need it. If initially ACTIVE_CPU_NONE
+	 * then we are the first cpu to run so set it to our id. If initially
+	 * any id other than ours, we are the second cpu to run so set it to
+	 * ACTIVE_CPU_MULTIPLE. If we update the value then we must issue
+	 * dsb(ishst) to ensure stores to mm->context.active_cpu are ordered
+	 * against the TTBR0 write in cpu_switch_mm()/uaccess_enable(); the
+	 * store must be visible to another cpu before this cpu could have
+	 * populated any TLB entries based on the pgtables that will be
+	 * installed.
+	 */
+	active = READ_ONCE(mm->context.active_cpu);
+	if (active != cpu && active != ACTIVE_CPU_MULTIPLE) {
+		if (active == ACTIVE_CPU_NONE)
+			active = cmpxchg_relaxed(&mm->context.active_cpu,
+						 ACTIVE_CPU_NONE, cpu);
+
+		if (active != ACTIVE_CPU_NONE)
+			WRITE_ONCE(mm->context.active_cpu, ACTIVE_CPU_MULTIPLE);
+
+		dsb(ishst);
+	}
+
 	/*
 	 * Defer TTBR0_EL1 setting for user threads to uaccess_enable() when
 	 * emulating PAN.
-- 
2.43.0



^ permalink raw reply related

* [PATCH v1] block: switch numa_node to int in blk_mq_hw_ctx and init_request
From: Mateusz Nowicki @ 2026-05-23 12:52 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Caleb Sander Mateos, Sung-woo Kim, Josef Bacik, Alasdair Kergon,
	Mike Snitzer, Mikulas Patocka, Benjamin Marzinski, Ulf Hansson,
	Richard Weinberger, Zhihao Cheng, Miquel Raynal,
	Vignesh Raghavendra, Sven Peter, Janne Grunau, Neal Gompa,
	Keith Busch, Christoph Hellwig, Sagi Grimberg, Justin Tee,
	Naresh Gottumukkala, Paul Ely, Chaitanya Kulkarni,
	James E.J. Bottomley, Martin K. Petersen, Thomas Fourier, Al Viro,
	Luke Wang, Kees Cook, linux-block, linux-kernel, nbd, dm-devel,
	linux-mmc, linux-mtd, asahi, linux-arm-kernel, linux-nvme,
	linux-scsi

numa_node in blk_mq_hw_ctx and the matching argument of
blk_mq_ops::init_request can be NUMA_NO_NODE (-1).  Declared as
unsigned int, NUMA_NO_NODE becomes UINT_MAX and walks off
nvme_dev::descriptor_pools[] on CONFIG_NUMA=n [1].

Switch the field and the callback prototype to int and update all
in-tree init_request implementations.  No functional change:
cpu_to_node(), kmalloc_node() and blk_alloc_flush_queue() already
take int.

Link: https://lore.kernel.org/linux-nvme/20260522150628.399288-1-mateusz.nowicki@posteo.net/ [1]
Link: https://lore.kernel.org/linux-nvme/20260309062840.2937858-2-iam@sung-woo.kim/
Suggested-by: Caleb Sander Mateos <csander@purestorage.com>
Suggested-by: Sung-woo Kim <iam@sung-woo.kim>
Signed-off-by: Mateusz Nowicki <mateusz.nowicki@posteo.net>
---
 block/bsg-lib.c                   | 2 +-
 drivers/block/mtip32xx/mtip32xx.c | 2 +-
 drivers/block/nbd.c               | 2 +-
 drivers/md/dm-rq.c                | 2 +-
 drivers/mmc/core/queue.c          | 2 +-
 drivers/mtd/ubi/block.c           | 2 +-
 drivers/nvme/host/apple.c         | 2 +-
 drivers/nvme/host/fc.c            | 2 +-
 drivers/nvme/host/pci.c           | 2 +-
 drivers/nvme/host/rdma.c          | 2 +-
 drivers/nvme/host/tcp.c           | 2 +-
 drivers/nvme/target/loop.c        | 2 +-
 drivers/scsi/scsi_lib.c           | 2 +-
 include/linux/blk-mq.h            | 4 ++--
 14 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index fdb4b290ca68..895db30a7033 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -299,7 +299,7 @@ static blk_status_t bsg_queue_rq(struct blk_mq_hw_ctx *hctx,
 
 /* called right after the request is allocated for the request_queue */
 static int bsg_init_rq(struct blk_mq_tag_set *set, struct request *req,
-		       unsigned int hctx_idx, unsigned int numa_node)
+		       unsigned int hctx_idx, int numa_node)
 {
 	struct bsg_job *job = blk_mq_rq_to_pdu(req);
 
diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c
index 567192e371a8..8aedba9b5690 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3340,7 +3340,7 @@ static void mtip_free_cmd(struct blk_mq_tag_set *set, struct request *rq,
 }
 
 static int mtip_init_cmd(struct blk_mq_tag_set *set, struct request *rq,
-			 unsigned int hctx_idx, unsigned int numa_node)
+			 unsigned int hctx_idx, int numa_node)
 {
 	struct driver_data *dd = set->driver_data;
 	struct mtip_cmd *cmd = blk_mq_rq_to_pdu(rq);
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index fe63f3c55d0d..e2fe9e3308fc 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1888,7 +1888,7 @@ static void nbd_dbg_close(void)
 #endif
 
 static int nbd_init_request(struct blk_mq_tag_set *set, struct request *rq,
-			    unsigned int hctx_idx, unsigned int numa_node)
+			    unsigned int hctx_idx, int numa_node)
 {
 	struct nbd_cmd *cmd = blk_mq_rq_to_pdu(rq);
 	cmd->nbd = set->driver_data;
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 9703b3ae364e..9a386254d836 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -462,7 +462,7 @@ static void dm_start_request(struct mapped_device *md, struct request *orig)
 }
 
 static int dm_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
-			      unsigned int hctx_idx, unsigned int numa_node)
+			      unsigned int hctx_idx, int numa_node)
 {
 	struct mapped_device *md = set->driver_data;
 	struct dm_rq_target_io *tio = blk_mq_rq_to_pdu(rq);
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 39fcb662c43f..cfa268925c26 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -208,7 +208,7 @@ static unsigned short mmc_get_max_segments(struct mmc_host *host)
 }
 
 static int mmc_mq_init_request(struct blk_mq_tag_set *set, struct request *req,
-			       unsigned int hctx_idx, unsigned int numa_node)
+			       unsigned int hctx_idx, int numa_node)
 {
 	struct mmc_queue_req *mq_rq = req_to_mmc_queue_req(req);
 	struct mmc_queue *mq = set->driver_data;
diff --git a/drivers/mtd/ubi/block.c b/drivers/mtd/ubi/block.c
index 8880a783c3bc..29c0d6941a81 100644
--- a/drivers/mtd/ubi/block.c
+++ b/drivers/mtd/ubi/block.c
@@ -312,7 +312,7 @@ static blk_status_t ubiblock_queue_rq(struct blk_mq_hw_ctx *hctx,
 
 static int ubiblock_init_request(struct blk_mq_tag_set *set,
 		struct request *req, unsigned int hctx_idx,
-		unsigned int numa_node)
+		int numa_node)
 {
 	struct ubiblock_pdu *pdu = blk_mq_rq_to_pdu(req);
 
diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
index c692fc73babf..97586307ac1a 100644
--- a/drivers/nvme/host/apple.c
+++ b/drivers/nvme/host/apple.c
@@ -819,7 +819,7 @@ static int apple_nvme_init_hctx(struct blk_mq_hw_ctx *hctx, void *data,
 
 static int apple_nvme_init_request(struct blk_mq_tag_set *set,
 				   struct request *req, unsigned int hctx_idx,
-				   unsigned int numa_node)
+				   int numa_node)
 {
 	struct apple_nvme_queue *q = set->driver_data;
 	struct apple_nvme *anv = queue_to_apple_nvme(q);
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index e4f4528fe2a2..1907da499ad2 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2109,7 +2109,7 @@ __nvme_fc_init_request(struct nvme_fc_ctrl *ctrl,
 
 static int
 nvme_fc_init_request(struct blk_mq_tag_set *set, struct request *rq,
-		unsigned int hctx_idx, unsigned int numa_node)
+		unsigned int hctx_idx, int numa_node)
 {
 	struct nvme_fc_ctrl *ctrl = to_fc_ctrl(set->driver_data);
 	struct nvme_fcp_op_w_sgl *op = blk_mq_rq_to_pdu(rq);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 139a10cd687f..afd407df640f 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -660,7 +660,7 @@ static int nvme_init_hctx(struct blk_mq_hw_ctx *hctx, void *data,
 
 static int nvme_pci_init_request(struct blk_mq_tag_set *set,
 		struct request *req, unsigned int hctx_idx,
-		unsigned int numa_node)
+		int numa_node)
 {
 	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
 
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index f77c960f7632..08459c65c3d5 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -292,7 +292,7 @@ static void nvme_rdma_exit_request(struct blk_mq_tag_set *set,
 
 static int nvme_rdma_init_request(struct blk_mq_tag_set *set,
 		struct request *rq, unsigned int hctx_idx,
-		unsigned int numa_node)
+		int numa_node)
 {
 	struct nvme_rdma_ctrl *ctrl = to_rdma_ctrl(set->driver_data);
 	struct nvme_rdma_request *req = blk_mq_rq_to_pdu(rq);
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 15d36d6a728e..36b3ec50a9fd 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -548,7 +548,7 @@ static void nvme_tcp_exit_request(struct blk_mq_tag_set *set,
 
 static int nvme_tcp_init_request(struct blk_mq_tag_set *set,
 		struct request *rq, unsigned int hctx_idx,
-		unsigned int numa_node)
+		int numa_node)
 {
 	struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(set->driver_data);
 	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index d98d0cdc5d6f..ae00bcef2251 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -202,7 +202,7 @@ static int nvme_loop_init_iod(struct nvme_loop_ctrl *ctrl,
 
 static int nvme_loop_init_request(struct blk_mq_tag_set *set,
 		struct request *req, unsigned int hctx_idx,
-		unsigned int numa_node)
+		int numa_node)
 {
 	struct nvme_loop_ctrl *ctrl = to_loop_ctrl(set->driver_data);
 	struct nvme_loop_iod *iod = blk_mq_rq_to_pdu(req);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6e8c7a42603e..67f789bd02e7 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1950,7 +1950,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 }
 
 static int scsi_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
-				unsigned int hctx_idx, unsigned int numa_node)
+				unsigned int hctx_idx, int numa_node)
 {
 	struct Scsi_Host *shost = set->driver_data;
 	struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(rq);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 18a2388ba581..2e7f90048171 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -428,7 +428,7 @@ struct blk_mq_hw_ctx {
 	struct blk_mq_tags	*sched_tags;
 
 	/** @numa_node: NUMA node the storage adapter has been connected to. */
-	unsigned int		numa_node;
+	int			numa_node;
 	/** @queue_num: Index of this hardware queue. */
 	unsigned int		queue_num;
 
@@ -653,7 +653,7 @@ struct blk_mq_ops {
 	 * flush request.
 	 */
 	int (*init_request)(struct blk_mq_tag_set *set, struct request *,
-			    unsigned int, unsigned int);
+			    unsigned int, int);
 	/**
 	 * @exit_request: Ditto for exit/teardown.
 	 */

base-commit: 45255ea1ca096b11b1303c9b54502a28f3a31dd1
-- 
2.53.0



^ permalink raw reply related

* Re: [PATCH 0/3] Add packet-mode ESP offload for Airoha/EIP93
From: Jihong Min @ 2026-05-23 12:24 UTC (permalink / raw)
  To: Christian Marangi, Antoine Tenart, Herbert Xu, David S . Miller,
	Lorenzo Bianconi, Andrew Lunn, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Steffen Klassert
  Cc: linux-kernel, linux-crypto, linux-arm-kernel, linux-mediatek,
	netdev
In-Reply-To: <20260523121522.3023992-1-hurryman2212@gmail.com>



On 5/23/26 21:15, Jihong Min wrote:
> This series adds the missing plumbing for ESP offload engines that
> operate on whole ESP packets instead of only exposing AES/HMAC through
> the crypto API AEAD interface.
> 
> The normal ESP software path can already call into accelerated AEAD
> algorithms, but packet-mode engines such as EIP93 can also generate and
> consume ESP packet framing: padding, pad length, next header and ICV.
> That needs a slightly different XFRM offload contract so the netdev
> driver can hand the skb to a packet backend rather than trying to make
> hardware fit the software trailer layout.
> 
> Patch 1 extends the ESP offload infrastructure for packet engines while
> preserving the existing behavior for drivers that do not opt in.
> Patch 2 exposes an EIP93 ESP packet backend for encapsulation and
> decapsulation.
> Patch 3 wires Airoha Ethernet GDM netdevs and DSA user ports to that
> backend through xfrmdev_ops. ESP GSO and ESP TX checksum offload remain
> disabled.
> 
> Runtime testing was done on a Gemtek W1700K2 running OpenWrt with the
> same changes applied on top of a 6.18.31-based kernel.
> 
> Test parameters:
> 
>   - Static IPv4 transport-mode XFRM SAs between the AP and host.
>   - ESP transform: auth hmac(sha1), enc cbc(aes) with a 128-bit AES key.
>   - iperf3 TCP test, AP as client and host as server:
>         iperf3 -c <host_ip> -P 4 -t 10
>   - The host always used normal Linux XFRM software processing.
>   - With AP ESP offload disabled, the AP also used the Linux XFRM
>     software path; in this setup, EIP93-backed AEAD crypto was still
>     available to that path.
> 
> Network-relevant test setup:
> 
>   - AP: Gemtek W1700K2, Airoha AN7581/EN7581, 4x Arm Cortex-A53 at
>     1.4 GHz, 2 GiB RAM, airoha_eth wan (GDM2) netdev, 10Gb/s full-duplex,
>     MTU 9200, EIP93 crypto and IPsec packet engine present.
>   - Host: AMD Ryzen 9 9950X3D, 16 cores/32 threads, Open vSwitch,
>     MTU 9978, backed by a ConnectX-6 Dx 10Gb/s full-duplex link.
> 
> AP to host iperf3 result:
> 
>   AP offload      Sender          Receiver        Retransmits
>   on              918.2 Mbit/s    913.6 Mbit/s    0
>   off             782.4 Mbit/s    778.6 Mbit/s    3569
> 
> This is a 17.3% receiver-side throughput improvement for the AP TX ESP
> path in this setup, with retransmits eliminated in the offloaded run.
> 
> Jihong Min (3):
>   xfrm: extend ESP offload infrastructure for packet engines
>   crypto: inside-secure: add EIP93 ESP packet backend
>   net: airoha: add EIP93-backed ESP XFRM offload
> 
>  MAINTAINERS                                   |    1 +
>  drivers/crypto/inside-secure/eip93/Kconfig    |   10 +
>  drivers/crypto/inside-secure/eip93/Makefile   |    1 +
>  .../crypto/inside-secure/eip93/eip93-ipsec.c  | 1413 ++++++++++++++++
>  .../crypto/inside-secure/eip93/eip93-main.c   |   69 +-
>  .../crypto/inside-secure/eip93/eip93-main.h   |   38 +-
>  drivers/net/ethernet/airoha/Kconfig           |   11 +
>  drivers/net/ethernet/airoha/Makefile          |    1 +
>  drivers/net/ethernet/airoha/airoha_eth.c      |   51 +-
>  drivers/net/ethernet/airoha/airoha_eth.h      |   69 +
>  drivers/net/ethernet/airoha/airoha_xfrm.c     | 1474 +++++++++++++++++
>  include/crypto/eip93-ipsec.h                  |  132 ++
>  include/linux/netdevice.h                     |    3 +
>  include/net/xfrm.h                            |    8 +-
>  net/ipv4/esp4.c                               |    6 +-
>  net/ipv4/esp4_offload.c                       |   29 +-
>  net/ipv6/esp6.c                               |    6 +-
>  net/ipv6/esp6_offload.c                       |   29 +-
>  18 files changed, 3324 insertions(+), 27 deletions(-)
>  create mode 100644 drivers/crypto/inside-secure/eip93/eip93-ipsec.c
>  create mode 100644 drivers/net/ethernet/airoha/airoha_xfrm.c
>  create mode 100644 include/crypto/eip93-ipsec.h
> 

One note I should have included in the cover letter:

The hardware behavior used by this series was studied from the out-of-tree
IPsec branch of the mtk-eip93 driver:

  https://github.com/vschagen/mtk-eip93/tree/ipsec

That code was useful for understanding the EIP93 packet-mode ESP descriptor
programming and SA record values.

This series is not a direct import of that driver. The implementation was
rewritten around the current upstream driver layout and the Linux XFRM
netdev offload model, with EIP93 exposed as a packet-mode ESP backend used
by the Airoha netdev driver.


Sincerely,
Jihong Min


^ permalink raw reply

* Re: [PATCH v2] net: stmmac: fix RX DMA leak on TX alloc failure
From: Abid Ali @ 2026-05-23 12:17 UTC (permalink / raw)
  To: devnull+dev.taqnialabs.gmail.com
  Cc: alexandre.torgue, andrew+netdev, davem, dev.taqnialabs, edumazet,
	kuba, linux-arm-kernel, linux-kernel, linux-stm32,
	mcoquelin.stm32, netdev, pabeni
In-Reply-To: <20260522-stmmac-rx-desc-cleanup-v2-1-76e78eb471e1@gmail.com>

> 	ret = alloc_dma_tx_desc_resources(priv, dma_conf);
>+	if (ret)
>+		free_dma_rx_desc_resources(priv, dma_conf);
>
> 	return ret;
> }

The sashiko-gemini analysis [1] flagged two issues.

1) Double-free via XDP path:

stmmac_xdp_set_prog() ignores the return of stmmac_xdp_open(), so
if alloc_dma_tx_desc_resources() fails inside that path,
rx_q->buf_pool and rx_q->dma_rx are freed for Rx queues.

The interface stays UP, so a later stmmac_release() calls
free_dma_desc_resources() on the same freed pointers.

Without this patch, the same failure path leaks RX resources
instead. Either way the root cause seems to be stmmac_xdp_set_prog() not
handling errors from stmmac_xdp_open().

The reported issue seems to be valid, but I'm not sure why XDP doesn't handle
a possible error in reinit in the first place.

2) NULL deref on partial queue alloc:

If alloc_dma_rx_desc_resources() fails for queue N,
e.g. rx_q->page_pool = page_pool_create() fails, buf_pool is NULL.
The cleanup free_dma_rx_desc_resources() iterates through all
queues and will hit a NULL pointer deref in:

static void stmmac_free_rx_buffer(struct stmmac_priv *priv,
				  struct stmmac_rx_queue *rx_q,
				  int i)
{
	struct stmmac_rx_buffer *buf = &rx_q->buf_pool[i];

The same could happen without the patch, and similar risk exists for
rx_q->buf_pool, rx_q->dma_rx, and rx_q->dma_erx which are all freed
without guards in __free_dma_rx_desc_resources().

I can add the necessary NULL guards in __free_dma_rx_desc_resources()
for V3 if necessary.

[1] https://sashiko.dev/#/patchset/20260522-stmmac-rx-desc-cleanup-v2-1-76e78eb471e1@gmail.com

- Abid


^ permalink raw reply

* [PATCH 3/3] net: airoha: add EIP93-backed ESP XFRM offload
From: Jihong Min @ 2026-05-23 12:15 UTC (permalink / raw)
  To: Christian Marangi, Antoine Tenart, Herbert Xu, David S . Miller,
	Lorenzo Bianconi, Andrew Lunn, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Steffen Klassert
  Cc: linux-kernel, linux-crypto, linux-arm-kernel, linux-mediatek,
	netdev, Jihong Min
In-Reply-To: <20260523121522.3023992-1-hurryman2212@gmail.com>

Wire Airoha GDM netdevs and DSA user ports to the EIP93 ESP packet
backend through xfrmdev_ops.

Gate netdev feature advertisement on backend capability, add TX and RX
submit paths, preserve opt-out builds, and handle SA lifetime across
feature changes, DSA detach, and EIP93 provider loss.

Assisted-by: Codex:gpt-5.5
Signed-off-by: Jihong Min <hurryman2212@gmail.com>
---
 drivers/net/ethernet/airoha/Kconfig       |   11 +
 drivers/net/ethernet/airoha/Makefile      |    1 +
 drivers/net/ethernet/airoha/airoha_eth.c  |   51 +-
 drivers/net/ethernet/airoha/airoha_eth.h  |   69 +
 drivers/net/ethernet/airoha/airoha_xfrm.c | 1474 +++++++++++++++++++++
 5 files changed, 1605 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/airoha/airoha_xfrm.c

diff --git a/drivers/net/ethernet/airoha/Kconfig b/drivers/net/ethernet/airoha/Kconfig
index ad3ce501e7a5..302534c89fdd 100644
--- a/drivers/net/ethernet/airoha/Kconfig
+++ b/drivers/net/ethernet/airoha/Kconfig
@@ -31,4 +31,15 @@ config NET_AIROHA_FLOW_STATS
 	help
 	  Enable Aiorha flowtable statistic counters.
 
+config NET_AIROHA_XFRM
+	bool "Airoha ESP XFRM offload support"
+	depends on NET_AIROHA
+	default y
+	help
+	  Enable ESP XFRM offload support for Airoha Ethernet netdevs.
+
+	  If unsure, say Y. Say N to opt out of advertising ESP hardware
+	  offload from the Airoha Ethernet driver even when the EIP93 IPsec
+	  packet backend and XFRM offload support are available.
+
 endif #NET_VENDOR_AIROHA
diff --git a/drivers/net/ethernet/airoha/Makefile b/drivers/net/ethernet/airoha/Makefile
index 94468053e34b..15386665bb27 100644
--- a/drivers/net/ethernet/airoha/Makefile
+++ b/drivers/net/ethernet/airoha/Makefile
@@ -5,5 +5,6 @@
 
 obj-$(CONFIG_NET_AIROHA) += airoha-eth.o
 airoha-eth-y := airoha_eth.o airoha_ppe.o
+airoha-eth-$(CONFIG_NET_AIROHA_XFRM) += airoha_xfrm.o
 airoha-eth-$(CONFIG_DEBUG_FS) += airoha_ppe_debugfs.o
 obj-$(CONFIG_NET_AIROHA_NPU) += airoha_npu.o
diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
index cecd66251dba..877002c03738 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.c
+++ b/drivers/net/ethernet/airoha/airoha_eth.c
@@ -684,6 +684,14 @@ static int airoha_qdma_rx_process(struct airoha_queue *q, int budget)
 					     false);
 
 		done++;
+#if IS_ENABLED(CONFIG_NET_AIROHA_XFRM)
+		if (airoha_xfrm_in_active(port) &&
+		    airoha_xfrm_rx_skb(port, q->skb)) {
+			q->skb = NULL;
+			continue;
+		}
+#endif
+
 		napi_gro_receive(&q->napi, q->skb);
 		q->skb = NULL;
 		continue;
@@ -2010,6 +2018,19 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
 	void *data;
 	u16 index;
 	u8 fport;
+#if IS_ENABLED(CONFIG_NET_AIROHA_XFRM)
+	int err;
+
+	if (airoha_xfrm_out_active(port)) {
+		err = airoha_xfrm_encrypt_skb(port, skb);
+		if (err == -EINPROGRESS)
+			return NETDEV_TX_OK;
+		if (err == -EBUSY)
+			return NETDEV_TX_BUSY;
+		if (err)
+			goto error;
+	}
+#endif
 
 	qid = airoha_qdma_get_txq(qdma, skb_get_queue_mapping(skb));
 	tag = airoha_get_dsa_tag(skb, dev);
@@ -2895,6 +2916,8 @@ static const struct net_device_ops airoha_netdev_ops = {
 	.ndo_stop		= airoha_dev_stop,
 	.ndo_change_mtu		= airoha_dev_change_mtu,
 	.ndo_select_queue	= airoha_dev_select_queue,
+	.ndo_fix_features	= airoha_xfrm_fix_features,
+	.ndo_set_features	= airoha_xfrm_set_features,
 	.ndo_start_xmit		= airoha_dev_xmit,
 	.ndo_get_stats64        = airoha_dev_get_stats64,
 	.ndo_set_mac_address	= airoha_dev_set_macaddr,
@@ -3025,6 +3048,7 @@ static int airoha_alloc_gdm_port(struct airoha_eth *eth,
 	/* XXX: Read nbq from DTS */
 	port->nbq = id == AIROHA_GDM3_IDX && airoha_is_7581(eth) ? 4 : 0;
 	eth->ports[p] = port;
+	airoha_xfrm_build_netdev(dev);
 
 	return airoha_metadata_dst_alloc(port);
 }
@@ -3155,6 +3179,7 @@ static int airoha_probe(struct platform_device *pdev)
 
 		if (port->dev->reg_state == NETREG_REGISTERED)
 			unregister_netdev(port->dev);
+		airoha_xfrm_teardown_netdev(port->dev);
 		airoha_metadata_dst_free(port);
 	}
 	airoha_hw_cleanup(eth);
@@ -3180,6 +3205,7 @@ static void airoha_remove(struct platform_device *pdev)
 			continue;
 
 		unregister_netdev(port->dev);
+		airoha_xfrm_teardown_netdev(port->dev);
 		airoha_metadata_dst_free(port);
 	}
 	airoha_hw_cleanup(eth);
@@ -3328,7 +3354,30 @@ static struct platform_driver airoha_driver = {
 		.of_match_table = of_airoha_match,
 	},
 };
-module_platform_driver(airoha_driver);
+
+static int __init airoha_init(void)
+{
+	int err;
+
+	err = airoha_xfrm_register_notifier();
+	if (err)
+		return err;
+
+	err = platform_driver_register(&airoha_driver);
+	if (err)
+		airoha_xfrm_unregister_notifier();
+
+	return err;
+}
+
+static void __exit airoha_exit(void)
+{
+	platform_driver_unregister(&airoha_driver);
+	airoha_xfrm_unregister_notifier();
+}
+
+module_init(airoha_init);
+module_exit(airoha_exit);
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Lorenzo Bianconi <lorenzo@kernel.org>");
diff --git a/drivers/net/ethernet/airoha/airoha_eth.h b/drivers/net/ethernet/airoha/airoha_eth.h
index 4fad3acc3ccf..4fe04c763271 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.h
+++ b/drivers/net/ethernet/airoha/airoha_eth.h
@@ -11,6 +11,8 @@
 #include <linux/etherdevice.h>
 #include <linux/iopoll.h>
 #include <linux/kernel.h>
+#include <linux/kconfig.h>
+#include <linux/jump_label.h>
 #include <linux/netdevice.h>
 #include <linux/reset.h>
 #include <linux/soc/airoha/airoha_offload.h>
@@ -533,6 +535,12 @@ struct airoha_qdma {
 	struct airoha_queue q_rx[AIROHA_NUM_RX_RING];
 };
 
+#if IS_ENABLED(CONFIG_NET_AIROHA_XFRM)
+struct eip93_ipsec;
+DECLARE_STATIC_KEY_FALSE(airoha_xfrm_in_state_key);
+DECLARE_STATIC_KEY_FALSE(airoha_xfrm_out_state_key);
+#endif
+
 struct airoha_gdm_port {
 	struct airoha_qdma *qdma;
 	struct airoha_eth *eth;
@@ -549,6 +557,13 @@ struct airoha_gdm_port {
 	u64 fwd_tx_packets;
 
 	struct metadata_dst *dsa_meta[AIROHA_MAX_DSA_PORTS];
+
+#if IS_ENABLED(CONFIG_NET_AIROHA_XFRM)
+	struct eip93_ipsec *xfrm_ipsec;
+	atomic_t xfrm_state_count;
+	atomic_t xfrm_out_state_count;
+	atomic_t xfrm_in_state_count;
+#endif
 };
 
 #define AIROHA_RXD4_PPE_CPU_REASON	GENMASK(20, 16)
@@ -683,4 +698,58 @@ static inline int airoha_ppe_debugfs_init(struct airoha_ppe *ppe)
 }
 #endif
 
+#if IS_ENABLED(CONFIG_NET_AIROHA_XFRM)
+static inline bool airoha_xfrm_in_active(struct airoha_gdm_port *port)
+{
+	return static_branch_unlikely(&airoha_xfrm_in_state_key) &&
+	       atomic_read(&port->xfrm_in_state_count);
+}
+
+static inline bool airoha_xfrm_out_active(struct airoha_gdm_port *port)
+{
+	return static_branch_unlikely(&airoha_xfrm_out_state_key) &&
+	       atomic_read(&port->xfrm_out_state_count);
+}
+
+void airoha_xfrm_build_netdev(struct net_device *dev);
+void airoha_xfrm_teardown_netdev(struct net_device *dev);
+netdev_features_t airoha_xfrm_fix_features(struct net_device *dev,
+					   netdev_features_t features);
+int airoha_xfrm_set_features(struct net_device *dev,
+			     netdev_features_t features);
+bool airoha_xfrm_rx_skb(struct airoha_gdm_port *port, struct sk_buff *skb);
+int airoha_xfrm_encrypt_skb(struct airoha_gdm_port *port, struct sk_buff *skb);
+int airoha_xfrm_register_notifier(void);
+void airoha_xfrm_unregister_notifier(void);
+#else
+static inline void airoha_xfrm_build_netdev(struct net_device *dev)
+{
+}
+
+static inline void airoha_xfrm_teardown_netdev(struct net_device *dev)
+{
+}
+
+static inline netdev_features_t
+airoha_xfrm_fix_features(struct net_device *dev, netdev_features_t features)
+{
+	return features;
+}
+
+static inline int airoha_xfrm_set_features(struct net_device *dev,
+					   netdev_features_t features)
+{
+	return 0;
+}
+
+static inline int airoha_xfrm_register_notifier(void)
+{
+	return 0;
+}
+
+static inline void airoha_xfrm_unregister_notifier(void)
+{
+}
+#endif
+
 #endif /* AIROHA_ETH_H */
diff --git a/drivers/net/ethernet/airoha/airoha_xfrm.c b/drivers/net/ethernet/airoha/airoha_xfrm.c
new file mode 100644
index 000000000000..58461954d098
--- /dev/null
+++ b/drivers/net/ethernet/airoha/airoha_xfrm.c
@@ -0,0 +1,1474 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2026 Jihong Min <hurryman2212@gmail.com>
+ */
+#include <crypto/eip93-ipsec.h>
+#include <linux/err.h>
+#include <linux/kmod.h>
+#include <linux/rtnetlink.h>
+#include <linux/slab.h>
+#include <linux/udp.h>
+#include <net/dst_metadata.h>
+#include <net/esp.h>
+#include <net/ip.h>
+#include <net/ip6_checksum.h>
+#include <net/ipv6.h>
+#include <net/net_namespace.h>
+#include <net/xfrm.h>
+
+#include "airoha_eth.h"
+
+#if IS_ENABLED(CONFIG_NET_AIROHA_XFRM)
+DEFINE_STATIC_KEY_FALSE(airoha_xfrm_in_state_key);
+DEFINE_STATIC_KEY_FALSE(airoha_xfrm_out_state_key);
+#endif
+
+#if IS_ENABLED(CONFIG_NET_AIROHA_XFRM) &&            \
+	IS_REACHABLE(CONFIG_CRYPTO_DEV_EIP93) &&     \
+	IS_ENABLED(CONFIG_CRYPTO_DEV_EIP93_IPSEC) && \
+	IS_REACHABLE(CONFIG_INET_ESP) &&             \
+	IS_REACHABLE(CONFIG_INET_ESP_OFFLOAD) &&     \
+	IS_ENABLED(CONFIG_XFRM_OFFLOAD)
+#define AIROHA_XFRM_FEATURES \
+	(NETIF_F_HW_ESP | NETIF_F_HW_ESP_TX_CSUM | NETIF_F_GSO_ESP)
+
+struct airoha_xfrm_state {
+	struct airoha_gdm_port *port;
+	struct eip93_ipsec_sa *sa;
+};
+
+static netdev_features_t airoha_xfrm_ipsec_features(struct eip93_ipsec *ipsec)
+{
+	netdev_features_t features = 0;
+	u32 ipsec_features;
+
+	ipsec_features = eip93_ipsec_features(ipsec);
+	if (ipsec_features & EIP93_IPSEC_FEATURE_ESP)
+		features |= NETIF_F_HW_ESP;
+	if (ipsec_features & EIP93_IPSEC_FEATURE_HW_ESP_TX_CSUM)
+		features |= NETIF_F_HW_ESP_TX_CSUM;
+	if (ipsec_features & EIP93_IPSEC_FEATURE_GSO_ESP)
+		features |= NETIF_F_GSO_ESP;
+
+	return features;
+}
+
+static int airoha_xfrm_request_module(struct net_device *dev,
+				      const char *module_name)
+{
+	int err;
+
+	err = request_module("%s", module_name);
+	if (err) {
+		netdev_err(dev, "failed requesting module %s: %d\n",
+			   module_name, err);
+		return err < 0 ? err : -ENOENT;
+	}
+
+	return 0;
+}
+
+static int airoha_xfrm_request_modules(struct net_device *dev)
+{
+	int err;
+
+	if (IS_MODULE(CONFIG_INET_ESP)) {
+		err = airoha_xfrm_request_module(dev, "esp4");
+		if (err)
+			return err;
+	}
+
+	if (IS_MODULE(CONFIG_INET_ESP_OFFLOAD)) {
+		err = airoha_xfrm_request_module(dev, "esp4_offload");
+		if (err)
+			return err;
+	}
+
+#if IS_REACHABLE(CONFIG_INET6_ESP)
+	if (IS_MODULE(CONFIG_INET6_ESP)) {
+		err = airoha_xfrm_request_module(dev, "esp6");
+		if (err)
+			return err;
+	}
+#endif
+
+#if IS_REACHABLE(CONFIG_INET6_ESP_OFFLOAD)
+	if (IS_MODULE(CONFIG_INET6_ESP_OFFLOAD)) {
+		err = airoha_xfrm_request_module(dev, "esp6_offload");
+		if (err)
+			return err;
+	}
+#endif
+
+	if (IS_MODULE(CONFIG_CRYPTO_DEV_EIP93)) {
+		err = airoha_xfrm_request_module(dev, "crypto-hw-eip93");
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int airoha_xfrm_prepare_ipsec(struct net_device *dev)
+{
+	struct airoha_gdm_port *port = netdev_priv(dev);
+	struct eip93_ipsec *ipsec;
+	int err;
+
+	if (port->xfrm_ipsec)
+		return eip93_ipsec_available(port->xfrm_ipsec) ? 0 : -ENODEV;
+
+	err = airoha_xfrm_request_modules(dev);
+	if (err)
+		return err;
+
+	ipsec = eip93_ipsec_get(port->eth->dev);
+	if (IS_ERR(ipsec)) {
+		netdev_dbg(dev,
+			   "EIP93 ESP packet backend is unavailable: %ld\n",
+			   PTR_ERR(ipsec));
+		return PTR_ERR(ipsec);
+	}
+
+	port->xfrm_ipsec = ipsec;
+	netdev_info(dev, "ESP HW offload available via EIP93 packet backend\n");
+
+	return 0;
+}
+
+static bool airoha_xfrm_state_supported(struct xfrm_state *x,
+					struct netlink_ext_ack *extack)
+{
+	if (x->xso.type != XFRM_DEV_OFFLOAD_CRYPTO) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "only XFRM crypto offload is supported");
+		return false;
+	}
+
+	switch (x->xso.dir) {
+	case XFRM_DEV_OFFLOAD_OUT:
+	case XFRM_DEV_OFFLOAD_IN:
+		break;
+	default:
+		NL_SET_ERR_MSG_MOD(extack, "only in/out SAs are supported");
+		return false;
+	}
+
+	switch (x->props.family) {
+	case AF_INET:
+		break;
+#if IS_REACHABLE(CONFIG_INET6_ESP) && IS_REACHABLE(CONFIG_INET6_ESP_OFFLOAD)
+	case AF_INET6:
+		break;
+#endif
+	default:
+		NL_SET_ERR_MSG_MOD(extack,
+				   "only IPv4/IPv6 ESP offload is supported");
+		return false;
+	}
+
+	if (x->outer_mode.family != x->props.family) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "only same-family ESP offload is supported");
+		return false;
+	}
+
+	if (x->id.proto != IPPROTO_ESP) {
+		NL_SET_ERR_MSG_MOD(extack, "only ESP offload is supported");
+		return false;
+	}
+
+	switch (x->props.mode) {
+	case XFRM_MODE_TUNNEL:
+	case XFRM_MODE_TRANSPORT:
+		break;
+	default:
+		NL_SET_ERR_MSG_MOD(extack,
+				   "only tunnel/transport modes are supported");
+		return false;
+	}
+
+	if (x->outer_mode.encap != x->props.mode) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "outer ESP mode does not match state mode");
+		return false;
+	}
+
+	if (x->encap) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "NAT-T is unsupported by EIP93 packet ESP");
+		return false;
+	}
+
+	if (x->tfcpad) {
+		NL_SET_ERR_MSG_MOD(extack, "TFC padding is not supported");
+		return false;
+	}
+
+	if (x->aead) {
+		NL_SET_ERR_MSG_MOD(extack, "AEAD SAs are unsupported");
+		return false;
+	}
+
+	if (!x->ealg || !x->aalg) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "encryption/authentication required");
+		return false;
+	}
+
+	return true;
+}
+
+static const struct xfrmdev_ops airoha_xfrmdev_ops;
+
+#if IS_ENABLED(CONFIG_NET_DSA)
+static struct airoha_gdm_port *airoha_xfrm_dsa_dev_port(struct net_device *dev)
+{
+	struct net_device *conduit;
+	struct dsa_port *dp;
+
+	if (!dsa_user_dev_check(dev))
+		return NULL;
+
+	dp = dsa_port_from_netdev(dev);
+	if (IS_ERR(dp))
+		return NULL;
+
+	conduit = dsa_port_to_conduit(dp);
+	if (!conduit || conduit->xfrmdev_ops != &airoha_xfrmdev_ops)
+		return NULL;
+
+	return netdev_priv(conduit);
+}
+
+static struct net_device *airoha_xfrm_dsa_rx_dev(struct airoha_gdm_port *port,
+						 struct sk_buff *skb)
+{
+	struct metadata_dst *md_dst = skb_metadata_dst(skb);
+	struct dsa_port *cpu_dp = port->dev->dsa_ptr;
+	struct dsa_port *dp;
+	u32 source_port;
+
+	if (!md_dst || md_dst->type != METADATA_HW_PORT_MUX)
+		return port->dev;
+
+	if (!cpu_dp || !cpu_dp->dst)
+		return NULL;
+
+	source_port = md_dst->u.port_info.port_id;
+	list_for_each_entry(dp, &cpu_dp->dst->ports, list) {
+		if (dp->type != DSA_PORT_TYPE_USER ||
+		    dp->index != source_port || dp->cpu_dp != cpu_dp ||
+		    dsa_port_to_conduit(dp) != port->dev || !dp->user)
+			continue;
+
+		return dp->user;
+	}
+
+	return NULL;
+}
+
+static bool airoha_xfrm_dsa_user_matches_port(struct net_device *user,
+					      struct net_device *conduit)
+{
+	struct dsa_port *dp;
+
+	if (!dsa_user_dev_check(user))
+		return false;
+
+	dp = dsa_port_from_netdev(user);
+	if (IS_ERR(dp))
+		return false;
+
+	return dsa_port_to_conduit(dp) == conduit;
+}
+#else
+static struct airoha_gdm_port *airoha_xfrm_dsa_dev_port(struct net_device *dev)
+{
+	return NULL;
+}
+
+static struct net_device *airoha_xfrm_dsa_rx_dev(struct airoha_gdm_port *port,
+						 struct sk_buff *skb)
+{
+	return port->dev;
+}
+#endif
+
+static struct airoha_gdm_port *airoha_xfrm_dev_port(struct net_device *dev)
+{
+	struct airoha_gdm_port *port;
+
+	if (dev->xfrmdev_ops != &airoha_xfrmdev_ops)
+		return NULL;
+
+	port = airoha_xfrm_dsa_dev_port(dev);
+	if (port)
+		return port;
+
+	return netdev_priv(dev);
+}
+
+static netdev_features_t airoha_xfrm_dev_features(struct net_device *dev)
+{
+	struct airoha_gdm_port *port = airoha_xfrm_dev_port(dev);
+
+	if (!port || !port->xfrm_ipsec)
+		return 0;
+
+	return airoha_xfrm_ipsec_features(port->xfrm_ipsec);
+}
+
+static struct net_device *airoha_xfrm_rx_dev(struct airoha_gdm_port *port,
+					     struct sk_buff *skb)
+{
+	if (!netdev_uses_dsa(port->dev))
+		return port->dev;
+
+	return airoha_xfrm_dsa_rx_dev(port, skb);
+}
+
+static void airoha_xfrm_state_advance_esn(struct xfrm_state *x)
+{
+	struct airoha_xfrm_state *state;
+
+	state = (struct airoha_xfrm_state *)x->xso.offload_handle;
+	if (state)
+		eip93_ipsec_state_advance_esn(state->sa, x);
+}
+
+static int airoha_xfrm_state_add(struct net_device *dev, struct xfrm_state *x,
+				 struct netlink_ext_ack *extack)
+{
+	struct airoha_gdm_port *port = airoha_xfrm_dev_port(dev);
+	struct airoha_xfrm_state *state;
+	int err;
+
+	if (!port) {
+		NL_SET_ERR_MSG_MOD(extack, "device lacks Airoha ESP offload");
+		return -EOPNOTSUPP;
+	}
+
+	if (!(dev->features & NETIF_F_HW_ESP)) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "ESP HW offload is disabled on device");
+		return -EOPNOTSUPP;
+	}
+
+	if (!port->xfrm_ipsec || !eip93_ipsec_available(port->xfrm_ipsec)) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "EIP93 packet backend is unavailable");
+		return -EOPNOTSUPP;
+	}
+
+	if (!airoha_xfrm_state_supported(x, extack))
+		return -EOPNOTSUPP;
+
+	state = kzalloc(sizeof(*state), GFP_KERNEL);
+	if (!state)
+		return -ENOMEM;
+
+	state->port = port;
+	err = eip93_ipsec_state_add(port->xfrm_ipsec, x, extack, &state->sa);
+	if (err) {
+		kfree(state);
+		return err;
+	}
+
+	x->xso.offload_handle = (unsigned long)state;
+	atomic_inc(&port->xfrm_state_count);
+	if (x->xso.dir == XFRM_DEV_OFFLOAD_OUT) {
+		atomic_inc(&port->xfrm_out_state_count);
+		static_branch_inc(&airoha_xfrm_out_state_key);
+	} else {
+		atomic_inc(&port->xfrm_in_state_count);
+		static_branch_inc(&airoha_xfrm_in_state_key);
+	}
+
+	return 0;
+}
+
+static void airoha_xfrm_state_delete(struct net_device *dev,
+				     struct xfrm_state *x)
+{
+	struct airoha_xfrm_state *state;
+	struct airoha_gdm_port *port;
+
+	state = (struct airoha_xfrm_state *)x->xso.offload_handle;
+	if (!state)
+		return;
+
+	port = state->port;
+	x->xso.offload_handle = 0;
+	atomic_dec(&port->xfrm_state_count);
+	if (x->xso.dir == XFRM_DEV_OFFLOAD_OUT) {
+		atomic_dec(&port->xfrm_out_state_count);
+		static_branch_dec(&airoha_xfrm_out_state_key);
+	} else if (x->xso.dir == XFRM_DEV_OFFLOAD_IN) {
+		atomic_dec(&port->xfrm_in_state_count);
+		static_branch_dec(&airoha_xfrm_in_state_key);
+	}
+
+	eip93_ipsec_state_delete(state->sa);
+	kfree(state);
+}
+
+static bool airoha_xfrm_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
+{
+	struct net_device *dev = skb->dev;
+	struct airoha_xfrm_state *state;
+	struct airoha_gdm_port *port;
+
+	if (!dev)
+		return false;
+
+	port = airoha_xfrm_dev_port(dev);
+	if (!port)
+		return false;
+
+	if (unlikely(x->xso.dir != XFRM_DEV_OFFLOAD_OUT ||
+		     x->xso.type != XFRM_DEV_OFFLOAD_CRYPTO ||
+		     !(dev->features & NETIF_F_HW_ESP) || x->xso.dev != dev))
+		return false;
+
+	state = (struct airoha_xfrm_state *)x->xso.offload_handle;
+	if (!state || state->port != port)
+		return false;
+
+	if (unlikely(skb_is_gso(skb)))
+		return false;
+
+	return true;
+}
+
+/*
+ * EIP93 packet-out mode creates ESP padding, trailer and ICV. The generic ESP
+ * xmit path should reserve tailroom only for plain, non-GSO ESP packets.
+ */
+static bool airoha_xfrm_esp_tx_hw_trailer(struct sk_buff *skb,
+					  struct xfrm_state *x)
+{
+	return x->xso.dir == XFRM_DEV_OFFLOAD_OUT &&
+	       x->xso.type == XFRM_DEV_OFFLOAD_CRYPTO && !x->encap &&
+	       !skb_is_gso(skb);
+}
+
+static const struct xfrmdev_ops airoha_xfrmdev_ops = {
+	.xdo_dev_state_add = airoha_xfrm_state_add,
+	.xdo_dev_state_delete = airoha_xfrm_state_delete,
+	.xdo_dev_state_free = airoha_xfrm_state_delete,
+	.xdo_dev_offload_ok = airoha_xfrm_offload_ok,
+	.xdo_dev_esp_tx_hw_trailer = airoha_xfrm_esp_tx_hw_trailer,
+	.xdo_dev_state_advance_esn = airoha_xfrm_state_advance_esn,
+};
+
+void airoha_xfrm_build_netdev(struct net_device *dev)
+{
+	struct airoha_gdm_port *port = netdev_priv(dev);
+	netdev_features_t features;
+
+	atomic_set(&port->xfrm_state_count, 0);
+	atomic_set(&port->xfrm_out_state_count, 0);
+	atomic_set(&port->xfrm_in_state_count, 0);
+	if (airoha_xfrm_prepare_ipsec(dev))
+		return;
+
+	features = airoha_xfrm_ipsec_features(port->xfrm_ipsec);
+	if (!(features & NETIF_F_HW_ESP)) {
+		eip93_ipsec_put(port->xfrm_ipsec);
+		port->xfrm_ipsec = NULL;
+		return;
+	}
+
+	dev->xfrmdev_ops = &airoha_xfrmdev_ops;
+	dev->hw_features |= features;
+	dev->hw_enc_features |= features;
+	dev->gso_partial_features |= features & NETIF_F_GSO_ESP;
+}
+
+void airoha_xfrm_teardown_netdev(struct net_device *dev)
+{
+	struct airoha_gdm_port *port = netdev_priv(dev);
+
+	if (port->xfrm_ipsec) {
+		eip93_ipsec_put(port->xfrm_ipsec);
+		port->xfrm_ipsec = NULL;
+	}
+}
+
+/* Airoha TX checksum/GSO offloads run after EIP93 has encrypted the skb, so
+ * they cannot operate on plaintext ESP payloads or build per-segment ESP data.
+ */
+netdev_features_t airoha_xfrm_fix_features(struct net_device *dev,
+					   netdev_features_t features)
+{
+	netdev_features_t supported = airoha_xfrm_dev_features(dev);
+	netdev_features_t unsupported = AIROHA_XFRM_FEATURES & ~supported;
+
+	if (features & unsupported)
+		features &= ~unsupported;
+
+	if (!(features & NETIF_F_HW_ESP))
+		features &= ~(NETIF_F_HW_ESP_TX_CSUM | NETIF_F_GSO_ESP);
+
+	return features;
+}
+
+int airoha_xfrm_set_features(struct net_device *dev, netdev_features_t features)
+{
+	netdev_features_t changed = (dev->features ^ features) &
+				    AIROHA_XFRM_FEATURES;
+	netdev_features_t requested = features & AIROHA_XFRM_FEATURES;
+	struct airoha_gdm_port *port = netdev_priv(dev);
+	netdev_features_t supported;
+	int err;
+
+	if (!changed)
+		return 0;
+
+	if (requested & NETIF_F_HW_ESP) {
+		err = airoha_xfrm_prepare_ipsec(dev);
+		if (err)
+			return err;
+	}
+
+	supported = airoha_xfrm_dev_features(dev);
+	if (requested & ~supported)
+		return -EOPNOTSUPP;
+
+	if (atomic_read(&port->xfrm_state_count)) {
+		netdev_err(dev, "cannot change ESP features with active SAs\n");
+		return -EBUSY;
+	}
+
+	if (!(features & NETIF_F_HW_ESP))
+		netdev_info(dev, "ESP HW offload disabled\n");
+
+	return 0;
+}
+
+struct airoha_xfrm_rx_info {
+	unsigned short family;
+	int encap_type;
+	int esp_offset;
+	int packet_len;
+	__be32 spi;
+	__be32 seq;
+};
+
+struct airoha_xfrm_rx_ctx {
+	struct sk_buff *skb;
+	struct net_device *dev;
+};
+
+static bool airoha_xfrm_parse_rx_ipv4(struct sk_buff *skb,
+				      struct airoha_xfrm_rx_info *info)
+{
+	struct ip_esp_hdr *esph;
+	struct iphdr *iph;
+	int packet_len;
+	int iphlen;
+
+	if (!pskb_may_pull(skb, sizeof(*iph)))
+		return false;
+
+	iph = ip_hdr(skb);
+	if (iph->version != 4)
+		return false;
+
+	iphlen = iph->ihl * 4;
+	if (iphlen < sizeof(*iph) || !pskb_may_pull(skb, iphlen))
+		return false;
+
+	if (ip_is_fragment(iph))
+		return false;
+
+	packet_len = ntohs(iph->tot_len);
+	if (packet_len < iphlen || packet_len > skb->len)
+		return false;
+
+	switch (iph->protocol) {
+	case IPPROTO_ESP:
+		info->encap_type = 0;
+		info->esp_offset = iphlen;
+		info->packet_len = packet_len;
+		break;
+	case IPPROTO_UDP: {
+		struct udphdr *uh;
+		int udp_len;
+		__be32 marker;
+
+		if (!pskb_may_pull(skb, iphlen + sizeof(*uh) + sizeof(*esph)))
+			return false;
+
+		uh = (struct udphdr *)(skb->data + iphlen);
+		udp_len = ntohs(uh->len);
+		if (udp_len <= sizeof(*uh) + sizeof(*esph) ||
+		    iphlen + udp_len > packet_len)
+			return false;
+
+		memcpy(&marker, skb->data + iphlen + sizeof(*uh),
+		       sizeof(marker));
+		if (!marker)
+			return false;
+
+		info->encap_type = UDP_ENCAP_ESPINUDP;
+		info->esp_offset = iphlen + sizeof(*uh);
+		info->packet_len = iphlen + udp_len;
+		break;
+	}
+	default:
+		return false;
+	}
+
+	if (info->esp_offset + sizeof(*esph) > info->packet_len ||
+	    !pskb_may_pull(skb, info->esp_offset + sizeof(*esph)))
+		return false;
+
+	esph = (struct ip_esp_hdr *)(skb->data + info->esp_offset);
+	info->family = AF_INET;
+	info->spi = esph->spi;
+	info->seq = esph->seq_no;
+
+	return !!info->spi;
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+static bool airoha_xfrm_parse_rx_ipv6(struct sk_buff *skb,
+				      struct airoha_xfrm_rx_info *info)
+{
+	struct ip_esp_hdr *esph;
+	struct ipv6hdr *ip6h;
+	__be16 frag_off;
+	int packet_len;
+	int offset;
+	u8 nexthdr;
+
+	if (!pskb_may_pull(skb, sizeof(*ip6h)))
+		return false;
+
+	ip6h = ipv6_hdr(skb);
+	if (ip6h->version != 6)
+		return false;
+
+	if (!ip6h->payload_len)
+		return false;
+
+	packet_len = sizeof(*ip6h) + ntohs(ip6h->payload_len);
+	if (packet_len < sizeof(*ip6h) || packet_len > skb->len)
+		return false;
+
+	nexthdr = ip6h->nexthdr;
+	offset = ipv6_skip_exthdr(skb, sizeof(*ip6h), &nexthdr, &frag_off);
+	if (offset < 0 || frag_off)
+		return false;
+
+	switch (nexthdr) {
+	case NEXTHDR_ESP:
+		info->encap_type = 0;
+		info->esp_offset = offset;
+		info->packet_len = packet_len;
+		break;
+	case NEXTHDR_UDP: {
+		struct udphdr *uh;
+		int udp_len;
+		__be32 marker;
+
+		if (!pskb_may_pull(skb, offset + sizeof(*uh) + sizeof(*esph)))
+			return false;
+
+		uh = (struct udphdr *)(skb->data + offset);
+		udp_len = ntohs(uh->len);
+		if (udp_len <= sizeof(*uh) + sizeof(*esph) ||
+		    offset + udp_len > packet_len)
+			return false;
+
+		memcpy(&marker, skb->data + offset + sizeof(*uh),
+		       sizeof(marker));
+		if (!marker)
+			return false;
+
+		info->encap_type = UDP_ENCAP_ESPINUDP;
+		info->esp_offset = offset + sizeof(*uh);
+		info->packet_len = offset + udp_len;
+		break;
+	}
+	default:
+		return false;
+	}
+
+	if (info->esp_offset + sizeof(*esph) > info->packet_len ||
+	    !pskb_may_pull(skb, info->esp_offset + sizeof(*esph)))
+		return false;
+
+	esph = (struct ip_esp_hdr *)(skb->data + info->esp_offset);
+	info->family = AF_INET6;
+	info->spi = esph->spi;
+	info->seq = esph->seq_no;
+
+	return !!info->spi;
+}
+#else
+static bool airoha_xfrm_parse_rx_ipv6(struct sk_buff *skb,
+				      struct airoha_xfrm_rx_info *info)
+{
+	return false;
+}
+#endif
+
+static bool airoha_xfrm_parse_rx_skb(struct sk_buff *skb,
+				     struct airoha_xfrm_rx_info *info)
+{
+	switch (skb->protocol) {
+	case htons(ETH_P_IP):
+		return airoha_xfrm_parse_rx_ipv4(skb, info);
+	case htons(ETH_P_IPV6):
+		return airoha_xfrm_parse_rx_ipv6(skb, info);
+	default:
+		return false;
+	}
+}
+
+static struct xfrm_state *
+airoha_xfrm_rx_state_lookup(struct airoha_gdm_port *port, struct sk_buff *skb,
+			    const struct airoha_xfrm_rx_info *info)
+{
+	struct airoha_xfrm_state *state;
+	xfrm_address_t daddr = {};
+	struct net_device *dev;
+	struct xfrm_state *x;
+
+	dev = airoha_xfrm_rx_dev(port, skb);
+	if (!dev)
+		return NULL;
+
+	switch (info->family) {
+	case AF_INET:
+		daddr.a4 = ip_hdr(skb)->daddr;
+		break;
+	case AF_INET6:
+		daddr.in6 = ipv6_hdr(skb)->daddr;
+		break;
+	default:
+		return NULL;
+	}
+
+	x = xfrm_input_state_lookup(dev_net(dev), skb->mark, &daddr, info->spi,
+				    IPPROTO_ESP, info->family);
+	if (!x)
+		return NULL;
+
+	if (x->dir && x->dir != XFRM_SA_DIR_IN)
+		goto err_put;
+
+	if (x->xso.dir != XFRM_DEV_OFFLOAD_IN ||
+	    x->xso.type != XFRM_DEV_OFFLOAD_CRYPTO || x->xso.dev != dev ||
+	    !(dev->features & NETIF_F_HW_ESP) || !x->type_offload ||
+	    !x->type_offload->input_tail)
+		goto err_put;
+
+	state = (struct airoha_xfrm_state *)x->xso.offload_handle;
+	if (!state || state->port != port)
+		goto err_put;
+
+	if ((x->encap ? x->encap->encap_type : 0) != info->encap_type)
+		goto err_put;
+
+	return x;
+
+err_put:
+	xfrm_state_put(x);
+	return NULL;
+}
+
+static u32 airoha_xfrm_rx_status(int err, struct xfrm_state *x)
+{
+	if (!err)
+		return CRYPTO_SUCCESS;
+
+	if (err == -EBADMSG) {
+		if (x->props.mode == XFRM_MODE_TUNNEL)
+			return CRYPTO_TUNNEL_ESP_AUTH_FAILED;
+
+		return CRYPTO_TRANSPORT_ESP_AUTH_FAILED;
+	}
+
+	if (err == -EINVAL)
+		return CRYPTO_INVALID_PACKET_SYNTAX;
+
+	return CRYPTO_GENERIC_ERROR;
+}
+
+static int airoha_xfrm_rx_apply_result(struct sk_buff *skb,
+				       struct xfrm_state *x,
+				       struct eip93_ipsec_result result)
+{
+	struct xfrm_offload *xo = xfrm_offload(skb);
+
+	if (!x || !result.packet_len || result.packet_len > skb->len || !xo)
+		return -EINVAL;
+
+	/*
+	 * EIP93 inbound ESP mode removes the ESP pad/trailer/ICV and reports
+	 * the decapsulated outer packet length plus the recovered next-header.
+	 */
+	xo->proto = result.nexthdr;
+	xo->flags |= XFRM_ESP_NO_TRAILER;
+	if (pskb_trim(skb, result.packet_len))
+		return -EINVAL;
+
+	if (x->props.family == AF_INET) {
+		ip_hdr(skb)->tot_len = htons(skb->len);
+		ip_send_check(ip_hdr(skb));
+	} else if (x->props.family == AF_INET6) {
+		int len = skb->len - skb_network_offset(skb) -
+			  sizeof(struct ipv6hdr);
+
+		if (len < 0)
+			return -EINVAL;
+
+		ipv6_hdr(skb)->payload_len = len > IPV6_MAXPLEN ? 0 :
+								      htons(len);
+	}
+
+	return 0;
+}
+
+static void airoha_xfrm_rx_free_ctx(struct airoha_xfrm_rx_ctx *ctx)
+{
+	kfree(ctx);
+}
+
+static void airoha_xfrm_rx_finish(void *data, int err,
+				  struct eip93_ipsec_result result)
+{
+	struct airoha_xfrm_rx_ctx *ctx = data;
+	struct net_device *dev = ctx->dev;
+	struct sk_buff *skb = ctx->skb;
+	struct xfrm_offload *xo;
+	struct xfrm_state *x;
+
+	x = xfrm_input_state(skb);
+	xo = xfrm_offload(skb);
+	if (!err)
+		err = airoha_xfrm_rx_apply_result(skb, x, result);
+	if (xo) {
+		xo->flags |= CRYPTO_DONE;
+		xo->status = airoha_xfrm_rx_status(err, x);
+	}
+
+	airoha_xfrm_rx_free_ctx(ctx);
+	netif_receive_skb(skb);
+	dev_put(dev);
+}
+
+static bool airoha_xfrm_tx_esp_offset(struct sk_buff *skb, struct xfrm_state *x,
+				      unsigned int *esp_offset)
+{
+	u8 *esph = (u8 *)ip_esp_hdr(skb);
+
+	if (x->encap)
+		esph += sizeof(struct udphdr);
+
+	if (esph < skb->data ||
+	    esph + sizeof(struct ip_esp_hdr) > skb_tail_pointer(skb))
+		return false;
+
+	*esp_offset = esph - skb->data;
+
+	return true;
+}
+
+static void airoha_xfrm_tx_update_outer_len(struct sk_buff *skb)
+{
+	struct iphdr *iph = ip_hdr(skb);
+
+	if (iph->version == 4) {
+		iph->tot_len = htons(skb->len - skb_network_offset(skb));
+		ip_send_check(iph);
+	} else if (iph->version == 6) {
+		int len = skb->len - skb_network_offset(skb) -
+			  sizeof(struct ipv6hdr);
+
+		if (len < 0)
+			return;
+
+		ipv6_hdr(skb)->payload_len = len > IPV6_MAXPLEN ? 0 :
+								  htons(len);
+	}
+}
+
+static void airoha_xfrm_tx_udp6_csum(struct sk_buff *skb,
+				     struct xfrm_state *x)
+{
+#if IS_ENABLED(CONFIG_IPV6)
+	struct udphdr *uh;
+	struct ipv6hdr *ip6h;
+	unsigned int offset;
+	__wsum csum;
+	int len;
+
+	if (x->props.family != AF_INET6 || !x->encap ||
+	    x->encap->encap_type != UDP_ENCAP_ESPINUDP)
+		return;
+
+	offset = skb_transport_offset(skb);
+	if (offset + sizeof(*uh) > skb->len)
+		return;
+
+	uh = udp_hdr(skb);
+	ip6h = ipv6_hdr(skb);
+	len = ntohs(uh->len);
+	if (len < sizeof(*uh) || len > skb->len - offset)
+		return;
+
+	uh->check = 0;
+	csum = skb_checksum(skb, offset, len, 0);
+	uh->check = csum_ipv6_magic(&ip6h->saddr, &ip6h->daddr, len,
+				    IPPROTO_UDP, csum);
+	if (!uh->check)
+		uh->check = CSUM_MANGLED_0;
+	#endif
+}
+
+static int airoha_xfrm_tx_apply_result(struct sk_buff *skb,
+				       struct xfrm_state *x,
+				       struct eip93_ipsec_result result)
+{
+	unsigned int current_esp_len;
+	unsigned int esp_offset;
+	unsigned int new_len;
+
+	if (!result.packet_len ||
+	    !airoha_xfrm_tx_esp_offset(skb, x, &esp_offset))
+		return -EINVAL;
+
+	current_esp_len = skb->len - esp_offset;
+	if (result.packet_len == current_esp_len)
+		return 0;
+
+	new_len = esp_offset + result.packet_len;
+	if (new_len < esp_offset)
+		return -EINVAL;
+
+	/*
+	 * EIP93 outbound ESP mode reports the generated ESP packet length.
+	 * Reflect it in skb->len before the packet resumes into the Ethernet
+	 * TX path, because generic ESP left hardware-generated trailer bytes
+	 * outside skb->len.
+	 */
+	if (new_len > skb->len) {
+		unsigned int delta = new_len - skb->len;
+
+		if (delta > skb_tailroom(skb))
+			return -ENOMEM;
+		skb_put(skb, delta);
+
+		return 0;
+	}
+
+	return pskb_trim(skb, new_len);
+}
+
+bool airoha_xfrm_rx_skb(struct airoha_gdm_port *port, struct sk_buff *skb)
+{
+	struct airoha_xfrm_rx_info info;
+	struct airoha_xfrm_state *state;
+	struct airoha_xfrm_rx_ctx *ctx;
+	struct sk_buff *trailer;
+	struct xfrm_offload *xo;
+	struct xfrm_state *x;
+	struct sec_path *sp;
+	int err;
+	u32 mark = skb->mark;
+
+	if (!airoha_xfrm_parse_rx_skb(skb, &info))
+		return false;
+
+	x = airoha_xfrm_rx_state_lookup(port, skb, &info);
+	if (!x)
+		return false;
+
+	sp = secpath_set(skb);
+	if (!sp)
+		goto err_put_state;
+
+	if (sp->len == XFRM_MAX_DEPTH) {
+		secpath_reset(skb);
+		goto err_put_state;
+	}
+
+	skb->mark = xfrm_smark_get(mark, x);
+	sp->xvec[sp->len++] = x;
+	sp->olen++;
+	XFRM_SKB_CB(skb)->seq.input.low = info.seq;
+	XFRM_SKB_CB(skb)->seq.input.hi = htonl(xfrm_replay_seqhi(x, info.seq));
+	XFRM_SPI_SKB_CB(skb)->family = info.family;
+	XFRM_SPI_SKB_CB(skb)->seq = info.seq;
+	if (info.family == AF_INET) {
+		XFRM_SPI_SKB_CB(skb)->daddroff = offsetof(struct iphdr, daddr);
+		XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip4 = NULL;
+	} else {
+		XFRM_SPI_SKB_CB(skb)->daddroff =
+			offsetof(struct ipv6hdr, daddr);
+		XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 = NULL;
+	}
+
+	xo = xfrm_offload(skb);
+	if (!xo)
+		goto err_reset;
+
+	state = (struct airoha_xfrm_state *)x->xso.offload_handle;
+	if (!state || state->port != port)
+		goto err_reset;
+
+	if (skb_cloned(skb) || skb_is_nonlinear(skb)) {
+		err = skb_cow_data(skb, 0, &trailer);
+		if (err < 0)
+			goto err_reset;
+
+		if (skb_is_nonlinear(skb)) {
+			err = skb_linearize(skb);
+			if (err)
+				goto err_reset;
+		}
+	}
+
+	ctx = kmalloc(sizeof(*ctx), GFP_ATOMIC);
+	if (!ctx)
+		goto err_reset;
+
+	if (!skb->dev)
+		goto err_free_ctx;
+
+	ctx->skb = skb;
+	ctx->dev = skb->dev;
+	skb->ip_summed = CHECKSUM_NONE;
+
+	dev_hold(ctx->dev);
+	err = eip93_ipsec_receive(state->sa, skb, info.packet_len,
+				  airoha_xfrm_rx_finish, ctx);
+	if (err == -EINPROGRESS)
+		return true;
+
+	dev_put(ctx->dev);
+	airoha_xfrm_rx_free_ctx(ctx);
+	skb->mark = mark;
+	secpath_reset(skb);
+
+	return false;
+
+err_free_ctx:
+	airoha_xfrm_rx_free_ctx(ctx);
+err_reset:
+	skb->mark = mark;
+	secpath_reset(skb);
+	return false;
+
+err_put_state:
+	xfrm_state_put(x);
+	return false;
+}
+
+static void airoha_xfrm_tx_done(void *data, int err,
+				struct eip93_ipsec_result result)
+{
+	struct sk_buff *skb = data;
+	struct xfrm_offload *xo = xfrm_offload(skb);
+	struct sec_path *sp = skb_sec_path(skb);
+	struct xfrm_state *x;
+
+	if (!xo || !sp || !sp->len) {
+		kfree_skb(skb);
+		return;
+	}
+
+	x = sp->xvec[sp->len - 1];
+	if (!err)
+		err = airoha_xfrm_tx_apply_result(skb, x, result);
+	if (err) {
+		XFRM_INC_STATS(xs_net(x), LINUX_MIB_XFRMOUTSTATEPROTOERROR);
+		kfree_skb(skb);
+		return;
+	}
+
+	airoha_xfrm_tx_update_outer_len(skb);
+	airoha_xfrm_tx_udp6_csum(skb, x);
+	xo->flags |= CRYPTO_DONE;
+	xo->status = CRYPTO_SUCCESS;
+	skb_push(skb, skb->data - skb_mac_header(skb));
+	secpath_reset(skb);
+	xfrm_dev_resume(skb);
+}
+
+int airoha_xfrm_encrypt_skb(struct airoha_gdm_port *port, struct sk_buff *skb)
+{
+	struct xfrm_offload *xo = xfrm_offload(skb);
+	struct airoha_xfrm_state *state;
+	struct net_device *dev;
+	struct xfrm_state *x;
+	struct sec_path *sp;
+	struct ip_esp_hdr *esph;
+	struct sk_buff *trailer;
+	unsigned int esp_offset;
+	unsigned int tailen;
+	int err;
+
+	if (!xo || !(xo->flags & XFRM_XMIT) || (xo->flags & CRYPTO_DONE))
+		return 0;
+
+	sp = skb_sec_path(skb);
+	if (!sp || !sp->len)
+		return -EINVAL;
+
+	x = sp->xvec[sp->len - 1];
+	dev = x->xso.dev;
+	if (unlikely(x->xso.dir != XFRM_DEV_OFFLOAD_OUT ||
+		     x->xso.type != XFRM_DEV_OFFLOAD_CRYPTO || !dev ||
+		     !(dev->features & NETIF_F_HW_ESP)))
+		return -EOPNOTSUPP;
+
+	state = (struct airoha_xfrm_state *)x->xso.offload_handle;
+	if (!state || state->port != port)
+		return -EOPNOTSUPP;
+
+	if (unlikely(skb_is_gso(skb)))
+		return -EOPNOTSUPP;
+
+	if (unlikely(skb->ip_summed == CHECKSUM_PARTIAL)) {
+		err = skb_checksum_help(skb);
+		if (err)
+			return err;
+	}
+
+	tailen = xo->esp_tx_tailen;
+	if (skb_cloned(skb) || skb_is_nonlinear(skb)) {
+		err = skb_cow_data(skb, tailen, &trailer);
+		if (err < 0)
+			return err;
+
+		if (skb_is_nonlinear(skb)) {
+			err = skb_linearize(skb);
+			if (err)
+				return err;
+		}
+	}
+	/*
+	 * Generic ESP reserves this tailroom before the skb reaches us. Keep a
+	 * small guard here because COW/linearization can replace the skb head.
+	 */
+	if (tailen && skb_tailroom(skb) < tailen) {
+		err = pskb_expand_head(skb, 0, tailen - skb_tailroom(skb),
+				       GFP_ATOMIC);
+		if (err)
+			return err;
+	}
+
+	if (!airoha_xfrm_tx_esp_offset(skb, x, &esp_offset))
+		return -EINVAL;
+
+	esph = (struct ip_esp_hdr *)(skb->data + esp_offset);
+	esph->seq_no = htonl(xo->seq.low);
+
+	return eip93_ipsec_xmit(state->sa, skb, esp_offset, airoha_xfrm_tx_done,
+				skb);
+}
+
+static void airoha_xfrm_flush_dev(struct net_device *dev)
+{
+	xfrm_dev_state_flush(dev_net(dev), dev, true);
+	xfrm_dev_policy_flush(dev_net(dev), dev, true);
+}
+
+static void airoha_xfrm_link_change(struct net_device *dev)
+{
+	struct airoha_gdm_port *port = airoha_xfrm_dev_port(dev);
+
+	if (!port || !(dev->hw_features & NETIF_F_HW_ESP) ||
+	    !atomic_read(&port->xfrm_state_count))
+		return;
+
+	netdev_dbg(dev, "carrier %s, preserving ESP HW offload SAs\n",
+		   netif_carrier_ok(dev) ? "up" : "down");
+}
+
+#if IS_ENABLED(CONFIG_NET_DSA)
+static void airoha_xfrm_dsa_attach_user(struct net_device *conduit,
+					struct net_device *user)
+{
+	netdev_features_t features = airoha_xfrm_dev_features(conduit);
+
+	if (conduit->xfrmdev_ops != &airoha_xfrmdev_ops ||
+	    !airoha_xfrm_dsa_user_matches_port(user, conduit))
+		return;
+
+	if (!(features & NETIF_F_HW_ESP))
+		return;
+
+	if (user->xfrmdev_ops && user->xfrmdev_ops != &airoha_xfrmdev_ops) {
+		netdev_dbg(conduit,
+			   "DSA user %s already has XFRM offload ops\n",
+			   user->name);
+		return;
+	}
+
+	user->xfrmdev_ops = &airoha_xfrmdev_ops;
+	user->hw_features |= features;
+	user->hw_enc_features |= features;
+	user->gso_partial_features |= features & NETIF_F_GSO_ESP;
+	netdev_dbg(user, "ESP HW offload available via %s\n", conduit->name);
+}
+
+static void airoha_xfrm_dsa_detach_user(struct net_device *user)
+{
+	struct airoha_gdm_port *port;
+	bool active = false;
+	bool enabled;
+
+	if (user->xfrmdev_ops != &airoha_xfrmdev_ops ||
+	    !dsa_user_dev_check(user))
+		return;
+
+	enabled = user->features & NETIF_F_HW_ESP;
+	port = airoha_xfrm_dsa_dev_port(user);
+	if (port)
+		active = atomic_read(&port->xfrm_state_count);
+
+	if (active) {
+		netdev_warn(user, "DSA detach with active ESP SAs, flushing\n");
+		airoha_xfrm_flush_dev(user);
+	}
+
+	user->wanted_features &= ~AIROHA_XFRM_FEATURES;
+	user->features &= ~AIROHA_XFRM_FEATURES;
+	user->hw_features &= ~AIROHA_XFRM_FEATURES;
+	user->hw_enc_features &= ~AIROHA_XFRM_FEATURES;
+	user->gso_partial_features &= ~NETIF_F_GSO_ESP;
+	user->xfrmdev_ops = NULL;
+
+	if (active || enabled)
+		netdev_features_change(user);
+}
+
+static void airoha_xfrm_dsa_feature_change(struct net_device *dev)
+{
+	struct airoha_gdm_port *port;
+
+	if (dev->xfrmdev_ops != &airoha_xfrmdev_ops ||
+	    !dsa_user_dev_check(dev) || (dev->features & NETIF_F_HW_ESP))
+		return;
+
+	port = airoha_xfrm_dsa_dev_port(dev);
+	if (port && atomic_read(&port->xfrm_state_count)) {
+		netdev_warn(dev, "DSA feature lost ESP SAs, flushing\n");
+		airoha_xfrm_flush_dev(dev);
+	}
+}
+#endif
+
+static int airoha_xfrm_netdevice_event(struct notifier_block *nb,
+				       unsigned long event, void *ptr)
+{
+	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+
+	switch (event) {
+	case NETDEV_CHANGE:
+		airoha_xfrm_link_change(dev);
+		break;
+#if IS_ENABLED(CONFIG_NET_DSA)
+	case NETDEV_CHANGEUPPER: {
+		struct netdev_notifier_changeupper_info *info = ptr;
+
+		if (info->linking)
+			airoha_xfrm_dsa_attach_user(dev, info->upper_dev);
+		else
+			airoha_xfrm_dsa_detach_user(info->upper_dev);
+		break;
+	}
+	case NETDEV_FEAT_CHANGE:
+		airoha_xfrm_dsa_feature_change(dev);
+		break;
+	case NETDEV_UNREGISTER:
+		airoha_xfrm_dsa_detach_user(dev);
+		break;
+#endif
+	default:
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block airoha_xfrm_netdev_notifier = {
+	.notifier_call = airoha_xfrm_netdevice_event,
+};
+
+static int airoha_xfrm_register_netdev_notifier(void)
+{
+	return register_netdevice_notifier(&airoha_xfrm_netdev_notifier);
+}
+
+static void airoha_xfrm_unregister_netdev_notifier(void)
+{
+	unregister_netdevice_notifier(&airoha_xfrm_netdev_notifier);
+}
+
+static void airoha_xfrm_drop_dev(struct net_device *dev, const char *reason)
+{
+	struct airoha_gdm_port *port = airoha_xfrm_dev_port(dev);
+	bool advertised = dev->hw_features & AIROHA_XFRM_FEATURES;
+	bool enabled = dev->features & NETIF_F_HW_ESP;
+	bool active = false;
+
+	if (port)
+		active = atomic_read(&port->xfrm_state_count);
+
+	if (active) {
+		netdev_warn(dev, "%s, flushing ESP HW offload SAs\n", reason);
+		airoha_xfrm_flush_dev(dev);
+	}
+
+	dev->wanted_features &= ~AIROHA_XFRM_FEATURES;
+	dev->features &= ~AIROHA_XFRM_FEATURES;
+	dev->hw_features &= ~AIROHA_XFRM_FEATURES;
+	dev->hw_enc_features &= ~AIROHA_XFRM_FEATURES;
+	dev->gso_partial_features &= ~NETIF_F_GSO_ESP;
+
+	if (active || enabled || advertised)
+		netdev_features_change(dev);
+}
+
+static void airoha_xfrm_drop_ipsec(struct eip93_ipsec *ipsec,
+				   const char *reason)
+{
+	struct net_device *dev;
+	struct net *net;
+
+	rtnl_lock();
+	for_each_net(net) {
+		for_each_netdev(net, dev) {
+			struct airoha_gdm_port *port;
+
+			port = airoha_xfrm_dev_port(dev);
+			if (!port || port->xfrm_ipsec != ipsec)
+				continue;
+
+			airoha_xfrm_drop_dev(dev, reason);
+		}
+	}
+
+	for_each_net(net) {
+		for_each_netdev(net, dev) {
+			struct airoha_gdm_port *port;
+
+			if (dev->xfrmdev_ops != &airoha_xfrmdev_ops)
+				continue;
+
+			if (airoha_xfrm_dsa_dev_port(dev))
+				continue;
+
+			port = netdev_priv(dev);
+			if (dev == port->dev && port->xfrm_ipsec == ipsec) {
+				eip93_ipsec_put(port->xfrm_ipsec);
+				port->xfrm_ipsec = NULL;
+			}
+		}
+	}
+
+	for_each_net(net) {
+		for_each_netdev(net, dev) {
+			if (dev->xfrmdev_ops == &airoha_xfrmdev_ops &&
+			    !(dev->hw_features & NETIF_F_HW_ESP))
+				dev->xfrmdev_ops = NULL;
+		}
+	}
+	rtnl_unlock();
+}
+
+static int airoha_xfrm_ipsec_event(struct notifier_block *nb,
+				   unsigned long event, void *ptr)
+{
+	switch (event) {
+	case EIP93_IPSEC_EVENT_REMOVE:
+		airoha_xfrm_drop_ipsec(ptr, "EIP93 provider removed");
+		break;
+	case EIP93_IPSEC_EVENT_RESET:
+		airoha_xfrm_drop_ipsec(ptr, "EIP93 provider reset");
+		break;
+	case EIP93_IPSEC_EVENT_DMA_ERROR:
+		airoha_xfrm_drop_ipsec(ptr, "EIP93 DMA error");
+		break;
+	case EIP93_IPSEC_EVENT_CAPABILITY_LOSS:
+		airoha_xfrm_drop_ipsec(ptr, "EIP93 capability loss");
+		break;
+	default:
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block airoha_xfrm_ipsec_notifier = {
+	.notifier_call = airoha_xfrm_ipsec_event,
+};
+
+int airoha_xfrm_register_notifier(void)
+{
+	int err;
+
+	err = airoha_xfrm_register_netdev_notifier();
+	if (err)
+		return err;
+
+	err = eip93_ipsec_register_notifier(&airoha_xfrm_ipsec_notifier);
+	if (err)
+		airoha_xfrm_unregister_netdev_notifier();
+
+	return err;
+}
+
+void airoha_xfrm_unregister_notifier(void)
+{
+	eip93_ipsec_unregister_notifier(&airoha_xfrm_ipsec_notifier);
+	airoha_xfrm_unregister_netdev_notifier();
+}
+#else
+void airoha_xfrm_build_netdev(struct net_device *dev)
+{
+}
+
+void airoha_xfrm_teardown_netdev(struct net_device *dev)
+{
+}
+
+netdev_features_t airoha_xfrm_fix_features(struct net_device *dev,
+					   netdev_features_t features)
+{
+	return features & ~(NETIF_F_HW_ESP_TX_CSUM | NETIF_F_GSO_ESP);
+}
+
+int airoha_xfrm_set_features(struct net_device *dev, netdev_features_t features)
+{
+	return 0;
+}
+
+bool airoha_xfrm_rx_skb(struct airoha_gdm_port *port, struct sk_buff *skb)
+{
+	return false;
+}
+
+int airoha_xfrm_encrypt_skb(struct airoha_gdm_port *port, struct sk_buff *skb)
+{
+	return 0;
+}
+
+int airoha_xfrm_register_notifier(void)
+{
+	return 0;
+}
+
+void airoha_xfrm_unregister_notifier(void)
+{
+}
+
+#endif
-- 
2.53.0



^ permalink raw reply related

* [PATCH 2/3] crypto: inside-secure: add EIP93 ESP packet backend
From: Jihong Min @ 2026-05-23 12:15 UTC (permalink / raw)
  To: Christian Marangi, Antoine Tenart, Herbert Xu, David S . Miller,
	Lorenzo Bianconi, Andrew Lunn, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Steffen Klassert
  Cc: linux-kernel, linux-crypto, linux-arm-kernel, linux-mediatek,
	netdev, Jihong Min
In-Reply-To: <20260523121522.3023992-1-hurryman2212@gmail.com>

Expose an EIP93 packet-mode IPsec backend for netdev drivers that need
ESP encapsulation and decapsulation offload without advertising EIP93
itself as a netdev.

Add provider selection, capability reporting, SA lifecycle management,
IPsec request completion, and provider fault notification around the
existing EIP93 descriptor path.

Assisted-by: Codex:gpt-5.5
Signed-off-by: Jihong Min <hurryman2212@gmail.com>
---
 MAINTAINERS                                   |    1 +
 drivers/crypto/inside-secure/eip93/Kconfig    |   10 +
 drivers/crypto/inside-secure/eip93/Makefile   |    1 +
 .../crypto/inside-secure/eip93/eip93-ipsec.c  | 1413 +++++++++++++++++
 .../crypto/inside-secure/eip93/eip93-main.c   |   69 +-
 .../crypto/inside-secure/eip93/eip93-main.h   |   38 +-
 include/crypto/eip93-ipsec.h                  |  132 ++
 7 files changed, 1643 insertions(+), 21 deletions(-)
 create mode 100644 drivers/crypto/inside-secure/eip93/eip93-ipsec.c
 create mode 100644 include/crypto/eip93-ipsec.h

diff --git a/MAINTAINERS b/MAINTAINERS
index f1e5e4258e7b..08cfede333e8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12743,6 +12743,7 @@ L:	linux-crypto@vger.kernel.org
 S:	Maintained
 F:	Documentation/devicetree/bindings/crypto/inside-secure,safexcel-eip93.yaml
 F:	drivers/crypto/inside-secure/eip93/
+F:	include/crypto/eip93-ipsec.h
 
 INTEGRITY MEASUREMENT ARCHITECTURE (IMA)
 M:	Mimi Zohar <zohar@linux.ibm.com>
diff --git a/drivers/crypto/inside-secure/eip93/Kconfig b/drivers/crypto/inside-secure/eip93/Kconfig
index 29523f6927dd..1a33ab6f04da 100644
--- a/drivers/crypto/inside-secure/eip93/Kconfig
+++ b/drivers/crypto/inside-secure/eip93/Kconfig
@@ -18,3 +18,13 @@ config CRYPTO_DEV_EIP93
 	  CTR crypto. Also provide DES and 3DES ECB and CBC.
 
 	  Also provide AEAD authenc(hmac(x), cipher(y)) for supported algo.
+
+config CRYPTO_DEV_EIP93_IPSEC
+	bool
+	depends on CRYPTO_DEV_EIP93
+	depends on XFRM_OFFLOAD
+	default y
+	help
+	  Select this if a netdev driver should be allowed to use EIP93 for
+	  ESP packet encapsulation and decapsulation rather than only the
+	  crypto transform.
diff --git a/drivers/crypto/inside-secure/eip93/Makefile b/drivers/crypto/inside-secure/eip93/Makefile
index a3d3d3677cdc..a5bb98370ff0 100644
--- a/drivers/crypto/inside-secure/eip93/Makefile
+++ b/drivers/crypto/inside-secure/eip93/Makefile
@@ -3,3 +3,4 @@ obj-$(CONFIG_CRYPTO_DEV_EIP93) += crypto-hw-eip93.o
 crypto-hw-eip93-y += eip93-main.o eip93-common.o
 crypto-hw-eip93-y += eip93-cipher.o eip93-aead.o
 crypto-hw-eip93-y += eip93-hash.o
+crypto-hw-eip93-$(CONFIG_CRYPTO_DEV_EIP93_IPSEC) += eip93-ipsec.o
diff --git a/drivers/crypto/inside-secure/eip93/eip93-ipsec.c b/drivers/crypto/inside-secure/eip93/eip93-ipsec.c
new file mode 100644
index 000000000000..7338f4c7e24a
--- /dev/null
+++ b/drivers/crypto/inside-secure/eip93/eip93-ipsec.c
@@ -0,0 +1,1413 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2026
+ *
+ * Jihong Min <hurryman2212@gmail.com>
+ */
+
+#include <crypto/aes.h>
+#include <crypto/eip93-ipsec.h>
+#include <crypto/hash.h>
+#include <crypto/hmac.h>
+#include <crypto/sha1.h>
+#include <crypto/sha2.h>
+#include <linux/completion.h>
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/dma-mapping.h>
+#include <linux/ip.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/netlink.h>
+#include <linux/notifier.h>
+#include <linux/of.h>
+#include <linux/refcount.h>
+#include <linux/skbuff.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/udp.h>
+#include <net/esp.h>
+#include <net/xfrm.h>
+#include <uapi/linux/pfkeyv2.h>
+
+#include "eip93-main.h"
+#include "eip93-regs.h"
+#include "eip93-common.h"
+
+#define EIP93_IPSEC_PAD_ALIGN 2
+#define EIP93_IPSEC_IDR_MIN 0
+#define EIP93_IPSEC_IDR_MAX (EIP93_RING_NUM - 1)
+#define EIP93_IPSEC_DIGEST_WORD_BITS (BITS_PER_BYTE * sizeof(u32))
+#define EIP93_IPSEC_DIGEST_WORDS(bits) ((bits) / EIP93_IPSEC_DIGEST_WORD_BITS)
+#define EIP93_IPSEC_HMAC_STATE_SIZE SHA256_DIGEST_SIZE
+#define EIP93_IPSEC_PRNG_BUF_SIZE 4080
+#define EIP93_IPSEC_PRNG_RESET_MODE 1
+#define EIP93_IPSEC_PRNG_POLL_US 10000
+#define EIP93_IPSEC_PRNG_POLL_STEP_US 10
+
+struct eip93_ipsec {
+	struct eip93_device *eip93;
+	struct list_head node;
+	struct list_head sa_list;
+	struct work_struct fault_work;
+	spinlock_t lock; /* protects dead/refcount admission */
+	refcount_t refcnt;
+	struct completion done;
+	enum eip93_ipsec_event fault_event;
+	u32 algo_flags;
+	bool dead;
+};
+
+struct eip93_ipsec_sa {
+	struct eip93_ipsec *ipsec;
+	struct sa_record *sa_record;
+	dma_addr_t sa_record_base;
+	struct list_head node;
+	struct list_head requests;
+	spinlock_t lock; /* protects dead/refcount admission */
+	refcount_t refcnt;
+	struct completion done;
+	u32 flags;
+	u16 family;
+	u8 authsize;
+	u8 blocksize;
+	u8 ivsize;
+	u8 encap_type;
+	bool esn;
+	bool dead;
+	bool aborting;
+};
+
+struct eip93_ipsec_request {
+	struct eip93_ipsec_sa *sa;
+	struct sk_buff *skb;
+	struct list_head node;
+	refcount_t refcnt;
+	eip93_ipsec_complete_t complete;
+	void *data;
+	dma_addr_t dma;
+	unsigned int dma_len;
+	enum dma_data_direction dma_dir;
+	int idr;
+};
+
+static DEFINE_MUTEX(eip93_ipsec_devices_lock);
+static LIST_HEAD(eip93_ipsec_devices);
+static BLOCKING_NOTIFIER_HEAD(eip93_ipsec_notifier);
+
+static bool eip93_ipsec_get_ref(struct eip93_ipsec *ipsec)
+{
+	bool ret = false;
+
+	spin_lock_bh(&ipsec->lock);
+	if (!ipsec->dead)
+		ret = refcount_inc_not_zero(&ipsec->refcnt);
+	spin_unlock_bh(&ipsec->lock);
+
+	return ret;
+}
+
+void eip93_ipsec_put(struct eip93_ipsec *ipsec)
+{
+	if (ipsec && refcount_dec_and_test(&ipsec->refcnt))
+		complete(&ipsec->done);
+}
+EXPORT_SYMBOL_GPL(eip93_ipsec_put);
+
+static bool eip93_ipsec_same_subsystem(struct device *consumer,
+				       struct eip93_ipsec *ipsec)
+{
+	struct device_node *consumer_parent;
+	struct device_node *eip93_parent;
+	struct device_node *consumer_np;
+	struct device_node *eip93_np;
+	bool match;
+
+	consumer_np = dev_of_node(consumer);
+	eip93_np = dev_of_node(ipsec->eip93->dev);
+	if (!consumer_np || !eip93_np)
+		return false;
+
+	consumer_parent = of_get_parent(consumer_np);
+	eip93_parent = of_get_parent(eip93_np);
+	match = consumer_parent && consumer_parent == eip93_parent;
+	of_node_put(consumer_parent);
+	of_node_put(eip93_parent);
+
+	return match;
+}
+
+static bool eip93_ipsec_hw_available(u32 flags)
+{
+	if (!(flags & EIP93_PE_OPTION_AES))
+		return false;
+
+	if (!(flags & (EIP93_PE_OPTION_AES_KEY128 | EIP93_PE_OPTION_AES_KEY192 |
+		       EIP93_PE_OPTION_AES_KEY256)))
+		return false;
+
+	return flags & (EIP93_PE_OPTION_SHA_1 | EIP93_PE_OPTION_SHA_256);
+}
+
+static bool eip93_ipsec_mark_dead(struct eip93_ipsec *ipsec)
+{
+	bool marked = false;
+
+	spin_lock_bh(&ipsec->lock);
+	if (!ipsec->dead) {
+		ipsec->dead = true;
+		marked = true;
+	}
+	spin_unlock_bh(&ipsec->lock);
+
+	return marked;
+}
+
+static bool eip93_ipsec_mark_dead_async(struct eip93_ipsec *ipsec,
+					enum eip93_ipsec_event event)
+{
+	bool marked = false;
+
+	spin_lock_bh(&ipsec->lock);
+	if (!ipsec->dead && refcount_inc_not_zero(&ipsec->refcnt)) {
+		ipsec->dead = true;
+		ipsec->fault_event = event;
+		marked = true;
+	}
+	spin_unlock_bh(&ipsec->lock);
+
+	if (marked)
+		schedule_work(&ipsec->fault_work);
+
+	return marked;
+}
+
+static bool eip93_ipsec_live_hw_available(struct eip93_ipsec *ipsec)
+{
+	u32 flags = readl(ipsec->eip93->base + EIP93_REG_PE_OPTION_1);
+
+	spin_lock_bh(&ipsec->lock);
+	ipsec->algo_flags = flags;
+	spin_unlock_bh(&ipsec->lock);
+
+	return eip93_ipsec_hw_available(flags);
+}
+
+struct eip93_ipsec *eip93_ipsec_get(struct device *consumer)
+{
+	struct eip93_ipsec *ipsec;
+	int err = -ENODEV;
+
+	if (!consumer)
+		return ERR_PTR(-EINVAL);
+
+	mutex_lock(&eip93_ipsec_devices_lock);
+	list_for_each_entry(ipsec, &eip93_ipsec_devices, node) {
+		if (!eip93_ipsec_same_subsystem(consumer, ipsec))
+			continue;
+
+		if (!eip93_ipsec_live_hw_available(ipsec)) {
+			enum eip93_ipsec_event event;
+
+			event = EIP93_IPSEC_EVENT_CAPABILITY_LOSS;
+			eip93_ipsec_mark_dead_async(ipsec, event);
+			err = -EOPNOTSUPP;
+			continue;
+		}
+
+		if (!eip93_ipsec_get_ref(ipsec)) {
+			err = -ENODEV;
+			continue;
+		}
+
+		mutex_unlock(&eip93_ipsec_devices_lock);
+		return ipsec;
+	}
+	mutex_unlock(&eip93_ipsec_devices_lock);
+
+	return ERR_PTR(err);
+}
+EXPORT_SYMBOL_GPL(eip93_ipsec_get);
+
+bool eip93_ipsec_available(struct eip93_ipsec *ipsec)
+{
+	bool available;
+
+	if (!ipsec)
+		return false;
+
+	spin_lock_bh(&ipsec->lock);
+	available = !ipsec->dead;
+	spin_unlock_bh(&ipsec->lock);
+	if (!available)
+		return false;
+
+	available = eip93_ipsec_live_hw_available(ipsec);
+	if (!available)
+		eip93_ipsec_mark_dead_async(ipsec,
+					    EIP93_IPSEC_EVENT_CAPABILITY_LOSS);
+
+	return available;
+}
+EXPORT_SYMBOL_GPL(eip93_ipsec_available);
+
+u32 eip93_ipsec_features(struct eip93_ipsec *ipsec)
+{
+	if (!eip93_ipsec_available(ipsec))
+		return 0;
+
+	return EIP93_IPSEC_FEATURE_ESP;
+}
+EXPORT_SYMBOL_GPL(eip93_ipsec_features);
+
+int eip93_ipsec_register_notifier(struct notifier_block *nb)
+{
+	return blocking_notifier_chain_register(&eip93_ipsec_notifier, nb);
+}
+EXPORT_SYMBOL_GPL(eip93_ipsec_register_notifier);
+
+void eip93_ipsec_unregister_notifier(struct notifier_block *nb)
+{
+	blocking_notifier_chain_unregister(&eip93_ipsec_notifier, nb);
+}
+EXPORT_SYMBOL_GPL(eip93_ipsec_unregister_notifier);
+
+static bool eip93_ipsec_sa_get(struct eip93_ipsec_sa *sa)
+{
+	bool ret = false;
+
+	spin_lock_bh(&sa->ipsec->lock);
+	spin_lock(&sa->lock);
+	if (!sa->ipsec->dead && !sa->dead)
+		ret = refcount_inc_not_zero(&sa->refcnt);
+	spin_unlock(&sa->lock);
+	spin_unlock_bh(&sa->ipsec->lock);
+
+	return ret;
+}
+
+static void eip93_ipsec_sa_put(struct eip93_ipsec_sa *sa)
+{
+	if (refcount_dec_and_test(&sa->refcnt))
+		complete(&sa->done);
+}
+
+static bool eip93_ipsec_request_get(struct eip93_ipsec_request *req)
+{
+	return refcount_inc_not_zero(&req->refcnt);
+}
+
+static void eip93_ipsec_request_put(struct eip93_ipsec_request *req)
+{
+	if (refcount_dec_and_test(&req->refcnt))
+		kfree(req);
+}
+
+static int eip93_ipsec_parse_flags(struct xfrm_state *x, u32 *flags)
+{
+	switch (x->props.ealgo) {
+	case SADB_X_EALG_AESCBC:
+		*flags |= EIP93_ALG_AES | EIP93_MODE_CBC;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	switch (x->props.aalgo) {
+	case SADB_AALG_SHA1HMAC:
+		*flags |= EIP93_HASH_HMAC | EIP93_HASH_SHA1;
+		break;
+	case SADB_X_AALG_SHA2_256HMAC:
+		*flags |= EIP93_HASH_HMAC | EIP93_HASH_SHA256;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	if (x->xso.dir == XFRM_DEV_OFFLOAD_IN)
+		*flags |= EIP93_DECRYPT;
+	else
+		*flags |= EIP93_ENCRYPT;
+
+	return 0;
+}
+
+static unsigned int eip93_ipsec_auth_digest_size(struct xfrm_state *x)
+{
+	switch (x->props.aalgo) {
+	case SADB_AALG_SHA1HMAC:
+		return SHA1_DIGEST_SIZE;
+	case SADB_X_AALG_SHA2_256HMAC:
+		return SHA256_DIGEST_SIZE;
+	default:
+		return 0;
+	}
+}
+
+static int eip93_ipsec_hmac_setkey(u32 flags, const u8 *key,
+				   unsigned int keylen, u8 *dest_ipad,
+				   u8 *dest_opad)
+{
+	u8 *ipad, *opad;
+	struct crypto_shash *tfm;
+	const char *alg_name;
+	unsigned int blocksize;
+	unsigned int digestsize;
+	unsigned int statesize;
+	unsigned int alloc_size;
+	unsigned int i;
+	int err;
+
+	switch (flags & EIP93_HASH_MASK) {
+	case EIP93_HASH_SHA1:
+		alg_name = "sha1";
+		digestsize = SHA1_DIGEST_SIZE;
+		break;
+	case EIP93_HASH_SHA256:
+		alg_name = "sha256";
+		digestsize = SHA256_DIGEST_SIZE;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	tfm = crypto_alloc_shash(alg_name, 0, CRYPTO_ALG_NEED_FALLBACK);
+	if (IS_ERR(tfm))
+		return PTR_ERR(tfm);
+
+	blocksize = crypto_shash_blocksize(tfm);
+	statesize = crypto_shash_statesize(tfm);
+	if (statesize < EIP93_IPSEC_HMAC_STATE_SIZE) {
+		err = -EINVAL;
+		goto free_tfm;
+	}
+
+	alloc_size = 2 * (blocksize + statesize);
+	ipad = kzalloc(alloc_size, GFP_KERNEL);
+	if (!ipad) {
+		err = -ENOMEM;
+		goto free_tfm;
+	}
+	opad = ipad + blocksize + statesize;
+
+	if (keylen > blocksize) {
+		SHASH_DESC_ON_STACK(desc, tfm);
+
+		desc->tfm = tfm;
+		err = crypto_shash_digest(desc, key, keylen, ipad);
+		shash_desc_zero(desc);
+		if (err)
+			goto free_pad;
+
+		keylen = digestsize;
+	} else {
+		memcpy(ipad, key, keylen);
+	}
+
+	memcpy(opad, ipad, blocksize);
+	for (i = 0; i < blocksize; i++) {
+		ipad[i] ^= HMAC_IPAD_VALUE;
+		opad[i] ^= HMAC_OPAD_VALUE;
+	}
+
+	{
+		SHASH_DESC_ON_STACK(desc, tfm);
+
+		desc->tfm = tfm;
+		err = crypto_shash_init(desc) ?:
+		      crypto_shash_update(desc, ipad, blocksize) ?:
+		      crypto_shash_export(desc, ipad) ?:
+		      crypto_shash_init(desc) ?:
+		      crypto_shash_update(desc, opad, blocksize) ?:
+		      crypto_shash_export(desc, opad);
+		shash_desc_zero(desc);
+	}
+	if (err)
+		goto free_pad;
+
+	/*
+	 * EIP93 ESP protocol mode consumes the raw exported HMAC ipad/opad
+	 * state. The crypto API AEAD helper byteswaps this state for its basic
+	 * authenc path, but packet ESP mode matches mtk-eip93 with native
+	 * exported bytes in the SA record.
+	 */
+	memcpy(dest_ipad, ipad, EIP93_IPSEC_HMAC_STATE_SIZE);
+	memcpy(dest_opad, opad, EIP93_IPSEC_HMAC_STATE_SIZE);
+
+free_pad:
+	kfree_sensitive(ipad);
+free_tfm:
+	crypto_free_shash(tfm);
+	return err;
+}
+
+static int eip93_ipsec_validate_algo(struct xfrm_state *x,
+				     struct netlink_ext_ack *extack)
+{
+	unsigned int authsize;
+	unsigned int keylen;
+
+	if (x->aead) {
+		NL_SET_ERR_MSG_MOD(extack, "AEAD SAs are unsupported");
+		return -EOPNOTSUPP;
+	}
+
+	if (!x->ealg || !x->aalg) {
+		NL_SET_ERR_MSG_MOD(extack, "encryption/auth required");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->props.ealgo != SADB_X_EALG_AESCBC) {
+		NL_SET_ERR_MSG_MOD(extack, "only AES-CBC is supported");
+		return -EOPNOTSUPP;
+	}
+
+	keylen = x->ealg->alg_key_len / BITS_PER_BYTE;
+	if (keylen != AES_KEYSIZE_128 && keylen != AES_KEYSIZE_192 &&
+	    keylen != AES_KEYSIZE_256) {
+		NL_SET_ERR_MSG_MOD(extack, "unsupported AES-CBC key length");
+		return -EOPNOTSUPP;
+	}
+
+	authsize = eip93_ipsec_auth_digest_size(x);
+	if (!authsize) {
+		NL_SET_ERR_MSG_MOD(extack, "only SHA1/SHA256 HMAC");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->aalg->alg_trunc_len % EIP93_IPSEC_DIGEST_WORD_BITS ||
+	    x->aalg->alg_trunc_len < EIP93_IPSEC_DIGEST_WORD_BITS ||
+	    x->aalg->alg_trunc_len > authsize * BITS_PER_BYTE) {
+		NL_SET_ERR_MSG_MOD(extack, "bad auth truncation length");
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static int eip93_ipsec_validate_state(struct xfrm_state *x,
+				      struct netlink_ext_ack *extack)
+{
+	switch (x->xso.dir) {
+	case XFRM_DEV_OFFLOAD_OUT:
+	case XFRM_DEV_OFFLOAD_IN:
+		break;
+	default:
+		NL_SET_ERR_MSG_MOD(extack, "only in/out SAs are supported");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->xso.type != XFRM_DEV_OFFLOAD_CRYPTO) {
+		NL_SET_ERR_MSG_MOD(extack, "only crypto offload is supported");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->id.proto != IPPROTO_ESP) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "EIP93 packet backend supports ESP only");
+		return -EOPNOTSUPP;
+	}
+
+	switch (x->props.family) {
+	case AF_INET:
+		break;
+#if IS_ENABLED(CONFIG_IPV6)
+	case AF_INET6:
+		break;
+#endif
+	default:
+		NL_SET_ERR_MSG_MOD(extack, "only IPv4/IPv6 is supported");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->outer_mode.family != x->props.family) {
+		NL_SET_ERR_MSG_MOD(extack, "only same-family ESP is supported");
+		return -EOPNOTSUPP;
+	}
+
+	switch (x->props.mode) {
+	case XFRM_MODE_TUNNEL:
+	case XFRM_MODE_TRANSPORT:
+		break;
+	default:
+		NL_SET_ERR_MSG_MOD(extack, "only tunnel/transport");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->outer_mode.encap != x->props.mode) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "outer ESP mode does not match state mode");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->encap) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "NAT-T is unsupported by EIP93 packet ESP");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->tfcpad) {
+		NL_SET_ERR_MSG_MOD(extack, "TFC padding is unsupported");
+		return -EOPNOTSUPP;
+	}
+
+	return eip93_ipsec_validate_algo(x, extack);
+}
+
+static int eip93_ipsec_validate_hw(struct xfrm_state *x, u32 flags,
+				   struct netlink_ext_ack *extack)
+{
+	unsigned int keylen = x->ealg->alg_key_len / BITS_PER_BYTE;
+	u32 required;
+
+	if (!(flags & EIP93_PE_OPTION_AES)) {
+		NL_SET_ERR_MSG_MOD(extack, "EIP93 AES engine is unavailable");
+		return -EOPNOTSUPP;
+	}
+
+	switch (keylen) {
+	case AES_KEYSIZE_128:
+		required = EIP93_PE_OPTION_AES_KEY128;
+		break;
+	case AES_KEYSIZE_192:
+		required = EIP93_PE_OPTION_AES_KEY192;
+		break;
+	case AES_KEYSIZE_256:
+		required = EIP93_PE_OPTION_AES_KEY256;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	if (!(flags & required)) {
+		NL_SET_ERR_MSG_MOD(extack, "unsupported AES key length");
+		return -EOPNOTSUPP;
+	}
+
+	switch (x->props.aalgo) {
+	case SADB_AALG_SHA1HMAC:
+		required = EIP93_PE_OPTION_SHA_1;
+		break;
+	case SADB_X_AALG_SHA2_256HMAC:
+		required = EIP93_PE_OPTION_SHA_256;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	if (!(flags & required)) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "EIP93 does not support this HMAC hash");
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static void eip93_ipsec_init_sa_record(struct eip93_ipsec_sa *sa,
+				       struct xfrm_state *x)
+{
+	struct sa_record *record = sa->sa_record;
+	unsigned int auth_words;
+	unsigned int enckeylen;
+
+	enckeylen = x->ealg->alg_key_len / BITS_PER_BYTE;
+	auth_words = EIP93_IPSEC_DIGEST_WORDS(x->aalg->alg_trunc_len);
+
+	eip93_set_sa_record(record, enckeylen, sa->flags);
+
+	record->sa_cmd0_word &=
+		~(EIP93_SA_CMD_OPGROUP | EIP93_SA_CMD_OPCODE |
+		  EIP93_SA_CMD_DIGEST_LENGTH | EIP93_SA_CMD_PAD_TYPE |
+		  EIP93_SA_CMD_IV_SOURCE | EIP93_SA_CMD_SAVE_IV);
+	record->sa_cmd0_word |=
+		EIP93_SA_CMD_OP_PROTOCOL | EIP93_SA_CMD_HDR_PROC |
+		EIP93_SA_CMD_PAD_IPSEC | EIP93_SA_CMD_SCPAD |
+		FIELD_PREP(EIP93_SA_CMD_OPCODE,
+			   EIP93_SA_CMD_OPCODE_PROTOCOL_OUT_ESP) |
+		FIELD_PREP(EIP93_SA_CMD_DIGEST_LENGTH, auth_words);
+
+	/*
+	 * ESP packet mode authenticates from the ESP header when the hash
+	 * crypt offset is zero. This is intentionally different from the AEAD
+	 * authenc path, whose AAD starts after the ESP header.
+	 */
+	record->sa_cmd1_word &=
+		~(EIP93_SA_CMD_HASH_CRYPT_OFFSET | EIP93_SA_CMD_BYTE_OFFSET |
+		  EIP93_SA_CMD_COPY_PAD | EIP93_SA_CMD_COPY_HEADER |
+		  EIP93_SA_CMD_COPY_DIGEST | EIP93_SA_CMD_COPY_PAYLOAD |
+		  EIP93_SA_CMD_EN_SEQNUM_CHK);
+	record->sa_cmd1_word |= EIP93_SA_CMD_HMAC | EIP93_SA_CMD_EN_SEQNUM_CHK;
+
+	if (x->xso.dir == XFRM_DEV_OFFLOAD_IN) {
+		record->sa_cmd0_word |= EIP93_SA_CMD_DIRECTION_IN |
+					EIP93_SA_CMD_IV_FROM_INPUT;
+		/*
+		 * Inbound ESP decapsulation keeps the outer header for XFRM and
+		 * lets hardware remove ESP pad/trailer/ICV bytes.
+		 */
+		record->sa_cmd1_word |= EIP93_SA_CMD_COPY_HEADER;
+	} else {
+		record->sa_cmd0_word |= EIP93_SA_CMD_IV_FROM_PRNG;
+		record->sa_cmd1_word |= EIP93_SA_CMD_COPY_DIGEST;
+	}
+
+	record->sa_spi = ntohl(x->id.spi);
+	if (sa->esn && x->replay_esn) {
+		if (x->xso.dir == XFRM_DEV_OFFLOAD_IN)
+			record->sa_seqnum[1] = x->replay_esn->seq_hi;
+		else
+			record->sa_seqnum[1] = x->replay_esn->oseq_hi;
+	} else {
+		record->sa_seqnum[1] = 0;
+	}
+	record->sa_seqnum[0] = 0;
+	record->sa_seqmum_mask[0] = 0xffffffff;
+	record->sa_seqmum_mask[1] = sa->esn ? 0xffffffff : 0;
+}
+
+static int eip93_ipsec_poll_result(struct eip93_device *eip93,
+				   struct eip93_descriptor **rdesc)
+{
+	struct eip93_descriptor *desc;
+	unsigned int i;
+	u32 pe_ctrl_stat;
+	u32 pe_length;
+
+	for (i = 0; i < EIP93_IPSEC_PRNG_POLL_US;
+	     i += EIP93_IPSEC_PRNG_POLL_STEP_US) {
+		if (readl(eip93->base + EIP93_REG_PE_RD_COUNT) &
+		    EIP93_PE_RD_COUNT)
+			break;
+		udelay(EIP93_IPSEC_PRNG_POLL_STEP_US);
+	}
+	if (i >= EIP93_IPSEC_PRNG_POLL_US)
+		return -ETIMEDOUT;
+
+	scoped_guard(spinlock_irqsave, &eip93->ring->read_lock)
+		desc = eip93_get_descriptor(eip93);
+	if (IS_ERR(desc))
+		return PTR_ERR(desc);
+	*rdesc = desc;
+
+	for (i = 0; i < EIP93_IPSEC_PRNG_POLL_US;
+	     i += EIP93_IPSEC_PRNG_POLL_STEP_US) {
+		pe_ctrl_stat = READ_ONCE((*rdesc)->pe_ctrl_stat_word);
+		pe_length = READ_ONCE((*rdesc)->pe_length_word);
+		if (FIELD_GET(EIP93_PE_CTRL_PE_READY_DES_TRING_OWN,
+			      pe_ctrl_stat) == EIP93_PE_CTRL_PE_READY &&
+		    FIELD_GET(EIP93_PE_LENGTH_HOST_PE_READY, pe_length) ==
+			    EIP93_PE_LENGTH_PE_READY)
+			break;
+		udelay(EIP93_IPSEC_PRNG_POLL_STEP_US);
+	}
+
+	writel(1, eip93->base + EIP93_REG_PE_RD_COUNT);
+	writel(EIP93_INT_RDR_THRESH, eip93->base + EIP93_REG_INT_CLR);
+
+	if (i >= EIP93_IPSEC_PRNG_POLL_US)
+		return -ETIMEDOUT;
+
+	return 0;
+}
+
+static int eip93_ipsec_init_prng(struct eip93_device *eip93)
+{
+	static const u32 prng_dt[4] = {};
+	static const u32 prng_key[4] = {
+		0xe0fc631d, 0xcbb9fb9a, 0x869285cb, 0xcbb9fb9a
+	};
+	static const u32 prng_seed[4] = {
+		0x758bac03, 0xf20ab39e, 0xa569f104, 0x95dfaea6
+	};
+	struct eip93_descriptor cdesc = {};
+	struct eip93_descriptor *rdesc;
+	struct sa_record *record;
+	dma_addr_t record_dma;
+	dma_addr_t buf_dma;
+	void *buf;
+	int err;
+
+	record = dma_alloc_coherent(eip93->dev, sizeof(*record), &record_dma,
+				    GFP_KERNEL);
+	if (!record)
+		return -ENOMEM;
+
+	buf = dma_alloc_coherent(eip93->dev, EIP93_IPSEC_PRNG_BUF_SIZE,
+				 &buf_dma, GFP_KERNEL);
+	if (!buf) {
+		err = -ENOMEM;
+		goto free_record;
+	}
+
+	memset(record, 0, sizeof(*record));
+	record->sa_cmd0_word =
+		EIP93_SA_CMD_OP_BASIC |
+		FIELD_PREP(EIP93_SA_CMD_OPCODE,
+			   EIP93_SA_CMD_OPCODE_BASIC_OUT_PRNG) |
+		EIP93_SA_CMD_CIPHER_AES | EIP93_SA_CMD_HASH_SHA1;
+	record->sa_cmd1_word = EIP93_SA_CMD_AES_KEY_128BIT;
+	memcpy(record->sa_key, prng_key, sizeof(prng_key));
+	memcpy(record->sa_i_digest, prng_seed, sizeof(prng_seed));
+	memcpy(record->sa_o_digest, prng_dt, sizeof(prng_dt));
+
+	cdesc.pe_ctrl_stat_word =
+		FIELD_PREP(EIP93_PE_CTRL_PE_READY_DES_TRING_OWN,
+			   EIP93_PE_CTRL_HOST_READY) |
+		FIELD_PREP(EIP93_PE_CTRL_PE_PRNG_MODE,
+			   EIP93_IPSEC_PRNG_RESET_MODE);
+	cdesc.dst_addr = (u32 __force)buf_dma;
+	cdesc.sa_addr = record_dma;
+	cdesc.user_id = FIELD_PREP(EIP93_PE_USER_ID_DESC_FLAGS,
+				   EIP93_DESC_PRNG | EIP93_DESC_LAST);
+	cdesc.pe_length_word =
+		FIELD_PREP(EIP93_PE_LENGTH_HOST_PE_READY,
+			   EIP93_PE_LENGTH_HOST_READY) |
+		FIELD_PREP(EIP93_PE_LENGTH_LENGTH,
+			   EIP93_IPSEC_PRNG_BUF_SIZE);
+
+	/*
+	 * Outbound ESP SAs use IV_FROM_PRNG. Initialize the EIP93 PRNG before
+	 * exposing the packet backend, otherwise the first ESP encrypt can
+	 * fail or emit unusable IV material.
+	 */
+	scoped_guard(spinlock_irqsave, &eip93->ring->write_lock)
+		err = eip93_put_descriptor(eip93, &cdesc);
+	if (err)
+		goto free_buf;
+
+	writel(1, eip93->base + EIP93_REG_PE_CD_COUNT);
+	err = eip93_ipsec_poll_result(eip93, &rdesc);
+	if (err)
+		goto free_buf;
+
+	err = rdesc->pe_ctrl_stat_word &
+	      (EIP93_PE_CTRL_PE_EXT_ERR_CODE | EIP93_PE_CTRL_PE_EXT_ERR |
+	       EIP93_PE_CTRL_PE_SEQNUM_ERR | EIP93_PE_CTRL_PE_PAD_ERR |
+	       EIP93_PE_CTRL_PE_AUTH_ERR);
+	err = eip93_parse_ctrl_stat_err(eip93, err);
+	if (err)
+		dev_err(eip93->dev, "IPsec PRNG init failed: %d\n", err);
+
+free_buf:
+	dma_free_coherent(eip93->dev, EIP93_IPSEC_PRNG_BUF_SIZE, buf, buf_dma);
+free_record:
+	dma_free_coherent(eip93->dev, sizeof(*record), record, record_dma);
+
+	return err;
+}
+
+static int eip93_ipsec_submit(struct eip93_ipsec_request *req,
+			      struct eip93_descriptor *cdesc)
+{
+	struct eip93_device *eip93 = req->sa->ipsec->eip93;
+	struct eip93_ipsec_sa *sa = req->sa;
+	struct eip93_ipsec *ipsec = sa->ipsec;
+	int err;
+
+	spin_lock_bh(&ipsec->lock);
+	if (ipsec->dead) {
+		err = -EOPNOTSUPP;
+		goto unlock_ipsec;
+	}
+
+	scoped_guard(spinlock_bh, &eip93->ring->idr_lock) req->idr =
+		idr_alloc(&eip93->ring->crypto_async_idr, req,
+			  EIP93_IPSEC_IDR_MIN, EIP93_IPSEC_IDR_MAX, GFP_ATOMIC);
+	if (req->idr < 0) {
+		err = req->idr == -ENOSPC ? -EBUSY : req->idr;
+		goto unlock_ipsec;
+	}
+
+	spin_lock(&sa->lock);
+	if (sa->dead) {
+		spin_unlock(&sa->lock);
+		err = -EOPNOTSUPP;
+		goto remove_idr;
+	}
+	list_add_tail(&req->node, &sa->requests);
+	spin_unlock(&sa->lock);
+
+	cdesc->user_id =
+		FIELD_PREP(EIP93_PE_USER_ID_CRYPTO_IDR, (u16)req->idr) |
+		FIELD_PREP(EIP93_PE_USER_ID_DESC_FLAGS,
+			   EIP93_DESC_IPSEC | EIP93_DESC_LAST);
+
+	scoped_guard(spinlock_irqsave, &eip93->ring->write_lock)
+		err = eip93_put_descriptor(eip93, cdesc);
+	if (err)
+		goto unlink_request;
+
+	writel(1, eip93->base + EIP93_REG_PE_CD_COUNT);
+	spin_unlock_bh(&ipsec->lock);
+
+	return -EINPROGRESS;
+
+unlink_request:
+	spin_lock(&sa->lock);
+	list_del_init(&req->node);
+	spin_unlock(&sa->lock);
+remove_idr:
+	scoped_guard(spinlock_bh, &eip93->ring->idr_lock)
+		idr_remove(&eip93->ring->crypto_async_idr, req->idr);
+	err = err == -ENOENT ? -EBUSY : err;
+unlock_ipsec:
+	spin_unlock_bh(&ipsec->lock);
+	return err;
+}
+
+static void eip93_ipsec_unlink_request(struct eip93_ipsec_request *req)
+{
+	struct eip93_ipsec_sa *sa = req->sa;
+
+	spin_lock_bh(&sa->lock);
+	if (!list_empty(&req->node))
+		list_del_init(&req->node);
+	spin_unlock_bh(&sa->lock);
+}
+
+static void eip93_ipsec_complete_request(struct eip93_ipsec_request *req,
+					 int err,
+					 struct eip93_ipsec_result result)
+{
+	struct eip93_ipsec_sa *sa = req->sa;
+	eip93_ipsec_complete_t complete = req->complete;
+	void *data = req->data;
+
+	dma_unmap_single(sa->ipsec->eip93->dev, req->dma, req->dma_len,
+			 req->dma_dir);
+	eip93_ipsec_unlink_request(req);
+	eip93_ipsec_sa_put(sa);
+	complete(data, err, result);
+	eip93_ipsec_request_put(req);
+}
+
+static void eip93_ipsec_abort_sa(struct eip93_ipsec_sa *sa, int err)
+{
+	struct eip93_device *eip93 = sa->ipsec->eip93;
+	struct eip93_ipsec_request *req;
+	bool claimed;
+
+	while (true) {
+		spin_lock_bh(&sa->lock);
+		if (list_empty(&sa->requests)) {
+			spin_unlock_bh(&sa->lock);
+			return;
+		}
+
+		req = list_first_entry(&sa->requests,
+				       struct eip93_ipsec_request, node);
+		if (!eip93_ipsec_request_get(req)) {
+			list_del_init(&req->node);
+			spin_unlock_bh(&sa->lock);
+			continue;
+		}
+		list_del_init(&req->node);
+		spin_unlock_bh(&sa->lock);
+
+		claimed = false;
+		scoped_guard(spinlock_bh, &eip93->ring->idr_lock) {
+			if (idr_find(&eip93->ring->crypto_async_idr,
+				     req->idr) == req) {
+				idr_remove(&eip93->ring->crypto_async_idr,
+					   req->idr);
+				claimed = true;
+			}
+		}
+
+		if (claimed) {
+			struct eip93_ipsec_result result = {};
+
+			eip93_ipsec_complete_request(req, err, result);
+		}
+		eip93_ipsec_request_put(req);
+	}
+}
+
+static void eip93_ipsec_abort_requests(struct eip93_ipsec *ipsec, int err)
+{
+	struct eip93_ipsec_sa *sa;
+
+	while (true) {
+		bool found = false;
+
+		spin_lock_bh(&ipsec->lock);
+		list_for_each_entry(sa, &ipsec->sa_list, node) {
+			spin_lock(&sa->lock);
+			if (sa->aborting) {
+				spin_unlock(&sa->lock);
+				continue;
+			}
+
+			sa->aborting = true;
+			found = refcount_inc_not_zero(&sa->refcnt);
+			spin_unlock(&sa->lock);
+			if (found)
+				break;
+		}
+		spin_unlock_bh(&ipsec->lock);
+		if (!found)
+			return;
+
+		eip93_ipsec_abort_sa(sa, err);
+		eip93_ipsec_sa_put(sa);
+	}
+}
+
+static void eip93_ipsec_fault_work(struct work_struct *work)
+{
+	struct eip93_ipsec *ipsec =
+		container_of(work, struct eip93_ipsec, fault_work);
+	enum eip93_ipsec_event event;
+
+	spin_lock_bh(&ipsec->lock);
+	event = ipsec->fault_event;
+	spin_unlock_bh(&ipsec->lock);
+
+	eip93_ipsec_abort_requests(ipsec, -EIO);
+	blocking_notifier_call_chain(&eip93_ipsec_notifier, event, ipsec);
+	eip93_ipsec_put(ipsec);
+}
+
+void eip93_ipsec_handle_result(struct eip93_ipsec_request *req, int err,
+			       u32 pe_ctrl_stat, u32 pe_length)
+{
+	struct eip93_ipsec_result result = {};
+
+	if (!req)
+		return;
+
+	if (err == -EIO || err == -EACCES)
+		eip93_ipsec_mark_dead_async(req->sa->ipsec,
+					    EIP93_IPSEC_EVENT_DMA_ERROR);
+
+	if (!err) {
+		result.packet_len = FIELD_GET(EIP93_PE_LENGTH_LENGTH, pe_length);
+		result.nexthdr = FIELD_GET(EIP93_PE_CTRL_PE_PAD_VALUE,
+					   pe_ctrl_stat);
+	}
+
+	eip93_ipsec_complete_request(req, err, result);
+}
+
+void eip93_ipsec_report_irq(struct eip93_device *eip93, u32 irq_status)
+{
+	struct eip93_ipsec *ipsec = eip93->ipsec;
+
+	if (!ipsec)
+		return;
+
+	if (irq_status & EIP93_INT_HALT) {
+		eip93_ipsec_mark_dead_async(ipsec, EIP93_IPSEC_EVENT_RESET);
+		return;
+	}
+
+	if (irq_status & (EIP93_INT_INTERFACE_ERR | EIP93_INT_RPOC_ERR |
+			  EIP93_INT_PE_RING_ERR))
+		eip93_ipsec_mark_dead_async(ipsec, EIP93_IPSEC_EVENT_DMA_ERROR);
+}
+
+int eip93_ipsec_register(struct eip93_device *eip93)
+{
+	struct eip93_ipsec *ipsec;
+	int err;
+
+	ipsec = kzalloc(sizeof(*ipsec), GFP_KERNEL);
+	if (!ipsec)
+		return -ENOMEM;
+
+	err = eip93_ipsec_init_prng(eip93);
+	if (err) {
+		kfree(ipsec);
+		return err;
+	}
+
+	ipsec->eip93 = eip93;
+	ipsec->algo_flags = readl(eip93->base + EIP93_REG_PE_OPTION_1);
+	ipsec->fault_event = EIP93_IPSEC_EVENT_REMOVE;
+	INIT_WORK(&ipsec->fault_work, eip93_ipsec_fault_work);
+	spin_lock_init(&ipsec->lock);
+	refcount_set(&ipsec->refcnt, 1);
+	init_completion(&ipsec->done);
+	INIT_LIST_HEAD(&ipsec->node);
+	INIT_LIST_HEAD(&ipsec->sa_list);
+
+	mutex_lock(&eip93_ipsec_devices_lock);
+	eip93->ipsec = ipsec;
+	list_add_tail(&ipsec->node, &eip93_ipsec_devices);
+	mutex_unlock(&eip93_ipsec_devices_lock);
+
+	return 0;
+}
+
+void eip93_ipsec_unregister(struct eip93_device *eip93)
+{
+	struct eip93_ipsec *ipsec = eip93->ipsec;
+	bool notify_remove;
+
+	if (!ipsec)
+		return;
+
+	mutex_lock(&eip93_ipsec_devices_lock);
+	notify_remove = eip93_ipsec_mark_dead(ipsec);
+	list_del_init(&ipsec->node);
+	eip93->ipsec = NULL;
+	mutex_unlock(&eip93_ipsec_devices_lock);
+
+	eip93_ipsec_abort_requests(ipsec, -ENODEV);
+	if (notify_remove)
+		blocking_notifier_call_chain(&eip93_ipsec_notifier,
+					     EIP93_IPSEC_EVENT_REMOVE, ipsec);
+
+	eip93_ipsec_put(ipsec);
+	wait_for_completion(&ipsec->done);
+	cancel_work_sync(&ipsec->fault_work);
+	kfree(ipsec);
+}
+
+int eip93_ipsec_state_add(struct eip93_ipsec *ipsec, struct xfrm_state *x,
+			  struct netlink_ext_ack *extack,
+			  struct eip93_ipsec_sa **sa)
+{
+	struct eip93_device *eip93;
+	struct eip93_ipsec_sa *new_sa;
+	unsigned int authkeylen;
+	unsigned int enckeylen;
+	int err;
+
+	if (!ipsec || !eip93_ipsec_get_ref(ipsec)) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "EIP93 packet backend is unavailable");
+		return -EOPNOTSUPP;
+	}
+
+	err = eip93_ipsec_validate_state(x, extack);
+	if (err)
+		goto put_ipsec;
+
+	err = eip93_ipsec_validate_hw(x, ipsec->algo_flags, extack);
+	if (err)
+		goto put_ipsec;
+
+	eip93 = ipsec->eip93;
+	new_sa = kzalloc(sizeof(*new_sa), GFP_KERNEL);
+	if (!new_sa) {
+		err = -ENOMEM;
+		goto put_ipsec;
+	}
+
+	new_sa->ipsec = ipsec;
+	new_sa->family = x->props.family;
+	new_sa->ivsize = AES_BLOCK_SIZE;
+	new_sa->authsize = x->aalg->alg_trunc_len / BITS_PER_BYTE;
+	new_sa->blocksize = AES_BLOCK_SIZE;
+	new_sa->encap_type = x->encap ? x->encap->encap_type : 0;
+	new_sa->esn = x->props.flags & XFRM_STATE_ESN;
+	INIT_LIST_HEAD(&new_sa->node);
+	INIT_LIST_HEAD(&new_sa->requests);
+	spin_lock_init(&new_sa->lock);
+	refcount_set(&new_sa->refcnt, 1);
+	init_completion(&new_sa->done);
+
+	err = eip93_ipsec_parse_flags(x, &new_sa->flags);
+	if (err)
+		goto free_sa;
+
+	new_sa->sa_record = kzalloc(sizeof(*new_sa->sa_record), GFP_KERNEL);
+	if (!new_sa->sa_record) {
+		err = -ENOMEM;
+		goto free_sa;
+	}
+
+	eip93_ipsec_init_sa_record(new_sa, x);
+
+	enckeylen = x->ealg->alg_key_len / BITS_PER_BYTE;
+	memcpy(new_sa->sa_record->sa_key, x->ealg->alg_key, enckeylen);
+
+	authkeylen = x->aalg->alg_key_len / BITS_PER_BYTE;
+	err = eip93_ipsec_hmac_setkey(new_sa->flags, x->aalg->alg_key,
+				      authkeylen,
+				      new_sa->sa_record->sa_i_digest,
+				      new_sa->sa_record->sa_o_digest);
+	if (err)
+		goto free_record;
+
+	new_sa->sa_record_base = dma_map_single(eip93->dev, new_sa->sa_record,
+						sizeof(*new_sa->sa_record),
+						DMA_TO_DEVICE);
+	if (dma_mapping_error(eip93->dev, new_sa->sa_record_base)) {
+		err = -ENOMEM;
+		goto free_record;
+	}
+
+	spin_lock_bh(&ipsec->lock);
+	if (ipsec->dead) {
+		spin_unlock_bh(&ipsec->lock);
+		err = -EOPNOTSUPP;
+		goto unmap_record;
+	}
+	list_add_tail(&new_sa->node, &ipsec->sa_list);
+	spin_unlock_bh(&ipsec->lock);
+
+	*sa = new_sa;
+
+	return 0;
+
+unmap_record:
+	dma_unmap_single(eip93->dev, new_sa->sa_record_base,
+			 sizeof(*new_sa->sa_record), DMA_TO_DEVICE);
+free_record:
+	kfree_sensitive(new_sa->sa_record);
+free_sa:
+	kfree(new_sa);
+put_ipsec:
+	eip93_ipsec_put(ipsec);
+	return err;
+}
+EXPORT_SYMBOL_GPL(eip93_ipsec_state_add);
+
+void eip93_ipsec_state_delete(struct eip93_ipsec_sa *sa)
+{
+	if (!sa)
+		return;
+
+	spin_lock_bh(&sa->ipsec->lock);
+	spin_lock(&sa->lock);
+	sa->dead = true;
+	list_del_init(&sa->node);
+	spin_unlock(&sa->lock);
+	spin_unlock_bh(&sa->ipsec->lock);
+
+	eip93_ipsec_sa_put(sa);
+	wait_for_completion(&sa->done);
+
+	dma_unmap_single(sa->ipsec->eip93->dev, sa->sa_record_base,
+			 sizeof(*sa->sa_record), DMA_TO_DEVICE);
+	kfree_sensitive(sa->sa_record);
+	eip93_ipsec_put(sa->ipsec);
+	kfree(sa);
+}
+EXPORT_SYMBOL_GPL(eip93_ipsec_state_delete);
+
+void eip93_ipsec_state_advance_esn(struct eip93_ipsec_sa *sa,
+				   struct xfrm_state *x)
+{
+	u32 seq_hi = 0;
+
+	if (!sa || !x || !sa->esn || !x->replay_esn)
+		return;
+
+	if (x->xso.dir == XFRM_DEV_OFFLOAD_IN)
+		seq_hi = x->replay_esn->seq_hi;
+	else if (x->xso.dir == XFRM_DEV_OFFLOAD_OUT)
+		seq_hi = x->replay_esn->oseq_hi;
+
+	spin_lock_bh(&sa->lock);
+	if (!sa->dead) {
+		sa->sa_record->sa_seqnum[1] = seq_hi;
+		dma_sync_single_for_device(sa->ipsec->eip93->dev,
+					   sa->sa_record_base,
+					   sizeof(*sa->sa_record),
+					   DMA_TO_DEVICE);
+	}
+	spin_unlock_bh(&sa->lock);
+}
+EXPORT_SYMBOL_GPL(eip93_ipsec_state_advance_esn);
+
+int eip93_ipsec_xmit(struct eip93_ipsec_sa *sa, struct sk_buff *skb,
+		     unsigned int esp_offset, eip93_ipsec_complete_t complete,
+		     void *data)
+{
+	struct eip93_descriptor cdesc = {};
+	struct eip93_ipsec_request *req;
+	struct xfrm_offload *xo;
+	unsigned int payload_len;
+	unsigned int crypt_len;
+	unsigned int dma_len;
+	unsigned int tailen;
+	int err;
+
+	if (!sa || !complete || !eip93_ipsec_sa_get(sa))
+		return -EOPNOTSUPP;
+
+	if (skb_is_nonlinear(skb)) {
+		err = -EINVAL;
+		goto put_sa;
+	}
+
+	if (skb->len <= esp_offset + sizeof(struct ip_esp_hdr) + sa->ivsize) {
+		err = -EINVAL;
+		goto put_sa;
+	}
+
+	xo = xfrm_offload(skb);
+	if (!xo) {
+		err = -EINVAL;
+		goto put_sa;
+	}
+
+	tailen = xo->esp_tx_tailen;
+	if (tailen) {
+		payload_len = skb->len - esp_offset - sizeof(struct ip_esp_hdr) -
+			      sa->ivsize;
+		dma_len = skb->len + tailen;
+		if (tailen > skb_tailroom(skb) || dma_len < skb->len) {
+			err = -ENOMEM;
+			goto put_sa;
+		}
+	} else {
+		u8 *trail;
+		u8 padlen;
+
+		if (skb->len <= esp_offset + sizeof(struct ip_esp_hdr) +
+					sa->ivsize + sa->authsize) {
+			err = -EINVAL;
+			goto put_sa;
+		}
+
+		crypt_len = skb->len - esp_offset - sizeof(struct ip_esp_hdr) -
+			    sa->ivsize - sa->authsize;
+		if (crypt_len < 2) {
+			err = -EINVAL;
+			goto put_sa;
+		}
+
+		trail = skb_tail_pointer(skb) - sa->authsize - 2;
+		padlen = trail[0];
+		if (crypt_len < padlen + 2) {
+			err = -EINVAL;
+			goto put_sa;
+		}
+
+		payload_len = crypt_len - padlen - 2;
+		dma_len = skb->len;
+	}
+	if (payload_len > FIELD_MAX(EIP93_PE_LENGTH_LENGTH)) {
+		err = -EINVAL;
+		goto put_sa;
+	}
+
+	req = kmalloc(sizeof(*req), GFP_ATOMIC);
+	if (!req) {
+		err = -ENOMEM;
+		goto put_sa;
+	}
+
+	req->sa = sa;
+	req->skb = skb;
+	INIT_LIST_HEAD(&req->node);
+	refcount_set(&req->refcnt, 1);
+	req->complete = complete;
+	req->data = data;
+	req->dma_len = dma_len;
+	req->dma_dir = DMA_BIDIRECTIONAL;
+	req->dma = dma_map_single(sa->ipsec->eip93->dev, skb->data,
+				  req->dma_len, req->dma_dir);
+	if (dma_mapping_error(sa->ipsec->eip93->dev, req->dma)) {
+		err = -ENOMEM;
+		goto free_req;
+	}
+
+	cdesc.pe_ctrl_stat_word =
+		FIELD_PREP(EIP93_PE_CTRL_PE_READY_DES_TRING_OWN,
+			   EIP93_PE_CTRL_HOST_READY) |
+		FIELD_PREP(EIP93_PE_CTRL_PE_PAD_CTRL_STAT,
+			   EIP93_IPSEC_PAD_ALIGN) |
+			FIELD_PREP(EIP93_PE_CTRL_PE_PAD_VALUE, xo->proto) |
+		EIP93_PE_CTRL_PE_HASH_FINAL;
+	cdesc.src_addr = (u32 __force)req->dma + esp_offset +
+			 sizeof(struct ip_esp_hdr) + sa->ivsize;
+	cdesc.dst_addr = (u32 __force)req->dma + esp_offset;
+	cdesc.sa_addr = sa->sa_record_base;
+	/*
+	 * EIP93 ESP protocol-out mode wants the plaintext payload length. It
+	 * generates ESP padding, next-header and ICV itself when tailroom was
+	 * reserved instead of filled by the generic ESP path.
+	 */
+	cdesc.pe_length_word = FIELD_PREP(EIP93_PE_LENGTH_HOST_PE_READY,
+					  EIP93_PE_LENGTH_HOST_READY) |
+			       FIELD_PREP(EIP93_PE_LENGTH_LENGTH, payload_len);
+
+	err = eip93_ipsec_submit(req, &cdesc);
+	if (err == -EINPROGRESS)
+		return err;
+
+	dma_unmap_single(sa->ipsec->eip93->dev, req->dma, req->dma_len,
+			 req->dma_dir);
+free_req:
+	eip93_ipsec_request_put(req);
+put_sa:
+	eip93_ipsec_sa_put(sa);
+	return err;
+}
+EXPORT_SYMBOL_GPL(eip93_ipsec_xmit);
+
+int eip93_ipsec_receive(struct eip93_ipsec_sa *sa, struct sk_buff *skb,
+			unsigned int packet_len,
+			eip93_ipsec_complete_t complete, void *data)
+{
+	struct eip93_descriptor cdesc = {};
+	struct eip93_ipsec_request *req;
+	int err;
+
+	if (!sa || !complete || !eip93_ipsec_sa_get(sa))
+		return -EOPNOTSUPP;
+
+	if (skb_is_nonlinear(skb)) {
+		err = -EINVAL;
+		goto put_sa;
+	}
+
+	req = kmalloc(sizeof(*req), GFP_ATOMIC);
+	if (!req) {
+		err = -ENOMEM;
+		goto put_sa;
+	}
+
+	req->sa = sa;
+	req->skb = skb;
+	INIT_LIST_HEAD(&req->node);
+	refcount_set(&req->refcnt, 1);
+	req->complete = complete;
+	req->data = data;
+	if (!packet_len || packet_len > skb->len ||
+	    packet_len > FIELD_MAX(EIP93_PE_LENGTH_LENGTH)) {
+		err = -EINVAL;
+		goto free_req;
+	}
+
+	req->dma_len = packet_len;
+	req->dma_dir = DMA_BIDIRECTIONAL;
+	req->dma = dma_map_single(sa->ipsec->eip93->dev, skb->data,
+				  req->dma_len, req->dma_dir);
+	if (dma_mapping_error(sa->ipsec->eip93->dev, req->dma)) {
+		err = -ENOMEM;
+		goto free_req;
+	}
+
+	cdesc.pe_ctrl_stat_word =
+		FIELD_PREP(EIP93_PE_CTRL_PE_READY_DES_TRING_OWN,
+			   EIP93_PE_CTRL_HOST_READY) |
+		FIELD_PREP(EIP93_PE_CTRL_PE_PAD_CTRL_STAT,
+			   EIP93_IPSEC_PAD_ALIGN) |
+		EIP93_PE_CTRL_PE_HASH_FINAL;
+	cdesc.src_addr = (u32 __force)req->dma;
+	cdesc.dst_addr = (u32 __force)req->dma;
+	cdesc.sa_addr = sa->sa_record_base;
+	cdesc.pe_length_word = FIELD_PREP(EIP93_PE_LENGTH_HOST_PE_READY,
+					  EIP93_PE_LENGTH_HOST_READY) |
+			       FIELD_PREP(EIP93_PE_LENGTH_LENGTH, req->dma_len);
+
+	err = eip93_ipsec_submit(req, &cdesc);
+	if (err == -EINPROGRESS)
+		return err;
+
+	dma_unmap_single(sa->ipsec->eip93->dev, req->dma, req->dma_len,
+			 req->dma_dir);
+free_req:
+	eip93_ipsec_request_put(req);
+put_sa:
+	eip93_ipsec_sa_put(sa);
+	return err;
+}
+EXPORT_SYMBOL_GPL(eip93_ipsec_receive);
diff --git a/drivers/crypto/inside-secure/eip93/eip93-main.c b/drivers/crypto/inside-secure/eip93/eip93-main.c
index 7dccfdeb7b11..1505e33d62bf 100644
--- a/drivers/crypto/inside-secure/eip93/eip93-main.c
+++ b/drivers/crypto/inside-secure/eip93/eip93-main.c
@@ -185,7 +185,9 @@ static int eip93_register_algs(struct eip93_device *eip93, u32 supported_algo_fl
 
 static void eip93_handle_result_descriptor(struct eip93_device *eip93)
 {
-	struct crypto_async_request *async;
+	struct crypto_async_request *async = NULL;
+	struct eip93_ipsec_request *ipsec = NULL;
+	void *request;
 	struct eip93_descriptor *rdesc;
 	u16 desc_flags, crypto_idr;
 	bool last_entry;
@@ -224,11 +226,11 @@ static void eip93_handle_result_descriptor(struct eip93_device *eip93)
 			 FIELD_GET(EIP93_PE_LENGTH_HOST_PE_READY, pe_length) !=
 			 EIP93_PE_LENGTH_PE_READY);
 
-		err = rdesc->pe_ctrl_stat_word & (EIP93_PE_CTRL_PE_EXT_ERR_CODE |
-						  EIP93_PE_CTRL_PE_EXT_ERR |
-						  EIP93_PE_CTRL_PE_SEQNUM_ERR |
-						  EIP93_PE_CTRL_PE_PAD_ERR |
-						  EIP93_PE_CTRL_PE_AUTH_ERR);
+		err = pe_ctrl_stat & (EIP93_PE_CTRL_PE_EXT_ERR_CODE |
+				      EIP93_PE_CTRL_PE_EXT_ERR |
+				      EIP93_PE_CTRL_PE_SEQNUM_ERR |
+				      EIP93_PE_CTRL_PE_PAD_ERR |
+				      EIP93_PE_CTRL_PE_AUTH_ERR);
 
 		desc_flags = FIELD_GET(EIP93_PE_USER_ID_DESC_FLAGS, rdesc->user_id);
 		crypto_idr = FIELD_GET(EIP93_PE_USER_ID_CRYPTO_IDR, rdesc->user_id);
@@ -248,23 +250,37 @@ static void eip93_handle_result_descriptor(struct eip93_device *eip93)
 	if (!last_entry)
 		goto get_more;
 
-	/* Get crypto async ref only for last descriptor */
+	/* Get request ref only for last descriptor */
 	scoped_guard(spinlock_bh, &eip93->ring->idr_lock) {
-		async = idr_find(&eip93->ring->crypto_async_idr, crypto_idr);
+		request = idr_find(&eip93->ring->crypto_async_idr, crypto_idr);
 		idr_remove(&eip93->ring->crypto_async_idr, crypto_idr);
 	}
+	if (!request) {
+		dev_warn_ratelimited(eip93->dev, "missing request id %u\n",
+				     crypto_idr);
+		goto get_more;
+	}
 
 	/* Parse error in ctrl stat word */
 	err = eip93_parse_ctrl_stat_err(eip93, err);
 
+	if (desc_flags & EIP93_DESC_IPSEC) {
+		ipsec = request;
+		eip93_ipsec_handle_result(ipsec, err, pe_ctrl_stat, pe_length);
+		goto get_more;
+	}
+
+	async = request;
+
 	if (desc_flags & EIP93_DESC_SKCIPHER)
 		eip93_skcipher_handle_result(async, err);
-
-	if (desc_flags & EIP93_DESC_AEAD)
+	else if (desc_flags & EIP93_DESC_AEAD)
 		eip93_aead_handle_result(async, err);
-
-	if (desc_flags & EIP93_DESC_HASH)
+	else if (desc_flags & EIP93_DESC_HASH)
 		eip93_hash_handle_result(async, err);
+	else
+		dev_warn_ratelimited(eip93->dev, "unknown descriptor flags %#x\n",
+				     desc_flags);
 
 	goto get_more;
 }
@@ -279,21 +295,26 @@ static void eip93_done_task(unsigned long data)
 static irqreturn_t eip93_irq_handler(int irq, void *data)
 {
 	struct eip93_device *eip93 = data;
+	bool handled = false;
 	u32 irq_status;
 
 	irq_status = readl(eip93->base + EIP93_REG_INT_MASK_STAT);
 	if (FIELD_GET(EIP93_INT_RDR_THRESH, irq_status)) {
 		eip93_irq_disable(eip93, EIP93_INT_RDR_THRESH);
 		tasklet_schedule(&eip93->ring->done_task);
-		return IRQ_HANDLED;
+		irq_status &= ~EIP93_INT_RDR_THRESH;
+		handled = true;
 	}
 
-	/* Ignore errors in AUTO mode, handled by the RDR */
+	if (!irq_status)
+		return handled ? IRQ_HANDLED : IRQ_NONE;
+
+	eip93_ipsec_report_irq(eip93, irq_status);
+
 	eip93_irq_clear(eip93, irq_status);
-	if (irq_status)
-		eip93_irq_disable(eip93, irq_status);
+	eip93_irq_disable(eip93, irq_status);
 
-	return IRQ_NONE;
+	return IRQ_HANDLED;
 }
 
 static void eip93_initialize(struct eip93_device *eip93, u32 supported_algo_flags)
@@ -455,15 +476,24 @@ static int eip93_crypto_probe(struct platform_device *pdev)
 
 	eip93_initialize(eip93, algo_flags);
 
-	/* Init finished, enable RDR interrupt */
-	eip93_irq_enable(eip93, EIP93_INT_RDR_THRESH);
+	ret = eip93_ipsec_register(eip93);
+	if (ret) {
+		eip93_cleanup(eip93);
+		return ret;
+	}
 
 	ret = eip93_register_algs(eip93, algo_flags);
 	if (ret) {
+		eip93_ipsec_unregister(eip93);
 		eip93_cleanup(eip93);
 		return ret;
 	}
 
+	/* Init finished, enable RDR and fatal error interrupts */
+	eip93_irq_enable(eip93, EIP93_INT_RDR_THRESH | EIP93_INT_INTERFACE_ERR |
+			 EIP93_INT_RPOC_ERR | EIP93_INT_PE_RING_ERR |
+			 EIP93_INT_HALT);
+
 	ver = readl(eip93->base + EIP93_REG_PE_REVISION);
 	/* EIP_EIP_NO:MAJOR_HW_REV:MINOR_HW_REV:HW_PATCH,PE(ALGO_FLAGS) */
 	dev_info(eip93->dev, "EIP%lu:%lx:%lx:%lx,PE(0x%x:0x%x)\n",
@@ -484,6 +514,7 @@ static void eip93_crypto_remove(struct platform_device *pdev)
 
 	algo_flags = readl(eip93->base + EIP93_REG_PE_OPTION_1);
 
+	eip93_ipsec_unregister(eip93);
 	eip93_unregister_algs(algo_flags, ARRAY_SIZE(eip93_algs));
 	eip93_cleanup(eip93);
 }
diff --git a/drivers/crypto/inside-secure/eip93/eip93-main.h b/drivers/crypto/inside-secure/eip93/eip93-main.h
index 990c2401b7ce..ca1bda5b2ac0 100644
--- a/drivers/crypto/inside-secure/eip93/eip93-main.h
+++ b/drivers/crypto/inside-secure/eip93/eip93-main.h
@@ -13,6 +13,7 @@
 #include <crypto/internal/skcipher.h>
 #include <linux/bitfield.h>
 #include <linux/interrupt.h>
+#include <linux/kconfig.h>
 
 #define EIP93_RING_BUSY_DELAY		500
 
@@ -92,6 +93,8 @@
 						    EIP93_HASH_SHA224 | \
 						    EIP93_HASH_SHA256))
 
+struct eip93_ipsec;
+
 /**
  * struct eip93_device - crypto engine device structure
  */
@@ -101,6 +104,7 @@ struct eip93_device {
 	struct clk		*clk;
 	int			irq;
 	struct eip93_ring		*ring;
+	struct eip93_ipsec	*ipsec;
 };
 
 struct eip93_desc_ring {
@@ -124,8 +128,8 @@ struct eip93_ring {
 	/* command/result rings */
 	struct eip93_desc_ring		cdr;
 	struct eip93_desc_ring		rdr;
-	spinlock_t			write_lock;
-	spinlock_t			read_lock;
+	spinlock_t			write_lock; /* command descriptor enqueue */
+	spinlock_t			read_lock; /* result descriptor dequeue */
 	/* aync idr */
 	spinlock_t			idr_lock;
 	struct idr			crypto_async_idr;
@@ -148,4 +152,34 @@ struct eip93_alg_template {
 	} alg;
 };
 
+struct eip93_ipsec_request;
+
+#if IS_ENABLED(CONFIG_CRYPTO_DEV_EIP93_IPSEC)
+int eip93_ipsec_register(struct eip93_device *eip93);
+void eip93_ipsec_unregister(struct eip93_device *eip93);
+void eip93_ipsec_handle_result(struct eip93_ipsec_request *req, int err,
+			       u32 pe_ctrl_stat, u32 pe_length);
+void eip93_ipsec_report_irq(struct eip93_device *eip93, u32 irq_status);
+#else
+static inline int eip93_ipsec_register(struct eip93_device *eip93)
+{
+	return 0;
+}
+
+static inline void eip93_ipsec_unregister(struct eip93_device *eip93)
+{
+}
+
+static inline void eip93_ipsec_handle_result(struct eip93_ipsec_request *req,
+					     int err, u32 pe_ctrl_stat,
+					     u32 pe_length)
+{
+}
+
+static inline void eip93_ipsec_report_irq(struct eip93_device *eip93,
+					  u32 irq_status)
+{
+}
+#endif
+
 #endif /* _EIP93_MAIN_H_ */
diff --git a/include/crypto/eip93-ipsec.h b/include/crypto/eip93-ipsec.h
new file mode 100644
index 000000000000..bc0ba8f4f84e
--- /dev/null
+++ b/include/crypto/eip93-ipsec.h
@@ -0,0 +1,132 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * EIP93 IPsec offload API
+ *
+ * Copyright (c) 2026 Jihong Min <hurryman2212@gmail.com>
+ */
+#ifndef _CRYPTO_EIP93_IPSEC_H
+#define _CRYPTO_EIP93_IPSEC_H
+
+#include <linux/bits.h>
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/kconfig.h>
+#include <linux/types.h>
+
+struct device;
+struct netlink_ext_ack;
+struct notifier_block;
+struct sk_buff;
+struct xfrm_state;
+
+struct eip93_ipsec;
+struct eip93_ipsec_sa;
+
+struct eip93_ipsec_result {
+	unsigned int packet_len;
+	u8 nexthdr;
+};
+
+enum eip93_ipsec_feature {
+	EIP93_IPSEC_FEATURE_ESP = BIT(0),
+	EIP93_IPSEC_FEATURE_GSO_ESP = BIT(1),
+	EIP93_IPSEC_FEATURE_HW_ESP_TX_CSUM = BIT(2),
+};
+
+enum eip93_ipsec_event {
+	EIP93_IPSEC_EVENT_REMOVE,
+	EIP93_IPSEC_EVENT_RESET,
+	EIP93_IPSEC_EVENT_DMA_ERROR,
+	EIP93_IPSEC_EVENT_CAPABILITY_LOSS,
+};
+
+typedef void (*eip93_ipsec_complete_t)(void *data, int err,
+				       struct eip93_ipsec_result result);
+
+#if IS_REACHABLE(CONFIG_CRYPTO_DEV_EIP93) && \
+	IS_ENABLED(CONFIG_CRYPTO_DEV_EIP93_IPSEC)
+struct eip93_ipsec *eip93_ipsec_get(struct device *consumer);
+void eip93_ipsec_put(struct eip93_ipsec *ipsec);
+bool eip93_ipsec_available(struct eip93_ipsec *ipsec);
+u32 eip93_ipsec_features(struct eip93_ipsec *ipsec);
+int eip93_ipsec_register_notifier(struct notifier_block *nb);
+void eip93_ipsec_unregister_notifier(struct notifier_block *nb);
+int eip93_ipsec_state_add(struct eip93_ipsec *ipsec, struct xfrm_state *x,
+			  struct netlink_ext_ack *extack,
+			  struct eip93_ipsec_sa **sa);
+void eip93_ipsec_state_delete(struct eip93_ipsec_sa *sa);
+void eip93_ipsec_state_advance_esn(struct eip93_ipsec_sa *sa,
+				   struct xfrm_state *x);
+int eip93_ipsec_xmit(struct eip93_ipsec_sa *sa, struct sk_buff *skb,
+		     unsigned int esp_offset, eip93_ipsec_complete_t complete,
+		     void *data);
+int eip93_ipsec_receive(struct eip93_ipsec_sa *sa, struct sk_buff *skb,
+			unsigned int packet_len,
+			eip93_ipsec_complete_t complete, void *data);
+#else
+static inline struct eip93_ipsec *eip93_ipsec_get(struct device *consumer)
+{
+	return ERR_PTR(-EOPNOTSUPP);
+}
+
+static inline void eip93_ipsec_put(struct eip93_ipsec *ipsec)
+{
+}
+
+static inline bool eip93_ipsec_available(struct eip93_ipsec *ipsec)
+{
+	return false;
+}
+
+static inline u32 eip93_ipsec_features(struct eip93_ipsec *ipsec)
+{
+	return 0;
+}
+
+static inline int eip93_ipsec_register_notifier(struct notifier_block *nb)
+{
+	return 0;
+}
+
+static inline void eip93_ipsec_unregister_notifier(struct notifier_block *nb)
+{
+}
+
+static inline int eip93_ipsec_state_add(struct eip93_ipsec *ipsec,
+					struct xfrm_state *x,
+					struct netlink_ext_ack *extack,
+					struct eip93_ipsec_sa **sa)
+{
+	if (sa)
+		*sa = NULL;
+
+	return -EOPNOTSUPP;
+}
+
+static inline void eip93_ipsec_state_delete(struct eip93_ipsec_sa *sa)
+{
+}
+
+static inline void eip93_ipsec_state_advance_esn(struct eip93_ipsec_sa *sa,
+						 struct xfrm_state *x)
+{
+}
+
+static inline int eip93_ipsec_xmit(struct eip93_ipsec_sa *sa,
+				   struct sk_buff *skb, unsigned int esp_offset,
+				   eip93_ipsec_complete_t complete, void *data)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int eip93_ipsec_receive(struct eip93_ipsec_sa *sa,
+				      struct sk_buff *skb,
+				      unsigned int packet_len,
+				      eip93_ipsec_complete_t complete,
+				      void *data)
+{
+	return -EOPNOTSUPP;
+}
+#endif
+
+#endif /* _CRYPTO_EIP93_IPSEC_H */
-- 
2.53.0



^ permalink raw reply related

* [PATCH 1/3] xfrm: extend ESP offload infrastructure for packet engines
From: Jihong Min @ 2026-05-23 12:15 UTC (permalink / raw)
  To: Christian Marangi, Antoine Tenart, Herbert Xu, David S . Miller,
	Lorenzo Bianconi, Andrew Lunn, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Steffen Klassert
  Cc: linux-kernel, linux-crypto, linux-arm-kernel, linux-mediatek,
	netdev, Jihong Min
In-Reply-To: <20260523121522.3023992-1-hurryman2212@gmail.com>

Some ESP offload engines operate on whole ESP packets rather than the
generic software trailer layout. They can generate outbound ESP padding,
next-header and ICV bytes in hardware, and inbound decapsulation can
return an already-trimmed packet with the recovered next-header value.

Add a netdev offload callback for drivers to opt into hardware-generated
ESP TX trailers, carry the reserved ESP TX tail length in xfrm_offload,
and let ESP input skip software trailer removal when hardware has already
done it.

This keeps the default ESP offload behavior unchanged for existing devices
while providing the infrastructure needed by packet-mode ESP engines.

Assisted-by: Codex:gpt-5.5
Signed-off-by: Jihong Min <hurryman2212@gmail.com>
---
 include/linux/netdevice.h |  3 +++
 include/net/xfrm.h        |  8 +++++++-
 net/ipv4/esp4.c           |  6 +++++-
 net/ipv4/esp4_offload.c   | 29 ++++++++++++++++++++++++++++-
 net/ipv6/esp6.c           |  6 +++++-
 net/ipv6/esp6_offload.c   | 29 ++++++++++++++++++++++++++++-
 6 files changed, 76 insertions(+), 5 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0e1e581efc5a..b6ff04c3df78 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1043,6 +1043,9 @@ struct xfrmdev_ops {
 				      struct xfrm_state *x);
 	bool	(*xdo_dev_offload_ok) (struct sk_buff *skb,
 				       struct xfrm_state *x);
+	/* Return true when the device generates the ESP trailer/ICV itself. */
+	bool	(*xdo_dev_esp_tx_hw_trailer)(struct sk_buff *skb,
+					     struct xfrm_state *x);
 	void	(*xdo_dev_state_advance_esn) (struct xfrm_state *x);
 	void	(*xdo_dev_state_update_stats) (struct xfrm_state *x);
 	int	(*xdo_dev_policy_add) (struct xfrm_policy *x, struct netlink_ext_ack *extack);
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 10d3edde6b2f..160069901e0a 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1141,7 +1141,7 @@ struct xfrm_offload {
 #define	CRYPTO_FALLBACK		8
 #define	XFRM_GSO_SEGMENT	16
 #define	XFRM_GRO		32
-/* 64 is free */
+#define	XFRM_ESP_NO_TRAILER	64
 #define	XFRM_DEV_RESUME		128
 #define	XFRM_XMIT		256
 
@@ -1158,6 +1158,12 @@ struct xfrm_offload {
 	/* Used to keep whole l2 header for transport mode GRO */
 	__u16			orig_mac_len;
 
+	/*
+	 * ESP packet engines can reserve tailroom in the generic ESP path and
+	 * generate padding, next-header and ICV bytes during device TX.
+	 */
+	__u16			esp_tx_tailen;
+
 	__u8			proto;
 	__u8			inner_ipproto;
 };
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 6a5febbdbee4..f21c8f2e60f7 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -720,7 +720,11 @@ int esp_input_done2(struct sk_buff *skb, int err)
 	if (unlikely(err))
 		goto out;
 
-	err = esp_remove_trailer(skb);
+	/* Hardware ESP decapsulation can already remove pad/trailer/ICV. */
+	if (xo && (xo->flags & XFRM_ESP_NO_TRAILER))
+		err = xo->proto;
+	else
+		err = esp_remove_trailer(skb);
 	if (unlikely(err < 0))
 		goto out;
 
diff --git a/net/ipv4/esp4_offload.c b/net/ipv4/esp4_offload.c
index abd77162f5e7..f00fff98b69f 100644
--- a/net/ipv4/esp4_offload.c
+++ b/net/ipv4/esp4_offload.c
@@ -270,8 +270,10 @@ static int esp_xmit(struct xfrm_state *x, struct sk_buff *skb,  netdev_features_
 	struct xfrm_offload *xo;
 	struct ip_esp_hdr *esph;
 	struct crypto_aead *aead;
+	struct sk_buff *trailer;
 	struct esp_info esp;
 	bool hw_offload = true;
+	bool hw_trailer = false;
 	__u32 seq;
 	int encap_type = 0;
 
@@ -281,6 +283,7 @@ static int esp_xmit(struct xfrm_state *x, struct sk_buff *skb,  netdev_features_
 
 	if (!xo)
 		return -EINVAL;
+	xo->esp_tx_tailen = 0;
 
 	if ((!(features & NETIF_F_HW_ESP) &&
 	     !(skb->dev->gso_partial_features & NETIF_F_HW_ESP)) ||
@@ -303,13 +306,37 @@ static int esp_xmit(struct xfrm_state *x, struct sk_buff *skb,  netdev_features_
 	esp.clen = ALIGN(skb->len + 2 + esp.tfclen, blksize);
 	esp.plen = esp.clen - skb->len - esp.tfclen;
 	esp.tailen = esp.tfclen + esp.plen + alen;
+	if (esp.tailen > U16_MAX)
+		return -EINVAL;
 
 	esp.esph = ip_esp_hdr(skb);
 
 	if (x->encap)
 		encap_type = x->encap->encap_type;
 
-	if (!hw_offload || !skb_is_gso(skb) || (hw_offload && encap_type == UDP_ENCAP_ESPINUDP)) {
+	if (hw_offload && !skb_is_gso(skb) && !encap_type && x->xso.dev &&
+	    x->xso.dev->xfrmdev_ops &&
+	    x->xso.dev->xfrmdev_ops->xdo_dev_esp_tx_hw_trailer)
+		hw_trailer =
+			x->xso.dev->xfrmdev_ops->xdo_dev_esp_tx_hw_trailer(skb, x);
+
+	if (hw_trailer) {
+		int esph_offset;
+
+		/*
+		 * The device packet engine will write ESP padding, next-header
+		 * and ICV bytes. Keep skb->len unchanged here, but make sure the
+		 * later DMA writer owns enough linear tailroom.
+		 */
+		esph_offset = (unsigned char *)esp.esph - skb_transport_header(skb);
+		esp.nfrags = skb_cow_data(skb, esp.tailen, &trailer);
+		if (esp.nfrags < 0)
+			return esp.nfrags;
+		esp.esph = (struct ip_esp_hdr *)(skb_transport_header(skb) +
+						 esph_offset);
+		xo->esp_tx_tailen = esp.tailen;
+	} else if (!hw_offload || !skb_is_gso(skb) ||
+		   (hw_offload && encap_type == UDP_ENCAP_ESPINUDP)) {
 		esp.nfrags = esp_output_head(x, skb, &esp);
 		if (esp.nfrags < 0)
 			return esp.nfrags;
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 9c06c5a1419d..730588f8eaba 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -751,7 +751,11 @@ int esp6_input_done2(struct sk_buff *skb, int err)
 	if (unlikely(err))
 		goto out;
 
-	err = esp_remove_trailer(skb);
+	/* Hardware ESP decapsulation can already remove pad/trailer/ICV. */
+	if (xo && (xo->flags & XFRM_ESP_NO_TRAILER))
+		err = xo->proto;
+	else
+		err = esp_remove_trailer(skb);
 	if (unlikely(err < 0))
 		goto out;
 
diff --git a/net/ipv6/esp6_offload.c b/net/ipv6/esp6_offload.c
index 22895521a57d..d124493da40b 100644
--- a/net/ipv6/esp6_offload.c
+++ b/net/ipv6/esp6_offload.c
@@ -308,8 +308,10 @@ static int esp6_xmit(struct xfrm_state *x, struct sk_buff *skb,  netdev_features
 	int blksize;
 	struct xfrm_offload *xo;
 	struct crypto_aead *aead;
+	struct sk_buff *trailer;
 	struct esp_info esp;
 	bool hw_offload = true;
+	bool hw_trailer = false;
 	__u32 seq;
 
 	esp.inplace = true;
@@ -318,6 +320,7 @@ static int esp6_xmit(struct xfrm_state *x, struct sk_buff *skb,  netdev_features
 
 	if (!xo)
 		return -EINVAL;
+	xo->esp_tx_tailen = 0;
 
 	if (!(features & NETIF_F_HW_ESP) || x->xso.dev != skb->dev) {
 		xo->flags |= CRYPTO_FALLBACK;
@@ -338,8 +341,32 @@ static int esp6_xmit(struct xfrm_state *x, struct sk_buff *skb,  netdev_features
 	esp.clen = ALIGN(skb->len + 2 + esp.tfclen, blksize);
 	esp.plen = esp.clen - skb->len - esp.tfclen;
 	esp.tailen = esp.tfclen + esp.plen + alen;
+	if (esp.tailen > U16_MAX)
+		return -EINVAL;
 
-	if (!hw_offload || !skb_is_gso(skb)) {
+	if (hw_offload && !skb_is_gso(skb) && !x->encap && x->xso.dev &&
+	    x->xso.dev->xfrmdev_ops &&
+	    x->xso.dev->xfrmdev_ops->xdo_dev_esp_tx_hw_trailer)
+		hw_trailer =
+			x->xso.dev->xfrmdev_ops->xdo_dev_esp_tx_hw_trailer(skb, x);
+
+	if (hw_trailer) {
+		int esph_offset;
+
+		/*
+		 * The device packet engine will write ESP padding, next-header
+		 * and ICV bytes. Keep skb->len unchanged here, but make sure the
+		 * later DMA writer owns enough linear tailroom.
+		 */
+		esp.esph = ip_esp_hdr(skb);
+		esph_offset = (unsigned char *)esp.esph - skb_transport_header(skb);
+		esp.nfrags = skb_cow_data(skb, esp.tailen, &trailer);
+		if (esp.nfrags < 0)
+			return esp.nfrags;
+		esp.esph = (struct ip_esp_hdr *)(skb_transport_header(skb) +
+						 esph_offset);
+		xo->esp_tx_tailen = esp.tailen;
+	} else if (!hw_offload || !skb_is_gso(skb)) {
 		esp.nfrags = esp6_output_head(x, skb, &esp);
 		if (esp.nfrags < 0)
 			return esp.nfrags;
-- 
2.53.0



^ permalink raw reply related

* [PATCH 0/3] Add packet-mode ESP offload for Airoha/EIP93
From: Jihong Min @ 2026-05-23 12:15 UTC (permalink / raw)
  To: Christian Marangi, Antoine Tenart, Herbert Xu, David S . Miller,
	Lorenzo Bianconi, Andrew Lunn, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Steffen Klassert
  Cc: linux-kernel, linux-crypto, linux-arm-kernel, linux-mediatek,
	netdev, Jihong Min

This series adds the missing plumbing for ESP offload engines that
operate on whole ESP packets instead of only exposing AES/HMAC through
the crypto API AEAD interface.

The normal ESP software path can already call into accelerated AEAD
algorithms, but packet-mode engines such as EIP93 can also generate and
consume ESP packet framing: padding, pad length, next header and ICV.
That needs a slightly different XFRM offload contract so the netdev
driver can hand the skb to a packet backend rather than trying to make
hardware fit the software trailer layout.

Patch 1 extends the ESP offload infrastructure for packet engines while
preserving the existing behavior for drivers that do not opt in.
Patch 2 exposes an EIP93 ESP packet backend for encapsulation and
decapsulation.
Patch 3 wires Airoha Ethernet GDM netdevs and DSA user ports to that
backend through xfrmdev_ops. ESP GSO and ESP TX checksum offload remain
disabled.

Runtime testing was done on a Gemtek W1700K2 running OpenWrt with the
same changes applied on top of a 6.18.31-based kernel.

Test parameters:

  - Static IPv4 transport-mode XFRM SAs between the AP and host.
  - ESP transform: auth hmac(sha1), enc cbc(aes) with a 128-bit AES key.
  - iperf3 TCP test, AP as client and host as server:
        iperf3 -c <host_ip> -P 4 -t 10
  - The host always used normal Linux XFRM software processing.
  - With AP ESP offload disabled, the AP also used the Linux XFRM
    software path; in this setup, EIP93-backed AEAD crypto was still
    available to that path.

Network-relevant test setup:

  - AP: Gemtek W1700K2, Airoha AN7581/EN7581, 4x Arm Cortex-A53 at
    1.4 GHz, 2 GiB RAM, airoha_eth wan (GDM2) netdev, 10Gb/s full-duplex,
    MTU 9200, EIP93 crypto and IPsec packet engine present.
  - Host: AMD Ryzen 9 9950X3D, 16 cores/32 threads, Open vSwitch,
    MTU 9978, backed by a ConnectX-6 Dx 10Gb/s full-duplex link.

AP to host iperf3 result:

  AP offload      Sender          Receiver        Retransmits
  on              918.2 Mbit/s    913.6 Mbit/s    0
  off             782.4 Mbit/s    778.6 Mbit/s    3569

This is a 17.3% receiver-side throughput improvement for the AP TX ESP
path in this setup, with retransmits eliminated in the offloaded run.

Jihong Min (3):
  xfrm: extend ESP offload infrastructure for packet engines
  crypto: inside-secure: add EIP93 ESP packet backend
  net: airoha: add EIP93-backed ESP XFRM offload

 MAINTAINERS                                   |    1 +
 drivers/crypto/inside-secure/eip93/Kconfig    |   10 +
 drivers/crypto/inside-secure/eip93/Makefile   |    1 +
 .../crypto/inside-secure/eip93/eip93-ipsec.c  | 1413 ++++++++++++++++
 .../crypto/inside-secure/eip93/eip93-main.c   |   69 +-
 .../crypto/inside-secure/eip93/eip93-main.h   |   38 +-
 drivers/net/ethernet/airoha/Kconfig           |   11 +
 drivers/net/ethernet/airoha/Makefile          |    1 +
 drivers/net/ethernet/airoha/airoha_eth.c      |   51 +-
 drivers/net/ethernet/airoha/airoha_eth.h      |   69 +
 drivers/net/ethernet/airoha/airoha_xfrm.c     | 1474 +++++++++++++++++
 include/crypto/eip93-ipsec.h                  |  132 ++
 include/linux/netdevice.h                     |    3 +
 include/net/xfrm.h                            |    8 +-
 net/ipv4/esp4.c                               |    6 +-
 net/ipv4/esp4_offload.c                       |   29 +-
 net/ipv6/esp6.c                               |    6 +-
 net/ipv6/esp6_offload.c                       |   29 +-
 18 files changed, 3324 insertions(+), 27 deletions(-)
 create mode 100644 drivers/crypto/inside-secure/eip93/eip93-ipsec.c
 create mode 100644 drivers/net/ethernet/airoha/airoha_xfrm.c
 create mode 100644 include/crypto/eip93-ipsec.h

-- 
2.53.0


^ permalink raw reply

* Re: [PATCH v2 3/3] ASoC: sunxi: sun4i-spdif: Reorder clock enable sequence
From: Bui Duc Phuc @ 2026-05-23 12:11 UTC (permalink / raw)
  To: wens
  Cc: broonie, codekipper, jernej.skrabec, lgirdwood, linux-arm-kernel,
	linux-kernel, linux-sound, linux-sunxi, nichen, perex, samuel,
	tiwai
In-Reply-To: <CAGb2v67aXgg8BmrtVVBi+_n32OBcQnQgEUe3XdRp6Jj=aEr8fg@mail.gmail.com>

Hi Chen-yu,

Thanks for your feedback

On Sat, May 23, 2026 at 2:20 AM Chen-Yu Tsai <wens@kernel.org> wrote:
> > Enable the APB bus clock before the SPDIF module clock
> > during runtime resume, as register accesses depend on the
> > bus clock being enabled first.
>
> That does not even matter here. Access will only happen once the runtime
> PM callbacks return.
>

I understand your point that ⁠sun4i-spdif⁠ doesn't immediately access
registers within the current ⁠runtime_resume⁠ path, so the order might
not trigger a failure right now.

However, if we look at the peer driver for the same Sunxi SoC family,
⁠sun4i-i2s.c⁠:
Links:
https://elixir.bootlin.com/linux/v7.0-rc5/source/sound/soc/sunxi/sun4i-i2s.c#L1296
In ⁠sun4i_i2s_runtime_resume()⁠, the sequence is strictly enforced as:

1. Enable bus clock
2. Access and restore/sync I2S registers
3. Enable module clock

Since both IP blocks belong to the same Sunxi platform and share similar
bus/module clock relationships, shouldn't we maintain architectural
consistency across these drivers?

Enforcing the "bus clock before module clock" order keeps the dependency
ordering aligned with the actual hardware roles, where the bus clock is
required for register access while the module clock drives the functional
audio path.

Wouldn't keeping this order also make the runtime PM behavior more
consistent and easier to follow across the Sunxi audio drivers?

Best Regards,
Phuc


^ permalink raw reply

* Re: [PATCH] irqchip/gic-v4: Harden against bogus command line
From: Marc Zyngier @ 2026-05-23  9:53 UTC (permalink / raw)
  To: Mostafa Saleh; +Cc: linux-arm-kernel, linux-kernel, tglx
In-Reply-To: <20260521130503.4103369-1-smostafa@google.com>

On Thu, 21 May 2026 14:05:03 +0100,
Mostafa Saleh <smostafa@google.com> wrote:
> 
> When accidentally setting “kvm-arm.vgic_v4_enable=1” on the wrong
> setup that has no MSI controller device tree node (it exists but
> not used) and GICv4, it caused a panic as “gic_domain” is NULL and
> the kernel attempted to access its ops.

When you say "that has no MSI controller device tree node", does it
mean that the ITS has not been probed at all?

>
> Originally, I hit this on an older kernel, but was able to reproduce
> it on upstream with Qemu by hacking this unreasonable setup.
> 
> [   33.145536] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
> [   33.145658] Mem abort info:
> [   33.145751]   ESR = 0x0000000096000006
> ...
> [   33.154057] CPU: 1 UID: 0 PID: 295 Comm: lkvm-static Not tainted 7.1.0-rc4-ge3f15ad3970e #5 PREEMPT
> [   33.156922] Hardware name: linux,dummy-virt (DT)
> [   33.158780] pstate: 81402005 (Nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> [   33.160340] pc : __irq_domain_instantiate+0x1d4/0x578
> [   33.162602] lr : __irq_domain_instantiate+0x1cc/0x578
> 
> Add a hardening check to avoid the NULL access, and fail the VM
> creation in that case.
> 
> Signed-off-by: Mostafa Saleh <smostafa@google.com>
> ---
>  drivers/irqchip/irq-gic-v4.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/irqchip/irq-gic-v4.c b/drivers/irqchip/irq-gic-v4.c
> index 8455b4a5fbb0..7e39f7eae85f 100644
> --- a/drivers/irqchip/irq-gic-v4.c
> +++ b/drivers/irqchip/irq-gic-v4.c
> @@ -159,6 +159,9 @@ int its_alloc_vcpu_irqs(struct its_vm *vm)
>  {
>  	int vpe_base_irq, i;
>  
> +	if (!gic_domain)
> +		return -EINVAL;
> +
>  	vm->fwnode = irq_domain_alloc_named_id_fwnode("GICv4-vpe",
>  						      task_pid_nr(current));
>  	if (!vm->fwnode)

I think this check is a good few levels too late. If you want to fix
this, I'd rather make sure that kvm_vgic_global_state.has_gicv4 is
reliable and covers this case. Which means making sure that
gic_kvm_info::has_v4 is itself reliable.

If my above understanding is correct, I'd expect the following
(untested) hack to help.

Thanks,

	M.

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 291d7668cc8da..e6b9fee1b6786 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -5838,6 +5838,7 @@ int __init its_init(struct fwnode_handle *handle, struct rdists *rdists,
 
 	if (list_empty(&its_nodes)) {
 		pr_warn("ITS: No ITS available, not enabling LPIs\n");
+		rdists->has_vlpis = false;
 		return -ENXIO;
 	}
 

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply related

* [PATCH] pinctrl: imx1: fix device_node leak in dt_is_flat_functions()
From: Felix Gu @ 2026-05-23 10:27 UTC (permalink / raw)
  To: Dong Aisheng, Fabio Estevam, Frank Li, Jacky Bai,
	Pengutronix Kernel Team, NXP S32 Linux Team, Linus Walleij,
	Sascha Hauer
  Cc: linux-gpio, imx, linux-arm-kernel, linux-kernel, Felix Gu

for_each_child_of_node() holds a reference on the iterator node that
must be released on early return. imx1_pinctrl_dt_is_flat_functions()
has two early return paths inside the loop that skip this cleanup.

Replace both loops with the scoped variant so that the reference is
automatically dropped when the iterator goes out of scope.

Fixes: 63d2059cd665 ("pinctrl: imx1: Allow parsing DT without function nodes")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
---
 drivers/pinctrl/freescale/pinctrl-imx1-core.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/pinctrl/freescale/pinctrl-imx1-core.c b/drivers/pinctrl/freescale/pinctrl-imx1-core.c
index b7bd4ef9c0db..4a6bdaefa42f 100644
--- a/drivers/pinctrl/freescale/pinctrl-imx1-core.c
+++ b/drivers/pinctrl/freescale/pinctrl-imx1-core.c
@@ -547,14 +547,11 @@ static int imx1_pinctrl_parse_functions(struct device_node *np,
  */
 static bool imx1_pinctrl_dt_is_flat_functions(struct device_node *np)
 {
-	struct device_node *function_np;
-	struct device_node *pinctrl_np;
-
-	for_each_child_of_node(np, function_np) {
+	for_each_child_of_node_scoped(np, function_np) {
 		if (of_property_present(function_np, "fsl,pins"))
 			return true;
 
-		for_each_child_of_node(function_np, pinctrl_np) {
+		for_each_child_of_node_scoped(function_np, pinctrl_np) {
 			if (of_property_present(pinctrl_np, "fsl,pins"))
 				return false;
 		}

---
base-commit: c1ecb239fa3456529a32255359fc78b69eb9d847
change-id: 20260523-pinctrl-imx-b198f8391abf

Best regards,
--  
Felix Gu <ustc.gu@gmail.com>



^ permalink raw reply related

* [PATCH RESEND] arm64: dts: mediatek: add LED and key support on Xiaomi AX3000T
From: Aleksander Jan Bajkowski @ 2026-05-23 10:18 UTC (permalink / raw)
  To: robh, krzk+dt, conor+dt, matthias.bgg, angelogioacchino.delregno,
	devicetree, linux-kernel, linux-arm-kernel, linux-mediatek
  Cc: Aleksander Jan Bajkowski

This patch adds support for keys and LEDs on the Xiaomi AX3000T.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
---
 .../dts/mediatek/mt7981b-xiaomi-ax3000t.dts   | 36 +++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt7981b-xiaomi-ax3000t.dts b/arch/arm64/boot/dts/mediatek/mt7981b-xiaomi-ax3000t.dts
index a314c3e05e50..db399cb3ead7 100644
--- a/arch/arm64/boot/dts/mediatek/mt7981b-xiaomi-ax3000t.dts
+++ b/arch/arm64/boot/dts/mediatek/mt7981b-xiaomi-ax3000t.dts
@@ -1,6 +1,9 @@
 // SPDX-License-Identifier: GPL-2.0-only OR MIT
 
 /dts-v1/;
+#include <dt-bindings/input/input.h>
+#include <dt-bindings/gpio/gpio.h>
+#include <dt-bindings/leds/common.h>
 
 #include "mt7981b.dtsi"
 
@@ -12,4 +15,37 @@ memory@40000000 {
 		reg = <0 0x40000000 0 0x10000000>;
 		device_type = "memory";
 	};
+
+	keys {
+		compatible = "gpio-keys";
+
+		key-mesh {
+			label = "MESH";
+			gpios = <&pio 0 GPIO_ACTIVE_LOW>;
+			linux,code = <BTN_9>;
+			linux,input-type = <EV_SW>;
+		};
+
+		key-reset {
+			label = "RESET";
+			gpios = <&pio 1 GPIO_ACTIVE_LOW>;
+			linux,code = <KEY_RESTART>;
+		};
+	};
+
+	leds {
+		compatible = "gpio-leds";
+
+		led-0 {
+			color = <LED_COLOR_ID_BLUE>;
+			function = LED_FUNCTION_STATUS;
+			gpios = <&pio 9 GPIO_ACTIVE_LOW>;
+		};
+
+		led-1 {
+			color = <LED_COLOR_ID_YELLOW>;
+			function = LED_FUNCTION_STATUS;
+			gpios = <&pio 10 GPIO_ACTIVE_LOW>;
+		};
+	};
 };
-- 
2.53.0



^ permalink raw reply related

* Re: [PATCH] KVM: arm64: Preserve all guest ZCR_EL2.LEN values
From: Marc Zyngier @ 2026-05-23  8:47 UTC (permalink / raw)
  To: Mark Brown
  Cc: Oliver Upton, Joey Gouly, Steffen Eiden, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	kvmarm, linux-kernel
In-Reply-To: <20260522-kvm-arm64-fix-zcr-len-nv-v1-1-ec254e9078cf@kernel.org>

On Fri, 22 May 2026 19:00:04 +0100,
Mark Brown <broonie@kernel.org> wrote:
> 
> Since b3d29a823099 ("KVM: arm64: nv: Handle ZCR_EL2 traps") when guests
> write to ZCR_EL2 we have clamped the value of ZCR_EL2.LEN to be at most
> that configuring the maximum guest VL.

That's not strictly true. This is only clamped when accessed as
ZCR_EL2. A VHE guest will happily use the ZCR_EL1 accessor for the
same register, and not see the truncation. This has ripple effects
down the line, where the full value will be used at load time.

> This is not the behaviour the
> architecture documents for ZCR_EL2.LEN, the expectation is that all bits
> will be read as written. Further, writing values larger than the largest
> available vector length is part of the documented procedure for enumerating
> the supported vector lengths so we expect to see this happen in practice.
> 
> The reasoning for the current behaviour is not specifically articulated, my
> best guess is that it is intended to ensure that the guest can not see an
> effective VL greater than the maximum that has been configured. This can
> instead be achieved by configuring ZCR_EL2 when loading guest state:
> 
>  - When running at EL0 or EL1 configure ZCR_EL2.LEN to the minimum of the
>    guest ZCR_EL2.LEN and vcpu_sve_max_vq(vcpu)-1.

This is not EL0 or EL1. This is when in a nested context (i.e. running
a L2 guest), as EL0 exists for L1 as well.

>  - When running at EL2 configure the maximum VL for the guest in
>    ZCR_EL2.LEN like we do for non-nested guests and load the guest
>    ZCR_EL2 into ZCR_EL1.
> 
> This will ensure that the guest sees both the ZCR_EL2.LEN value which it
> wrote and the effective VL that resulting from the values it has configured
> in ZCR_ELx.LEN.
> 
> Currently all other bits in ZCR_EL2 are either RES0 or RAZ/WI, values
> written are sanitised based on this.

Only for the direct writes to ZCR_EL2, as they are trapping. I don't
see any sanitisation for writes using the ZCR_EL1 accessor, which is
the common case. This needs fixing at the same time.

> 
> Fixes: b3d29a823099 ("KVM: arm64: nv: Handle ZCR_EL2 traps")
> Signed-off-by: Mark Brown <broonie@kernel.org>

Given the nature of the bug, this needs a Cc: stable.

> ---
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 8 ++++----
>  arch/arm64/kvm/sys_regs.c               | 6 +-----
>  2 files changed, 5 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index bf0eb5e43427..fd277cb70967 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -501,11 +501,11 @@ static inline void fpsimd_lazy_switch_to_guest(struct kvm_vcpu *vcpu)
>  		return;
>  
>  	if (vcpu_has_sve(vcpu)) {
> +		zcr_el2 = vcpu_sve_max_vq(vcpu) - 1;
> +
>  		/* A guest hypervisor may restrict the effective max VL. */
> -		if (is_nested_ctxt(vcpu))
> -			zcr_el2 = __vcpu_sys_reg(vcpu, ZCR_EL2);
> -		else
> -			zcr_el2 = vcpu_sve_max_vq(vcpu) - 1;
> +		if (is_nested_ctxt(vcpu) && !is_hyp_ctxt(vcpu))
> +			zcr_el2 = min(zcr_el2, __vcpu_sys_reg(vcpu, ZCR_EL2));

Why the change in the condition guarding this? Given the definition of
is_nested_ctxt(), this seems unnecessary.

>
>  		write_sysreg_el2(zcr_el2, SYS_ZCR);
>  
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 148fc3400ea8..c4d3bbae2d14 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -2862,8 +2862,6 @@ static bool access_zcr_el2(struct kvm_vcpu *vcpu,
>  			   struct sys_reg_params *p,
>  			   const struct sys_reg_desc *r)
>  {
> -	unsigned int vq;
> -
>  	if (guest_hyp_sve_traps_enabled(vcpu)) {
>  		kvm_inject_nested_sve_trap(vcpu);
>  		return false;
> @@ -2874,9 +2872,7 @@ static bool access_zcr_el2(struct kvm_vcpu *vcpu,
>  		return true;
>  	}
>  
> -	vq = SYS_FIELD_GET(ZCR_ELx, LEN, p->regval) + 1;
> -	vq = min(vq, vcpu_sve_max_vq(vcpu));
> -	__vcpu_assign_sys_reg(vcpu, ZCR_EL2, vq - 1);
> +	__vcpu_assign_sys_reg(vcpu, ZCR_EL2, p->regval & ZCR_ELx_LEN);

Once you have added the full ZCR_EL2 sanitisation, this masking can go.

>  	return true;
>  }

Thanks,

	M.

-- 
Jazz isn't dead. It just smells funny.


^ permalink raw reply

* Re: [PATCH 5/5] pinctrl: pinctrl-scmi: Log number of pins, groups, functions
From: Linus Walleij @ 2026-05-23  9:20 UTC (permalink / raw)
  To: Alex Tran
  Cc: Jyoti Bhayana, Jonathan Cameron, David Lechner, Nuno Sá,
	Andy Shevchenko, Sudeep Holla, Cristian Marussi,
	Rafael J. Wysocki, Philipp Zabel, Viresh Kumar, Guenter Roeck,
	linux-iio, linux-kernel, arm-scmi, linux-arm-kernel, linux-gpio,
	linux-pm, linux-hwmon
In-Reply-To: <CAD++jL=6ikpC-BqVqP1Ev5HC37fw=K_n6rP96AxKi0jdVcyvmw@mail.gmail.com>

On Sat, May 23, 2026 at 11:18 AM Linus Walleij <linusw@kernel.org> wrote:
> On Wed, May 13, 2026 at 6:44 PM Alex Tran <alex.tran@oss.qualcomm.com> wrote:
>
> > The SCMI pinctrl driver does not currently log the number of pins,
> > groups, and functions discovered from firmware. This information is
> > useful for confirming the firmware exposed pinctrl resources during
> > debugging.
> >
> > Log these counts after a successful probe to align with the existing
> > SCMI client driver logging pattern.
> >
> > Signed-off-by: Alex Tran <alex.tran@oss.qualcomm.com>
>
> Other kernel maintainers want a minimalist dmesg, but not me,
> so I just applied this.
>
> If someone is upset about the noise they can send a patch
> changing it to dev_dbg().

Ah scratch that, Andy made a fair point that it is available
in debugfs anyway so I dropped the patch.

Yours,
Linus Walleij


^ permalink raw reply

* Re: [PATCH 5/5] pinctrl: pinctrl-scmi: Log number of pins, groups, functions
From: Linus Walleij @ 2026-05-23  9:18 UTC (permalink / raw)
  To: Alex Tran
  Cc: Jyoti Bhayana, Jonathan Cameron, David Lechner, Nuno Sá,
	Andy Shevchenko, Sudeep Holla, Cristian Marussi,
	Rafael J. Wysocki, Philipp Zabel, Viresh Kumar, Guenter Roeck,
	linux-iio, linux-kernel, arm-scmi, linux-arm-kernel, linux-gpio,
	linux-pm, linux-hwmon
In-Reply-To: <20260513-scmi-client-probe-log-v1-5-00b47b1be009@oss.qualcomm.com>

On Wed, May 13, 2026 at 6:44 PM Alex Tran <alex.tran@oss.qualcomm.com> wrote:

> The SCMI pinctrl driver does not currently log the number of pins,
> groups, and functions discovered from firmware. This information is
> useful for confirming the firmware exposed pinctrl resources during
> debugging.
>
> Log these counts after a successful probe to align with the existing
> SCMI client driver logging pattern.
>
> Signed-off-by: Alex Tran <alex.tran@oss.qualcomm.com>

Other kernel maintainers want a minimalist dmesg, but not me,
so I just applied this.

If someone is upset about the noise they can send a patch
changing it to dev_dbg().

Yours,
Linus Walleij


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox