Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH RFC 2/4] printk: deprecate boot_delay in favour of printk_delay
From: Andrew Murray @ 2026-06-14 11:45 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jonathan Corbet, Shuah Khan, Russell King, Florian Fainelli,
	Broadcom internal kernel review list, Ray Jui, Scott Branden,
	Steven Rostedt, John Ogness, Sergey Senozhatsky, Andrew Morton,
	Sebastian Andrzej Siewior, Clark Williams, Randy Dunlap,
	Linus Torvalds, linux-doc, linux-kernel, linux-arm-kernel,
	linux-rpi-kernel, linux-rt-devel
In-Reply-To: <aibMr16r55xE26rU@pathway.suse.cz>

On Mon, 8 Jun 2026 at 15:07, Petr Mladek <pmladek@suse.com> wrote:
>
> On Mon 2026-06-01 00:17:38, Andrew Murray wrote:
> > The boot_delay (BOOT_PRINTK_DELAY) kernel parameter and printk_delay sysctl
> > are two distinct mechanisms for providing similar functionality which add a
> > delay prior to each printed printk message.
> >
> > boot_delay provides a kernel parameter for delaying printk output from
> > kernel start through to boot (SYSTEM_RUNNING), whereas printk_delay is
> > configurable only via sysctl and thus is only used post boot.
> >
> > Let's deprecate the boot_delay feature in favour of printk_delay. In order
> > to preserve functionality, we'll also extend printk_delay such that it can
> > additionally configured via a kernel parameter.
>
> I would make it clear and say: "via an early kernel parameter".
>
> Note that there are also kernel parameters which can be modified at runtime
> via /sys/module/kernel/paramters/<parameter>

OK thanks, I will update.


>
> Also I would make it clear that this changes the behavior, for
> example:
>
> <proposal>
> Behavior change:
>
> The delay enabled by both "boot_delay" and "printk_delay" continues
> working even in SYSTEM_RUNNING state. It must be explicitly stopped
> by setting printk_delay=0 via sysctl.
>
> The delay is skipped when the message is suppressed in all system
> states. It used to skipped only for the boot_delay.
> </proposal>

Yes, I'm happy to make that clearer.


>
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -1339,11 +1327,34 @@ static void boot_delay_msec(int level)
> >       }
> >  }
> >  #else
> > -static inline void boot_delay_msec(int level)
> > +static inline void __init printk_delay_calculate(void)
> > +{
> > +}
> > +
> > +static inline void early_boot_delay_msec(void)
> >  {
>
> It would be nice to print a warning that the early boot delay
> does not work, something like:
>
>         pr_warn_once("Early boot delay does not work without CONFIG_GENERIC_CALIBRATE_DELAY enabled.\n");
>
> >  }
> >  #endif
> >
> > +static int __init printk_delay_setup(char *str)
> > +{
> > +     get_option(&str, &printk_delay_msec);
> > +     if (printk_delay_msec > 10 * 1000)
> > +             printk_delay_msec = 0;
>
> Sashiko AI warns that this code accepts negative values.
> It might cause long delays, see
> https://sashiko.dev/#/patchset/20260601-deprecate_boot_delay-v1-0-c34c187142a6%40thegoodpenguin.co.uk
>
> The problem has already been there even before. But it would be nice
> to fix it.

Thanks for pointing out Sashiko, I hadn't seen its review on my
patches. Are authors expected to get emails from it, as I didn't?

In any case, it's a good spot, so I'll address.


>
> > +
> > +     printk_delay_calculate();
> > +
> > +     return 0;
> > +}
> > +early_param("printk_delay", printk_delay_setup);
> > +
> > +static int __init boot_delay_setup(char *str)
> > +{
> > +     pr_warn("boot_delay will soon be deprecated, please use printk_delay instead");
> > +     return printk_delay_setup(str);
> > +}
> > +early_param("boot_delay", boot_delay_setup);
> > +
> >  static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME);
> >  module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR);
>
> Otherwise, it looks good to me.
>
> Best Regards,
> Petr

Thanks,

Andrew Murray


^ permalink raw reply

* Re: [PATCH RFC 1/4] printk: remove BOOT_PRINTK_DELAY config option
From: Andrew Murray @ 2026-06-14 11:41 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Jonathan Corbet, Shuah Khan, Russell King, Florian Fainelli,
	Broadcom internal kernel review list, Ray Jui, Scott Branden,
	Steven Rostedt, John Ogness, Sergey Senozhatsky, Andrew Morton,
	Sebastian Andrzej Siewior, Clark Williams, Randy Dunlap,
	Linus Torvalds, linux-doc, linux-kernel, linux-arm-kernel,
	linux-rpi-kernel, linux-rt-devel
In-Reply-To: <aibCBGjVk4yqtYyT@pathway.suse.cz>

On Mon, 8 Jun 2026 at 14:22, Petr Mladek <pmladek@suse.com> wrote:
>
> On Mon 2026-06-01 00:17:37, Andrew Murray wrote:
> > The boot_delay (BOOT_PRINTK_DELAY) kernel parameter and printk_delay sysctl
> > are two distinct mechanisms for providing similar functionality which add a
> > delay prior to each printed printk message.
> >
> > In preparation of combining them into a single configurable feature, let's
> > first remove the kconfig option BOOT_PRINTK_DELAY.
> >
> > Signed-off-by: Andrew Murray <amurray@thegoodpenguin.co.uk>
>
> The option allowed to reduce a bit the vmlinux size when people were
> not interested into the functionality. I am not sure if it is worth
> it though. I am personally fine with this change.

I hadn't considered that need.

I'm happy to add this back in, but it would only make sense if this
option covered both boot_delay and printk_delay. That would change the
meaning of this existing Kconfig option, and would also allow the
removal of the printk_delay sysctl, I'm not sure if userspace assumes
this will always be there (probably not).

I'll leave this as is, unless there are objections.

Thanks,

Andrew Murray

>
> Reviewed-by: Petr Mladek <pmladek@suse.com>
>
> Best Regards,
> Petr


^ permalink raw reply

* Re: [PATCH v2] arm64: tlbflush: Don't broadcast if mm was only active on local cpu
From: Will Deacon @ 2026-06-14 11:33 UTC (permalink / raw)
  To: Linu Cherian
  Cc: Catalin Marinas, Ryan Roberts, Kevin Brodsky, Anshuman Khandual,
	Yang Shi, Mark Rutland, Huang Ying, linux-arm-kernel,
	linux-kernel
In-Reply-To: <ai6KzFgfMAxqplcr@willie-the-truck>

On Sun, Jun 14, 2026 at 12:04:44PM +0100, Will Deacon wrote:
> Can you simplify the 'if' condition here?
> 
> 	if (active == ACTIVE_CPU_NONE) {
> 		if (!try_cmpxchg_relaxed(...))
> 			WRITE_ONCE(...);
> 
> 		dsb(ishst);
> 	}
> 
> (as an aside, maybe we should implement arch_try_cmpxchg{,_relaxed} so
>  we could drop the READ_ONCE() here as well?)

Mulling this over a little more, we probably can't drop the READ_ONCE()
even if we optimised our try_cmpxchg() implementation, as it would
prevent us from eliding the DSB on the fast path.

The rest of my comments (including the refactoring above) stand, however.

Will


^ permalink raw reply

* Re: [PATCH v2] arm64: tlbflush: Don't broadcast if mm was only active on local cpu
From: Will Deacon @ 2026-06-14 11:04 UTC (permalink / raw)
  To: Linu Cherian
  Cc: Catalin Marinas, Ryan Roberts, Kevin Brodsky, Anshuman Khandual,
	Yang Shi, Mark Rutland, Huang Ying, linux-arm-kernel,
	linux-kernel
In-Reply-To: <20260523134710.3827956-1-linu.cherian@arm.com>

On Sat, May 23, 2026 at 07:17:10PM +0530, Linu Cherian wrote:
> From: Ryan Roberts <ryan.roberts@arm.com>
> 
> There are 3 variants of tlb flush that invalidate user mappings:
> flush_tlb_mm(), flush_tlb_page() and __flush_tlb_range(). All of these
> would previously unconditionally broadcast their tlbis to all cpus in
> the inner shareable domain.
> 
> But this is a waste of effort if we can prove that the mm for which we
> are flushing the mappings has only ever been active on the local cpu. In
> that case, it is safe to avoid the broadcast and simply invalidate the
> current cpu.
> 
> So let's track in mm_context_t::active_cpu either the mm has never been
> active on any cpu, has been active on more than 1 cpu, or has been
> active on precisely 1 cpu - and in that case, which one. We update this
> when switching context, being careful to ensure that it gets updated
> *before* installing the mm's pgtables. On the reader side, we ensure we
> read *after* the previous write(s) to the pgtable(s) that necessitated
> the tlb flush have completed. This guarrantees that if a cpu that is
> doing a tlb flush sees it's own id in active_cpu, then the old pgtable
> entry cannot have been seen by any other cpu and we can flush only the
> local cpu.
> 
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> Tested-by: Huang Ying <ying.huang@linux.alibaba.com>
> [linu.cherian@arm.com: Adapted for v7.1 flush tlb API changes]
> Signed-off-by: Linu Cherian <linu.cherian@arm.com>
> ---
> Changelog from RFC v1:
> - Adapted for v7.1 flush tlb API changes
>   No changes in core logic
> - Collected Rb and Tb tags
> - lat_mmap benchmark showed dsb(ishst) performs better than dsb(ish),
>   hence retained dsb(ishst) in flush_tlb_user_pre	
> 
> 
> Testing with 7.1-rc4 :
> +-----------------------+---------------------------------------------------+-------------+
> | Benchmark             | Result Class                                      |  Improvement|  
> +=======================+===================================================+=============+
> | perf/syscall          | fork (ops/sec)                                    |   (I) 3.25% |
> +-----------------------+---------------------------------------------------+-------------+
> | pts/memtier-benchmark | Protocol: Redis Clients: 100 Ratio: 1:5 (Ops/sec) |   (I) 2.70% |
> | 			| Protocol: Redis Clients: 100 Ratio: 5:1 (Ops/sec) |   (I) 2.13% |
> +-----------------------+---------------------------------------------------+-------------+

I think we need a much more comprehensive set of benchmarks before we can
begin to consider a change like this.

>  arch/arm64/include/asm/mmu.h         |  12 +++
>  arch/arm64/include/asm/mmu_context.h |   2 +
>  arch/arm64/include/asm/tlbflush.h    | 127 +++++++++++++++++++++------
>  arch/arm64/mm/context.c              |  30 ++++++-
>  4 files changed, 141 insertions(+), 30 deletions(-)

Doesn't this break BTM/SVM with the SMMU? I think that's a non-starter
even if you can provide some more compelling numbers.

> +static inline bool flush_tlb_user_pre(struct mm_struct *mm, tlbf_t flags)
> +{
> +	unsigned int self, active;
> +	bool local;
> +
> +	migrate_disable();
> +
> +	if (flags & TLBF_NOBROADCAST) {
> +		dsb(nshst);
> +		return true;
> +	}

Why does the NOBROADCAST case need migration disabled? It didn't before...

> +
> +	self = smp_processor_id();
> +
> +	/*
> +	 * The load of mm->context.active_cpu must not be reordered before the
> +	 * store to the pgtable that necessitated this flush. This ensures that
> +	 * if the value read is our cpu id, then no other cpu can have seen the
> +	 * old pgtable value and therefore does not need this old value to be
> +	 * flushed from its tlb. But we don't want to upgrade the dsb(ishst),
> +	 * needed to make the pgtable updates visible to the walker, to a
> +	 * dsb(ish) by default. So speculatively load without a barrier and if
> +	 * it indicates our cpu id, then upgrade the barrier and re-load.
> +	 */
> +	active = READ_ONCE(mm->context.active_cpu);
> +	if (active == self) {
> +		dsb(ish);
> +		active = READ_ONCE(mm->context.active_cpu);
> +	} else {
> +		dsb(ishst);
> +	}

Why can't you just do:

	dsb(ishst);
	active = READ_ONCE(mm->context.active_cpu);

?

> +
> +	local = active == self;
> +	if (!local)
> +		migrate_enable();
> +
> +	return local;
> +}
> +
> +static inline void flush_tlb_user_post(bool local)
> +{
> +	if (local)
> +		migrate_enable();
> +}

I was under the impression that disabling/enabling migration was an
expensive thing to do, so I'd really want to see some more numbers to
justify this (including from inside a VM) and allow us to consider the
trade-offs properly. It's also not at all clear to me that it's safe
from such a low-level TLB invalidation helper.

> +
>  /*
>   *	TLB Invalidation
>   *	================
> @@ -408,12 +482,20 @@ static inline void flush_tlb_all(void)
>  static inline void flush_tlb_mm(struct mm_struct *mm)
>  {
>  	unsigned long asid;
> +	bool local;
>  
> -	dsb(ishst);
> +	local = flush_tlb_user_pre(mm, TLBF_NONE);
>  	asid = __TLBI_VADDR(0, ASID(mm));
> -	__tlbi(aside1is, asid);
> -	__tlbi_user(aside1is, asid);
> -	__tlbi_sync_s1ish(mm);
> +	if (local) {
> +		__tlbi(aside1, asid);
> +		__tlbi_user(aside1, asid);
> +		dsb(nsh);
> +	} else {
> +		__tlbi(aside1is, asid);
> +		__tlbi_user(aside1is, asid);
> +		__tlbi_sync_s1ish(mm);
> +	}
> +	flush_tlb_user_post(local);

I think you've changed this since Ryan's original patch, but why are you
only calling __tlbi_sync_s1ish() for the !local case? Doesn't that break
the erratum workaround when running as a VM if the vCPU is migrated?

> diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
> index 0f4a28b87469..f34ed78393e0 100644
> --- a/arch/arm64/mm/context.c
> +++ b/arch/arm64/mm/context.c
> @@ -214,9 +214,10 @@ static u64 new_context(struct mm_struct *mm)
>  
>  void check_and_switch_context(struct mm_struct *mm)
>  {
> -	unsigned long flags;
> -	unsigned int cpu;
> +	unsigned int cpu = smp_processor_id();
>  	u64 asid, old_active_asid;
> +	unsigned int active;
> +	unsigned long flags;
>  
>  	if (system_supports_cnp())
>  		cpu_set_reserved_ttbr0();
> @@ -251,7 +252,6 @@ void check_and_switch_context(struct mm_struct *mm)
>  		atomic64_set(&mm->context.id, asid);
>  	}
>  
> -	cpu = smp_processor_id();
>  	if (cpumask_test_and_clear_cpu(cpu, &tlb_flush_pending))
>  		local_flush_tlb_all();
>  
> @@ -262,6 +262,30 @@ void check_and_switch_context(struct mm_struct *mm)
>  
>  	arm64_apply_bp_hardening();
>  
> +	/*
> +	 * Update mm->context.active_cpu in such a manner that we avoid cmpxchg
> +	 * and dsb unless we definitely need it. If initially ACTIVE_CPU_NONE
> +	 * then we are the first cpu to run so set it to our id. If initially
> +	 * any id other than ours, we are the second cpu to run so set it to
> +	 * ACTIVE_CPU_MULTIPLE. If we update the value then we must issue
> +	 * dsb(ishst) to ensure stores to mm->context.active_cpu are ordered
> +	 * against the TTBR0 write in cpu_switch_mm()/uaccess_enable(); the
> +	 * store must be visible to another cpu before this cpu could have
> +	 * populated any TLB entries based on the pgtables that will be
> +	 * installed.
> +	 */
> +	active = READ_ONCE(mm->context.active_cpu);
> +	if (active != cpu && active != ACTIVE_CPU_MULTIPLE) {
> +		if (active == ACTIVE_CPU_NONE)
> +			active = cmpxchg_relaxed(&mm->context.active_cpu,
> +						 ACTIVE_CPU_NONE, cpu);
> +
> +		if (active != ACTIVE_CPU_NONE)
> +			WRITE_ONCE(mm->context.active_cpu, ACTIVE_CPU_MULTIPLE);
> +
> +		dsb(ishst);
> +	}
> +

Can you simplify the 'if' condition here?

	if (active == ACTIVE_CPU_NONE) {
		if (!try_cmpxchg_relaxed(...))
			WRITE_ONCE(...);

		dsb(ishst);
	}

(as an aside, maybe we should implement arch_try_cmpxchg{,_relaxed} so
 we could drop the READ_ONCE() here as well?)

Will


^ permalink raw reply

* Re: [PATCH 1/8] mm: Add ptep_try_set() for lockless empty-slot installs
From: Will Deacon @ 2026-06-14  9:28 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David Vernet, Andrea Righi, Changwoo Min, Alexei Starovoitov,
	Andrii Nakryiko, Daniel Borkmann, Martin KaFai Lau,
	Kumar Kartikeya Dwivedi, Peter Zijlstra, Catalin Marinas,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Andrew Morton, David Hildenbrand, Mike Rapoport, Emil Tsalapatis,
	sched-ext, bpf, x86, linux-arm-kernel, linux-mm, linux-kernel
In-Reply-To: <20260522172219.1423324-2-tj@kernel.org>

On Fri, May 22, 2026 at 07:22:12AM -1000, Tejun Heo wrote:
> Add ptep_try_set(ptep, new_pte): atomically set *ptep to new_pte iff it is
> currently pte_none(). Returns true on success, false if the slot was already
> populated or the arch has no implementation.
> 
> The intended caller is the upcoming bpf_arena kernel-side fault recovery
> path. The install runs from a page fault that can be nested under locks
> held by the faulting kernel caller (e.g. a BPF program holding
> raw_res_spin_lock_irqsave on its arena's spinlock), so trylock-and-retry
> would A-A deadlock. Lock-free cmpxchg is the only viable option, which
> constrains this helper to special kernel page tables where concurrent
> writers cooperate via atomic accessors.
> 
> The generic version in <linux/pgtable.h> returns false. x86 and arm64
> override with try_cmpxchg-based implementations on the underlying pteval.
> Other architectures get the false stub - the callers there already fall
> through to oops.
> 
> v2: Rename to ptep_try_set(). Tighten kerneldoc. (David, Alexei)
> v3: Note that strict-zero cmpxchg is narrower than pte_none(). (Andrea)
> 
> Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> Suggested-by: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reviewed-by: Andrea Righi <arighi@nvidia.com>
> Cc: David Hildenbrand <david@kernel.org>
> ---
>  arch/arm64/include/asm/pgtable.h | 12 ++++++++++++
>  arch/x86/include/asm/pgtable.h   | 12 ++++++++++++
>  include/linux/pgtable.h          | 25 +++++++++++++++++++++++++
>  3 files changed, 49 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 9029b81ccbe8..28bada97d443 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -1830,6 +1830,18 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>  	return __ptep_get_and_clear(mm, addr, ptep);
>  }
>  
> +/*
> + * Note: strictly-zero compare is narrower than pte_none(), but the gap is
> + * harmless: a fresh kernel PTE has no software bits set.
> + */

This comment really confused me :/

What is a "fresh" kernel PTE and why do you specifically call out "software
bits" if the CAS requires all 64 bits to be 0? Why is that narrower than
pte_none() given that pte_none() for arm64 is:

#define pte_none(pte)           (!pte_val(pte))

Will


^ permalink raw reply

* Re: [PATCH 1/2] arm64: tlbflush: Don't broadcast if mm was only active on local cpu
From: Will Deacon @ 2026-06-14  9:44 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: sk, linux-arm-kernel, linux-kernel, Ryan Roberts, Andrew Morton,
	David Hildenbrand, Anshuman Khandual, Mike Rapoport, Dev Jain,
	Kevin Brodsky, Marc Zyngier, Oliver Upton, cl, Huang Ying,
	Linu Cherian
In-Reply-To: <airUWY4jFgxWvQ4s@arm.com>

On Thu, Jun 11, 2026 at 04:29:29PM +0100, Catalin Marinas wrote:
> On Tue, Jun 09, 2026 at 02:34:32PM -0700, sk@gentwo.org wrote:
> > From: Ryan Roberts <ryan.roberts@arm.com>
> > 
> > There are 3 variants of tlb flush that invalidate user mappings:
> > flush_tlb_mm(), flush_tlb_page() and __flush_tlb_range(). All of these
> > would previously unconditionally broadcast their tlbis to all cpus in
> > the inner shareable domain.
> > 
> > But this is a waste of effort if we can prove that the mm for which we
> > are flushing the mappings has only ever been active on the local cpu. In
> > that case, it is safe to avoid the broadcast and simply invalidate the
> > current cpu.
> > 
> > So let's track in mm_context_t::active_cpu either the mm has never been
> > active on any cpu, has been active on more than 1 cpu, or has been
> > active on precisely 1 cpu - and in that case, which one. We update this
> > when switching context, being careful to ensure that it gets updated
> > *before* installing the mm's pgtables. On the reader side, we ensure we
> > read *after* the previous write(s) to the pgtable(s) that necessitated
> > the tlb flush have completed. This guarrantees that if a cpu that is
> > doing a tlb flush sees it's own id in active_cpu, then the old pgtable
> > entry cannot have been seen by any other cpu and we can flush only the
> > local cpu.
> > 
> > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> > Tested-by: Huang Ying <ying.huang@linux.alibaba.com>
> > [linu.cherian@arm.com: Adapted for v7.1 flush tlb API changes]
> > Signed-off-by: Linu Cherian <linu.cherian@arm.com>
> 
> Nit: if you repost someone's patch, please add your signed-off-by.

I have a feeling this patch is horribly broken, so I'll reply on the
original.

Will


^ permalink raw reply

* Re: [PATCH v2 0/7] KVM: arm64: Forward FFA_NOTIFICATION* calls to TrustZone
From: Will Deacon @ 2026-06-14  9:29 UTC (permalink / raw)
  To: Vincent Donnefort
  Cc: Sebastian Ene, catalin.marinas, maz, oupton, joey.gouly, korneld,
	kvmarm, linux-arm-kernel, linux-kernel, android-kvm,
	mrigendra.chaubey, perlarsen, suzuki.poulose, yuzenghui
In-Reply-To: <ailtCAcEaJIgZ5Ap@google.com>

On Wed, Jun 10, 2026 at 02:56:24PM +0100, Vincent Donnefort wrote:
> On Wed, Jun 10, 2026 at 01:23:04PM +0100, Will Deacon wrote:
> > On Wed, Jun 10, 2026 at 01:15:44PM +0100, Vincent Donnefort wrote:
> > > On Wed, Jun 10, 2026 at 11:15:14AM +0100, Will Deacon wrote:
> > > > On Wed, Jun 10, 2026 at 10:26:59AM +0100, Vincent Donnefort wrote:
> > > > > On Mon, Jun 08, 2026 at 04:55:42PM +0000, Sebastian Ene wrote:
> > > > > > Remove the FFA_NOTIFICATION* calls from the blocklist used by the pKVM
> > > > > > FF-A proxy. This restriction was preventing the use of asynchronous
> > > > > > signaling mechanisms defined by the Arm FF-A specification to
> > > > > > communicate with the secure services.
> > > > > > While these calls are markes as optional, there is no reason why the
> > > > > > hypervisor proxy would block them because:
> > > > > > 
> > > > > > 1. Host is the Sole Non-Secure Endpoint: The Host operates as the
> > > > > >    only Non-Secure VM ID (VM ID 0) recognized by the Secure World.
> > > > > >    Because all forwarded notifications are inherently attributed to
> > > > > >    the Host by the SPMC, there is no risk of VM ID spoofing
> > > > > >    originating from the Normal World.
> > > > > > 
> > > > > > 2. No Memory Pointers or Addresses: The FFA_NOTIFICATION_* ABIs
> > > > > >    operate strictly via register-based parameters, passing only
> > > > > >    VM IDs, VCPU IDs, flags, and bitmaps. Because these calls do
> > > > > >    not contain memory addresses, offsets, or pointers, forwarding
> > > > > >    them doesn't pose a risk of memory-based confused deputy attack
> > > > > >    (e.g., tricking the SPMC into overwriting protected memory).
> > > > > > 
> > > > > > While the pKVM proxy behaves as a relayer, it doesn't currently have its
> > > > > > own FF-A ID(only the host has the ID 0). The behavior of the setup
> > > > > > flow is covered by the spec in the: '10.9 Notification support without
> > > > > > a Hypervisor'.
> > > > > 
> > > > > As it is only a relayer. Is it really important to check SBZ arguments and
> > > > > fields on behalf of Trustzone? It doesn't feel it brings any security. If the
> > > > > host passes broken arguments, I don't believe this puts pKVM at risk. Does it? 
> > > > 
> > > > I think the problem would be if an update to FF-A allocated some of the
> > > > currently SBZ bits to implement some functionality that we would want
> > > > to filter at EL2.
> > > 
> > > I suppose that would bump the FF-A version and the proxy would reject it?
> > 
> > Maybe? I don't think they'd _have_ to bump the version number.
> > 
> > > If we really want to check for those arguments to be 0:
> > > 
> > >  * Shouldn't we extend this check to other FF-A invocations?
> > 
> > yes, that's what the diff was doing in the reply here:
> > 
> > https://lore.kernel.org/all/af3fW468-f1KXCrC@google.com/
> > 
> > but, as I said here:
> > 
> > https://lore.kernel.org/all/ahmxiFXXTupafbXw@willie-the-truck/
> > 
> > I don't particularly like the table-driven indirection (the checks
> > should just be inlined).
> 
> Ha, sorry I'm late to the party. 
> 
> Perhaps this series should start with adding ffa_check_unused_args_sbz() to the
> existing allowed FF-A invocations?

Yes, that part now seems to be missing.

Seb, please can you respin with that included?

Will


^ permalink raw reply

* Re: [PATCH] net: airoha: Fix skb->priority underflow in airoha_dev_select_queue()
From: Lorenzo Bianconi @ 2026-06-14  8:09 UTC (permalink / raw)
  To: Wayen.Yan
  Cc: netdev, horms, pabeni, kuba, edumazet, andrew+netdev,
	angelogioacchino.delregno, matthias.bgg, linux-arm-kernel,
	linux-mediatek
In-Reply-To: <6a2de8c5.2c570c9e.53b1a.0e1b@mx.google.com>

[-- Attachment #1: Type: text/plain, Size: 1707 bytes --]

> In airoha_dev_select_queue(), the expression:
> 
>   queue = (skb->priority - 1) % AIROHA_NUM_QOS_QUEUES;
> 
> implicitly converts to unsigned arithmetic: when skb->priority is 0
> (the default for unclassified traffic), (0u - 1u) wraps to UINT_MAX,
> and UINT_MAX % 8 = 7, routing default best-effort packets to the
> highest-priority QoS queue. This causes QoS inversion where the
> majority of traffic on a PON gateway starves actual high-priority
> flows (VoIP, gaming, etc.).
> 
> Fix by guarding the subtraction: when priority is 0, map to queue 0
> (lowest priority), otherwise apply the original (priority - 1) % 8
> mapping.
> 
> Fixes: 2b288b81560b ("net: airoha: Introduce ndo_select_queue callback")
> Signed-off-by: Wayen <win847@gmail.com>

Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>

> ---
>  drivers/net/ethernet/airoha/airoha_eth.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> index 31cdb11cd7..d476ef83c3 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> @@ -1933,7 +1933,7 @@ static u16 airoha_dev_select_queue(struct net_device *dev, struct sk_buff *skb,
>  	 */
>  	channel = netdev_uses_dsa(dev) ? skb_get_queue_mapping(skb) : port->id;
>  	channel = channel % AIROHA_NUM_QOS_CHANNELS;
> -	queue = (skb->priority - 1) % AIROHA_NUM_QOS_QUEUES; /* QoS queue */
> +	queue = skb->priority ? (skb->priority - 1) % AIROHA_NUM_QOS_QUEUES : 0;
>  	queue = channel * AIROHA_NUM_QOS_QUEUES + queue;
>  
>  	return queue < dev->num_tx_queues ? queue : 0;
> -- 
> 2.51.0
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH] net: airoha: Remove dead MT7996 NPU firmware declarations
From: Lorenzo Bianconi @ 2026-06-14  8:16 UTC (permalink / raw)
  To: Wayen.Yan
  Cc: netdev, horms, pabeni, kuba, edumazet, andrew+netdev,
	angelogioacchino.delregno, matthias.bgg, linux-arm-kernel,
	linux-mediatek
In-Reply-To: <6a2dea77.01c4f138.336eeb.a256@mx.google.com>

[-- Attachment #1: Type: text/plain, Size: 2985 bytes --]

> Remove the NPU_EN7581_7996_FIRMWARE_DATA/RV32 #define macros and
> their corresponding MODULE_FIRMWARE() declarations. Neither the
> en7581_npu_soc_data nor the an7583_npu_soc_data references these
> firmware names, and no firmware loading path in the driver ever
> requests them. The only references are the #define lines themselves
> and the MODULE_FIRMWARE() declarations below.
> 
> Keeping dead MODULE_FIRMWARE entries causes modprobe/udev to attempt
> pre-loading non-existent firmware files, generating kernel log noise
> and misleading distributors about which firmware files to package.
> 
> Fixes: 23290c7bc190 ("net: airoha: Introduce Airoha NPU support")
> Signed-off-by: Wayen <win847@gmail.com>

Please drop this patch since EN7581_7996 firmware is defined via dts
for 7581:

commit 3847173525e307ebcd23bd4863da943ea78b0057
Author: Lorenzo Bianconi <lorenzo@kernel.org>
Date:   Tue Jan 20 11:17:18 2026 +0100

    net: airoha: npu: Add the capability to read firmware names from dts
    
    Introduce the capability to read the firmware binary names from device-tree
    using the firmware-name property if available.
    This patch is needed because NPU firmware binaries are board specific since
    they depend on the MediaTek WiFi chip used on the board (e.g. MT7996 or
    MT7992) and the WiFi chip version info is not available in the NPU driver.
    This is a preliminary patch to enable MT76 NPU offloading if the Airoha SoC
    is equipped with MT7996 (Eagle) WiFi chipset.

https://github.com/openwrt/openwrt/blob/main/target/linux/airoha/dts/an7581-npu-mt7996.dtsi

and here these macros are used to notify userspace for firmware loading.

Regards,
Lorenzo

> ---
>  drivers/net/ethernet/airoha/airoha_npu.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/airoha/airoha_npu.c b/drivers/net/ethernet/airoha/airoha_npu.c
> index 17dbdc8325..93095f3894 100644
> --- a/drivers/net/ethernet/airoha/airoha_npu.c
> +++ b/drivers/net/ethernet/airoha/airoha_npu.c
> @@ -16,8 +16,6 @@
>  
>  #define NPU_EN7581_FIRMWARE_DATA		"airoha/en7581_npu_data.bin"
>  #define NPU_EN7581_FIRMWARE_RV32		"airoha/en7581_npu_rv32.bin"
> -#define NPU_EN7581_7996_FIRMWARE_DATA		"airoha/en7581_MT7996_npu_data.bin"
> -#define NPU_EN7581_7996_FIRMWARE_RV32		"airoha/en7581_MT7996_npu_rv32.bin"
>  #define NPU_AN7583_FIRMWARE_DATA		"airoha/an7583_npu_data.bin"
>  #define NPU_AN7583_FIRMWARE_RV32		"airoha/an7583_npu_rv32.bin"
>  #define NPU_EN7581_FIRMWARE_RV32_MAX_SIZE	0x200000
> @@ -822,8 +820,6 @@ module_platform_driver(airoha_npu_driver);
>  
>  MODULE_FIRMWARE(NPU_EN7581_FIRMWARE_DATA);
>  MODULE_FIRMWARE(NPU_EN7581_FIRMWARE_RV32);
> -MODULE_FIRMWARE(NPU_EN7581_7996_FIRMWARE_DATA);
> -MODULE_FIRMWARE(NPU_EN7581_7996_FIRMWARE_RV32);
>  MODULE_FIRMWARE(NPU_AN7583_FIRMWARE_DATA);
>  MODULE_FIRMWARE(NPU_AN7583_FIRMWARE_RV32);
>  MODULE_LICENSE("GPL");
> -- 
> 2.51.0
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re:Re: [PATCH v3] net: stmmac: fix fatal bus error on resume by reinitializing RX buffers
From: Ding Hui @ 2026-06-14  6:14 UTC (permalink / raw)
  To: kuba
  Cc: alexandre.torgue, andrew+netdev, davem, dinghui1111, dinghui,
	edumazet, j.raczynski, linux-arm-kernel, linux-kernel,
	linux-stm32, liuxuanjun, maxime.chevallier, mcoquelin.stm32,
	netdev, pabeni, rmk+kernel, xiasanbo, yangchen11
In-Reply-To: <20260608193059.78e05dce@kernel.org>

At 2026-06-09 10:30:59, "Jakub Kicinski" <kuba@kernel.org> wrote:
>On Thu,  4 Jun 2026 22:45:54 +0800 Ding Hui wrote:
>> +/**
>> + * stmmac_reinit_rx_descriptors - re-program RX descriptor buffer addresses
>> + *				   after stmmac_clear_descriptors()
>> + * @priv: driver private structure
>> + * @dma_conf: structure holding the dma data
>> + * @queue: RX queue index
>
>nit:
>
>kernel-doc script says:
>
>Warning: drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:1733 No description found for return value of 'stmmac_reinit_rx_descriptors'
>
>You need a Returns: statement in this kdoc
>-- 
>pw-bot: cr

Sorry for late reply. I will update a new version for it. Thanks.



^ permalink raw reply

* Re:Re: [PATCH v3] net: stmmac: fix fatal bus error on resume by reinitializing RX buffers
From: Ding Hui @ 2026-06-14  6:02 UTC (permalink / raw)
  To: j.raczynski
  Cc: alexandre.torgue, andrew+netdev, davem, dinghui1111, dinghui,
	edumazet, kuba, linux-arm-kernel, linux-kernel, linux-stm32,
	liuxuanjun, maxime.chevallier, mcoquelin.stm32, netdev, pabeni,
	rmk+kernel, xiasanbo, yangchen11
In-Reply-To: <aiaORbb0lZVxDg8L@AMDC4622.eu.corp.samsungelectronics.net>

At 2026-06-08 17:41:25, "Jakub Raczynski" <j.raczynski@samsung.com> wrote:
>On Thu, Jun 04, 2026 at 10:45:54PM +0800, Ding Hui wrote:
>> From: Ding Hui <dinghui@lixiang.com>
>> +	for (queue = 0; queue < priv->plat->rx_queues_to_use; queue++) {
>> +		ret = stmmac_reinit_rx_descriptors(priv, &priv->dma_conf,
>> +						   queue);
>> +		if (ret) {
>> +			netdev_err(priv->dev,
>> +				   "%s: rx desc reinit failed on queue %u\n",
>> +				   __func__, queue);
>> +			mutex_unlock(&priv->lock);
>> +			rtnl_unlock();
>> +			return ret;
>> +		}
>> +	}
>
>This is not directly related to the patch, but rather stmmac_resume() itself,
>but doesn't this return and hw_setup one leave bunch of descriptor memory
>hanging and effectively leaked?
>
>> +
>>  	ret = stmmac_hw_setup(ndev);
>>  	if (ret < 0) {
>>  		netdev_err(priv->dev, "%s: Hw setup failed\n", __func__);
>> -- 
>

You are right that both error paths leave the descriptor rings and RX
buffers allocated without an explicit cleanup. However, I prefer to call
it a memory "hanging" but not "leaked":

The memory is not permanently leaked. All RX buffers allocated in the
error path are stored in dma_conf->rx_queue[q].buf_pool[].page (or
.xdp for XSK queues), and the DMA descriptor rings themselves remain
reachable via priv->dma_conf. When the user eventually brings the
interface down, stmmac_release() -> free_dma_desc_resources() will
free everything correctly.

Maybe I should submit a follow-up patch that adds proper cleanup to
stmmac_resume()'s error paths (calling free_dma_desc_resources() and
marking the device as not running), if that would be welcome. I'd
prefer to keep it separate from this fix to keep the scope clean.

>Other than that, I don't see any obvious issues.
>

Thanks for the review.



^ permalink raw reply

* [RFC PATCH net-next 7/7] net: airoha: add SOE XFRM packet offload support
From: Jihong Min @ 2026-06-14  4:00 UTC (permalink / raw)
  To: netdev, Lorenzo Bianconi
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Simon Horman, Herbert Xu, Steffen Klassert,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, devicetree,
	Matthias Brugger, AngeloGioacchino Del Regno, linux-arm-kernel,
	linux-mediatek, Christian Marangi, Felix Fietkau, linux-kernel,
	Jihong Min
In-Reply-To: <20260614040032.1567994-1-hurryman2212@gmail.com>

Add the EN7581 Secure Offload Engine provider. The provider programs ESP
SAs, exposes NETIF_F_HW_ESP through xfrmdev_ops, submits encrypt and
decrypt packets through the QDMA SOE path, and handles SOE completion
delivery.

Mirror the XFRM ops to DSA user devices whose CPU conduit is an Airoha
netdev so packet offload remains available through switch ports.

Signed-off-by: Jihong Min <hurryman2212@gmail.com>
---
 drivers/net/ethernet/airoha/Kconfig      |   13 +
 drivers/net/ethernet/airoha/Makefile     |    1 +
 drivers/net/ethernet/airoha/airoha_soe.c | 1896 ++++++++++++++++++++++
 3 files changed, 1910 insertions(+)
 create mode 100644 drivers/net/ethernet/airoha/airoha_soe.c

diff --git a/drivers/net/ethernet/airoha/Kconfig b/drivers/net/ethernet/airoha/Kconfig
index ad3ce501e7a5..a20e9dd0bfde 100644
--- a/drivers/net/ethernet/airoha/Kconfig
+++ b/drivers/net/ethernet/airoha/Kconfig
@@ -31,4 +31,17 @@ config NET_AIROHA_FLOW_STATS
 	help
 	  Enable Aiorha flowtable statistic counters.
 
+config NET_AIROHA_SOE
+	bool "Airoha SOE ESP offload support"
+	depends on NET_AIROHA
+	depends on INET
+	select XFRM
+	select XFRM_OFFLOAD
+	help
+	  Enable support for the Airoha Secure Offload Engine used by
+	  the Ethernet driver for ESP packet offload. This option only
+	  adds the provider and netdev plumbing; ESP offload is still
+	  advertised at runtime only when the SOE block and required
+	  packet offload path are available.
+
 endif #NET_VENDOR_AIROHA
diff --git a/drivers/net/ethernet/airoha/Makefile b/drivers/net/ethernet/airoha/Makefile
index 94468053e34b..b68b8f614b0e 100644
--- a/drivers/net/ethernet/airoha/Makefile
+++ b/drivers/net/ethernet/airoha/Makefile
@@ -6,4 +6,5 @@
 obj-$(CONFIG_NET_AIROHA) += airoha-eth.o
 airoha-eth-y := airoha_eth.o airoha_ppe.o
 airoha-eth-$(CONFIG_DEBUG_FS) += airoha_ppe_debugfs.o
+airoha-eth-$(CONFIG_NET_AIROHA_SOE) += airoha_soe.o
 obj-$(CONFIG_NET_AIROHA_NPU) += airoha_npu.o
diff --git a/drivers/net/ethernet/airoha/airoha_soe.c b/drivers/net/ethernet/airoha/airoha_soe.c
new file mode 100644
index 000000000000..3a240ed44d7f
--- /dev/null
+++ b/drivers/net/ethernet/airoha/airoha_soe.c
@@ -0,0 +1,1896 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Airoha Secure Offload Engine (SOE) provider for the Ethernet driver.
+ *
+ * This file owns the EN7581 SOE packet-offload glue used by airoha_eth:
+ * xfrm state programming, hop-descriptor TX metadata, SOE RX completion
+ * decoding, and DSA proxy netdev binding. The SOE block is reached through
+ * the FE/QDMA packet fabric and is initialized by the Ethernet driver rather
+ * than by a separate platform driver.
+ */
+
+#include <linux/atomic.h>
+#include <linux/bitfield.h>
+#include <linux/completion.h>
+#include <linux/device.h>
+#include <linux/err.h>
+#include <linux/if_ether.h>
+#include <linux/if_packet.h>
+#include <linux/iopoll.h>
+#include <linux/io.h>
+#include <linux/ipv6.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/moduleparam.h>
+#include <linux/netdevice.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/rcupdate.h>
+#include <linux/refcount.h>
+#include <linux/skbuff.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/string.h>
+#include <linux/udp.h>
+#include <linux/unaligned.h>
+
+#include <net/dst.h>
+#include <net/esp.h>
+#include <net/gso.h>
+#include <net/ip.h>
+#include <net/net_namespace.h>
+#include <net/xfrm.h>
+
+#include "airoha_eth.h"
+#include "airoha_regs.h"
+#include "airoha_soe.h"
+
+#define AIROHA_SOE_NUM_SA 32
+#define AIROHA_SOE_QDMA_HOP_DESC_LEN 32
+#define AIROHA_SOE_KEY_WORDS 8
+#define AIROHA_SOE_ADDR_WORDS 4
+#define AIROHA_SOE_SA_TIMEOUT_US 1000
+#define AIROHA_SOE_SA_FREE_TIMEOUT HZ
+#define AIROHA_SOE_HOP_DESC0_ENCRYPT 0xffffff81ULL
+#define AIROHA_SOE_HOP_DESC0_DECRYPT 0xffffff82ULL
+#define AIROHA_SOE_HOP_DESC1 0x1ff00000000ULL
+#define AIROHA_SOE_QDMA_TX_RING 2
+#define AIROHA_SOE_TXMSG2_DEFAULT 0xff00ffff
+
+/* This is the packet/IPsec SOE window at 0x1fbfa000. EN7581 E2 exposes
+ * this register block for packet processing, not standalone crypto offload.
+ */
+#define AIROHA_SOE_GLB_CFG 0x000
+#define AIROHA_SOE_GLB_CFG_ENC_EN BIT(0)
+#define AIROHA_SOE_GLB_CFG_DEC_EN BIT(1)
+#define AIROHA_SOE_CONT_ICV_CTRL 0x004
+#define AIROHA_SOE_INT_EN 0x020
+#define AIROHA_SOE_INT_STS 0x024
+#define AIROHA_SOE_INT_ALL GENMASK(15, 0)
+#define AIROHA_SOE_CNT_CLR 0x04c
+#define AIROHA_SOE_CNT_CLR_ALL BIT(0)
+#define AIROHA_SOE_SA_CTRL 0x100
+#define AIROHA_SOE_SA_DONE 0x104
+#define AIROHA_SOE_SA_CMD 0x110
+#define AIROHA_SOE_BCNT_THSHD_32_SOFT 0x114
+#define AIROHA_SOE_BCNT_THSHD_64_SOFT 0x118
+#define AIROHA_SOE_SA_SPI 0x11c
+#define AIROHA_SOE_SA_UDP_PORT 0x120
+#define AIROHA_SOE_SA_ENC_KEY(n) (0x124 + (n) * 4)
+#define AIROHA_SOE_SA_HMAC_KEY(n) (0x144 + (n) * 4)
+#define AIROHA_SOE_SA_SRC_ADDR(n) (0x164 + (n) * 4)
+#define AIROHA_SOE_SA_DST_ADDR(n) (0x174 + (n) * 4)
+#define AIROHA_SOE_ICV_OK_LO_CNT 0x184
+#define AIROHA_SOE_ICV_OK_HI_CNT 0x188
+#define AIROHA_SOE_ICV_FAIL_LO_CNT 0x18c
+#define AIROHA_SOE_ICV_FAIL_HI_CNT 0x190
+#define AIROHA_SOE_CON_ICV_FAIL_CNT 0x194
+#define AIROHA_SOE_SEQ_NUM_LO 0x198
+#define AIROHA_SOE_SEQ_NUM_HI 0x19c
+#define AIROHA_SOE_BCNT_LO 0x1a0
+#define AIROHA_SOE_BCNT_HI 0x1a4
+#define AIROHA_SOE_FLOW_LAB_DSCP 0x1a8
+#define AIROHA_SOE_BCNT_80 0x1ac
+#define AIROHA_SOE_BCNT_THSHD_80 0x1b0
+#define AIROHA_SOE_BCNT_THSHD_32_HARD 0x1b4
+#define AIROHA_SOE_BCNT_THSHD_64_HARD 0x1b8
+#define AIROHA_SOE_SEQ_THSHD_32_SOFT 0x1bc
+#define AIROHA_SOE_SEQ_THSHD_64_SOFT 0x1c0
+#define AIROHA_SOE_SEQ_THSHD_32_HARD 0x1c4
+#define AIROHA_SOE_SEQ_THSHD_64_HARD 0x1c8
+#define AIROHA_SOE_SA_CTRL_WR BIT(0)
+#define AIROHA_SOE_SA_CTRL_IDX GENMASK(15, 8)
+#define AIROHA_SOE_SA_DONE_W1C BIT(0)
+
+#define AIROHA_SOE_SA_CMD_ENC BIT(0)
+#define AIROHA_SOE_SA_CMD_CIPHER GENMASK(3, 1)
+#define AIROHA_SOE_SA_CMD_HASH GENMASK(6, 4)
+#define AIROHA_SOE_SA_CMD_AES_KEY_LEN GENMASK(8, 7)
+#define AIROHA_SOE_SA_CMD_ESN_EN BIT(9)
+#define AIROHA_SOE_SA_CMD_OUT_IPV6 BIT(10)
+#define AIROHA_SOE_SA_CMD_ESP_MODE BIT(11) /* 0=tunnel, 1=transport */
+#define AIROHA_SOE_SA_CMD_NAT_EN BIT(12)
+#define AIROHA_SOE_SA_CMD_ANTI_RPLY_EN BIT(13)
+#define AIROHA_SOE_SA_CMD_ANTI_RPLY_WDW GENMASK(15, 14)
+#define AIROHA_SOE_SA_CMD_SN_ERR_DROP BIT(16)
+#define AIROHA_SOE_SA_CMD_PAD_ERR_DROP BIT(17)
+#define AIROHA_SOE_SA_CMD_ICV_ERR_DROP BIT(18)
+#define AIROHA_SOE_SA_CMD_GCM_ICV_LEN GENMASK(25, 24)
+#define AIROHA_SOE_SA_CMD_DEC_UDP_PARSER_EN BIT(29)
+#define AIROHA_SOE_SA_CMD_VLD BIT(31)
+
+#define AIROHA_SOE_CIPHER_AES_CBC 1
+#define AIROHA_SOE_CIPHER_AES_GCM 2
+#define AIROHA_SOE_HASH_HMAC_SHA1_96 1
+#define AIROHA_SOE_HASH_HMAC_SHA256_128 2
+#define AIROHA_SOE_AES_KEY_128 0
+#define AIROHA_SOE_AES_KEY_192 1
+#define AIROHA_SOE_AES_KEY_256 2
+#define AIROHA_SOE_QDMA_QUEUE_ENCRYPT 8
+#define AIROHA_SOE_QDMA_QUEUE_DECRYPT 9
+#define AIROHA_SOE_NATT_PORT 4500
+#define AIROHA_SOE_HOP_FLAG_ENCRYPTED 3
+#define AIROHA_SOE_HOP_FLAG_DECRYPTED 4
+#define AIROHA_SOE_HOP_FLAG_ERROR_BASE 5
+#define AIROHA_SOE_HOP_INFO_ENCRYPT 2
+#define AIROHA_SOE_HOP_INFO_DECRYPT 3
+
+static unsigned int airoha_soe_rx_trace_packets;
+module_param_named(soe_rx_trace_packets, airoha_soe_rx_trace_packets, uint,
+		   0600);
+MODULE_PARM_DESC(soe_rx_trace_packets,
+		 "Number of SOE RX completion IPv4 headers to log");
+
+enum airoha_soe_ctx_dir {
+	AIROHA_SOE_CTX_OUT,
+	AIROHA_SOE_CTX_IN,
+};
+
+struct airoha_soe_ctx {
+	struct list_head list;
+	enum airoha_soe_ctx_dir dir;
+	union {
+		struct dst_entry *dst;
+		struct {
+			struct xfrm_state *x;
+			struct airoha_gdm_dev *gdm_dev;
+			struct net_device *dev;
+			__be32 saddr;
+			__be16 sport;
+			u16 foe_hash;
+			u32 foe_reason;
+			u8 sa_index;
+			bool foe_valid;
+			u32 mark;
+		} rx;
+	};
+};
+
+struct airoha_soe_sa {
+	struct airoha_soe *soe;
+	unsigned int index;
+	u32 cmd;
+	u32 spi;
+
+	spinlock_t lock; /* Protects in-flight context queues and dead. */
+	struct list_head tx_queue;
+	struct list_head rx_queue;
+	struct completion idle;
+	unsigned int inflight;
+	bool dead;
+};
+
+struct airoha_soe_xfrm_state {
+	struct airoha_gdm_dev *dev;
+	struct airoha_soe *soe;
+	struct airoha_soe_sa *sa;
+	bool counted;
+};
+
+struct airoha_soe_sa_cfg {
+	u32 cmd;
+	u32 spi;
+	u32 udp_port;
+	u32 enc_key[AIROHA_SOE_KEY_WORDS];
+	u32 hmac_key[AIROHA_SOE_KEY_WORDS];
+	u32 src_addr[AIROHA_SOE_ADDR_WORDS];
+	u32 dst_addr[AIROHA_SOE_ADDR_WORDS];
+	u64 soft_byte_limit;
+	u64 hard_byte_limit;
+	u64 soft_packet_limit;
+	u64 hard_packet_limit;
+};
+
+struct airoha_soe_rx_info {
+	int packet_len;
+	bool encap;
+	__be16 sport;
+	__be16 dport;
+	__be32 spi;
+};
+
+struct airoha_soe {
+	struct device *dev;
+	void __iomem *base;
+
+	/* Serialize SA table programming and software slot ownership. */
+	struct mutex sa_lock;
+	unsigned long sa_map;
+	struct airoha_soe_sa __rcu *sa[AIROHA_SOE_NUM_SA];
+	atomic_t pending_rx;
+
+	spinlock_t state_lock; /* Protects dead against concurrent users. */
+	refcount_t refcnt;
+	struct completion released;
+	bool dead;
+};
+
+static const struct xfrmdev_ops airoha_soe_xfrmdev_ops;
+static const struct xfrmdev_ops airoha_soe_dsa_xfrmdev_ops;
+
+static struct airoha_soe *airoha_soe_get_ref(struct airoha_soe *soe)
+{
+	unsigned long flags;
+	bool alive;
+
+	if (!soe)
+		return NULL;
+
+	spin_lock_irqsave(&soe->state_lock, flags);
+	alive = !soe->dead && refcount_inc_not_zero(&soe->refcnt);
+	spin_unlock_irqrestore(&soe->state_lock, flags);
+
+	return alive ? soe : NULL;
+}
+
+static void airoha_soe_put_ref(struct airoha_soe *soe)
+{
+	if (soe && refcount_dec_and_test(&soe->refcnt))
+		complete(&soe->released);
+}
+
+bool airoha_soe_available(struct airoha_soe *soe)
+{
+	unsigned long flags;
+	bool available;
+
+	if (!soe)
+		return false;
+
+	spin_lock_irqsave(&soe->state_lock, flags);
+	available = !soe->dead;
+	spin_unlock_irqrestore(&soe->state_lock, flags);
+
+	return available;
+}
+
+u32 airoha_soe_features(struct airoha_soe *soe)
+{
+	return airoha_soe_available(soe) ? AIROHA_SOE_FEATURE_ESP : 0;
+}
+
+static u64 airoha_soe_limit(u64 limit)
+{
+	return limit == XFRM_INF ? U64_MAX : limit;
+}
+
+static int airoha_soe_wait_sa_done(struct airoha_soe *soe)
+{
+	u32 done;
+	int err;
+
+	err = readl_poll_timeout(soe->base + AIROHA_SOE_SA_DONE, done,
+				 done & AIROHA_SOE_SA_DONE_W1C, 1,
+				 AIROHA_SOE_SA_TIMEOUT_US);
+	writel(0, soe->base + AIROHA_SOE_SA_CTRL);
+	writel(AIROHA_SOE_SA_DONE_W1C, soe->base + AIROHA_SOE_SA_DONE);
+
+	return err;
+}
+
+static int airoha_soe_commit_sa(struct airoha_soe *soe, unsigned int index)
+{
+	u32 ctrl;
+
+	/* SA registers are a single staging window committed by index. */
+	writel(AIROHA_SOE_SA_DONE_W1C, soe->base + AIROHA_SOE_SA_DONE);
+	ctrl = FIELD_PREP(AIROHA_SOE_SA_CTRL_IDX, index) |
+	       AIROHA_SOE_SA_CTRL_WR;
+	writel(ctrl, soe->base + AIROHA_SOE_SA_CTRL);
+
+	return airoha_soe_wait_sa_done(soe);
+}
+
+static void airoha_soe_write_key(void __iomem *base, u32 reg, const u32 *key)
+{
+	unsigned int i;
+
+	for (i = 0; i < AIROHA_SOE_KEY_WORDS; i++)
+		writel(key[i], base + reg + i * sizeof(u32));
+}
+
+static void airoha_soe_write_addr(void __iomem *base, u32 reg, const u32 *addr)
+{
+	unsigned int i;
+
+	for (i = 0; i < AIROHA_SOE_ADDR_WORDS; i++)
+		writel(addr[i], base + reg + i * sizeof(u32));
+}
+
+static int airoha_soe_program_sa_locked(struct airoha_soe *soe,
+					unsigned int index,
+					const struct airoha_soe_sa_cfg *cfg)
+{
+	void __iomem *base = soe->base;
+
+	writel(cfg->cmd | AIROHA_SOE_SA_CMD_VLD, base + AIROHA_SOE_SA_CMD);
+	writel(lower_32_bits(cfg->soft_byte_limit),
+	       base + AIROHA_SOE_BCNT_THSHD_32_SOFT);
+	writel(upper_32_bits(cfg->soft_byte_limit),
+	       base + AIROHA_SOE_BCNT_THSHD_64_SOFT);
+	writel(cfg->spi, base + AIROHA_SOE_SA_SPI);
+	writel(cfg->udp_port, base + AIROHA_SOE_SA_UDP_PORT);
+	airoha_soe_write_key(base, AIROHA_SOE_SA_ENC_KEY(0), cfg->enc_key);
+	airoha_soe_write_key(base, AIROHA_SOE_SA_HMAC_KEY(0), cfg->hmac_key);
+	airoha_soe_write_addr(base, AIROHA_SOE_SA_SRC_ADDR(0), cfg->src_addr);
+	airoha_soe_write_addr(base, AIROHA_SOE_SA_DST_ADDR(0), cfg->dst_addr);
+
+	writel(0, base + AIROHA_SOE_ICV_OK_LO_CNT);
+	writel(0, base + AIROHA_SOE_ICV_OK_HI_CNT);
+	writel(0, base + AIROHA_SOE_ICV_FAIL_LO_CNT);
+	writel(0, base + AIROHA_SOE_ICV_FAIL_HI_CNT);
+	writel(0, base + AIROHA_SOE_CON_ICV_FAIL_CNT);
+	writel(0, base + AIROHA_SOE_SEQ_NUM_LO);
+	writel(0, base + AIROHA_SOE_SEQ_NUM_HI);
+	writel(0, base + AIROHA_SOE_BCNT_LO);
+	writel(0, base + AIROHA_SOE_BCNT_HI);
+	writel(0, base + AIROHA_SOE_FLOW_LAB_DSCP);
+	writel(0, base + AIROHA_SOE_BCNT_80);
+	writel(0xffffffff, base + AIROHA_SOE_BCNT_THSHD_80);
+	writel(lower_32_bits(cfg->hard_byte_limit),
+	       base + AIROHA_SOE_BCNT_THSHD_32_HARD);
+	writel(upper_32_bits(cfg->hard_byte_limit),
+	       base + AIROHA_SOE_BCNT_THSHD_64_HARD);
+	writel(lower_32_bits(cfg->soft_packet_limit),
+	       base + AIROHA_SOE_SEQ_THSHD_32_SOFT);
+	writel(upper_32_bits(cfg->soft_packet_limit),
+	       base + AIROHA_SOE_SEQ_THSHD_64_SOFT);
+	writel(lower_32_bits(cfg->hard_packet_limit),
+	       base + AIROHA_SOE_SEQ_THSHD_32_HARD);
+	writel(upper_32_bits(cfg->hard_packet_limit),
+	       base + AIROHA_SOE_SEQ_THSHD_64_HARD);
+
+	return airoha_soe_commit_sa(soe, index);
+}
+
+static int airoha_soe_clear_sa_locked(struct airoha_soe *soe,
+				      unsigned int index)
+{
+	struct airoha_soe_sa_cfg cfg = {};
+
+	return airoha_soe_program_sa_locked(soe, index, &cfg);
+}
+
+static void airoha_soe_copy_words(u32 *dst, const u8 *src, unsigned int bits)
+{
+	unsigned int words = bits / (BITS_PER_BYTE * sizeof(u32));
+	unsigned int i;
+
+	for (i = 0; i < words && i < AIROHA_SOE_KEY_WORDS; i++)
+		dst[i] = get_unaligned_be32(src + i * sizeof(u32));
+}
+
+static int airoha_soe_aes_key_len(unsigned int bits,
+				  struct netlink_ext_ack *extack, u32 *val)
+{
+	switch (bits) {
+	case 128:
+		*val = AIROHA_SOE_AES_KEY_128;
+		return 0;
+	case 192:
+		*val = AIROHA_SOE_AES_KEY_192;
+		return 0;
+	case 256:
+		*val = AIROHA_SOE_AES_KEY_256;
+		return 0;
+	default:
+		NL_SET_ERR_MSG_MOD(extack,
+				   "SOE supports AES-128/192/256 keys only");
+		return -EOPNOTSUPP;
+	}
+}
+
+static int airoha_soe_build_algo(struct xfrm_state *x,
+				 struct airoha_soe_sa_cfg *cfg,
+				 struct netlink_ext_ack *extack)
+{
+	u32 key_len;
+	u32 field;
+	int err;
+
+	if (x->aead) {
+		if (strcmp(x->aead->alg_name, "rfc4106(gcm(aes))")) {
+			NL_SET_ERR_MSG_MOD(extack,
+					   "SOE supports rfc4106(gcm(aes)) AEAD only");
+			return -EOPNOTSUPP;
+		}
+
+		if (x->aead->alg_key_len < 32) {
+			NL_SET_ERR_MSG_MOD(extack, "invalid AEAD key length");
+			return -EINVAL;
+		}
+
+		key_len = x->aead->alg_key_len - 32;
+		err = airoha_soe_aes_key_len(key_len, extack, &field);
+		if (err)
+			return err;
+
+		cfg->cmd |= FIELD_PREP(AIROHA_SOE_SA_CMD_CIPHER,
+				       AIROHA_SOE_CIPHER_AES_GCM);
+		cfg->cmd |= FIELD_PREP(AIROHA_SOE_SA_CMD_AES_KEY_LEN, field);
+		switch (x->aead->alg_icv_len) {
+		case 64:
+			field = 0;
+			break;
+		case 96:
+			field = 1;
+			break;
+		case 128:
+			field = 2;
+			break;
+		default:
+			NL_SET_ERR_MSG_MOD(extack,
+					   "SOE supports 64/96/128-bit GCM ICV only");
+			return -EOPNOTSUPP;
+		}
+		cfg->cmd |= FIELD_PREP(AIROHA_SOE_SA_CMD_GCM_ICV_LEN, field);
+		airoha_soe_copy_words(cfg->enc_key, x->aead->alg_key, key_len);
+		cfg->hmac_key[0] =
+			get_unaligned_be32(x->aead->alg_key + key_len / 8);
+		return 0;
+	}
+
+	if (!x->ealg || strcmp(x->ealg->alg_name, "cbc(aes)")) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "SOE supports cbc(aes) encryption only");
+		return -EOPNOTSUPP;
+	}
+
+	err = airoha_soe_aes_key_len(x->ealg->alg_key_len, extack, &field);
+	if (err)
+		return err;
+
+	cfg->cmd |=
+		FIELD_PREP(AIROHA_SOE_SA_CMD_CIPHER, AIROHA_SOE_CIPHER_AES_CBC);
+	cfg->cmd |= FIELD_PREP(AIROHA_SOE_SA_CMD_AES_KEY_LEN, field);
+	airoha_soe_copy_words(cfg->enc_key, x->ealg->alg_key,
+			      x->ealg->alg_key_len);
+
+	if (!x->aalg) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "SOE CBC mode requires HMAC authentication");
+		return -EOPNOTSUPP;
+	}
+
+	if (!strcmp(x->aalg->alg_name, "hmac(sha1)")) {
+		if (x->aalg->alg_key_len != 160 ||
+		    x->aalg->alg_trunc_len != 96) {
+			NL_SET_ERR_MSG_MOD(extack,
+					   "SOE supports HMAC-SHA1-96 only");
+			return -EOPNOTSUPP;
+		}
+		field = AIROHA_SOE_HASH_HMAC_SHA1_96;
+	} else if (!strcmp(x->aalg->alg_name, "hmac(sha256)")) {
+		if (x->aalg->alg_key_len != 256 ||
+		    x->aalg->alg_trunc_len != 128) {
+			NL_SET_ERR_MSG_MOD(extack,
+					   "SOE supports HMAC-SHA256-128 only");
+			return -EOPNOTSUPP;
+		}
+		field = AIROHA_SOE_HASH_HMAC_SHA256_128;
+	} else {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "SOE supports HMAC-SHA1/SHA256 only");
+		return -EOPNOTSUPP;
+	}
+
+	cfg->cmd |= FIELD_PREP(AIROHA_SOE_SA_CMD_HASH, field);
+	airoha_soe_copy_words(cfg->hmac_key, x->aalg->alg_key,
+			      x->aalg->alg_key_len);
+
+	return 0;
+}
+
+static int airoha_soe_build_replay(struct xfrm_state *x,
+				   struct airoha_soe_sa_cfg *cfg,
+				   struct netlink_ext_ack *extack)
+{
+	u32 window;
+
+	if ((x->props.flags & XFRM_STATE_ESN) ||
+	    x->repl_mode == XFRM_REPLAY_MODE_ESN) {
+		NL_SET_ERR_MSG_MOD(extack, "SOE ESN is not supported yet");
+		return -EOPNOTSUPP;
+	}
+
+	window = x->replay_esn ? x->replay_esn->replay_window :
+				 x->props.replay_window;
+	if (!window)
+		return 0;
+
+	cfg->cmd |= AIROHA_SOE_SA_CMD_ANTI_RPLY_EN;
+	cfg->cmd |= FIELD_PREP(AIROHA_SOE_SA_CMD_ANTI_RPLY_WDW,
+			       min_t(u32, (window - 1) / 64, 3));
+
+	return 0;
+}
+
+static int airoha_soe_build_sa(struct xfrm_state *x,
+			       struct airoha_soe_sa_cfg *cfg,
+			       struct netlink_ext_ack *extack)
+{
+	int err;
+
+	if (x->xso.type != XFRM_DEV_OFFLOAD_PACKET) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "SOE supports XFRM packet offload only");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->xso.dir != XFRM_DEV_OFFLOAD_OUT &&
+	    x->xso.dir != XFRM_DEV_OFFLOAD_IN) {
+		NL_SET_ERR_MSG_MOD(extack, "SOE supports in/out SAs only");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->id.proto != IPPROTO_ESP) {
+		NL_SET_ERR_MSG_MOD(extack, "SOE supports ESP only");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->props.family != AF_INET || x->outer_mode.family != AF_INET) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "SOE bring-up supports IPv4 outer tunnel only");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->props.mode != XFRM_MODE_TUNNEL) {
+		NL_SET_ERR_MSG_MOD(extack, "SOE supports tunnel mode only");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->encap && x->encap->encap_type != UDP_ENCAP_ESPINUDP) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "SOE supports native ESP or UDP_ENCAP_ESPINUDP");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->tfcpad) {
+		NL_SET_ERR_MSG_MOD(extack, "SOE does not support TFC padding");
+		return -EOPNOTSUPP;
+	}
+
+	cfg->cmd = AIROHA_SOE_SA_CMD_SN_ERR_DROP |
+		   AIROHA_SOE_SA_CMD_PAD_ERR_DROP |
+		   AIROHA_SOE_SA_CMD_ICV_ERR_DROP;
+	if (x->xso.dir == XFRM_DEV_OFFLOAD_OUT) {
+		cfg->cmd |= AIROHA_SOE_SA_CMD_ENC;
+		if (x->encap)
+			cfg->cmd |= AIROHA_SOE_SA_CMD_NAT_EN;
+		cfg->src_addr[0] = be32_to_cpu(x->props.saddr.a4);
+		cfg->dst_addr[0] = be32_to_cpu(x->id.daddr.a4);
+	} else if (x->encap) {
+		/* RX submit passes the full UDP/4500 packet to SOE. Ask the
+		 * decrypt parser to consume the UDP header before ESP decap.
+		 */
+		cfg->cmd |= AIROHA_SOE_SA_CMD_DEC_UDP_PARSER_EN;
+	}
+
+	err = airoha_soe_build_algo(x, cfg, extack);
+	if (err)
+		return err;
+
+	err = airoha_soe_build_replay(x, cfg, extack);
+	if (err)
+		return err;
+
+	cfg->spi = be32_to_cpu(x->id.spi);
+	if (x->encap) {
+		/* The NAT-T port word stores dport above sport. */
+		cfg->udp_port = (u32)ntohs(x->encap->encap_dport) << 16 |
+				ntohs(x->encap->encap_sport);
+	}
+	cfg->soft_byte_limit = airoha_soe_limit(x->lft.soft_byte_limit);
+	cfg->hard_byte_limit = airoha_soe_limit(x->lft.hard_byte_limit);
+	cfg->soft_packet_limit = airoha_soe_limit(x->lft.soft_packet_limit);
+	cfg->hard_packet_limit = airoha_soe_limit(x->lft.hard_packet_limit);
+
+	return 0;
+}
+
+static int airoha_soe_alloc_sa(struct airoha_soe *soe, struct xfrm_state *x,
+			       struct netlink_ext_ack *extack,
+			       struct airoha_soe_sa **sa)
+{
+	struct airoha_soe_sa_cfg cfg = {};
+	struct airoha_soe_sa *new_sa;
+	unsigned int i;
+	int err;
+
+	if (!soe || !sa || !airoha_soe_available(soe)) {
+		NL_SET_ERR_MSG_MOD(extack, "SOE provider is unavailable");
+		return -ENODEV;
+	}
+
+	err = airoha_soe_build_sa(x, &cfg, extack);
+	if (err)
+		return err;
+
+	new_sa = kzalloc_obj(*new_sa, GFP_KERNEL);
+	if (!new_sa)
+		return -ENOMEM;
+
+	mutex_lock(&soe->sa_lock);
+	for (i = 0; i < AIROHA_SOE_NUM_SA; i++) {
+		if (!(soe->sa_map & BIT(i)))
+			break;
+	}
+	if (i == AIROHA_SOE_NUM_SA) {
+		mutex_unlock(&soe->sa_lock);
+		kfree(new_sa);
+		return -ENOSPC;
+	}
+
+	err = airoha_soe_program_sa_locked(soe, i, &cfg);
+	if (err) {
+		mutex_unlock(&soe->sa_lock);
+		kfree(new_sa);
+		return err;
+	}
+
+	new_sa->soe = soe;
+	new_sa->index = i;
+	new_sa->cmd = cfg.cmd;
+	new_sa->spi = cfg.spi;
+	spin_lock_init(&new_sa->lock);
+	INIT_LIST_HEAD(&new_sa->tx_queue);
+	INIT_LIST_HEAD(&new_sa->rx_queue);
+	init_completion(&new_sa->idle);
+	rcu_assign_pointer(soe->sa[i], new_sa);
+	soe->sa_map |= BIT(i);
+	mutex_unlock(&soe->sa_lock);
+
+	*sa = new_sa;
+	return 0;
+}
+
+static void airoha_soe_mark_sa_dead(struct airoha_soe_sa *sa)
+{
+	if (!sa)
+		return;
+
+	spin_lock_bh(&sa->lock);
+	sa->dead = true;
+	if (!sa->inflight)
+		complete(&sa->idle);
+	spin_unlock_bh(&sa->lock);
+}
+
+static void airoha_soe_free_ctx(struct airoha_soe_ctx *ctx)
+{
+	if (!ctx)
+		return;
+
+	if (ctx->dir == AIROHA_SOE_CTX_OUT)
+		dst_release(ctx->dst);
+	else
+		xfrm_state_put(ctx->rx.x);
+	kfree(ctx);
+}
+
+static void airoha_soe_purge_ctx_list(struct list_head *head)
+{
+	struct airoha_soe_ctx *ctx, *tmp;
+
+	list_for_each_entry_safe(ctx, tmp, head, list) {
+		list_del(&ctx->list);
+		airoha_soe_free_ctx(ctx);
+	}
+}
+
+static void airoha_soe_forget_rx_ctx_list(struct airoha_soe_sa *sa)
+{
+	if (!list_empty(&sa->rx_queue))
+		atomic_sub((int)list_count_nodes(&sa->rx_queue),
+			   &sa->soe->pending_rx);
+}
+
+static void airoha_soe_abort_sa(struct airoha_soe_sa *sa)
+{
+	LIST_HEAD(rx_queue);
+	LIST_HEAD(tx_queue);
+
+	if (!sa)
+		return;
+
+	spin_lock_bh(&sa->lock);
+	sa->dead = true;
+	airoha_soe_forget_rx_ctx_list(sa);
+	list_splice_init(&sa->tx_queue, &tx_queue);
+	list_splice_init(&sa->rx_queue, &rx_queue);
+	sa->inflight = 0;
+	complete(&sa->idle);
+	spin_unlock_bh(&sa->lock);
+
+	airoha_soe_purge_ctx_list(&tx_queue);
+	airoha_soe_purge_ctx_list(&rx_queue);
+}
+
+static void airoha_soe_free_sa(struct airoha_soe_sa *sa)
+{
+	LIST_HEAD(rx_queue);
+	LIST_HEAD(tx_queue);
+	struct airoha_soe *soe;
+
+	if (!sa)
+		return;
+
+	soe = sa->soe;
+	airoha_soe_mark_sa_dead(sa);
+	if (!wait_for_completion_timeout(&sa->idle, AIROHA_SOE_SA_FREE_TIMEOUT))
+		dev_warn(soe->dev,
+			 "timed out waiting for SOE SA%u in-flight packets\n",
+			 sa->index);
+
+	mutex_lock(&soe->sa_lock);
+	if (sa->index < AIROHA_SOE_NUM_SA &&
+	    rcu_access_pointer(soe->sa[sa->index]) == sa) {
+		airoha_soe_clear_sa_locked(soe, sa->index);
+		RCU_INIT_POINTER(soe->sa[sa->index], NULL);
+		soe->sa_map &= ~BIT(sa->index);
+	}
+	mutex_unlock(&soe->sa_lock);
+	synchronize_rcu();
+
+	spin_lock_bh(&sa->lock);
+	airoha_soe_forget_rx_ctx_list(sa);
+	list_splice_init(&sa->tx_queue, &tx_queue);
+	list_splice_init(&sa->rx_queue, &rx_queue);
+	spin_unlock_bh(&sa->lock);
+	airoha_soe_purge_ctx_list(&tx_queue);
+	airoha_soe_purge_ctx_list(&rx_queue);
+
+	kfree(sa);
+}
+
+static struct airoha_soe_ctx *airoha_soe_pop_ctx(struct airoha_soe_sa *sa,
+						 enum airoha_soe_ctx_dir dir)
+{
+	struct list_head *head;
+	struct airoha_soe_ctx *ctx = NULL;
+
+	head = dir == AIROHA_SOE_CTX_OUT ? &sa->tx_queue : &sa->rx_queue;
+
+	spin_lock_bh(&sa->lock);
+	if (!list_empty(head)) {
+		ctx = list_first_entry(head, struct airoha_soe_ctx, list);
+		list_del(&ctx->list);
+		if (dir == AIROHA_SOE_CTX_IN)
+			atomic_dec(&sa->soe->pending_rx);
+	}
+
+	if (ctx && !WARN_ON_ONCE(!sa->inflight)) {
+		sa->inflight--;
+		if (sa->dead && !sa->inflight)
+			complete(&sa->idle);
+	}
+	spin_unlock_bh(&sa->lock);
+
+	return ctx;
+}
+
+static int airoha_soe_prepare_ip_headers(struct sk_buff *skb)
+{
+	unsigned int hdr_len;
+
+	if (!pskb_may_pull(skb, 1))
+		return -EINVAL;
+
+	switch (skb->data[0] & 0xf0) {
+	case 0x40:
+		hdr_len = sizeof(struct iphdr);
+		skb->protocol = htons(ETH_P_IP);
+		break;
+	case 0x60:
+		hdr_len = sizeof(struct ipv6hdr);
+		skb->protocol = htons(ETH_P_IPV6);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	if (!pskb_may_pull(skb, hdr_len))
+		return -EINVAL;
+
+	skb_reset_network_header(skb);
+	skb_set_transport_header(skb, hdr_len);
+
+	return 0;
+}
+
+static void airoha_soe_trace_rx_complete(struct sk_buff *skb,
+					 const struct airoha_soe_ctx *ctx,
+					 const struct xfrm_state *x)
+{
+	unsigned int trace = READ_ONCE(airoha_soe_rx_trace_packets);
+	const struct iphdr *iph;
+
+	if (!trace || skb->protocol != htons(ETH_P_IP))
+		return;
+
+	iph = ip_hdr(skb);
+	pr_info("airoha_eth: SOE RX complete dev=%s saddr=%pI4 daddr=%pI4 proto=%u len=%u mark=0x%x spi=0x%08x natt=%u foe=%u hash=0x%04x sa=%u\n",
+		ctx->rx.dev->name, &iph->saddr, &iph->daddr, iph->protocol,
+		ntohs(iph->tot_len), skb->mark, ntohl(x->id.spi),
+		x->encap ? 1 : 0, ctx->rx.foe_valid, ctx->rx.foe_hash,
+		ctx->rx.sa_index);
+	WRITE_ONCE(airoha_soe_rx_trace_packets, trace - 1);
+}
+
+static int airoha_soe_push_l2_header(struct sk_buff *skb)
+{
+	static const u8 ipv4_l2_header[ETH_HLEN] = {
+		0x00, 0x0c, 0xe7, 0x20, 0x21, 0x12, 0x00,
+		0x0c, 0xe7, 0x20, 0x22, 0x62, 0x08, 0x00,
+	};
+	static const u8 ipv6_l2_header[ETH_HLEN] = {
+		0x00, 0x0c, 0xe7, 0x20, 0x21, 0x12, 0x00,
+		0x0c, 0xe7, 0x20, 0x22, 0x62, 0x86, 0xdd,
+	};
+	const u8 *l2_header;
+	int err;
+
+	err = airoha_soe_prepare_ip_headers(skb);
+	if (err)
+		return err;
+
+	if (skb->protocol == htons(ETH_P_IP))
+		l2_header = ipv4_l2_header;
+	else
+		l2_header = ipv6_l2_header;
+
+	/* TDMA/SOE port 7 expects an Ethernet-looking frame before the SOE hop. */
+	memcpy(skb_push(skb, ETH_HLEN), l2_header, ETH_HLEN);
+
+	return 0;
+}
+
+static void airoha_soe_push_hop_desc(struct sk_buff *skb, unsigned int sa_index,
+				     bool encrypt, int foe_idx)
+{
+	u32 hop_direction = encrypt ? AIROHA_SOE_HOP_INFO_ENCRYPT :
+				      AIROHA_SOE_HOP_INFO_DECRYPT;
+	u64 desc3 = ((u64)(u16)((hop_direction << 4) | 0x80) << 48) |
+		    ((u64)(sa_index & 0x3f) << 40) | 0x05dc0000ULL;
+	u64 desc2 = 0;
+	__le64 desc[4] = {};
+
+	if (foe_idx >= 0)
+		desc2 = (u64)(foe_idx & 0xffff) << 32;
+
+	desc[0] = cpu_to_le64(encrypt ? AIROHA_SOE_HOP_DESC0_ENCRYPT :
+					AIROHA_SOE_HOP_DESC0_DECRYPT);
+	desc[1] = cpu_to_le64(AIROHA_SOE_HOP_DESC1);
+	desc[2] = cpu_to_le64(desc2);
+	desc[3] = cpu_to_le64(desc3);
+	((u8 *)desc)[28] = sa_index;
+
+	/* The FE/QDMA hop descriptor is consumed by PSE port 7 before SOE. */
+	memcpy(skb_push(skb, AIROHA_SOE_QDMA_HOP_DESC_LEN), desc, sizeof(desc));
+}
+
+static int airoha_soe_submit_skb(struct airoha_soe_sa *sa,
+				 struct airoha_gdm_dev *dev,
+				 struct sk_buff *skb,
+				 struct airoha_soe_ctx *ctx)
+{
+	struct net_device *netdev = netdev_from_priv(dev);
+	u32 queue = ctx->dir == AIROHA_SOE_CTX_OUT ?
+			    AIROHA_SOE_QDMA_QUEUE_ENCRYPT :
+			    AIROHA_SOE_QDMA_QUEUE_DECRYPT;
+	bool encrypt = ctx->dir == AIROHA_SOE_CTX_OUT;
+	unsigned int headroom = AIROHA_SOE_QDMA_HOP_DESC_LEN + ETH_HLEN;
+	struct list_head *head;
+	u32 msg0, msg1;
+	int foe_idx = -1;
+	int err;
+
+	if (skb->ip_summed == CHECKSUM_PARTIAL) {
+		err = skb_checksum_help(skb);
+		if (err)
+			return err;
+	}
+
+	err = skb_cow_head(skb, headroom);
+	if (err)
+		return err;
+
+	err = airoha_soe_push_l2_header(skb);
+	if (err)
+		return err;
+
+	msg0 = FIELD_PREP(QDMA_ETH_TXMSG_SOE_SA_MASK, sa->index & 0x3f);
+	msg1 = FIELD_PREP(QDMA_ETH_TXMSG_METER_MASK, 0x7f) |
+	       FIELD_PREP(QDMA_ETH_TXMSG_FPORT_MASK, 7) |
+	       FIELD_PREP(QDMA_ETH_TXMSG_NBOQ_MASK, queue) |
+	       QDMA_ETH_TXMSG_HOP_MASK |
+	       FIELD_PREP(QDMA_ETH_TXMSG_ACNT_G1_MASK, 0x1f) |
+	       FIELD_PREP(QDMA_ETH_TXMSG_ACNT_G0_MASK, 0x3f);
+
+	if (ctx->dir == AIROHA_SOE_CTX_IN && ctx->rx.foe_valid &&
+	    ctx->rx.foe_hash != AIROHA_RXD4_FOE_ENTRY)
+		foe_idx = ctx->rx.foe_hash;
+
+	airoha_soe_push_hop_desc(skb, sa->index, encrypt, foe_idx);
+
+	skb->dev = netdev;
+	skb_set_queue_mapping(skb, AIROHA_SOE_QDMA_TX_RING);
+
+	if (!dev->soe_xmit_skb)
+		return -ENODEV;
+
+	head = ctx->dir == AIROHA_SOE_CTX_OUT ? &sa->tx_queue : &sa->rx_queue;
+	spin_lock_bh(&sa->lock);
+	if (sa->dead) {
+		spin_unlock_bh(&sa->lock);
+		return -ENOENT;
+	}
+
+	/* Completion descriptors carry only SA/hop flags, so keep skb context here. */
+	list_add_tail(&ctx->list, head);
+	sa->inflight++;
+	if (ctx->dir == AIROHA_SOE_CTX_IN)
+		atomic_inc(&sa->soe->pending_rx);
+	reinit_completion(&sa->idle);
+
+	err = dev->soe_xmit_skb(dev, skb, msg0, msg1,
+				AIROHA_SOE_TXMSG2_DEFAULT);
+	if (err) {
+		list_del(&ctx->list);
+		if (ctx->dir == AIROHA_SOE_CTX_IN)
+			atomic_dec(&sa->soe->pending_rx);
+		sa->inflight--;
+		if (sa->dead && !sa->inflight)
+			complete(&sa->idle);
+	}
+	spin_unlock_bh(&sa->lock);
+
+	return err;
+}
+
+int airoha_soe_xmit(struct airoha_soe_sa *sa, struct airoha_gdm_dev *dev,
+		    struct sk_buff *skb, struct xfrm_state *x)
+{
+	struct airoha_soe_ctx *ctx;
+	struct dst_entry *path;
+	struct dst_entry *dst;
+	int err;
+
+	if (!sa || !dev || !skb || !x || x->xso.dir != XFRM_DEV_OFFLOAD_OUT)
+		return -EINVAL;
+
+	if (skb_is_gso(skb))
+		return -EOPNOTSUPP;
+
+	dst = skb_dst(skb);
+	if (!dst)
+		return -EHOSTUNREACH;
+
+	path = xfrm_dst_path(dst);
+	if (!path)
+		return -EHOSTUNREACH;
+
+	ctx = kzalloc_obj(*ctx, GFP_ATOMIC);
+	if (!ctx)
+		return -ENOMEM;
+
+	ctx->dir = AIROHA_SOE_CTX_OUT;
+	dst_hold(path);
+	ctx->dst = path;
+
+	err = airoha_soe_submit_skb(sa, dev, skb, ctx);
+	if (err) {
+		airoha_soe_free_ctx(ctx);
+		return err;
+	}
+
+	return 0;
+}
+
+static bool airoha_soe_rx_parse_ipv4(struct sk_buff *skb,
+				     struct airoha_soe_rx_info *info)
+{
+	struct ip_esp_hdr *esph;
+	struct udphdr *uh;
+	struct iphdr *iph;
+	int iphlen;
+	int udp_len;
+	int packet_len;
+
+	if (skb->protocol != htons(ETH_P_IP)) {
+		if (!pskb_may_pull(skb, 1) || (skb->data[0] >> 4) != 4)
+			return false;
+
+		skb->protocol = htons(ETH_P_IP);
+	}
+
+	if (!pskb_may_pull(skb, sizeof(*iph)))
+		return false;
+
+	iph = ip_hdr(skb);
+	if (iph->version != 4 || ip_is_fragment(iph))
+		return false;
+
+	iphlen = iph->ihl * 4;
+	packet_len = ntohs(iph->tot_len);
+	if (iphlen < sizeof(*iph) || packet_len > skb->len)
+		return false;
+
+	if (iph->protocol == IPPROTO_ESP) {
+		if (packet_len <= iphlen + sizeof(*esph) ||
+		    !pskb_may_pull(skb, iphlen + sizeof(*esph)))
+			return false;
+
+		esph = (struct ip_esp_hdr *)(skb->data + iphlen);
+		if (!esph->spi)
+			return false;
+
+		info->packet_len = packet_len;
+		info->encap = false;
+		info->sport = 0;
+		info->dport = 0;
+		info->spi = esph->spi;
+
+		return true;
+	}
+
+	if (iph->protocol != IPPROTO_UDP ||
+	    !pskb_may_pull(skb, iphlen + sizeof(*uh) + sizeof(*esph)))
+		return false;
+
+	uh = (struct udphdr *)(skb->data + iphlen);
+	udp_len = ntohs(uh->len);
+	if (uh->dest != htons(AIROHA_SOE_NATT_PORT) ||
+	    udp_len <= sizeof(*uh) + sizeof(*esph) ||
+	    iphlen + udp_len != packet_len || packet_len > skb->len)
+		return false;
+
+	esph = (struct ip_esp_hdr *)(skb->data + iphlen + sizeof(*uh));
+	if (!esph->spi)
+		return false;
+
+	info->packet_len = packet_len;
+	info->encap = true;
+	info->sport = uh->source;
+	info->dport = uh->dest;
+	info->spi = esph->spi;
+
+	return true;
+}
+
+/* Plain ESP/NAT-T first arrives as normal RX, then is bounced to SOE decrypt. */
+bool airoha_soe_rx_plain_skb(struct airoha_gdm_dev *dev, struct sk_buff *skb,
+			     struct net_device *rx_dev, u16 foe_hash,
+			     u32 foe_reason, bool foe_valid)
+{
+	struct airoha_soe_xfrm_state *state;
+	struct airoha_soe_rx_info info = {};
+	struct airoha_soe_ctx *ctx;
+	xfrm_address_t daddr = {};
+	struct xfrm_state *x;
+	int err;
+
+	if (!dev || !skb || !rx_dev)
+		return false;
+
+	if (!dev->eth->soe || !(rx_dev->features & NETIF_F_HW_ESP))
+		return false;
+
+	if (!atomic_read(&dev->soe_xfrm_state_count))
+		return false;
+
+	/* The packet is still in the driver RX path after eth_type_trans(). */
+	skb_reset_network_header(skb);
+	if (!airoha_soe_rx_parse_ipv4(skb, &info))
+		return false;
+
+	if (skb->len != info.packet_len && pskb_trim(skb, info.packet_len))
+		return false;
+
+	daddr.a4 = ip_hdr(skb)->daddr;
+	x = xfrm_input_state_lookup(dev_net(rx_dev), skb->mark, &daddr,
+				    info.spi, IPPROTO_ESP, AF_INET);
+	if (!x)
+		return false;
+
+	if (x->xso.dir != XFRM_DEV_OFFLOAD_IN)
+		goto put_state;
+	if (x->xso.type != XFRM_DEV_OFFLOAD_PACKET)
+		goto put_state;
+	if (x->xso.dev != rx_dev)
+		goto put_state;
+	if ((info.encap &&
+	     (!x->encap || x->encap->encap_type != UDP_ENCAP_ESPINUDP)) ||
+	    (!info.encap && x->encap))
+		goto put_state;
+
+	if (info.encap && info.dport != x->encap->encap_dport)
+		goto put_state;
+
+	state = (struct airoha_soe_xfrm_state *)x->xso.offload_handle;
+	if (!state || state->dev != dev || !state->sa)
+		goto put_state;
+
+	ctx = kzalloc_obj(*ctx, GFP_ATOMIC);
+	if (!ctx)
+		goto put_state;
+
+	ctx->dir = AIROHA_SOE_CTX_IN;
+	ctx->rx.x = x;
+	ctx->rx.gdm_dev = dev;
+	ctx->rx.dev = rx_dev;
+	ctx->rx.saddr = ip_hdr(skb)->saddr;
+	ctx->rx.sport = info.sport;
+	ctx->rx.foe_hash = foe_hash;
+	ctx->rx.foe_reason = foe_reason;
+	ctx->rx.sa_index = state->sa->index;
+	ctx->rx.foe_valid = foe_valid;
+	ctx->rx.mark = skb->mark;
+
+		err = airoha_soe_submit_skb(state->sa, dev, skb, ctx);
+	if (err) {
+		airoha_soe_free_ctx(ctx);
+		goto drop_state;
+	}
+
+	return true;
+
+drop_state:
+	kfree_skb(skb);
+	return true;
+put_state:
+	xfrm_state_put(x);
+	return false;
+}
+
+static bool airoha_soe_complete_out(struct sk_buff *skb,
+				    struct airoha_soe_ctx *ctx)
+{
+	struct dst_entry *dst = ctx->dst;
+	struct net *net;
+	int err;
+
+	ctx->dst = NULL;
+	if (!pskb_may_pull(skb, ETH_HLEN + 1))
+		goto drop;
+	skb_pull(skb, ETH_HLEN);
+
+	err = airoha_soe_prepare_ip_headers(skb);
+	if (err)
+		goto drop;
+
+	/* Re-enter dst_output() with the original dst after hardware ESP encode. */
+	skb->protocol = htons(ETH_P_IP);
+	skb_dst_drop(skb);
+	skb_dst_set(skb, dst);
+	skb->ignore_df = 1;
+	net = dev_net(dst->dev);
+	kfree(ctx);
+	dst_output(net, NULL, skb);
+
+	return true;
+
+drop:
+	dst_release(dst);
+	kfree(ctx);
+	kfree_skb(skb);
+	return true;
+}
+
+static bool airoha_soe_complete_in(struct sk_buff *skb,
+				   struct airoha_soe_ctx *ctx)
+{
+	struct xfrm_state *x = ctx->rx.x;
+	struct net_device *rx_dev = ctx->rx.dev;
+	struct xfrm_offload *xo;
+	struct sec_path *sp;
+	int err;
+
+	if (!pskb_may_pull(skb, ETH_HLEN + 1))
+		goto drop;
+	skb_pull(skb, ETH_HLEN);
+
+	err = airoha_soe_prepare_ip_headers(skb);
+	if (err)
+		goto drop;
+
+	skb->dev = rx_dev;
+	skb->mark = ctx->rx.mark;
+	skb->ip_summed = CHECKSUM_NONE;
+	skb_reset_mac_header(skb);
+	skb_reset_mac_len(skb);
+	skb->pkt_type = PACKET_HOST;
+	skb->encapsulation = 0;
+	skb_dst_drop(skb);
+
+	if (x->encap && x->encap->encap_type == UDP_ENCAP_ESPINUDP &&
+	    (ctx->rx.saddr != x->props.saddr.a4 ||
+	     ctx->rx.sport != x->encap->encap_sport)) {
+		xfrm_address_t ipaddr = {
+			.a4 = ctx->rx.saddr,
+		};
+
+		km_new_mapping(x, &ipaddr, ctx->rx.sport);
+	}
+
+	/* Tell xfrm_input() equivalent consumers that hardware already decrypted. */
+	sp = secpath_set(skb);
+	if (!sp)
+		goto drop;
+
+	if (sp->len == XFRM_MAX_DEPTH) {
+		secpath_reset(skb);
+		goto drop;
+	}
+
+	sp->xvec[sp->len++] = x;
+	sp->olen++;
+	ctx->rx.x = NULL;
+	xo = xfrm_offload(skb);
+	if (!xo) {
+		secpath_reset(skb);
+		goto drop;
+	}
+
+	xo->flags = CRYPTO_DONE;
+	xo->status = CRYPTO_SUCCESS;
+
+	airoha_soe_trace_rx_complete(skb, ctx, x);
+
+	/* SOE decrypt completion reaches the CPU before the routed plaintext
+	 * packet has selected its final egress port. Preserve the original FOE
+	 * hash and SA hop until the Ethernet xmit path can bind that decrypt
+	 * entry with the completed L2/PSE descriptor.
+	 */
+	if (ctx->rx.foe_valid)
+		airoha_ppe_soe_mark_skb(&ctx->rx.gdm_dev->eth->ppe->dev, skb,
+					ctx->rx.foe_hash, ctx->rx.sa_index,
+					AIROHA_SOE_HOP_INFO_DECRYPT);
+
+	kfree(ctx);
+	netif_rx(skb);
+
+	return true;
+
+drop:
+	airoha_soe_free_ctx(ctx);
+	kfree_skb(skb);
+	return true;
+}
+
+bool airoha_soe_rx_skb(struct airoha_soe *soe, struct sk_buff *skb,
+		       unsigned int sa_index, u32 hop_flags)
+{
+	struct airoha_soe_ctx *ctx;
+	struct airoha_soe_sa *sa;
+
+	if (!soe || !skb || sa_index >= AIROHA_SOE_NUM_SA)
+		return false;
+
+	rcu_read_lock();
+	sa = rcu_dereference(soe->sa[sa_index]);
+	if (!sa) {
+		rcu_read_unlock();
+		return false;
+	}
+
+	if (hop_flags >= AIROHA_SOE_HOP_FLAG_ERROR_BASE) {
+		ctx = airoha_soe_pop_ctx(sa, AIROHA_SOE_CTX_OUT);
+		if (!ctx)
+			ctx = airoha_soe_pop_ctx(sa, AIROHA_SOE_CTX_IN);
+		rcu_read_unlock();
+		airoha_soe_free_ctx(ctx);
+		kfree_skb(skb);
+		return true;
+	}
+
+	if (hop_flags == AIROHA_SOE_HOP_FLAG_ENCRYPTED) {
+		ctx = airoha_soe_pop_ctx(sa, AIROHA_SOE_CTX_OUT);
+		rcu_read_unlock();
+		if (!ctx) {
+			kfree_skb(skb);
+			return true;
+		}
+		return airoha_soe_complete_out(skb, ctx);
+	}
+
+	if (hop_flags == AIROHA_SOE_HOP_FLAG_DECRYPTED) {
+		ctx = airoha_soe_pop_ctx(sa, AIROHA_SOE_CTX_IN);
+		rcu_read_unlock();
+		if (!ctx) {
+			kfree_skb(skb);
+			return true;
+		}
+		return airoha_soe_complete_in(skb, ctx);
+	}
+
+	rcu_read_unlock();
+	return false;
+}
+
+bool airoha_soe_has_pending_rx(struct airoha_soe *soe)
+{
+	if (!soe)
+		return false;
+
+	return !!atomic_read(&soe->pending_rx);
+}
+
+int airoha_soe_xfrm_ppe_info(const struct dst_entry *dst, u8 *sa_index, u8 *hop)
+{
+	struct airoha_soe_xfrm_state *state;
+	struct net_device *netdev;
+	struct xfrm_state *x;
+
+	if (!dst || !sa_index || !hop)
+		return -EINVAL;
+
+	x = dst_xfrm(dst);
+	if (!x || x->xso.type != XFRM_DEV_OFFLOAD_PACKET)
+		return -EOPNOTSUPP;
+
+	state = (struct airoha_soe_xfrm_state *)x->xso.offload_handle;
+	if (!state || !state->sa)
+		return -ENODEV;
+
+	if (!state->dev)
+		return -ENODEV;
+
+	netdev = netdev_from_priv(state->dev);
+	if (netdev != x->xso.dev || !(netdev->features & NETIF_F_HW_ESP))
+		return -ENODEV;
+
+	switch (x->xso.dir) {
+	case XFRM_DEV_OFFLOAD_OUT:
+		*hop = AIROHA_SOE_HOP_INFO_ENCRYPT;
+		break;
+	case XFRM_DEV_OFFLOAD_IN:
+		*hop = AIROHA_SOE_HOP_INFO_DECRYPT;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	*sa_index = state->sa->index;
+
+	return 0;
+}
+
+static int airoha_soe_xfrm_state_add(struct net_device *dev,
+				     struct xfrm_state *x,
+				     struct netlink_ext_ack *extack)
+{
+	struct airoha_soe_xfrm_state *state;
+	struct airoha_gdm_dev *gdm_dev;
+	struct airoha_soe *soe;
+	gfp_t gfp;
+	int err;
+
+	if (dev->xfrmdev_ops != &airoha_soe_xfrmdev_ops ||
+	    !(dev->features & NETIF_F_HW_ESP))
+		return -EOPNOTSUPP;
+
+	gdm_dev = netdev_priv(dev);
+	soe = airoha_soe_get_ref(gdm_dev->eth->soe);
+	if (!soe)
+		return -ENODEV;
+
+	gfp = (x->xso.flags & XFRM_DEV_OFFLOAD_FLAG_ACQ) ? GFP_ATOMIC :
+							   GFP_KERNEL;
+	state = kzalloc_obj(*state, gfp);
+	if (!state) {
+		airoha_soe_put_ref(soe);
+		return -ENOMEM;
+	}
+
+	state->dev = gdm_dev;
+	state->soe = soe;
+
+	if (x->xso.flags & XFRM_DEV_OFFLOAD_FLAG_ACQ)
+		goto out;
+
+	err = airoha_soe_alloc_sa(soe, x, extack, &state->sa);
+	if (err)
+		goto err_free;
+
+	atomic_inc(&gdm_dev->soe_xfrm_state_count);
+	state->counted = true;
+out:
+	x->xso.offload_handle = (unsigned long)state;
+	return 0;
+
+err_free:
+	kfree(state);
+	airoha_soe_put_ref(soe);
+	return err;
+}
+
+static void airoha_soe_xfrm_state_delete(struct net_device *dev,
+					 struct xfrm_state *x)
+{
+	struct airoha_soe_xfrm_state *state;
+
+	state = (struct airoha_soe_xfrm_state *)x->xso.offload_handle;
+	if (state && state->sa) {
+		airoha_ppe_soe_flush_sa(state->dev->eth->ppe, state->sa->index);
+		airoha_soe_abort_sa(state->sa);
+	}
+}
+
+static void airoha_soe_xfrm_state_free(struct net_device *dev,
+				       struct xfrm_state *x)
+{
+	struct airoha_soe_xfrm_state *state;
+
+	state = (struct airoha_soe_xfrm_state *)xchg(&x->xso.offload_handle, 0);
+	if (!state)
+		return;
+
+	if (state->sa) {
+		airoha_ppe_soe_flush_sa(state->dev->eth->ppe,
+					state->sa->index);
+		airoha_soe_free_sa(state->sa);
+	}
+	if (state->counted)
+		atomic_dec(&state->dev->soe_xfrm_state_count);
+	airoha_soe_put_ref(state->soe);
+	kfree(state);
+}
+
+static bool airoha_soe_xfrm_offload_ok(struct sk_buff *skb,
+				       struct xfrm_state *x)
+{
+	struct airoha_soe_xfrm_state *state;
+	struct net_device *dev = x->xso.dev;
+
+	if (!dev || !(dev->features & NETIF_F_HW_ESP))
+		return false;
+
+	if (x->xso.type != XFRM_DEV_OFFLOAD_PACKET ||
+	    x->xso.dir != XFRM_DEV_OFFLOAD_OUT)
+		return false;
+
+	state = (struct airoha_soe_xfrm_state *)x->xso.offload_handle;
+
+	return state && state->sa;
+}
+
+static int airoha_soe_xfrm_packet_xmit_gso(struct sk_buff *skb,
+					   struct xfrm_state *x,
+					   struct airoha_soe_xfrm_state *state)
+{
+	struct sk_buff *segs, *nskb;
+	int err;
+
+	segs = skb_gso_segment(skb, 0);
+	if (IS_ERR(segs)) {
+		XFRM_INC_STATS(xs_net(x), LINUX_MIB_XFRMOUTERROR);
+		kfree_skb(skb);
+		return PTR_ERR(segs);
+	}
+
+	consume_skb(skb);
+
+	skb_list_walk_safe(segs, skb, nskb) {
+		skb_mark_not_on_list(skb);
+		err = airoha_soe_xmit(state->sa, state->dev, skb, x);
+		if (err) {
+			XFRM_INC_STATS(xs_net(x), LINUX_MIB_XFRMOUTERROR);
+			kfree_skb(skb);
+			kfree_skb_list(nskb);
+			return err;
+		}
+	}
+
+	return 0;
+}
+
+static int airoha_soe_xfrm_packet_xmit(struct sk_buff *skb,
+				       struct xfrm_state *x)
+{
+	struct airoha_soe_xfrm_state *state;
+	struct net_device *netdev;
+	int err = -EHOSTUNREACH;
+
+	state = (struct airoha_soe_xfrm_state *)x->xso.offload_handle;
+	if (!state || !state->sa || !state->dev)
+		goto drop;
+
+	netdev = netdev_from_priv(state->dev);
+	if (netdev->xfrmdev_ops != &airoha_soe_xfrmdev_ops ||
+	    !(netdev->features & NETIF_F_HW_ESP))
+		goto drop;
+
+	if (skb_is_gso(skb))
+		return airoha_soe_xfrm_packet_xmit_gso(skb, x, state);
+
+	err = airoha_soe_xmit(state->sa, state->dev, skb, x);
+	if (err)
+		goto drop;
+
+	return 0;
+
+drop:
+	XFRM_INC_STATS(xs_net(x), LINUX_MIB_XFRMOUTERROR);
+	kfree_skb(skb);
+	return err;
+}
+
+static int airoha_soe_xfrm_policy_add(struct xfrm_policy *x,
+				      struct netlink_ext_ack *extack)
+{
+	if (x->xdo.type != XFRM_DEV_OFFLOAD_PACKET) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "SOE supports XFRM packet policies only");
+		return -EOPNOTSUPP;
+	}
+
+	if (xfrm_policy_id2dir(x->index) >= XFRM_POLICY_MAX) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "SOE does not offload socket policies");
+		return -EOPNOTSUPP;
+	}
+
+	if (x->xfrm_nr != 1 ||
+	    x->xfrm_vec[0].id.proto != IPPROTO_ESP) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "SOE offloads ESP policies only");
+		return -EOPNOTSUPP;
+	}
+
+	if (!x->xdo.dev || !(x->xdo.dev->features & NETIF_F_HW_ESP)) {
+		NL_SET_ERR_MSG_MOD(extack, "SOE ESP offload is disabled");
+		return -EOPNOTSUPP;
+	}
+
+	switch (x->xdo.dir) {
+	case XFRM_DEV_OFFLOAD_IN:
+	case XFRM_DEV_OFFLOAD_OUT:
+		return 0;
+	default:
+		NL_SET_ERR_MSG_MOD(extack, "SOE supports in/out policies only");
+		return -EOPNOTSUPP;
+	}
+}
+
+static struct net_device *airoha_soe_dsa_conduit_get(struct net_device *dev)
+{
+	struct net_device *conduit;
+	struct dsa_port *dp;
+
+	if (!dsa_user_dev_check(dev))
+		return NULL;
+
+	/* DSA users expose XFRM, but SOE is attached to their CPU conduit. */
+	dp = dsa_port_from_netdev(dev);
+	if (IS_ERR(dp) || !dp->cpu_dp)
+		return NULL;
+
+	conduit = dsa_port_to_conduit(dp);
+	if (!conduit || conduit->xfrmdev_ops != &airoha_soe_xfrmdev_ops)
+		return NULL;
+
+	dev_hold(conduit);
+
+	return conduit;
+}
+
+static int airoha_soe_dsa_xfrm_state_add(struct net_device *dev,
+					 struct xfrm_state *x,
+					 struct netlink_ext_ack *extack)
+{
+	struct net_device *conduit;
+	int err;
+
+	conduit = airoha_soe_dsa_conduit_get(dev);
+	if (!conduit) {
+		NL_SET_ERR_MSG_MOD(extack, "SOE DSA conduit is unavailable");
+		return -EOPNOTSUPP;
+	}
+
+	err = airoha_soe_xfrm_state_add(conduit, x, extack);
+	dev_put(conduit);
+
+	return err;
+}
+
+static void airoha_soe_dsa_xfrm_state_delete(struct net_device *dev,
+					     struct xfrm_state *x)
+{
+	airoha_soe_xfrm_state_delete(dev, x);
+}
+
+static void airoha_soe_dsa_xfrm_state_free(struct net_device *dev,
+					   struct xfrm_state *x)
+{
+	airoha_soe_xfrm_state_free(dev, x);
+}
+
+static bool airoha_soe_dsa_xfrm_offload_ok(struct sk_buff *skb,
+					   struct xfrm_state *x)
+{
+	return airoha_soe_xfrm_offload_ok(skb, x);
+}
+
+static int airoha_soe_dsa_xfrm_policy_add(struct xfrm_policy *x,
+					  struct netlink_ext_ack *extack)
+{
+	return airoha_soe_xfrm_policy_add(x, extack);
+}
+
+static int airoha_soe_dsa_xfrm_packet_xmit(struct sk_buff *skb,
+					   struct xfrm_state *x)
+{
+	return airoha_soe_xfrm_packet_xmit(skb, x);
+}
+
+static const struct xfrmdev_ops airoha_soe_xfrmdev_ops = {
+	.xdo_dev_state_add = airoha_soe_xfrm_state_add,
+	.xdo_dev_state_delete = airoha_soe_xfrm_state_delete,
+	.xdo_dev_state_free = airoha_soe_xfrm_state_free,
+	.xdo_dev_offload_ok = airoha_soe_xfrm_offload_ok,
+	.xdo_dev_policy_add = airoha_soe_xfrm_policy_add,
+	.xdo_dev_packet_xmit = airoha_soe_xfrm_packet_xmit,
+};
+
+static const struct xfrmdev_ops airoha_soe_dsa_xfrmdev_ops = {
+	.xdo_dev_state_add = airoha_soe_dsa_xfrm_state_add,
+	.xdo_dev_state_delete = airoha_soe_dsa_xfrm_state_delete,
+	.xdo_dev_state_free = airoha_soe_dsa_xfrm_state_free,
+	.xdo_dev_offload_ok = airoha_soe_dsa_xfrm_offload_ok,
+	.xdo_dev_policy_add = airoha_soe_dsa_xfrm_policy_add,
+	.xdo_dev_packet_xmit = airoha_soe_dsa_xfrm_packet_xmit,
+};
+
+static void airoha_soe_dsa_proxy_enable(struct net_device *dev)
+{
+	struct net_device *conduit;
+
+	conduit = airoha_soe_dsa_conduit_get(dev);
+	if (!conduit)
+		return;
+
+	if (dev->xfrmdev_ops && dev->xfrmdev_ops != &airoha_soe_dsa_xfrmdev_ops)
+		goto out;
+
+	/* Mirror ESP capability onto DSA users while programming SAs on the conduit. */
+	dev->xfrmdev_ops = &airoha_soe_dsa_xfrmdev_ops;
+	dev->hw_features |= NETIF_F_HW_ESP;
+	dev->hw_enc_features |= NETIF_F_HW_ESP;
+	dev->wanted_features |= NETIF_F_HW_ESP;
+
+	conduit->wanted_features |= NETIF_F_HW_ESP;
+	netdev_update_features(conduit);
+	netdev_update_features(dev);
+out:
+	dev_put(conduit);
+}
+
+static void airoha_soe_dsa_proxy_clear(struct net_device *dev)
+{
+	if (dev->xfrmdev_ops != &airoha_soe_dsa_xfrmdev_ops)
+		return;
+
+	dev->wanted_features &= ~NETIF_F_HW_ESP;
+	dev->hw_features &= ~NETIF_F_HW_ESP;
+	dev->hw_enc_features &= ~NETIF_F_HW_ESP;
+	netdev_update_features(dev);
+	dev->xfrmdev_ops = NULL;
+}
+
+static void airoha_soe_dsa_proxy_scan(bool enable)
+{
+	struct net_device *dev;
+
+	for_each_netdev(&init_net, dev) {
+		if (enable)
+			airoha_soe_dsa_proxy_enable(dev);
+		else
+			airoha_soe_dsa_proxy_clear(dev);
+	}
+}
+
+static int airoha_soe_netdev_event(struct notifier_block *nb,
+				   unsigned long event, void *ptr)
+{
+	switch (event) {
+	case NETDEV_REGISTER:
+	case NETDEV_CHANGENAME:
+		airoha_soe_dsa_proxy_scan(true);
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block airoha_soe_netdev_notifier = {
+	.notifier_call = airoha_soe_netdev_event,
+};
+
+void airoha_soe_build_netdev(struct net_device *netdev,
+			     airoha_soe_xmit_skb_t xmit_skb)
+{
+	struct airoha_gdm_dev *dev = netdev_priv(netdev);
+
+	atomic_set(&dev->soe_xfrm_state_count, 0);
+	dev->soe_xmit_skb = xmit_skb;
+
+	if (!xmit_skb ||
+	    !(airoha_soe_features(dev->eth->soe) & AIROHA_SOE_FEATURE_ESP))
+		return;
+
+	netdev->xfrmdev_ops = &airoha_soe_xfrmdev_ops;
+	netdev->hw_features |= NETIF_F_HW_ESP;
+	netdev->hw_enc_features |= NETIF_F_HW_ESP;
+}
+
+void airoha_soe_teardown_netdev(struct net_device *netdev)
+{
+	struct airoha_gdm_dev *dev = netdev_priv(netdev);
+
+	if (netdev->xfrmdev_ops == &airoha_soe_xfrmdev_ops)
+		netdev->xfrmdev_ops = NULL;
+	dev->soe_xmit_skb = NULL;
+}
+
+int airoha_soe_set_features(struct net_device *netdev,
+			    netdev_features_t features)
+{
+	netdev_features_t changed = (netdev->features ^ features) &
+				    NETIF_F_HW_ESP;
+	struct airoha_gdm_dev *dev = netdev_priv(netdev);
+
+	if (!changed)
+		return 0;
+
+	if ((features & NETIF_F_HW_ESP) &&
+	    !(airoha_soe_features(dev->eth->soe) & AIROHA_SOE_FEATURE_ESP))
+		return -EOPNOTSUPP;
+
+	if (atomic_read(&dev->soe_xfrm_state_count)) {
+		netdev_err(netdev,
+			   "cannot change ESP features with active SAs\n");
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+static struct device_node *airoha_soe_find_node(struct airoha_eth *eth)
+{
+	struct device_node *parent, *np;
+
+	if (!eth->dev->of_node)
+		return NULL;
+
+	parent = of_get_parent(eth->dev->of_node);
+	if (!parent)
+		return NULL;
+
+	/* SOE is a sibling DT node; Ethernet owns the provider lifetime. */
+	for_each_child_of_node(parent, np) {
+		if (!of_device_is_available(np) ||
+		    !of_device_is_compatible(np, "airoha,en7581-soe"))
+			continue;
+
+		of_node_put(parent);
+		return np;
+	}
+
+	of_node_put(parent);
+
+	return NULL;
+}
+
+int airoha_soe_init(struct airoha_eth *eth)
+{
+	struct device *dev = eth->dev;
+	struct device_node *np;
+	struct resource res;
+	struct airoha_soe *soe;
+	void __iomem *base;
+	int err;
+
+	np = airoha_soe_find_node(eth);
+	if (!np)
+		return 0;
+
+	err = of_address_to_resource(np, 0, &res);
+	if (err)
+		goto put_node;
+
+	base = devm_ioremap_resource(dev, &res);
+	if (IS_ERR(base)) {
+		err = PTR_ERR(base);
+		goto put_node;
+	}
+
+	soe = devm_kzalloc(dev, sizeof(*soe), GFP_KERNEL);
+	if (!soe) {
+		err = -ENOMEM;
+		goto put_node;
+	}
+
+	soe->dev = dev;
+	soe->base = base;
+	mutex_init(&soe->sa_lock);
+	spin_lock_init(&soe->state_lock);
+	refcount_set(&soe->refcnt, 1);
+	init_completion(&soe->released);
+
+	/* Enable the packet engines; reset leaves SOE present but idle. */
+	writel(AIROHA_SOE_INT_ALL, base + AIROHA_SOE_INT_STS);
+	writel(AIROHA_SOE_CNT_CLR_ALL, base + AIROHA_SOE_CNT_CLR);
+	writel(AIROHA_SOE_INT_ALL, base + AIROHA_SOE_INT_EN);
+	writel(AIROHA_SOE_GLB_CFG_ENC_EN | AIROHA_SOE_GLB_CFG_DEC_EN,
+	       base + AIROHA_SOE_GLB_CFG);
+
+	err = register_netdevice_notifier(&airoha_soe_netdev_notifier);
+	if (err)
+		goto disable_soe;
+
+	eth->soe = soe;
+
+	rtnl_lock();
+	airoha_soe_dsa_proxy_scan(true);
+	rtnl_unlock();
+
+	of_node_put(np);
+
+	return 0;
+
+disable_soe:
+	writel(0, base + AIROHA_SOE_GLB_CFG);
+	writel(0, base + AIROHA_SOE_INT_EN);
+	writel(0xffffffff, base + AIROHA_SOE_INT_STS);
+put_node:
+	of_node_put(np);
+
+	return err;
+}
+
+void airoha_soe_deinit(struct airoha_eth *eth)
+{
+	struct airoha_soe *soe = eth->soe;
+	unsigned long flags;
+
+	if (!soe)
+		return;
+
+	eth->soe = NULL;
+
+	spin_lock_irqsave(&soe->state_lock, flags);
+	soe->dead = true;
+	spin_unlock_irqrestore(&soe->state_lock, flags);
+
+	rtnl_lock();
+	airoha_soe_dsa_proxy_scan(false);
+	rtnl_unlock();
+	unregister_netdevice_notifier(&airoha_soe_netdev_notifier);
+
+	airoha_soe_put_ref(soe);
+	wait_for_completion(&soe->released);
+
+	writel(0, soe->base + AIROHA_SOE_GLB_CFG);
+	writel(0, soe->base + AIROHA_SOE_INT_EN);
+	writel(0xffffffff, soe->base + AIROHA_SOE_INT_STS);
+}
-- 
2.53.0


^ permalink raw reply related

* [RFC PATCH net-next 6/7] net: airoha: add PPE support for SOE flows
From: Jihong Min @ 2026-06-14  4:00 UTC (permalink / raw)
  To: netdev, Lorenzo Bianconi
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Simon Horman, Herbert Xu, Steffen Klassert,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, devicetree,
	Matthias Brugger, AngeloGioacchino Del Regno, linux-arm-kernel,
	linux-mediatek, Christian Marangi, Felix Fietkau, linux-kernel,
	Jihong Min
In-Reply-To: <20260614040032.1567994-1-hurryman2212@gmail.com>

Add PPE metadata handling for SOE flows so decrypted packets can carry
their original FOE/SA context until the normal egress path is known, and
so XFRM flowtable entries can be programmed with SOE SA and hop
information.

Signed-off-by: Jihong Min <hurryman2212@gmail.com>
---
 drivers/net/ethernet/airoha/airoha_ppe.c  | 606 +++++++++++++++++++++-
 include/linux/soc/airoha/airoha_offload.h |   5 +
 2 files changed, 585 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/airoha/airoha_ppe.c b/drivers/net/ethernet/airoha/airoha_ppe.c
index 91bcc55a6ac6..faa5f04d4c7b 100644
--- a/drivers/net/ethernet/airoha/airoha_ppe.c
+++ b/drivers/net/ethernet/airoha/airoha_ppe.c
@@ -6,18 +6,77 @@
 
 #include <linux/ip.h>
 #include <linux/ipv6.h>
+#include <linux/kstrtox.h>
+#include <linux/moduleparam.h>
 #include <linux/of_platform.h>
 #include <linux/platform_device.h>
 #include <linux/rhashtable.h>
+#include <linux/sysfs.h>
+#include <linux/tcp.h>
+#include <linux/udp.h>
 #include <net/ipv6.h>
+#include <net/netfilter/nf_flow_table.h>
 #include <net/pkt_cls.h>
+#include <net/route.h>
 
 #include "airoha_regs.h"
 #include "airoha_eth.h"
+#include "airoha_soe.h"
 
 static DEFINE_MUTEX(flow_offload_mutex);
 static DEFINE_SPINLOCK(ppe_lock);
 
+#define AIROHA_FOE_HOP0			GENMASK(31, 29)
+#define AIROHA_FOE_HOP1			GENMASK(28, 26)
+#define AIROHA_FOE_HOP2			GENMASK(25, 23)
+#define AIROHA_FOE_HOP3			GENMASK(22, 20)
+#define AIROHA_FOE_HOP_MASK						\
+	(AIROHA_FOE_HOP0 | AIROHA_FOE_HOP1 | AIROHA_FOE_HOP2 |		\
+	 AIROHA_FOE_HOP3)
+#define AIROHA_PPE_SOE_DEFAULT_TUNNEL_MTU	1500
+#define AIROHA_PPE_SOE_MAGIC_IPSEC		0x72a1
+#define AIROHA_PPE_SOE_PORT_AG			0x3f
+#define AIROHA_PPE_SOE_CHANNEL			2
+#define AIROHA_PPE_SOE_META_TIMEOUT_MS		1000
+#define AIROHA_PPE_SOE_MAGIC_GDM4		0x729a
+#define AIROHA_PPE_SOE_MARK_MAGIC		0x5e000000
+#define AIROHA_PPE_SOE_MARK_MAGIC_MASK		0xff000000
+#define AIROHA_PPE_SOE_MARK_HASH_MASK		0x00ffff00
+#define AIROHA_PPE_DEFAULT_BIND_RATE		0x1e
+#define AIROHA_PPE_SOE_BIND_DELAY_PACKETS	2
+#define AIROHA_PPE_FORCE_COMMIT_PROBE_WINDOW	0
+
+struct airoha_ppe_soe_tuple_info {
+	unsigned int tunnel_mtu;
+	u8 sa_index;
+	u8 hop;
+};
+
+static unsigned int airoha_ppe_bind_rate = AIROHA_PPE_DEFAULT_BIND_RATE;
+static unsigned int airoha_ppe_soe_inline_bind_delay_packets =
+	AIROHA_PPE_SOE_BIND_DELAY_PACKETS;
+static unsigned int airoha_ppe_soe_inline_force_commit_probe_window =
+	AIROHA_PPE_FORCE_COMMIT_PROBE_WINDOW;
+static struct airoha_ppe *airoha_ppe_active;
+
+static int airoha_ppe_set_bind_rate(const char *val,
+				    const struct kernel_param *kp);
+static int airoha_ppe_get_bind_rate(char *buf, const struct kernel_param *kp);
+
+module_param_named(soe_inline_force_commit_probe_window,
+		   airoha_ppe_soe_inline_force_commit_probe_window, uint, 0600);
+MODULE_PARM_DESC(soe_inline_force_commit_probe_window,
+		 "Adjacent FOE slots searched before force-commit");
+module_param_named(soe_inline_bind_delay_packets,
+		   airoha_ppe_soe_inline_bind_delay_packets, uint, 0600);
+MODULE_PARM_DESC(soe_inline_bind_delay_packets,
+		 "CPU-visible SOE decrypt packets before binding FOE entry");
+module_param_call(ppe_bind_rate, airoha_ppe_set_bind_rate,
+		  airoha_ppe_get_bind_rate, NULL, 0600);
+__MODULE_PARM_TYPE(ppe_bind_rate, "uint");
+MODULE_PARM_DESC(ppe_bind_rate,
+		 "PPE bind-rate threshold for L2B and bind fields");
+
 static const struct rhashtable_params airoha_flow_table_params = {
 	.head_offset = offsetof(struct airoha_flow_table_entry, node),
 	.key_offset = offsetof(struct airoha_flow_table_entry, cookie),
@@ -78,6 +137,17 @@ bool airoha_ppe_is_enabled(struct airoha_eth *eth, int index)
 	return airoha_fe_rr(eth, REG_PPE_GLO_CFG(index)) & PPE_GLO_CFG_EN_MASK;
 }
 
+static void airoha_ppe_apply_bind_rate(struct airoha_eth *eth, int ppe_idx)
+{
+	u32 rate = min_t(u32, READ_ONCE(airoha_ppe_bind_rate),
+			 FIELD_MAX(PPE_BIND_RATE_BIND_MASK));
+
+	airoha_fe_rmw(eth, REG_PPE_BIND_RATE(ppe_idx),
+		      PPE_BIND_RATE_L2B_BIND_MASK | PPE_BIND_RATE_BIND_MASK,
+		      FIELD_PREP(PPE_BIND_RATE_L2B_BIND_MASK, rate) |
+			      FIELD_PREP(PPE_BIND_RATE_BIND_MASK, rate));
+}
+
 static u32 airoha_ppe_get_timestamp(struct airoha_ppe *ppe)
 {
 	return airoha_fe_get(ppe->eth, REG_FE_FOE_TS,
@@ -157,15 +227,14 @@ static void airoha_ppe_hw_init(struct airoha_ppe *ppe)
 			      FIELD_PREP(PPE_DRAM_TB_NUM_ENTRY_MASK,
 					 dram_num_entries));
 
-		airoha_fe_rmw(eth, REG_PPE_BIND_RATE(i),
-			      PPE_BIND_RATE_L2B_BIND_MASK |
-			      PPE_BIND_RATE_BIND_MASK,
-			      FIELD_PREP(PPE_BIND_RATE_L2B_BIND_MASK, 0x1e) |
-			      FIELD_PREP(PPE_BIND_RATE_BIND_MASK, 0x1e));
+		airoha_ppe_apply_bind_rate(eth, i);
 
 		airoha_fe_wr(eth, REG_PPE_HASH_SEED(i), PPE_HASH_SEED);
 		airoha_fe_clear(eth, REG_PPE_PPE_FLOW_CFG(i),
 				PPE_FLOW_CFG_IP6_6RD_MASK);
+		airoha_fe_set(eth, REG_PPE_PPE_FLOW_CFG(i),
+			      PPE_FLOW_CFG_IP4_IPSEC_MASK |
+				      PPE_FLOW_CFG_IP6_IPSEC_MASK);
 
 		for (p = 0; p < ARRAY_SIZE(eth->ports); p++)
 			airoha_fe_rmw(eth, REG_PPE_MTU(i, p),
@@ -509,6 +578,162 @@ static int airoha_ppe_foe_entry_set_ipv6_tuple(struct airoha_foe_entry *hwe,
 	return 0;
 }
 
+static int airoha_ppe_soe_fill_inner_ipv4_data(struct sk_buff *skb,
+					       struct airoha_flow_data *data,
+					       int *type, int *l4proto)
+{
+	unsigned int ip_offset = ETH_HLEN;
+	union {
+		struct tcphdr tcp;
+		struct udphdr udp;
+	} ports;
+	struct iphdr iph_buf, *iph;
+	unsigned int l4_offset;
+	struct udphdr *udp;
+	struct tcphdr *tcp;
+
+	if (skb_headlen(skb) < ETH_HLEN)
+		return -EINVAL;
+
+	memcpy(&data->eth, skb->data, ETH_HLEN);
+	if (data->eth.h_proto != htons(ETH_P_IP))
+		return -EAFNOSUPPORT;
+
+	iph = skb_header_pointer(skb, ip_offset, sizeof(iph_buf), &iph_buf);
+	if (!iph || iph->ihl < 5 || iph->version != 4)
+		return -EINVAL;
+
+	l4_offset = ip_offset + iph->ihl * 4;
+	data->v4.src_addr = iph->saddr;
+	data->v4.dst_addr = iph->daddr;
+	*l4proto = iph->protocol;
+
+	switch (iph->protocol) {
+	case IPPROTO_TCP:
+		tcp = skb_header_pointer(skb, l4_offset, sizeof(ports.tcp),
+					 &ports.tcp);
+		if (!tcp)
+			return -EINVAL;
+
+		data->src_port = tcp->source;
+		data->dst_port = tcp->dest;
+		*type = PPE_PKT_TYPE_IPV4_HNAPT;
+		break;
+	case IPPROTO_UDP:
+		udp = skb_header_pointer(skb, l4_offset, sizeof(ports.udp),
+					 &ports.udp);
+		if (!udp)
+			return -EINVAL;
+
+		data->src_port = udp->source;
+		data->dst_port = udp->dest;
+		*type = PPE_PKT_TYPE_IPV4_HNAPT;
+		break;
+	default:
+		*type = PPE_PKT_TYPE_IPV4_ROUTE;
+		break;
+	}
+
+	return 0;
+}
+
+static int airoha_ppe_foe_entry_set_soe_fields(struct airoha_foe_entry *hwe,
+					       u8 sa_index, u8 hop,
+					       unsigned int tunnel_mtu)
+{
+	int type;
+
+	if (hop > FIELD_MAX(AIROHA_FOE_HOP0))
+		return -ERANGE;
+
+	type = FIELD_GET(AIROHA_FOE_IB1_BIND_PACKET_TYPE, hwe->ib1);
+	switch (type) {
+	case PPE_PKT_TYPE_IPV4_HNAPT:
+	case PPE_PKT_TYPE_IPV4_ROUTE:
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	tunnel_mtu = min_t(unsigned int, tunnel_mtu,
+			   FIELD_MAX(AIROHA_FOE_TUNNEL_MTU));
+
+	/* SOE FOE entries store the hop selector and SA index here. */
+	hwe->ipv4.rsv[0] &= ~AIROHA_FOE_HOP_MASK;
+	hwe->ipv4.rsv[0] |= FIELD_PREP(AIROHA_FOE_HOP0, hop);
+
+	hwe->ipv4.data &= ~(AIROHA_FOE_ACTDP | AIROHA_FOE_TUNNEL_ID);
+	hwe->ipv4.data |= AIROHA_FOE_TUNNEL |
+			  FIELD_PREP(AIROHA_FOE_ACTDP, sa_index);
+
+	hwe->ipv4.l2.meter &= ~AIROHA_FOE_TUNNEL_MTU;
+	hwe->ipv4.l2.meter |= FIELD_PREP(AIROHA_FOE_TUNNEL_MTU, tunnel_mtu);
+
+	hwe->ib1 &= ~AIROHA_FOE_IB1_BIND_TUNNEL_DECAP;
+
+	return 0;
+}
+
+static int
+airoha_ppe_soe_get_tuple_info(const struct flow_offload_tuple *tuple,
+			      struct airoha_ppe_soe_tuple_info *info)
+{
+	int err;
+
+	if (!tuple || !info)
+		return -EINVAL;
+	if (tuple->xmit_type != FLOW_OFFLOAD_XMIT_XFRM)
+		return -EOPNOTSUPP;
+
+	err = airoha_soe_xfrm_ppe_info(tuple->dst_cache, &info->sa_index,
+				       &info->hop);
+	if (err)
+		return err;
+
+	info->tunnel_mtu = tuple->mtu ? tuple->mtu :
+				      AIROHA_PPE_SOE_DEFAULT_TUNNEL_MTU;
+
+	return 0;
+}
+
+static int
+airoha_ppe_foe_entry_set_soe_info(struct airoha_foe_entry *hwe,
+				  const struct flow_offload_tuple *tuple)
+{
+	struct airoha_ppe_soe_tuple_info info;
+	int err;
+
+	if (!tuple)
+		return 0;
+	if (tuple->xmit_type != FLOW_OFFLOAD_XMIT_XFRM)
+		return 0;
+
+	err = airoha_ppe_soe_get_tuple_info(tuple, &info);
+	if (err)
+		return err;
+
+	err = airoha_ppe_foe_entry_set_soe_fields(hwe, info.sa_index,
+						  info.hop, info.tunnel_mtu);
+	if (err)
+		return err;
+
+	/* XFRM packet-offload entries are not plain Ethernet/IP entries:
+	 * the PPE must tag them as SOE/IPsec work and submit them through the
+	 * SOE-facing channel/port aggregation path. Without these fields the
+	 * entry can still become BND, but traffic falls back to the slow path
+	 * instead of the inline encrypt/decrypt datapath.
+	 */
+	hwe->ipv4.l2.common.etype = AIROHA_PPE_SOE_MAGIC_IPSEC;
+	hwe->ipv4.data &= ~AIROHA_FOE_CHANNEL;
+	hwe->ipv4.data |= FIELD_PREP(AIROHA_FOE_CHANNEL,
+				     AIROHA_PPE_SOE_CHANNEL);
+	hwe->ipv4.ib2 &= ~AIROHA_FOE_IB2_PORT_AG;
+	hwe->ipv4.ib2 |= FIELD_PREP(AIROHA_FOE_IB2_PORT_AG,
+				    AIROHA_PPE_SOE_PORT_AG);
+
+	return 0;
+}
+
 static u32 airoha_ppe_foe_get_entry_hash(struct airoha_ppe *ppe,
 					 struct airoha_foe_entry *hwe)
 {
@@ -633,6 +858,9 @@ static void airoha_ppe_foe_flow_stats_update(struct airoha_ppe *ppe,
 		meter = &hwe->ipv4.l2.meter;
 	}
 
+	if (*data & AIROHA_FOE_TUNNEL)
+		return;
+
 	pse_port = FIELD_GET(AIROHA_FOE_IB2_PSE_PORT, *ib2);
 	if (pse_port == FE_PSE_PORT_CDM4)
 		return;
@@ -868,16 +1096,129 @@ airoha_ppe_foe_commit_subflow_entry(struct airoha_ppe *ppe,
 	return 0;
 }
 
-static void airoha_ppe_foe_insert_entry(struct airoha_ppe *ppe,
-					struct sk_buff *skb,
-					u32 hash, bool rx_wlan)
+static void airoha_ppe_soe_meta_store(struct airoha_ppe_soe_meta *meta,
+				      u32 key_hash, u16 foe_hash, u8 sa_index,
+				      u8 hop)
 {
-	struct airoha_flow_table_entry *e;
-	struct airoha_foe_bridge br = {};
-	struct airoha_foe_entry *hwe;
-	bool commit_done = false;
-	struct hlist_node *n;
-	u32 index, state;
+	u8 seen = 1;
+
+	if (READ_ONCE(meta->valid) &&
+	    READ_ONCE(meta->key_hash) == key_hash &&
+	    READ_ONCE(meta->foe_hash) == foe_hash &&
+	    READ_ONCE(meta->sa_index) == sa_index &&
+	    READ_ONCE(meta->hop) == hop &&
+	    time_before_eq(jiffies, READ_ONCE(meta->expires)))
+		seen = min_t(u8, READ_ONCE(meta->seen) + 1, U8_MAX);
+
+	WRITE_ONCE(meta->key_hash, key_hash);
+	WRITE_ONCE(meta->foe_hash, foe_hash);
+	WRITE_ONCE(meta->sa_index, sa_index);
+	WRITE_ONCE(meta->hop, hop);
+	WRITE_ONCE(meta->seen, seen);
+	WRITE_ONCE(meta->expires,
+		   jiffies + msecs_to_jiffies(AIROHA_PPE_SOE_META_TIMEOUT_MS));
+	WRITE_ONCE(meta->valid, 1);
+}
+
+void airoha_ppe_soe_mark_skb(struct airoha_ppe_dev *dev, struct sk_buff *skb,
+			     u16 hash, u8 sa_index, u8 hop)
+{
+	struct airoha_ppe *ppe;
+	u32 ppe_hash_mask;
+
+	if (!dev || !skb)
+		return;
+
+	ppe = dev->priv;
+	if (!ppe || !ppe->soe_meta)
+		return;
+
+	ppe_hash_mask = airoha_ppe_get_total_num_entries(ppe) - 1;
+	if (hash > ppe_hash_mask)
+		return;
+
+	/* SOE decrypt completion is CPU-visible before normal routing has
+	 * selected the plaintext egress netdev. Keep the original encrypted FOE
+	 * hash and SA hop briefly on the skb so airoha_dev_xmit() can finish
+	 * the PPE entry once the final egress descriptor is known.
+	 */
+	airoha_ppe_soe_meta_store(&ppe->soe_meta[hash], hash, hash, sa_index,
+				  hop);
+	ppe->foe_check_time[hash] = 0;
+
+	skb->mark &= ~(AIROHA_PPE_SOE_MARK_MAGIC_MASK |
+		       AIROHA_PPE_SOE_MARK_HASH_MASK);
+	skb->mark |= AIROHA_PPE_SOE_MARK_MAGIC |
+		     FIELD_PREP(AIROHA_PPE_SOE_MARK_HASH_MASK, hash);
+}
+
+bool airoha_ppe_soe_skb_marked(struct sk_buff *skb)
+{
+	return skb && ((skb->mark & AIROHA_PPE_SOE_MARK_MAGIC_MASK) ==
+		       AIROHA_PPE_SOE_MARK_MAGIC);
+}
+
+void airoha_ppe_soe_xmit_skb(struct airoha_ppe_dev *dev, struct sk_buff *skb,
+			     struct net_device *netdev)
+{
+	struct airoha_foe_entry entry, tmpl, *hwe;
+	struct airoha_flow_data data = {};
+	struct airoha_ppe_soe_meta *meta;
+	u32 ppe_hash_mask, key_hash;
+	struct airoha_gdm_dev *gdm;
+	struct airoha_ppe *ppe;
+	unsigned long expires;
+	u16 hash;
+	int err, l4proto, type;
+	u8 sa_index, hop;
+	u8 seen;
+
+	if (!dev || !skb || !netdev)
+		return;
+
+	if ((skb->mark & AIROHA_PPE_SOE_MARK_MAGIC_MASK) !=
+	    AIROHA_PPE_SOE_MARK_MAGIC)
+		return;
+
+	ppe = dev->priv;
+	if (!ppe || !ppe->soe_meta)
+		goto clear_mark;
+
+	ppe_hash_mask = airoha_ppe_get_total_num_entries(ppe) - 1;
+	key_hash = FIELD_GET(AIROHA_PPE_SOE_MARK_HASH_MASK, skb->mark);
+	if (key_hash > ppe_hash_mask)
+		goto clear_mark;
+
+	meta = &ppe->soe_meta[key_hash];
+	if (!READ_ONCE(meta->valid))
+		goto clear_mark;
+
+	if (READ_ONCE(meta->key_hash) != key_hash)
+		goto clear_mark;
+
+	expires = READ_ONCE(meta->expires);
+	if (time_after(jiffies, expires)) {
+		WRITE_ONCE(meta->valid, 0);
+		goto clear_mark;
+	}
+
+	hash = READ_ONCE(meta->foe_hash);
+	if (hash > ppe_hash_mask) {
+		WRITE_ONCE(meta->valid, 0);
+		goto clear_mark;
+	}
+
+	seen = READ_ONCE(meta->seen);
+	if (seen <= READ_ONCE(airoha_ppe_soe_inline_bind_delay_packets))
+		goto clear_mark;
+
+	err = airoha_ppe_soe_fill_inner_ipv4_data(skb, &data, &type, &l4proto);
+	if (err)
+		goto clear_mark;
+
+	sa_index = READ_ONCE(meta->sa_index);
+	hop = READ_ONCE(meta->hop);
+	WRITE_ONCE(meta->valid, 0);
 
 	spin_lock_bh(&ppe_lock);
 
@@ -885,13 +1226,120 @@ static void airoha_ppe_foe_insert_entry(struct airoha_ppe *ppe,
 	if (!hwe)
 		goto unlock;
 
-	state = FIELD_GET(AIROHA_FOE_IB1_BIND_STATE, hwe->ib1);
-	if (state == AIROHA_FOE_STATE_BIND)
+	switch (FIELD_GET(AIROHA_FOE_IB1_BIND_PACKET_TYPE, hwe->ib1)) {
+	case PPE_PKT_TYPE_IPV4_HNAPT:
+	case PPE_PKT_TYPE_IPV4_ROUTE:
+		break;
+	default:
 		goto unlock;
+	}
 
-	index = airoha_ppe_foe_get_entry_hash(ppe, hwe);
-	hlist_for_each_entry_safe(e, n, &ppe->foe_flow[index], list) {
+	err = airoha_ppe_foe_entry_prepare(ppe->eth, &tmpl, netdev, type,
+					   &data, l4proto);
+	if (err)
+		goto unlock;
+
+	memcpy(&entry, hwe, sizeof(entry));
+	entry.ib1 &= ~(AIROHA_FOE_IB1_BIND_STATE |
+		       AIROHA_FOE_IB1_BIND_KEEPALIVE |
+		       AIROHA_FOE_IB1_BIND_TIMESTAMP);
+	entry.ib1 |= FIELD_PREP(AIROHA_FOE_IB1_BIND_STATE,
+				AIROHA_FOE_STATE_BIND) |
+		     AIROHA_FOE_IB1_BIND_TTL;
+	entry.ib1 = (entry.ib1 & (AIROHA_FOE_IB1_BIND_PACKET_TYPE |
+				  AIROHA_FOE_IB1_BIND_UDP)) |
+		    (tmpl.ib1 & ~(AIROHA_FOE_IB1_BIND_PACKET_TYPE |
+				  AIROHA_FOE_IB1_BIND_UDP));
+	entry.ipv4.ib2 = tmpl.ipv4.ib2;
+	entry.ipv4.data = tmpl.ipv4.data;
+	memcpy(&entry.ipv4.l2, &tmpl.ipv4.l2, sizeof(entry.ipv4.l2));
+
+	gdm = netdev_priv(netdev);
+	if (gdm->port && gdm->port->id == AIROHA_GDM4_IDX)
+		entry.ipv4.l2.common.etype = AIROHA_PPE_SOE_MAGIC_GDM4;
+
+	if (FIELD_GET(AIROHA_FOE_IB1_BIND_PACKET_TYPE, entry.ib1) ==
+	    PPE_PKT_TYPE_IPV4_HNAPT)
+		memcpy(&entry.ipv4.new_tuple, &entry.ipv4.orig_tuple,
+		       sizeof(entry.ipv4.new_tuple));
+
+	/* Commit the original decrypt entry only after the normal transmit path
+	 * has provided the final plaintext egress descriptor. Binding it at SOE
+	 * RX completion would miss this device-specific L2/PSE state.
+	 */
+	err = airoha_ppe_foe_entry_set_soe_fields(&entry, sa_index, hop,
+						  AIROHA_PPE_SOE_DEFAULT_TUNNEL_MTU);
+	if (!err)
+		airoha_ppe_foe_commit_entry(ppe, &entry, hash, false);
+
+unlock:
+	spin_unlock_bh(&ppe_lock);
+clear_mark:
+	skb->mark &= ~(AIROHA_PPE_SOE_MARK_MAGIC_MASK |
+		       AIROHA_PPE_SOE_MARK_HASH_MASK);
+}
+
+void airoha_ppe_soe_flush_sa(struct airoha_ppe *ppe, u8 sa_index)
+{
+	u32 num_entries, hash;
+
+	if (!ppe)
+		return;
+
+	num_entries = airoha_ppe_get_total_num_entries(ppe);
+
+	spin_lock_bh(&ppe_lock);
+	for (hash = 0; hash < num_entries; hash++) {
+		struct airoha_foe_entry *hwe;
+		u32 state, type;
+
+		hwe = airoha_ppe_foe_get_entry_locked(ppe, hash);
+		if (!hwe)
+			continue;
+
+		state = FIELD_GET(AIROHA_FOE_IB1_BIND_STATE, hwe->ib1);
+		if (state != AIROHA_FOE_STATE_BIND)
+			continue;
+
+		type = FIELD_GET(AIROHA_FOE_IB1_BIND_PACKET_TYPE, hwe->ib1);
+		if (type != PPE_PKT_TYPE_IPV4_HNAPT &&
+		    type != PPE_PKT_TYPE_IPV4_ROUTE)
+			continue;
+
+		if (!(hwe->ipv4.data & AIROHA_FOE_TUNNEL))
+			continue;
+
+		if (FIELD_GET(AIROHA_FOE_ACTDP, hwe->ipv4.data) != sa_index)
+			continue;
+
+		/* NAT-T data and IKE control both use UDP/4500. A stale SOE
+		 * bound entry can otherwise keep sending later IKE_AUTH packets
+		 * to the SOE path after the SA has been deleted.
+		 */
+		hwe->ib1 &= ~AIROHA_FOE_IB1_BIND_STATE;
+		hwe->ib1 |= FIELD_PREP(AIROHA_FOE_IB1_BIND_STATE,
+				       AIROHA_FOE_STATE_INVALID);
+		airoha_ppe_foe_commit_entry(ppe, hwe, hash, false);
+	}
+	spin_unlock_bh(&ppe_lock);
+}
+
+static bool airoha_ppe_foe_try_flow_commit_bucket(struct airoha_ppe *ppe,
+						  struct airoha_foe_entry *hwe,
+						  u32 hash, u32 probe_index,
+						  bool rx_wlan,
+						  bool allow_l2_subflow)
+{
+	struct airoha_flow_table_entry *e;
+	struct hlist_node *n;
+	bool commit_done = false;
+	u32 state;
+
+	hlist_for_each_entry_safe(e, n, &ppe->foe_flow[probe_index], list) {
 		if (e->type == FLOW_TYPE_L2_SUBFLOW) {
+			if (!allow_l2_subflow)
+				continue;
+
 			state = FIELD_GET(AIROHA_FOE_IB1_BIND_STATE, hwe->ib1);
 			if (state != AIROHA_FOE_STATE_BIND) {
 				e->hash = 0xffff;
@@ -908,6 +1356,51 @@ static void airoha_ppe_foe_insert_entry(struct airoha_ppe *ppe,
 		e->hash = hash;
 	}
 
+	return commit_done;
+}
+
+static void airoha_ppe_foe_insert_entry(struct airoha_ppe *ppe,
+					struct sk_buff *skb,
+					u32 hash, bool rx_wlan)
+{
+	struct airoha_flow_table_entry *e;
+	struct airoha_foe_bridge br = {};
+	struct airoha_foe_entry *hwe;
+	bool commit_done = false;
+	u32 index, mask, state, window;
+	unsigned int i;
+
+	spin_lock_bh(&ppe_lock);
+
+	hwe = airoha_ppe_foe_get_entry_locked(ppe, hash);
+	if (!hwe)
+		goto unlock;
+
+	state = FIELD_GET(AIROHA_FOE_IB1_BIND_STATE, hwe->ib1);
+	if (state == AIROHA_FOE_STATE_BIND)
+		goto unlock;
+
+	index = airoha_ppe_foe_get_entry_hash(ppe, hwe);
+	commit_done =
+		airoha_ppe_foe_try_flow_commit_bucket(ppe, hwe, hash, index,
+						      rx_wlan, true);
+
+	mask = airoha_ppe_get_total_num_entries(ppe) - 1;
+	window = min_t(u32,
+		       READ_ONCE(airoha_ppe_soe_inline_force_commit_probe_window),
+		       mask);
+	for (i = 1; !commit_done && i <= window; i++) {
+		u32 candidates[2] = { (index + i) & mask, (index - i) & mask };
+		unsigned int j;
+
+		for (j = 0; !commit_done && j < ARRAY_SIZE(candidates); j++) {
+			commit_done =
+				airoha_ppe_foe_try_flow_commit_bucket(ppe, hwe,
+								      hash, candidates[j],
+								      rx_wlan, false);
+		}
+	}
+
 	if (commit_done)
 		goto unlock;
 
@@ -940,8 +1433,9 @@ airoha_ppe_foe_l2_flow_commit_entry(struct airoha_ppe *ppe,
 				       airoha_l2_flow_table_params);
 }
 
-static int airoha_ppe_foe_flow_commit_entry(struct airoha_ppe *ppe,
-					    struct airoha_flow_table_entry *e)
+static int
+airoha_ppe_foe_flow_commit_entry(struct airoha_ppe *ppe,
+				 struct airoha_flow_table_entry *e)
 {
 	int type = FIELD_GET(AIROHA_FOE_IB1_BIND_PACKET_TYPE, e->data.ib1);
 	u32 hash;
@@ -1057,6 +1551,7 @@ static int airoha_ppe_entry_idle_time(struct airoha_ppe *ppe,
 static int airoha_ppe_flow_offload_replace(struct airoha_eth *eth,
 					   struct flow_cls_offload *f)
 {
+	const struct flow_offload_tuple *tuple = (const void *)f->cookie;
 	struct flow_rule *rule = flow_cls_offload_flow_rule(f);
 	struct airoha_flow_table_entry *e;
 	struct airoha_flow_data data = {};
@@ -1183,7 +1678,9 @@ static int airoha_ppe_flow_offload_replace(struct airoha_eth *eth,
 		flow_rule_match_ipv4_addrs(rule, &addrs);
 		data.v4.src_addr = addrs.key->src;
 		data.v4.dst_addr = addrs.key->dst;
-		airoha_ppe_foe_entry_set_ipv4_tuple(&hwe, &data, false);
+		err = airoha_ppe_foe_entry_set_ipv4_tuple(&hwe, &data, false);
+		if (err)
+			return err;
 	}
 
 	if (addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) {
@@ -1228,6 +1725,10 @@ static int airoha_ppe_flow_offload_replace(struct airoha_eth *eth,
 			return err;
 	}
 
+	err = airoha_ppe_foe_entry_set_soe_info(&hwe, tuple);
+	if (err)
+		return err;
+
 	e = kzalloc_obj(*e);
 	if (!e)
 		return -ENOMEM;
@@ -1350,16 +1851,26 @@ static int airoha_ppe_flow_offload_cmd(struct airoha_eth *eth,
 	return -EOPNOTSUPP;
 }
 
-static int airoha_ppe_flush_sram_entries(struct airoha_ppe *ppe)
+static int airoha_ppe_flush_entries(struct airoha_ppe *ppe)
 {
+	u32 ppe_num_entries = airoha_ppe_get_total_num_entries(ppe);
 	u32 sram_num_entries = airoha_ppe_get_total_sram_num_entries(ppe);
 	struct airoha_foe_entry *hwe = ppe->foe;
 	int i, err = 0;
 
+	memset(hwe, 0, ppe_num_entries * sizeof(*hwe));
+	if (ppe->foe_stats) {
+		u32 ppe_num_stats_entries =
+			airoha_ppe_get_total_num_stats_entries(ppe);
+
+		memset(ppe->foe_stats, 0,
+		       ppe_num_stats_entries * sizeof(*ppe->foe_stats));
+	}
+	dma_wmb();
+
 	for (i = 0; i < sram_num_entries; i++) {
 		int err;
 
-		memset(&hwe[i], 0, sizeof(*hwe));
 		err = airoha_ppe_foe_commit_sram_entry(ppe, i);
 		if (err)
 			break;
@@ -1368,6 +1879,37 @@ static int airoha_ppe_flush_sram_entries(struct airoha_ppe *ppe)
 	return err;
 }
 
+static int airoha_ppe_set_bind_rate(const char *val,
+				    const struct kernel_param *kp)
+{
+	struct airoha_ppe *ppe;
+	unsigned long rate;
+	int err, i;
+
+	err = kstrtoul(val, 0, &rate);
+	if (err)
+		return err;
+	if (rate > FIELD_MAX(PPE_BIND_RATE_BIND_MASK))
+		return -ERANGE;
+
+	WRITE_ONCE(airoha_ppe_bind_rate, (unsigned int)rate);
+
+	mutex_lock(&flow_offload_mutex);
+	ppe = READ_ONCE(airoha_ppe_active);
+	if (ppe) {
+		for (i = 0; i < ppe->eth->soc->num_ppe; i++)
+			airoha_ppe_apply_bind_rate(ppe->eth, i);
+	}
+	mutex_unlock(&flow_offload_mutex);
+
+	return 0;
+}
+
+static int airoha_ppe_get_bind_rate(char *buf, const struct kernel_param *kp)
+{
+	return sysfs_emit(buf, "%u\n", READ_ONCE(airoha_ppe_bind_rate));
+}
+
 static struct airoha_npu *airoha_ppe_npu_get(struct airoha_eth *eth)
 {
 	struct airoha_npu *npu = airoha_npu_get(eth->dev);
@@ -1601,12 +2143,20 @@ int airoha_ppe_init(struct airoha_eth *eth)
 			return -ENOMEM;
 	}
 
-	ppe->foe_check_time = devm_kzalloc(eth->dev, ppe_num_entries,
-					   GFP_KERNEL);
+	ppe->foe_check_time =
+		devm_kzalloc(eth->dev,
+			     ppe_num_entries * sizeof(*ppe->foe_check_time),
+			     GFP_KERNEL);
 	if (!ppe->foe_check_time)
 		return -ENOMEM;
 
-	err = airoha_ppe_flush_sram_entries(ppe);
+	ppe->soe_meta = devm_kzalloc(eth->dev,
+				     ppe_num_entries * sizeof(*ppe->soe_meta),
+				     GFP_KERNEL);
+	if (!ppe->soe_meta)
+		return -ENOMEM;
+
+	err = airoha_ppe_flush_entries(ppe);
 	if (err)
 		return err;
 
@@ -1622,6 +2172,8 @@ int airoha_ppe_init(struct airoha_eth *eth)
 	if (err)
 		goto error_l2_flow_table_destroy;
 
+	WRITE_ONCE(airoha_ppe_active, ppe);
+
 	return 0;
 
 error_l2_flow_table_destroy:
@@ -1636,6 +2188,8 @@ void airoha_ppe_deinit(struct airoha_eth *eth)
 {
 	struct airoha_npu *npu;
 
+	WRITE_ONCE(airoha_ppe_active, NULL);
+
 	mutex_lock(&flow_offload_mutex);
 
 	npu = rcu_replace_pointer(eth->npu, NULL,
diff --git a/include/linux/soc/airoha/airoha_offload.h b/include/linux/soc/airoha/airoha_offload.h
index 7589fccfeef6..120dbd274c89 100644
--- a/include/linux/soc/airoha/airoha_offload.h
+++ b/include/linux/soc/airoha/airoha_offload.h
@@ -11,7 +11,12 @@
 #include <linux/workqueue.h>
 
 enum {
+	PPE_CPU_REASON_UN_HIT = 0x0d,
+	PPE_CPU_REASON_HIT_UNBIND = 0x0e,
 	PPE_CPU_REASON_HIT_UNBIND_RATE_REACHED = 0x0f,
+	PPE_CPU_REASON_HIT_BIND_FORCE_CPU = 0x16,
+	PPE_CPU_REASON_HIT_BIND_EXCEED_MTU = 0x1c,
+	PPE_CPU_REASON_NOT_THROUGH_PPE = 0x1e,
 };
 
 struct airoha_ppe_dev {
-- 
2.53.0


^ permalink raw reply related

* [PULL] PCI: meson: Fix PERST# timing by asserting reset before LTSSM enable
From: gowtham @ 2026-06-14  4:26 UTC (permalink / raw)
  To: yue.wang, lpieralisi, kwilczynski, mani
  Cc: robh, bhelgaas, neil.armstrong, khilman, jbrunet,
	martin.blumenstingl, linux-pci, linux-amlogic, linux-arm-kernel,
	linux-kernel

The following changes since commit
bb532bfaf7919c7c98caab81864e9ce2646e11e3:

  Linux 7.0.11 (2026-06-01 17:54:55 +0200)

are available in the Git repository at:

  https://github.com/GowthamKudupudi/linux.git
  tags/meson-pcie-warm-reset-linux-7.0.y

for you to fetch changes up to 852811b11795ee389ea6a953ed0db69b76722469:

  PCI: meson: Fix PERST# timing by asserting reset before LTSSM enable
  (2026-06-14 09:41:01 +0530)

----------------------------------------------------------------
PCI: meson: Fix PERST# timing by asserting reset before LTSSM enable

On warm reboot, the PCIe controller's LTSSM starts link training
immediately if PERST# is already deasserted from the previous boot.
The driver then pulses PERST# for only 500us, which is too short to
properly reset the endpoint device that has already started training.

Fix by moving the PERST# assert/deassert pulse BEFORE enabling LTSSM,
so the endpoint gets a clean reset cycle before link training begins.

This was found on Amlogic G12B (A311D) with NVMe on an M.2 slot.
Cold boot worked because POR held PERST# low; warm reboot did not.
The fix was confirmed on a Banana Pi CM4 with Waveshare IO base board.

----------------------------------------------------------------
Gowtham Kudupudi (1):
      PCI: meson: Fix PERST# timing by asserting reset before LTSSM
enable

 drivers/pci/controller/dwc/pci-meson.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


^ permalink raw reply

* Re: [RFC PATCH net-next 0/7] net: airoha: add EN7581 SOE ESP packet offload
From: Jihong Min @ 2026-06-14  4:18 UTC (permalink / raw)
  To: netdev, Lorenzo Bianconi
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Simon Horman, Herbert Xu, Steffen Klassert,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, devicetree,
	Matthias Brugger, AngeloGioacchino Del Regno, linux-arm-kernel,
	linux-mediatek, Christian Marangi, Felix Fietkau, linux-kernel
In-Reply-To: <20260614040032.1567994-1-hurryman2212@gmail.com>



On 6/14/26 13:00, Jihong Min wrote:
> Add Secure Offload Engine (SOE) support for the Airoha EN7581 Ethernet
> driver. SOE provides inline ESP packet offload for native ESP and NAT-T
> traffic, with the Ethernet/QDMA path used to submit packets to the SOE
> block and the PPE path used to bind eligible ESP flows. NETIF_F_GSO_ESP
> and NETIF_F_HW_ESP_TX_CSUM are intentionally left out for now and will be
> revisited separately for feasibility.
> 
> This is posted as RFC because the code was originally developed and tested
> against an OpenWrt 6.18 Airoha tree, not against the current upstream
> net-next driver. The original OpenWrt commit used as the source for this
> RFC is available at:
> https://github.com/hurryman2212/OpenW1700k-test/commit/7c1b5e662f7790b3d23ed143beadc1dcbf6d15f7
> 
> The SOE part is intentionally linked into the airoha Ethernet module
> instead of being exposed as an independent crypto or platform driver. The
> user-visible ESP offload control is a netdev capability: xfrmdev_ops and
> NETIF_F_HW_ESP live on the target netdev, and the feature can be controlled
> through the usual netdev feature path. SOE also shares the FE/QDMA/PPE
> datapath, private queues, DSA conduit handling and netdev lifetime owned by
> airoha_eth.
> 
> Patch 1 adds xdo_dev_packet_xmit() because the existing XFRM packet
> offload transmit path does not provide a hook for hardware whose ESP engine
> is reached through device-specific packet forwarding. SOE needs to consume
> the skb, add a hardware hop descriptor, steer it to a private QDMA path and
> return the final transmit status. Drivers that do not implement the
> optional callback keep the existing XFRM output behavior.
> 
> Jihong Min (7):
>   xfrm: allow packet offload drivers to own transmit
>   dt-bindings: net: airoha: add EN7581 SOE
>   arm64: dts: airoha: add EN7581 SOE node
>   net: airoha: add SOE registers and driver state
>   net: airoha: add QDMA support for SOE packets
>   net: airoha: add PPE support for SOE flows
>   net: airoha: add SOE XFRM packet offload support
> 
>  .../bindings/net/airoha,en7581-soe.yaml       |   48 +
>  MAINTAINERS                                   |    1 +
>  arch/arm64/boot/dts/airoha/en7581.dtsi        |    6 +
>  drivers/net/ethernet/airoha/Kconfig           |   13 +
>  drivers/net/ethernet/airoha/Makefile          |    1 +
>  drivers/net/ethernet/airoha/airoha_eth.c      |  668 +++++-
>  drivers/net/ethernet/airoha/airoha_eth.h      |   40 +
>  drivers/net/ethernet/airoha/airoha_ppe.c      |  606 +++++-
>  drivers/net/ethernet/airoha/airoha_regs.h     |   16 +
>  drivers/net/ethernet/airoha/airoha_soe.c      | 1896 +++++++++++++++++
>  drivers/net/ethernet/airoha/airoha_soe.h      |  126 ++
>  include/linux/netdevice.h                     |    8 +
>  include/linux/soc/airoha/airoha_offload.h     |    5 +
>  net/xfrm/xfrm_output.c                        |   11 +
>  14 files changed, 3342 insertions(+), 103 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/net/airoha,en7581-soe.yaml
>  create mode 100644 drivers/net/ethernet/airoha/airoha_soe.c
>  create mode 100644 drivers/net/ethernet/airoha/airoha_soe.h
> 

I noticed, after posting this RFC, that I forgot to include the
following trailer while preparing the latest patch series:

Assisted-by: Codex:gpt-5.5

These patches were written and tested with AI assistance, although I've
reviewed the resulting code and test results. I'll include the trailer
properly in future revisions or submissions. Sorry.


Sincerely,
Jihong Min


^ permalink raw reply

* [RFC PATCH net-next 5/7] net: airoha: add QDMA support for SOE packets
From: Jihong Min @ 2026-06-14  4:00 UTC (permalink / raw)
  To: netdev, Lorenzo Bianconi
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Simon Horman, Herbert Xu, Steffen Klassert,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, devicetree,
	Matthias Brugger, AngeloGioacchino Del Regno, linux-arm-kernel,
	linux-mediatek, Christian Marangi, Felix Fietkau, linux-kernel,
	Jihong Min
In-Reply-To: <20260614040032.1567994-1-hurryman2212@gmail.com>

Add QDMA RX/TX plumbing for SOE packets, including the SOE RX ring,
coherent RX slots, SOE completion decoding, and the private QDMA submit
helper used by the SOE provider. Wire the Ethernet netdev feature and
lifetime hooks through the SOE stubs.

Signed-off-by: Jihong Min <hurryman2212@gmail.com>
---
 drivers/net/ethernet/airoha/airoha_eth.c | 668 ++++++++++++++++++++---
 1 file changed, 591 insertions(+), 77 deletions(-)

diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
index 5f1a118875fb..42fd30c12ed7 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.c
+++ b/drivers/net/ethernet/airoha/airoha_eth.c
@@ -6,6 +6,7 @@
 #include <linux/of.h>
 #include <linux/of_net.h>
 #include <linux/of_reserved_mem.h>
+#include <linux/moduleparam.h>
 #include <linux/platform_device.h>
 #include <linux/tcp.h>
 #include <linux/u64_stats_sync.h>
@@ -16,6 +17,67 @@
 
 #include "airoha_regs.h"
 #include "airoha_eth.h"
+#include "airoha_soe.h"
+
+/* QDMA1 uses a different RX IRQ bank layout than QDMA0 on EN7581. */
+#define AIROHA_QDMA_WAN_RX_IRQ0_BANK_PIN_MASK 0x0000839f
+#define AIROHA_QDMA_WAN_RX_IRQ1_BANK_PIN_MASK 0x7f800400
+#define AIROHA_QDMA_WAN_RX_IRQ2_BANK_PIN_MASK 0x00000000
+#define AIROHA_QDMA_WAN_RX_IRQ3_BANK_PIN_MASK 0x00000040
+
+static unsigned int airoha_qdma0_rx_irq_bank_mask[AIROHA_MAX_NUM_IRQ_BANKS] = {
+	RX_IRQ0_BANK_PIN_MASK,
+	RX_IRQ1_BANK_PIN_MASK,
+	RX_IRQ2_BANK_PIN_MASK,
+	RX_IRQ3_BANK_PIN_MASK,
+};
+
+static unsigned int airoha_qdma1_rx_irq_bank_mask[AIROHA_MAX_NUM_IRQ_BANKS] = {
+	AIROHA_QDMA_WAN_RX_IRQ0_BANK_PIN_MASK,
+	AIROHA_QDMA_WAN_RX_IRQ1_BANK_PIN_MASK,
+	AIROHA_QDMA_WAN_RX_IRQ2_BANK_PIN_MASK,
+	AIROHA_QDMA_WAN_RX_IRQ3_BANK_PIN_MASK,
+};
+
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+#define AIROHA_SOE_RX_RING 10
+#define AIROHA_SOE_RX_ALLOC_NDESC 2048
+#define AIROHA_SOE_RX_NDESC_DEFAULT 512
+#define AIROHA_SOE_RX_BUF_SIZE 4096
+#define AIROHA_SOE_RX_BUF_SIZE_MIN 128
+#define AIROHA_SOE_RX_BUF_SIZE_MAX 16384
+#define AIROHA_SOE_DECRYPT_FOE_REASON_MASK \
+	(BIT(PPE_CPU_REASON_UN_HIT) | BIT(PPE_CPU_REASON_HIT_UNBIND) | \
+	 BIT(PPE_CPU_REASON_HIT_UNBIND_RATE_REACHED))
+
+static int airoha_soe_rx_poll_desc_budget = AIROHA_SOE_RX_NDESC_DEFAULT;
+static int airoha_soe_rx_max_frame_descs = 128;
+static int airoha_soe_probe_rx_ndesc = AIROHA_SOE_RX_NDESC_DEFAULT;
+static int airoha_soe_probe_rx_buf_size = AIROHA_SOE_RX_BUF_SIZE;
+static bool airoha_soe_probe_rx_coherent = true;
+static int airoha_soe_probe_rx_scatter = 1;
+
+module_param_named(soe_rx_poll_desc_budget, airoha_soe_rx_poll_desc_budget, int,
+		   0600);
+MODULE_PARM_DESC(soe_rx_poll_desc_budget,
+		 "Maximum SOE RX descriptors consumed per poll");
+module_param_named(soe_rx_max_frame_descs, airoha_soe_rx_max_frame_descs, int,
+		   0600);
+MODULE_PARM_DESC(soe_rx_max_frame_descs,
+		 "Maximum descriptors per SOE RX frame before dropping the chain");
+module_param_named(soe_probe_rx_ndesc, airoha_soe_probe_rx_ndesc, int, 0600);
+MODULE_PARM_DESC(soe_probe_rx_ndesc, "SOE RX descriptor count");
+module_param_named(soe_probe_rx_buf_size, airoha_soe_probe_rx_buf_size, int,
+		   0600);
+MODULE_PARM_DESC(soe_probe_rx_buf_size, "SOE RX buffer size");
+module_param_named(soe_probe_rx_coherent, airoha_soe_probe_rx_coherent, bool,
+		   0600);
+MODULE_PARM_DESC(soe_probe_rx_coherent, "Use coherent SOE RX buffers");
+module_param_named(soe_probe_rx_scatter, airoha_soe_probe_rx_scatter, int,
+		   0600);
+MODULE_PARM_DESC(soe_probe_rx_scatter,
+		 "SOE RX scatter mode: 0 disabled, 1 enabled");
+#endif
 
 u32 airoha_rr(void __iomem *base, u32 offset)
 {
@@ -71,6 +133,100 @@ static void airoha_qdma_irq_disable(struct airoha_irq_bank *irq_bank,
 	airoha_qdma_set_irqmask(irq_bank, index, mask, 0);
 }
 
+static unsigned int *airoha_qdma_rx_irq_bank_masks(struct airoha_qdma *qdma)
+{
+	struct airoha_eth *eth = qdma->eth;
+	int id = qdma - &eth->qdma[0];
+
+	return id ? airoha_qdma1_rx_irq_bank_mask :
+		    airoha_qdma0_rx_irq_bank_mask;
+}
+
+static u32 airoha_qdma_rx_irq_bank_mask(struct airoha_qdma *qdma, int bank)
+{
+	unsigned int *masks = airoha_qdma_rx_irq_bank_masks(qdma);
+
+	if (bank < 0 || bank >= AIROHA_MAX_NUM_IRQ_BANKS)
+		return 0;
+
+	return READ_ONCE(masks[bank]);
+}
+
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+static u32 airoha_qdma_rx_irq_all_bank_mask(struct airoha_qdma *qdma)
+{
+	u32 mask = 0;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(qdma->irq_banks); i++)
+		mask |= airoha_qdma_rx_irq_bank_mask(qdma, i);
+
+	return mask;
+}
+#endif
+
+static void airoha_qdma_apply_rx_irq_bank_masks(struct airoha_qdma *qdma)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(qdma->irq_banks); i++) {
+		struct airoha_irq_bank *irq_bank = &qdma->irq_banks[i];
+		u32 mask = airoha_qdma_rx_irq_bank_mask(qdma, i);
+
+		airoha_qdma_set_irqmask(irq_bank, QDMA_INT_REG_IDX0,
+					RX_COHERENT_LOW_INT_MASK,
+					INT_RX0_MASK(mask));
+		airoha_qdma_set_irqmask(irq_bank, QDMA_INT_REG_IDX1,
+					RX_NO_CPU_DSCP_LOW_INT_MASK |
+						RX_DONE_LOW_INT_MASK,
+					INT_RX1_MASK(mask));
+		airoha_qdma_set_irqmask(irq_bank, QDMA_INT_REG_IDX2,
+					RX_NO_CPU_DSCP_HIGH_INT_MASK |
+						RX_DONE_HIGH_INT_MASK,
+					INT_RX2_MASK(mask));
+		airoha_qdma_set_irqmask(irq_bank, QDMA_INT_REG_IDX3,
+					RX_COHERENT_HIGH_INT_MASK,
+					INT_RX3_MASK(mask));
+	}
+}
+
+static void airoha_qdma_set_rx_done_irq(struct airoha_qdma *qdma, int qid,
+					bool enable)
+{
+	int i, intr_reg;
+	u32 mask;
+
+	intr_reg = qid < RX_DONE_HIGH_OFFSET ? QDMA_INT_REG_IDX1 :
+					       QDMA_INT_REG_IDX2;
+	mask = BIT(qid % RX_DONE_HIGH_OFFSET);
+
+	for (i = 0; i < ARRAY_SIZE(qdma->irq_banks); i++) {
+		if (!(BIT(qid) & airoha_qdma_rx_irq_bank_mask(qdma, i)))
+			continue;
+
+		if (enable)
+			airoha_qdma_irq_enable(&qdma->irq_banks[i], intr_reg,
+					       mask);
+		else
+			airoha_qdma_irq_disable(&qdma->irq_banks[i], intr_reg,
+						mask);
+	}
+
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+	if (qid != AIROHA_SOE_RX_RING)
+		return;
+
+	if (BIT(qid) & airoha_qdma_rx_irq_all_bank_mask(qdma))
+		return;
+
+	/* SOE RX10 may not be covered by the shared bank mask; use bank 0. */
+	if (enable)
+		airoha_qdma_irq_enable(&qdma->irq_banks[0], intr_reg, mask);
+	else
+		airoha_qdma_irq_disable(&qdma->irq_banks[0], intr_reg, mask);
+#endif
+}
+
 static int airoha_set_macaddr(struct airoha_gdm_dev *dev, const u8 *addr)
 {
 	u8 ref_addr[ETH_ALEN] __aligned(2);
@@ -532,6 +688,11 @@ static int airoha_fe_init(struct airoha_eth *eth)
 	airoha_fe_wr(eth, REG_FE_CDM1_OQ_MAP2, BIT(4));
 	airoha_fe_wr(eth, REG_FE_CDM1_OQ_MAP3, BIT(28));
 
+	/* SOE/TDMA output depends on PSE shared-buffer flow control instead
+	 * of leaving port-local sharing disabled.
+	 */
+	airoha_fe_clear(eth, REG_PSE_FC_CFG, PSE_TDMA_SHARE_BUF_DIS_MASK);
+
 	airoha_fe_vip_setup(eth);
 	airoha_fe_pse_ports_init(eth);
 
@@ -597,20 +758,29 @@ static int airoha_qdma_fill_rx_queue(struct airoha_queue *q)
 		int offset;
 		u32 val;
 
-		page = page_pool_dev_alloc_frag(q->page_pool, &offset,
-						q->buf_size);
-		if (!page)
-			break;
+		if (q->rx_coherent) {
+			/* Coherent slots avoid page_pool recycling for SOE frames. */
+			offset = q->head * q->buf_size;
+			e->buf = q->rx_coherent_buf + offset;
+			e->dma_addr = q->rx_coherent_dma + offset;
+			e->dma_len = q->buf_size;
+		} else {
+			page = page_pool_dev_alloc_frag(q->page_pool, &offset,
+							q->buf_size);
+			if (!page)
+				break;
+
+			offset += AIROHA_RX_HEADROOM;
+			e->buf = page_address(page) + offset;
+			e->dma_addr = page_pool_get_dma_addr(page) + offset;
+			e->dma_len =
+				SKB_WITH_OVERHEAD(AIROHA_RX_LEN(q->buf_size));
+		}
 
 		q->head = (q->head + 1) % q->ndesc;
 		q->queued++;
 		nframes++;
 
-		offset += AIROHA_RX_HEADROOM;
-		e->buf = page_address(page) + offset;
-		e->dma_addr = page_pool_get_dma_addr(page) + offset;
-		e->dma_len = SKB_WITH_OVERHEAD(AIROHA_RX_LEN(q->buf_size));
-
 		val = FIELD_PREP(QDMA_DESC_LEN_MASK, e->dma_len);
 		WRITE_ONCE(desc->ctrl, cpu_to_le32(val));
 		WRITE_ONCE(desc->addr, cpu_to_le32(e->dma_addr));
@@ -652,92 +822,210 @@ airoha_qdma_get_gdm_dev(struct airoha_eth *eth, struct airoha_qdma_desc *desc)
 	return port->devs[d] ? port->devs[d] : ERR_PTR(-ENODEV);
 }
 
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+static bool airoha_qdma_rx_is_soe(u32 msg0)
+{
+	u32 hop_flags = FIELD_GET(QDMA_ETH_RXMSG_HOP_FLAGS_MASK, msg0);
+
+	return hop_flags >= 3 && hop_flags <= 7;
+}
+#endif
+
 static int airoha_qdma_rx_process(struct airoha_queue *q, int budget)
 {
-	enum dma_data_direction dir = page_pool_get_dma_dir(q->page_pool);
+	enum dma_data_direction dir;
 	struct airoha_qdma *qdma = q->qdma;
 	struct airoha_eth *eth = qdma->eth;
 	int qid = q - &qdma->q_rx[0];
+	int desc_budget = q->ndesc;
+	int desc_done = 0;
 	int done = 0;
 
-	while (done < budget) {
+	dir = q->rx_coherent ? DMA_FROM_DEVICE :
+			       page_pool_get_dma_dir(q->page_pool);
+
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+	if (airoha_soe_rx_poll_desc_budget > 0)
+		desc_budget = min(airoha_soe_rx_poll_desc_budget, q->ndesc);
+#endif
+
+	while (q->queued > 0 && done < budget && desc_done < desc_budget) {
 		struct airoha_queue_entry *e = &q->entry[q->tail];
 		struct airoha_qdma_desc *desc = &q->desc[q->tail];
-		u32 hash, reason, msg1, desc_ctrl;
-		struct airoha_gdm_dev *dev;
+		u32 hash, reason, msg0, msg1, msg2, desc_ctrl;
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+		/* Scattered SOE completions only tag the first descriptor. */
+		bool partial_soe = q->skb && !q->skb->dev;
+#endif
+		struct airoha_gdm_dev *dev = NULL;
+		struct net_device *rx_dev = NULL;
 		struct net_device *netdev;
+		bool soe_pkt = false;
 		int data_len, len;
-		struct page *page;
+		struct page *page = NULL;
 
 		desc_ctrl = le32_to_cpu(READ_ONCE(desc->ctrl));
 		if (!(desc_ctrl & QDMA_DESC_DONE_MASK))
 			break;
 
 		dma_rmb();
+		desc_done++;
 
 		q->tail = (q->tail + 1) % q->ndesc;
 		q->queued--;
 
-		dma_sync_single_for_cpu(eth->dev, e->dma_addr, e->dma_len,
-					dir);
+		if (!q->rx_coherent)
+			dma_sync_single_for_cpu(eth->dev, e->dma_addr,
+						e->dma_len, dir);
+
+		if (!q->rx_coherent)
+			page = virt_to_head_page(e->buf);
+
+		if (q->rx_drop_chain) {
+			if (!FIELD_GET(QDMA_DESC_MORE_MASK, desc_ctrl)) {
+				q->rx_drop_chain = false;
+				q->rx_frame_descs = 0;
+				done++;
+			}
+			if (!q->rx_coherent)
+				page_pool_put_full_page(q->page_pool, page,
+							true);
+			continue;
+		}
 
-		page = virt_to_head_page(e->buf);
 		len = FIELD_GET(QDMA_DESC_LEN_MASK, desc_ctrl);
-		data_len = q->skb ? AIROHA_RX_LEN(q->buf_size) : e->dma_len;
+		data_len = q->skb && !q->rx_coherent ?
+				   AIROHA_RX_LEN(q->buf_size) :
+				   e->dma_len;
 		if (!len || data_len < len)
 			goto free_frag;
 
-		dev = airoha_qdma_get_gdm_dev(eth, desc);
-		if (IS_ERR(dev))
-			goto free_frag;
+		msg0 = le32_to_cpu(READ_ONCE(desc->msg0));
+		msg1 = le32_to_cpu(READ_ONCE(desc->msg1));
+		msg2 = le32_to_cpu(READ_ONCE(desc->msg2));
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+		soe_pkt = partial_soe || airoha_qdma_rx_is_soe(msg0);
+#endif
+		if (!soe_pkt) {
+			dev = airoha_qdma_get_gdm_dev(eth, desc);
+			if (IS_ERR(dev))
+				goto free_frag;
+			netdev = netdev_from_priv(dev);
+			rx_dev = netdev;
+		}
 
-		netdev = netdev_from_priv(dev);
 		if (!q->skb) { /* first buffer */
-			q->skb = napi_build_skb(e->buf - AIROHA_RX_HEADROOM,
-						q->buf_size);
+			if (q->rx_coherent) {
+				q->skb = napi_alloc_skb(&q->napi, len);
+				if (q->skb)
+					skb_put_data(q->skb, e->buf, len);
+			} else {
+				void *buf = e->buf - AIROHA_RX_HEADROOM;
+
+				q->skb = napi_build_skb(buf, q->buf_size);
+			}
 			if (!q->skb)
 				goto free_frag;
 
-			skb_reserve(q->skb, AIROHA_RX_HEADROOM);
-			__skb_put(q->skb, len);
-			skb_mark_for_recycle(q->skb);
-			q->skb->dev = netdev;
-			q->skb->protocol = eth_type_trans(q->skb, netdev);
-			q->skb->ip_summed = CHECKSUM_UNNECESSARY;
+			q->rx_drop_chain = false;
+			q->rx_frame_descs = 1;
+			if (!q->rx_coherent) {
+				skb_reserve(q->skb, AIROHA_RX_HEADROOM);
+				__skb_put(q->skb, len);
+				skb_mark_for_recycle(q->skb);
+			}
+			q->skb->dev = soe_pkt ? NULL : netdev;
+			q->skb->ip_summed = soe_pkt ? CHECKSUM_NONE :
+						      CHECKSUM_UNNECESSARY;
 			skb_record_rx_queue(q->skb, qid);
+			if (soe_pkt) {
+				q->soe_rx_msg0 = msg0;
+				q->soe_rx_msg2 = msg2;
+			}
+			if (!soe_pkt)
+				q->skb->protocol = eth_type_trans(q->skb,
+								  netdev);
 		} else { /* scattered frame */
-			struct skb_shared_info *shinfo = skb_shinfo(q->skb);
-			int nr_frags = shinfo->nr_frags;
-
-			if (nr_frags >= ARRAY_SIZE(shinfo->frags))
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+			if (soe_pkt && airoha_soe_rx_max_frame_descs > 0 &&
+			    q->rx_frame_descs >=
+				    airoha_soe_rx_max_frame_descs) {
+				q->rx_drop_chain =
+					FIELD_GET(QDMA_DESC_MORE_MASK, desc_ctrl);
 				goto free_frag;
-
-			skb_add_rx_frag(q->skb, nr_frags, page,
-					e->buf - page_address(page), len,
-					q->buf_size);
+			}
+#endif
+			q->rx_frame_descs++;
+			if (q->rx_coherent) {
+				if (skb_tailroom(q->skb) < len) {
+					unsigned int needed;
+
+					needed = len - skb_tailroom(q->skb);
+					if (pskb_expand_head(q->skb, 0, needed,
+							     GFP_ATOMIC))
+						goto free_frag;
+				}
+				skb_put_data(q->skb, e->buf, len);
+			} else {
+				struct skb_shared_info *shinfo =
+					skb_shinfo(q->skb);
+				int nr_frags = shinfo->nr_frags;
+
+				if (nr_frags >= ARRAY_SIZE(shinfo->frags))
+					goto free_frag;
+
+				skb_add_rx_frag(q->skb, nr_frags, page,
+						e->buf - page_address(page),
+						len, q->buf_size);
+			}
 		}
 
 		if (FIELD_GET(QDMA_DESC_MORE_MASK, desc_ctrl))
 			continue;
 
+		q->rx_drop_chain = false;
+		q->rx_frame_descs = 0;
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+		if (soe_pkt) {
+			u32 hop_flags = FIELD_GET(QDMA_ETH_RXMSG_HOP_FLAGS_MASK,
+						  q->soe_rx_msg0);
+			u32 sa_index = FIELD_GET(QDMA_ETH_RXMSG_SW_UDF_MASK,
+						 q->soe_rx_msg2);
+
+			done++;
+			if (!airoha_soe_rx_skb(eth->soe, q->skb, sa_index,
+					       hop_flags))
+				dev_kfree_skb(q->skb);
+			q->skb = NULL;
+			continue;
+		}
+#endif
+
 		if (netdev_uses_dsa(netdev)) {
 			struct airoha_gdm_port *port = dev->port;
+			struct dsa_port *cpu_dp = netdev->dsa_ptr;
+			u32 sptag = FIELD_GET(QDMA_ETH_RXMSG_SPTAG, msg0);
 
 			/* PPE module requires untagged packets to work
 			 * properly and it provides DSA port index via the
 			 * DMA descriptor. Report DSA tag to the DSA stack
 			 * via skb dst info.
 			 */
-			u32 msg0 = le32_to_cpu(READ_ONCE(desc->msg0));
-			u32 sptag = FIELD_GET(QDMA_ETH_RXMSG_SPTAG, msg0);
-
 			if (sptag < ARRAY_SIZE(port->dsa_meta) &&
 			    port->dsa_meta[sptag])
 				skb_dst_set_noref(q->skb,
 						  &port->dsa_meta[sptag]->dst);
+
+			if (cpu_dp && cpu_dp->ds) {
+				struct dsa_port *dp =
+					dsa_to_port(cpu_dp->ds, sptag);
+
+				if (dp && dsa_port_is_user(dp) &&
+				    dp->cpu_dp == cpu_dp && dp->user)
+					rx_dev = dp->user;
+			}
 		}
 
-		msg1 = le32_to_cpu(READ_ONCE(desc->msg1));
 		hash = FIELD_GET(AIROHA_RXD4_FOE_ENTRY, msg1);
 		if (hash != AIROHA_RXD4_FOE_ENTRY)
 			skb_set_hash(q->skb, jhash_1word(hash, 0),
@@ -749,18 +1037,54 @@ static int airoha_qdma_rx_process(struct airoha_queue *q, int budget)
 					     false);
 
 		done++;
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+		/* Native ESP/NAT-T packets enter through the normal RX path
+		 * first. If they match an inbound packet-offload SA, consume
+		 * the encrypted skb here and resubmit it to SOE before GRO
+		 * takes ownership of the packet. SOE decrypts the original
+		 * ESP/NAT-T packet; after GRO or the normal RX stack processes
+		 * the skb, it is no longer a suitable hardware decrypt
+		 * candidate. Keep the RX FOE hash/reason so the decrypt
+		 * completion can later bind the PPE flow after egress is known.
+		 */
+		if (hash != AIROHA_RXD4_FOE_ENTRY) {
+			bool foe_valid = false;
+
+			if (reason < 32)
+				foe_valid = AIROHA_SOE_DECRYPT_FOE_REASON_MASK &
+					    BIT(reason);
+			if (airoha_soe_rx_plain_skb(dev, q->skb, rx_dev, hash,
+						    reason, foe_valid)) {
+				q->skb = NULL;
+				continue;
+			}
+		} else if (airoha_soe_rx_plain_skb(dev, q->skb, rx_dev, hash,
+						   reason, false)) {
+			q->skb = NULL;
+			continue;
+		}
+#endif
 		napi_gro_receive(&q->napi, q->skb);
 		q->skb = NULL;
 		continue;
 free_frag:
+		q->rx_drop_chain = !!FIELD_GET(QDMA_DESC_MORE_MASK, desc_ctrl);
+		q->rx_frame_descs = 0;
+		done++;
 		if (q->skb) {
 			dev_kfree_skb(q->skb);
 			q->skb = NULL;
 		}
-		page_pool_put_full_page(q->page_pool, page, true);
+		if (!q->rx_coherent)
+			page_pool_put_full_page(q->page_pool, page, true);
 	}
 	airoha_qdma_fill_rx_queue(q);
 
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+	if (desc_done && !done)
+		return 1;
+#endif
+
 	return done;
 }
 
@@ -776,17 +1100,9 @@ static int airoha_qdma_rx_napi_poll(struct napi_struct *napi, int budget)
 
 	if (done < budget && napi_complete(napi)) {
 		struct airoha_qdma *qdma = q->qdma;
-		int i, qid = q - &qdma->q_rx[0];
-		int intr_reg = qid < RX_DONE_HIGH_OFFSET ? QDMA_INT_REG_IDX1
-							 : QDMA_INT_REG_IDX2;
-
-		for (i = 0; i < ARRAY_SIZE(qdma->irq_banks); i++) {
-			if (!(BIT(qid) & RX_IRQ_BANK_PIN_MASK(i)))
-				continue;
+		int qid = q - &qdma->q_rx[0];
 
-			airoha_qdma_irq_enable(&qdma->irq_banks[i], intr_reg,
-					       BIT(qid % RX_DONE_HIGH_OFFSET));
-		}
+		airoha_qdma_set_rx_done_irq(qdma, qid, true);
 	}
 
 	return done;
@@ -795,7 +1111,7 @@ static int airoha_qdma_rx_napi_poll(struct napi_struct *napi, int budget)
 static int airoha_qdma_init_rx_queue(struct airoha_queue *q,
 				     struct airoha_qdma *qdma, int ndesc)
 {
-	const struct page_pool_params pp_params = {
+	struct page_pool_params pp_params = {
 		.order = 0,
 		.pool_size = 256,
 		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
@@ -809,8 +1125,21 @@ static int airoha_qdma_init_rx_queue(struct airoha_queue *q,
 	int qid = q - &qdma->q_rx[0], thr;
 	dma_addr_t dma_addr;
 
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+	if (qid == AIROHA_SOE_RX_RING) {
+		ndesc = max(ndesc, AIROHA_SOE_RX_ALLOC_NDESC);
+		if (airoha_soe_probe_rx_ndesc > 0)
+			ndesc = clamp(airoha_soe_probe_rx_ndesc, 1,
+				      AIROHA_SOE_RX_ALLOC_NDESC);
+	}
+#endif
+
 	q->buf_size = PAGE_SIZE / 2;
 	q->qdma = qdma;
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+	if (qid == AIROHA_SOE_RX_RING)
+		q->rx_coherent = airoha_soe_probe_rx_coherent;
+#endif
 
 	q->entry = devm_kzalloc(eth->dev, ndesc * sizeof(*q->entry),
 				GFP_KERNEL);
@@ -821,19 +1150,45 @@ static int airoha_qdma_init_rx_queue(struct airoha_queue *q,
 				      &dma_addr, GFP_KERNEL);
 	if (!q->desc)
 		return -ENOMEM;
+	q->desc_dma = dma_addr;
 
-	q->page_pool = page_pool_create(&pp_params);
-	if (IS_ERR(q->page_pool)) {
-		int err = PTR_ERR(q->page_pool);
+	if (!q->rx_coherent) {
+		q->page_pool = page_pool_create(&pp_params);
+		if (IS_ERR(q->page_pool)) {
+			int err = PTR_ERR(q->page_pool);
 
-		q->page_pool = NULL;
-		return err;
+			q->page_pool = NULL;
+			return err;
+		}
+	}
+
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+	if (qid == AIROHA_SOE_RX_RING) {
+		size_t buf_size;
+		int max_buf_size;
+
+		q->rx_alloc_ndesc = ndesc;
+		max_buf_size = q->rx_coherent ? AIROHA_SOE_RX_BUF_SIZE_MAX :
+						AIROHA_SOE_RX_BUF_SIZE;
+		q->buf_size = clamp(airoha_soe_probe_rx_buf_size,
+				    AIROHA_SOE_RX_BUF_SIZE_MIN, max_buf_size);
+		if (q->rx_coherent) {
+			buf_size = q->buf_size * ndesc;
+			q->rx_coherent_buf =
+				dmam_alloc_coherent(eth->dev, buf_size,
+						    &q->rx_coherent_dma,
+						    GFP_KERNEL);
+			if (!q->rx_coherent_buf)
+				return -ENOMEM;
+			q->rx_coherent_buf_size = buf_size;
+		}
 	}
+#endif
 
 	q->ndesc = ndesc;
 	netif_napi_add(eth->napi_dev, &q->napi, airoha_qdma_rx_napi_poll);
 
-	airoha_qdma_wr(qdma, REG_RX_RING_BASE(qid), dma_addr);
+	airoha_qdma_wr(qdma, REG_RX_RING_BASE(qid), q->desc_dma);
 	airoha_qdma_rmw(qdma, REG_RX_RING_SIZE(qid),
 			RX_RING_SIZE_MASK,
 			FIELD_PREP(RX_RING_SIZE_MASK, ndesc));
@@ -843,7 +1198,14 @@ static int airoha_qdma_init_rx_queue(struct airoha_queue *q,
 			FIELD_PREP(RX_RING_THR_MASK, thr));
 	airoha_qdma_rmw(qdma, REG_RX_DMA_IDX(qid), RX_RING_DMA_IDX_MASK,
 			FIELD_PREP(RX_RING_DMA_IDX_MASK, q->head));
-	airoha_qdma_set(qdma, REG_RX_SCATTER_CFG(qid), RX_RING_SG_EN_MASK);
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+	if (qid == AIROHA_SOE_RX_RING && !airoha_soe_probe_rx_scatter)
+		airoha_qdma_clear(qdma, REG_RX_SCATTER_CFG(qid),
+				  RX_RING_SG_EN_MASK);
+	else
+#endif
+		airoha_qdma_set(qdma, REG_RX_SCATTER_CFG(qid),
+				RX_RING_SG_EN_MASK);
 
 	airoha_qdma_fill_rx_queue(q);
 
@@ -859,11 +1221,14 @@ static void airoha_qdma_cleanup_rx_queue(struct airoha_queue *q)
 	while (q->queued) {
 		struct airoha_queue_entry *e = &q->entry[q->tail];
 		struct airoha_qdma_desc *desc = &q->desc[q->tail];
-		struct page *page = virt_to_head_page(e->buf);
 
-		dma_sync_single_for_cpu(eth->dev, e->dma_addr, e->dma_len,
-					page_pool_get_dma_dir(q->page_pool));
-		page_pool_put_full_page(q->page_pool, page, false);
+		if (!q->rx_coherent) {
+			struct page *page = virt_to_head_page(e->buf);
+
+			dma_sync_single_for_cpu(eth->dev, e->dma_addr, e->dma_len,
+						page_pool_get_dma_dir(q->page_pool));
+			page_pool_put_full_page(q->page_pool, page, false);
+		}
 		/* Reset DMA descriptor */
 		WRITE_ONCE(desc->ctrl, 0);
 		WRITE_ONCE(desc->addr, 0);
@@ -1045,8 +1410,24 @@ static int airoha_qdma_tx_napi_poll(struct napi_struct *napi, int budget)
 			airoha_qdma_rmw(qdma, REG_IRQ_CLEAR_LEN(id),
 					IRQ_CLEAR_LEN_MASK, 0x80);
 		airoha_qdma_rmw(qdma, REG_IRQ_CLEAR_LEN(id),
-				IRQ_CLEAR_LEN_MASK, (done & 0x7f));
+				IRQ_CLEAR_LEN_MASK,
+				(done & 0x7f));
+	}
+
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+	if (done && airoha_soe_has_pending_rx(eth->soe)) {
+		int i;
+
+		/* SOE decrypt completions can lag behind TX cleanup IRQs. */
+		for (i = 0; i < ARRAY_SIZE(eth->qdma); i++) {
+			struct airoha_queue *rxq =
+				&eth->qdma[i].q_rx[AIROHA_SOE_RX_RING];
+
+			if (rxq->ndesc)
+				napi_schedule(&rxq->napi);
+		}
 	}
+#endif
 
 	if (done < budget && napi_complete(napi))
 		airoha_qdma_irq_enable(&qdma->irq_banks[0], QDMA_INT_REG_IDX0,
@@ -1346,16 +1727,11 @@ static int airoha_qdma_hw_init(struct airoha_qdma *qdma)
 	for (i = 0; i < ARRAY_SIZE(qdma->irq_banks); i++) {
 		/* clear pending irqs */
 		airoha_qdma_wr(qdma, REG_INT_STATUS(i), 0xffffffff);
-		/* setup rx irqs */
-		airoha_qdma_irq_enable(&qdma->irq_banks[i], QDMA_INT_REG_IDX0,
-				       INT_RX0_MASK(RX_IRQ_BANK_PIN_MASK(i)));
-		airoha_qdma_irq_enable(&qdma->irq_banks[i], QDMA_INT_REG_IDX1,
-				       INT_RX1_MASK(RX_IRQ_BANK_PIN_MASK(i)));
-		airoha_qdma_irq_enable(&qdma->irq_banks[i], QDMA_INT_REG_IDX2,
-				       INT_RX2_MASK(RX_IRQ_BANK_PIN_MASK(i)));
-		airoha_qdma_irq_enable(&qdma->irq_banks[i], QDMA_INT_REG_IDX3,
-				       INT_RX3_MASK(RX_IRQ_BANK_PIN_MASK(i)));
 	}
+	airoha_qdma_apply_rx_irq_bank_masks(qdma);
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+	airoha_qdma_set_rx_done_irq(qdma, AIROHA_SOE_RX_RING, true);
+#endif
 	/* setup tx irqs */
 	airoha_qdma_irq_enable(&qdma->irq_banks[0], QDMA_INT_REG_IDX0,
 			       TX_COHERENT_LOW_INT_MASK | INT_TX_MASK);
@@ -2176,6 +2552,110 @@ int airoha_get_fe_port(struct airoha_gdm_dev *dev)
 	}
 }
 
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+int airoha_qdma_xmit_skb(struct airoha_gdm_dev *dev, struct sk_buff *skb,
+			 u32 msg0, u32 msg1, u32 msg2)
+{
+	struct net_device *netdev = netdev_from_priv(dev);
+	struct airoha_qdma *qdma = dev->qdma;
+	u32 nr_frags, len;
+	struct airoha_queue_entry *e;
+	struct netdev_queue *txq;
+	struct airoha_queue *q;
+	LIST_HEAD(tx_list);
+	int i = 0, qid;
+	void *data;
+	u16 index;
+
+	qid = airoha_qdma_get_txq(qdma, skb_get_queue_mapping(skb));
+	q = &qdma->q_tx[qid];
+	if (WARN_ON_ONCE(!q->ndesc))
+		return -ENODEV;
+
+	spin_lock_bh(&q->lock);
+
+	txq = skb_get_tx_queue(netdev, skb);
+	nr_frags = 1 + skb_shinfo(skb)->nr_frags;
+	if (q->queued + nr_frags >= q->ndesc) {
+		netif_tx_stop_queue(txq);
+		q->txq_stopped = true;
+		spin_unlock_bh(&q->lock);
+		return -EBUSY;
+	}
+
+	len = skb_headlen(skb);
+	data = skb->data;
+
+	e = list_first_entry(&q->tx_list, struct airoha_queue_entry, list);
+	index = e - q->entry;
+
+	while (true) {
+		struct airoha_qdma_desc *desc = &q->desc[index];
+		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+		dma_addr_t addr;
+		u32 val;
+
+		addr = dma_map_single(netdev->dev.parent, data, len,
+				      DMA_TO_DEVICE);
+		if (unlikely(dma_mapping_error(netdev->dev.parent, addr)))
+			goto error_unmap;
+
+		list_move_tail(&e->list, &tx_list);
+		e->skb = i == nr_frags - 1 ? skb : NULL;
+		e->dma_addr = addr;
+		e->dma_len = len;
+
+		e = list_first_entry(&q->tx_list, struct airoha_queue_entry,
+				     list);
+		index = e - q->entry;
+
+		val = FIELD_PREP(QDMA_DESC_LEN_MASK, len);
+		if (i < nr_frags - 1)
+			val |= FIELD_PREP(QDMA_DESC_MORE_MASK, 1);
+		WRITE_ONCE(desc->ctrl, cpu_to_le32(val));
+		WRITE_ONCE(desc->addr, cpu_to_le32(addr));
+		val = FIELD_PREP(QDMA_DESC_NEXT_ID_MASK, index);
+		WRITE_ONCE(desc->data, cpu_to_le32(val));
+		WRITE_ONCE(desc->msg0, cpu_to_le32(msg0));
+		WRITE_ONCE(desc->msg1, cpu_to_le32(msg1));
+		WRITE_ONCE(desc->msg2, cpu_to_le32(msg2));
+
+		if (++i == nr_frags)
+			break;
+
+		data = skb_frag_address(frag);
+		len = skb_frag_size(frag);
+	}
+	q->queued += i;
+
+	skb_tx_timestamp(skb);
+	netdev_tx_sent_queue(txq, skb->len);
+	if (q->ndesc - q->queued < q->free_thr) {
+		netif_tx_stop_queue(txq);
+		q->txq_stopped = true;
+	}
+
+	/* SOE submits do not run in the regular ndo_start_xmit batching path. */
+	airoha_qdma_rmw(qdma, REG_TX_CPU_IDX(qid), TX_RING_CPU_IDX_MASK,
+			FIELD_PREP(TX_RING_CPU_IDX_MASK, index));
+
+	spin_unlock_bh(&q->lock);
+
+	return 0;
+
+error_unmap:
+	list_for_each_entry(e, &tx_list, list) {
+		dma_unmap_single(netdev->dev.parent, e->dma_addr, e->dma_len,
+				 DMA_TO_DEVICE);
+		e->dma_addr = 0;
+	}
+	list_splice(&tx_list, &q->tx_list);
+	spin_unlock_bh(&q->lock);
+
+	return -ENOMEM;
+}
+#endif
+
 static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
 				   struct net_device *netdev)
 {
@@ -2185,6 +2665,7 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
 	struct airoha_queue_entry *e;
 	struct netdev_queue *txq;
 	struct airoha_queue *q;
+	bool soe_decrypt_skb = false;
 	LIST_HEAD(tx_list);
 	int i = 0, qid;
 	void *data;
@@ -2223,6 +2704,11 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
 	       FIELD_PREP(QDMA_ETH_TXMSG_FPORT_MASK, fport) |
 	       FIELD_PREP(QDMA_ETH_TXMSG_METER_MASK, 0x7f);
 
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+	if (dev->eth->ppe)
+		soe_decrypt_skb = airoha_ppe_soe_skb_marked(skb);
+#endif
+
 	q = &qdma->q_tx[qid];
 	if (WARN_ON_ONCE(!q->ndesc))
 		goto error;
@@ -2293,13 +2779,24 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
 		q->txq_stopped = true;
 	}
 
-	if (netif_xmit_stopped(txq) || !netdev_xmit_more())
+	if (netif_xmit_stopped(txq) || !netdev_xmit_more() ||
+	    soe_decrypt_skb)
 		airoha_qdma_rmw(qdma, REG_TX_CPU_IDX(qid),
 				TX_RING_CPU_IDX_MASK,
 				FIELD_PREP(TX_RING_CPU_IDX_MASK, index));
 
 	spin_unlock_bh(&q->lock);
 
+	/* SOE decrypt flow binding needs the final egress netdev and QDMA
+	 * descriptor context. The SOE RX path marks only candidate packets; bind
+	 * only after the current plaintext packet has been queued and kicked so
+	 * the newly bound decrypt entry cannot race the CPU-transmitted packet.
+	 */
+#if IS_REACHABLE(CONFIG_NET_AIROHA_SOE)
+	if (soe_decrypt_skb)
+		airoha_ppe_soe_xmit_skb(&dev->eth->ppe->dev, skb, netdev);
+#endif
+
 	return NETDEV_TX_OK;
 
 error_unmap:
@@ -3096,6 +3593,12 @@ static int airoha_dev_tc_setup(struct net_device *dev,
 	}
 }
 
+static int airoha_dev_set_features(struct net_device *netdev,
+				   netdev_features_t features)
+{
+	return airoha_soe_set_features(netdev, features);
+}
+
 static const struct net_device_ops airoha_netdev_ops = {
 	.ndo_init		= airoha_dev_init,
 	.ndo_open		= airoha_dev_open,
@@ -3105,6 +3608,7 @@ static const struct net_device_ops airoha_netdev_ops = {
 	.ndo_start_xmit		= airoha_dev_xmit,
 	.ndo_get_stats64        = airoha_dev_get_stats64,
 	.ndo_set_mac_address	= airoha_dev_set_macaddr,
+	.ndo_set_features	= airoha_dev_set_features,
 	.ndo_setup_tc		= airoha_dev_tc_setup,
 };
 
@@ -3230,6 +3734,7 @@ static int airoha_alloc_gdm_device(struct airoha_eth *eth,
 	dev->eth = eth;
 	dev->nbq = nbq;
 	port->devs[index] = dev;
+	airoha_soe_build_netdev(netdev, airoha_qdma_xmit_skb);
 
 	return 0;
 }
@@ -3409,10 +3914,14 @@ static int airoha_probe(struct platform_device *pdev)
 	strscpy(eth->napi_dev->name, "qdma_eth", sizeof(eth->napi_dev->name));
 	platform_set_drvdata(pdev, eth);
 
-	err = airoha_hw_init(pdev, eth);
+	err = airoha_soe_init(eth);
 	if (err)
 		goto error_netdev_free;
 
+	err = airoha_hw_init(pdev, eth);
+	if (err)
+		goto error_soe_deinit;
+
 	for (i = 0; i < ARRAY_SIZE(eth->qdma); i++)
 		airoha_qdma_start_napi(&eth->qdma[i]);
 
@@ -3457,11 +3966,14 @@ static int airoha_probe(struct platform_device *pdev)
 			netdev = netdev_from_priv(dev);
 			if (netdev->reg_state == NETREG_REGISTERED)
 				unregister_netdev(netdev);
+			airoha_soe_teardown_netdev(netdev);
 			of_node_put(netdev->dev.of_node);
 		}
 		airoha_metadata_dst_free(port);
 	}
 	airoha_hw_cleanup(eth);
+error_soe_deinit:
+	airoha_soe_deinit(eth);
 error_netdev_free:
 	free_netdev(eth->napi_dev);
 	platform_set_drvdata(pdev, NULL);
@@ -3492,12 +4004,14 @@ static void airoha_remove(struct platform_device *pdev)
 				continue;
 
 			netdev = netdev_from_priv(dev);
+			airoha_soe_teardown_netdev(netdev);
 			unregister_netdev(netdev);
 			of_node_put(netdev->dev.of_node);
 		}
 		airoha_metadata_dst_free(port);
 	}
 	airoha_hw_cleanup(eth);
+	airoha_soe_deinit(eth);
 
 	free_netdev(eth->napi_dev);
 	platform_set_drvdata(pdev, NULL);
-- 
2.53.0


^ permalink raw reply related

* [RFC PATCH net-next 4/7] net: airoha: add SOE registers and driver state
From: Jihong Min @ 2026-06-14  4:00 UTC (permalink / raw)
  To: netdev, Lorenzo Bianconi
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Simon Horman, Herbert Xu, Steffen Klassert,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, devicetree,
	Matthias Brugger, AngeloGioacchino Del Regno, linux-arm-kernel,
	linux-mediatek, Christian Marangi, Felix Fietkau, linux-kernel,
	Jihong Min
In-Reply-To: <20260614040032.1567994-1-hurryman2212@gmail.com>

Add the FE/PPE/QDMA register definitions and shared driver state needed
by the Secure Offload Engine. Add a small SOE-facing header with stubs so
later Ethernet and PPE changes remain buildable until the provider is
enabled.

Signed-off-by: Jihong Min <hurryman2212@gmail.com>
---
 drivers/net/ethernet/airoha/airoha_eth.h  |  40 +++++++
 drivers/net/ethernet/airoha/airoha_regs.h |  16 +++
 drivers/net/ethernet/airoha/airoha_soe.h  | 126 ++++++++++++++++++++++
 3 files changed, 182 insertions(+)
 create mode 100644 drivers/net/ethernet/airoha/airoha_soe.h

diff --git a/drivers/net/ethernet/airoha/airoha_eth.h b/drivers/net/ethernet/airoha/airoha_eth.h
index 46b1c31939de..c5f09aedd7e7 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.h
+++ b/drivers/net/ethernet/airoha/airoha_eth.h
@@ -16,6 +16,8 @@
 #include <linux/soc/airoha/airoha_offload.h>
 #include <net/dsa.h>
 
+struct airoha_soe;
+
 #define AIROHA_MAX_NUM_GDM_PORTS	4
 #define AIROHA_MAX_NUM_GDM_DEVS		2
 #define AIROHA_MAX_NUM_QDMA		2
@@ -189,18 +191,31 @@ struct airoha_queue {
 	spinlock_t lock;
 	struct airoha_queue_entry *entry;
 	struct airoha_qdma_desc *desc;
+	/* Preserved for RX ring reprogramming; dmam_alloc_coherent() owns it. */
+	dma_addr_t desc_dma;
 	u16 head;
 	u16 tail;
 
 	int queued;
 	int ndesc;
+	/* SOE RX rings may use coherent slots instead of page_pool fragments. */
+	int rx_alloc_ndesc;
 	int free_thr;
 	int buf_size;
 	bool txq_stopped;
+	bool rx_coherent;
+	bool rx_drop_chain;
+	void *rx_coherent_buf;
+	dma_addr_t rx_coherent_dma;
+	size_t rx_coherent_buf_size;
 
 	struct napi_struct napi;
 	struct page_pool *page_pool;
 	struct sk_buff *skb;
+	/* First SOE descriptor metadata is kept while a scattered frame is built. */
+	int rx_frame_descs;
+	u32 soe_rx_msg0;
+	u32 soe_rx_msg2;
 
 	struct list_head tx_list;
 };
@@ -434,6 +449,16 @@ struct airoha_foe_stats64 {
 	u64 packets;
 };
 
+struct airoha_ppe_soe_meta {
+	unsigned long expires;
+	u32 key_hash;
+	u16 foe_hash;
+	u8 valid;
+	u8 sa_index;
+	u8 hop;
+	u8 seen;
+};
+
 struct airoha_flow_data {
 	struct ethhdr eth;
 
@@ -552,6 +577,11 @@ struct airoha_gdm_dev {
 
 	u32 flags;
 	int nbq;
+	/* Prevent toggling NETIF_F_HW_ESP while programmed SAs still exist. */
+	atomic_t soe_xfrm_state_count;
+	/* Private SOE submit path into this GDM's active QDMA instance. */
+	int (*soe_xmit_skb)(struct airoha_gdm_dev *dev, struct sk_buff *skb,
+			    u32 msg0, u32 msg1, u32 msg2);
 
 	struct airoha_hw_stats stats;
 };
@@ -581,6 +611,7 @@ struct airoha_ppe {
 
 	struct hlist_head *foe_flow;
 	u16 *foe_check_time;
+	struct airoha_ppe_soe_meta *soe_meta;
 
 	struct airoha_foe_stats *foe_stats;
 	dma_addr_t foe_stats_dma;
@@ -621,6 +652,7 @@ struct airoha_eth {
 
 	struct airoha_qdma qdma[AIROHA_MAX_NUM_QDMA];
 	struct airoha_gdm_port *ports[AIROHA_MAX_NUM_GDM_PORTS];
+	struct airoha_soe *soe;
 };
 
 u32 airoha_rr(void __iomem *base, u32 offset);
@@ -676,6 +708,14 @@ static inline bool airoha_is_7583(struct airoha_eth *eth)
 int airoha_get_fe_port(struct airoha_gdm_dev *dev);
 bool airoha_is_valid_gdm_dev(struct airoha_eth *eth,
 			     struct airoha_gdm_dev *dev);
+int airoha_qdma_xmit_skb(struct airoha_gdm_dev *dev, struct sk_buff *skb,
+			 u32 msg0, u32 msg1, u32 msg2);
+void airoha_ppe_soe_mark_skb(struct airoha_ppe_dev *dev, struct sk_buff *skb,
+			     u16 hash, u8 sa_index, u8 hop);
+bool airoha_ppe_soe_skb_marked(struct sk_buff *skb);
+void airoha_ppe_soe_xmit_skb(struct airoha_ppe_dev *dev, struct sk_buff *skb,
+			     struct net_device *netdev);
+void airoha_ppe_soe_flush_sa(struct airoha_ppe *ppe, u8 sa_index);
 
 void airoha_ppe_set_cpu_port(struct airoha_gdm_dev *dev, u8 ppe_id, u8 fport);
 bool airoha_ppe_is_enabled(struct airoha_eth *eth, int index);
diff --git a/drivers/net/ethernet/airoha/airoha_regs.h b/drivers/net/ethernet/airoha/airoha_regs.h
index 436f3c8779c1..27e158d0fa4b 100644
--- a/drivers/net/ethernet/airoha/airoha_regs.h
+++ b/drivers/net/ethernet/airoha/airoha_regs.h
@@ -82,6 +82,10 @@
 #define PSE_SHARE_USED_MTHD_MASK	GENMASK(31, 16)
 #define PSE_SHARE_USED_HTHD_MASK	GENMASK(15, 0)
 
+/* TDMA/SOE port 7 needs shared-buffer flow control enabled in the PSE. */
+#define REG_PSE_FC_CFG			0x0098
+#define PSE_TDMA_SHARE_BUF_DIS_MASK	BIT(23)
+
 #define REG_GDM_MISC_CFG		0x0148
 #define GDM2_RDM_ACK_WAIT_PREF_MASK	BIT(9)
 #define GDM2_CHN_VLD_MODE_MASK		BIT(5)
@@ -252,6 +256,8 @@
 #define PPE_GLO_CFG_EN_MASK			BIT(0)
 
 #define REG_PPE_PPE_FLOW_CFG(_n)		(((_n) ? PPE2_BASE : PPE1_BASE) + 0x204)
+#define PPE_FLOW_CFG_IP6_IPSEC_MASK		BIT(28)
+#define PPE_FLOW_CFG_IP4_IPSEC_MASK		BIT(27)
 #define PPE_FLOW_CFG_IP6_HASH_GRE_KEY_MASK	BIT(20)
 #define PPE_FLOW_CFG_IP4_HASH_GRE_KEY_MASK	BIT(19)
 #define PPE_FLOW_CFG_IP4_HASH_FLOW_LABEL_MASK	BIT(18)
@@ -851,6 +857,8 @@
 #define QDMA_DESC_NEXT_ID_MASK		GENMASK(15, 0)
 /* TX MSG0 */
 #define QDMA_ETH_TXMSG_MIC_IDX_MASK	BIT(30)
+/* SOE submit metadata: msg0 carries SA index, msg1 selects port 7 OQ8/OQ9. */
+#define QDMA_ETH_TXMSG_SOE_SA_MASK	GENMASK(29, 24)
 #define QDMA_ETH_TXMSG_SP_TAG_MASK	GENMASK(29, 14)
 #define QDMA_ETH_TXMSG_ICO_MASK		BIT(13)
 #define QDMA_ETH_TXMSG_UCO_MASK		BIT(12)
@@ -873,6 +881,11 @@
 
 /* RX MSG0 */
 #define QDMA_ETH_RXMSG_SPTAG		GENMASK(21, 14)
+/* SOE completion metadata can use the full 16-bit SP tag word. */
+#define QDMA_ETH_RXMSG_SPTAG_FULL	GENMASK(29, 14)
+/* SOE completion metadata returned by the QDMA RX descriptor. */
+#define QDMA_ETH_RXMSG_SOE_MASK		BIT(10)
+#define QDMA_ETH_RXMSG_HOP_FLAGS_MASK	GENMASK(2, 0)
 /* RX MSG1 */
 #define QDMA_ETH_RXMSG_DEI_MASK		BIT(31)
 #define QDMA_ETH_RXMSG_IP6_MASK		BIT(30)
@@ -883,6 +896,9 @@
 #define QDMA_ETH_RXMSG_SPORT_MASK	GENMASK(25, 21)
 #define QDMA_ETH_RXMSG_CRSN_MASK	GENMASK(20, 16)
 #define QDMA_ETH_RXMSG_PPE_ENTRY_MASK	GENMASK(15, 0)
+/* RX MSG2 */
+/* SW_UDF carries the SA index for SOE completion frames. */
+#define QDMA_ETH_RXMSG_SW_UDF_MASK	GENMASK(31, 24)
 
 struct airoha_qdma_desc {
 	__le32 rsv;
diff --git a/drivers/net/ethernet/airoha/airoha_soe.h b/drivers/net/ethernet/airoha/airoha_soe.h
new file mode 100644
index 000000000000..0bde2e9c6b5b
--- /dev/null
+++ b/drivers/net/ethernet/airoha/airoha_soe.h
@@ -0,0 +1,126 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Ethernet-facing declarations for the Airoha Secure Offload Engine (SOE)
+ * packet offload provider.
+ *
+ * airoha_eth owns SOE lifetime and calls these helpers to expose xfrm
+ * ESP/NAT-T offload on its netdevs. When CONFIG_NET_AIROHA_SOE is disabled,
+ * the stubs keep the Ethernet driver buildable without SOE support.
+ */
+
+#ifndef AIROHA_SOE_H
+#define AIROHA_SOE_H
+
+#include <linux/bitops.h>
+#include <linux/errno.h>
+#include <linux/kconfig.h>
+#include <linux/netdev_features.h>
+#include <linux/types.h>
+
+struct airoha_soe;
+struct airoha_soe_sa;
+struct airoha_eth;
+struct airoha_gdm_dev;
+struct device;
+struct dst_entry;
+struct net_device;
+struct netlink_ext_ack;
+struct sk_buff;
+struct xfrm_state;
+
+#define AIROHA_SOE_FEATURE_ESP		BIT(0)
+
+typedef int (*airoha_soe_xmit_skb_t)(struct airoha_gdm_dev *dev,
+				     struct sk_buff *skb, u32 msg0, u32 msg1,
+				     u32 msg2);
+
+#if IS_ENABLED(CONFIG_NET_AIROHA_SOE)
+int airoha_soe_init(struct airoha_eth *eth);
+void airoha_soe_deinit(struct airoha_eth *eth);
+bool airoha_soe_available(struct airoha_soe *soe);
+u32 airoha_soe_features(struct airoha_soe *soe);
+void airoha_soe_build_netdev(struct net_device *dev,
+			     airoha_soe_xmit_skb_t xmit_skb);
+void airoha_soe_teardown_netdev(struct net_device *dev);
+int airoha_soe_set_features(struct net_device *dev,
+			    netdev_features_t features);
+bool airoha_soe_rx_skb(struct airoha_soe *soe, struct sk_buff *skb,
+		       unsigned int sa_index, u32 hop_flags);
+bool airoha_soe_rx_plain_skb(struct airoha_gdm_dev *dev,
+			     struct sk_buff *skb, struct net_device *rx_dev,
+			     u16 foe_hash, u32 foe_reason, bool foe_valid);
+bool airoha_soe_has_pending_rx(struct airoha_soe *soe);
+int airoha_soe_xfrm_ppe_info(const struct dst_entry *dst, u8 *sa_index,
+			     u8 *hop);
+int airoha_soe_xmit(struct airoha_soe_sa *sa, struct airoha_gdm_dev *dev,
+		    struct sk_buff *skb, struct xfrm_state *x);
+#else
+static inline int airoha_soe_init(struct airoha_eth *eth)
+{
+	return 0;
+}
+
+static inline void airoha_soe_deinit(struct airoha_eth *eth)
+{
+}
+
+static inline bool airoha_soe_available(struct airoha_soe *soe)
+{
+	return false;
+}
+
+static inline u32 airoha_soe_features(struct airoha_soe *soe)
+{
+	return 0;
+}
+
+static inline void airoha_soe_build_netdev(struct net_device *dev,
+					   airoha_soe_xmit_skb_t xmit_skb)
+{
+}
+
+static inline void airoha_soe_teardown_netdev(struct net_device *dev)
+{
+}
+
+static inline int airoha_soe_set_features(struct net_device *dev,
+					  netdev_features_t features)
+{
+	return 0;
+}
+
+static inline bool airoha_soe_rx_skb(struct airoha_soe *soe,
+				     struct sk_buff *skb,
+				     unsigned int sa_index, u32 hop_flags)
+{
+	return false;
+}
+
+static inline bool airoha_soe_rx_plain_skb(struct airoha_gdm_dev *dev,
+					   struct sk_buff *skb,
+					   struct net_device *rx_dev,
+					   u16 foe_hash, u32 foe_reason,
+					   bool foe_valid)
+{
+	return false;
+}
+
+static inline bool airoha_soe_has_pending_rx(struct airoha_soe *soe)
+{
+	return false;
+}
+
+static inline int airoha_soe_xfrm_ppe_info(const struct dst_entry *dst,
+					   u8 *sa_index, u8 *hop)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int airoha_soe_xmit(struct airoha_soe_sa *sa,
+				  struct airoha_gdm_dev *dev,
+				  struct sk_buff *skb, struct xfrm_state *x)
+{
+	return -EOPNOTSUPP;
+}
+#endif
+
+#endif /* AIROHA_SOE_H */
-- 
2.53.0


^ permalink raw reply related

* [RFC PATCH net-next 3/7] arm64: dts: airoha: add EN7581 SOE node
From: Jihong Min @ 2026-06-14  4:00 UTC (permalink / raw)
  To: netdev, Lorenzo Bianconi
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Simon Horman, Herbert Xu, Steffen Klassert,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, devicetree,
	Matthias Brugger, AngeloGioacchino Del Regno, linux-arm-kernel,
	linux-mediatek, Christian Marangi, Felix Fietkau, linux-kernel,
	Jihong Min
In-Reply-To: <20260614040032.1567994-1-hurryman2212@gmail.com>

Describe the EN7581 SOE register window and interrupt so the Ethernet driver can discover and initialize the packet offload engine.

Signed-off-by: Jihong Min <hurryman2212@gmail.com>
---
 arch/arm64/boot/dts/airoha/en7581.dtsi | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm64/boot/dts/airoha/en7581.dtsi b/arch/arm64/boot/dts/airoha/en7581.dtsi
index ff6908a76e8e..a3c1033d2437 100644
--- a/arch/arm64/boot/dts/airoha/en7581.dtsi
+++ b/arch/arm64/boot/dts/airoha/en7581.dtsi
@@ -347,6 +347,12 @@ i2c1: i2c@1fbf8100 {
 			status = "disabled";
 		};
 
+		soe: soe@1fbfa000 {
+			compatible = "airoha,en7581-soe";
+			reg = <0x0 0x1fbfa000 0x0 0x268>;
+			interrupts = <GIC_SPI 79 IRQ_TYPE_LEVEL_HIGH>;
+		};
+
 		eth: ethernet@1fb50000 {
 			compatible = "airoha,en7581-eth";
 			reg = <0 0x1fb50000 0 0x2600>,
-- 
2.53.0



^ permalink raw reply related

* [RFC PATCH net-next 2/7] dt-bindings: net: airoha: add EN7581 SOE
From: Jihong Min @ 2026-06-14  4:00 UTC (permalink / raw)
  To: netdev, Lorenzo Bianconi
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Simon Horman, Herbert Xu, Steffen Klassert,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, devicetree,
	Matthias Brugger, AngeloGioacchino Del Regno, linux-arm-kernel,
	linux-mediatek, Christian Marangi, Felix Fietkau, linux-kernel,
	Jihong Min
In-Reply-To: <20260614040032.1567994-1-hurryman2212@gmail.com>

Document the EN7581 Secure Offload Engine register window used by the Ethernet driver for ESP packet offload, and add the new binding to the Airoha Ethernet MAINTAINERS entry.

Signed-off-by: Jihong Min <hurryman2212@gmail.com>
---
 .../bindings/net/airoha,en7581-soe.yaml       | 48 +++++++++++++++++++
 MAINTAINERS                                   |  1 +
 2 files changed, 49 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/airoha,en7581-soe.yaml

diff --git a/Documentation/devicetree/bindings/net/airoha,en7581-soe.yaml b/Documentation/devicetree/bindings/net/airoha,en7581-soe.yaml
new file mode 100644
index 000000000000..24aecafecc70
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/airoha,en7581-soe.yaml
@@ -0,0 +1,48 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/airoha,en7581-soe.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Airoha EN7581 Secure Offload Engine
+
+maintainers:
+  - Lorenzo Bianconi <lorenzo@kernel.org>
+
+description:
+  The Secure Offload Engine provides inline ESP packet offload resources used
+  by the Airoha Ethernet controller.
+
+properties:
+  compatible:
+    const: airoha,en7581-soe
+
+  reg:
+    maxItems: 1
+
+  interrupts:
+    maxItems: 1
+
+required:
+  - compatible
+  - reg
+  - interrupts
+
+additionalProperties: false
+
+examples:
+  - |
+    #include <dt-bindings/interrupt-controller/arm-gic.h>
+    #include <dt-bindings/interrupt-controller/irq.h>
+
+    soc {
+      #address-cells = <2>;
+      #size-cells = <2>;
+
+      soe@1fbfa000 {
+        compatible = "airoha,en7581-soe";
+        reg = <0 0x1fbfa000 0 0x268>;
+        interrupts = <GIC_SPI 79 IRQ_TYPE_LEVEL_HIGH>;
+      };
+    };
+...
diff --git a/MAINTAINERS b/MAINTAINERS
index cc1dde0c9067..7c338e670572 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -757,6 +757,7 @@ L:	linux-mediatek@lists.infradead.org (moderated for non-subscribers)
 L:	netdev@vger.kernel.org
 S:	Maintained
 F:	Documentation/devicetree/bindings/net/airoha,en7581-eth.yaml
+F:	Documentation/devicetree/bindings/net/airoha,en7581-soe.yaml
 F:	drivers/net/ethernet/airoha/
 
 AIROHA PCIE PHY DRIVER
-- 
2.53.0



^ permalink raw reply related

* [RFC PATCH net-next 1/7] xfrm: allow packet offload drivers to own transmit
From: Jihong Min @ 2026-06-14  4:00 UTC (permalink / raw)
  To: netdev, Lorenzo Bianconi
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Simon Horman, Herbert Xu, Steffen Klassert,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, devicetree,
	Matthias Brugger, AngeloGioacchino Del Regno, linux-arm-kernel,
	linux-mediatek, Christian Marangi, Felix Fietkau, linux-kernel,
	Jihong Min
In-Reply-To: <20260614040032.1567994-1-hurryman2212@gmail.com>

Packet offload drivers can currently program state and validate whether an skb can be offloaded, but they cannot take ownership of a packet that needs driver-specific TX preparation before the regular XFRM output path continues.

Add an optional xdo_dev_packet_xmit() callback. Drivers that implement it consume the skb and return the final TX status; all other drivers keep the existing XFRM output path.

Signed-off-by: Jihong Min <hurryman2212@gmail.com>
---
 include/linux/netdevice.h |  8 ++++++++
 net/xfrm/xfrm_output.c    | 11 +++++++++++
 2 files changed, 19 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 7f4f0837c09f..1552eb81ddf0 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1048,6 +1048,14 @@ struct xfrmdev_ops {
 	int	(*xdo_dev_policy_add) (struct xfrm_policy *x, struct netlink_ext_ack *extack);
 	void	(*xdo_dev_policy_delete) (struct xfrm_policy *x);
 	void	(*xdo_dev_policy_free) (struct xfrm_policy *x);
+	/* Optional packet-offload TX path for devices that need
+	 * driver-specific transmit preparation instead of continuing through
+	 * the regular XFRM output path, such as adding offload metadata or
+	 * steering the packet to a private transmit queue. The driver consumes
+	 * skb and returns the final transmit status.
+	 */
+	int	(*xdo_dev_packet_xmit)(struct sk_buff *skb,
+				       struct xfrm_state *x);
 };
 #endif
 
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index cc35c2fcbbe0..9f11559b0221 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -770,6 +770,17 @@ int xfrm_output(struct sock *sk, struct sk_buff *skb)
 	}
 
 	if (x->xso.type == XFRM_DEV_OFFLOAD_PACKET) {
+#ifdef CONFIG_XFRM_OFFLOAD
+		const struct xfrmdev_ops *ops;
+#endif
+
+#ifdef CONFIG_XFRM_OFFLOAD
+		ops = x->xso.dev->xfrmdev_ops;
+		/* Callback validates, consumes skb and returns final TX status. */
+		if (ops && ops->xdo_dev_packet_xmit)
+			return ops->xdo_dev_packet_xmit(skb, x);
+#endif
+
 		if (!xfrm_dev_offload_ok(skb, x)) {
 			XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
 			kfree_skb(skb);
-- 
2.53.0



^ permalink raw reply related

* [RFC PATCH net-next 0/7] net: airoha: add EN7581 SOE ESP packet offload
From: Jihong Min @ 2026-06-14  4:00 UTC (permalink / raw)
  To: netdev, Lorenzo Bianconi
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Simon Horman, Herbert Xu, Steffen Klassert,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, devicetree,
	Matthias Brugger, AngeloGioacchino Del Regno, linux-arm-kernel,
	linux-mediatek, Christian Marangi, Felix Fietkau, linux-kernel,
	Jihong Min

Add Secure Offload Engine (SOE) support for the Airoha EN7581 Ethernet
driver. SOE provides inline ESP packet offload for native ESP and NAT-T
traffic, with the Ethernet/QDMA path used to submit packets to the SOE
block and the PPE path used to bind eligible ESP flows. NETIF_F_GSO_ESP
and NETIF_F_HW_ESP_TX_CSUM are intentionally left out for now and will be
revisited separately for feasibility.

This is posted as RFC because the code was originally developed and tested
against an OpenWrt 6.18 Airoha tree, not against the current upstream
net-next driver. The original OpenWrt commit used as the source for this
RFC is available at:
https://github.com/hurryman2212/OpenW1700k-test/commit/7c1b5e662f7790b3d23ed143beadc1dcbf6d15f7

The SOE part is intentionally linked into the airoha Ethernet module
instead of being exposed as an independent crypto or platform driver. The
user-visible ESP offload control is a netdev capability: xfrmdev_ops and
NETIF_F_HW_ESP live on the target netdev, and the feature can be controlled
through the usual netdev feature path. SOE also shares the FE/QDMA/PPE
datapath, private queues, DSA conduit handling and netdev lifetime owned by
airoha_eth.

Patch 1 adds xdo_dev_packet_xmit() because the existing XFRM packet
offload transmit path does not provide a hook for hardware whose ESP engine
is reached through device-specific packet forwarding. SOE needs to consume
the skb, add a hardware hop descriptor, steer it to a private QDMA path and
return the final transmit status. Drivers that do not implement the
optional callback keep the existing XFRM output behavior.

Jihong Min (7):
  xfrm: allow packet offload drivers to own transmit
  dt-bindings: net: airoha: add EN7581 SOE
  arm64: dts: airoha: add EN7581 SOE node
  net: airoha: add SOE registers and driver state
  net: airoha: add QDMA support for SOE packets
  net: airoha: add PPE support for SOE flows
  net: airoha: add SOE XFRM packet offload support

 .../bindings/net/airoha,en7581-soe.yaml       |   48 +
 MAINTAINERS                                   |    1 +
 arch/arm64/boot/dts/airoha/en7581.dtsi        |    6 +
 drivers/net/ethernet/airoha/Kconfig           |   13 +
 drivers/net/ethernet/airoha/Makefile          |    1 +
 drivers/net/ethernet/airoha/airoha_eth.c      |  668 +++++-
 drivers/net/ethernet/airoha/airoha_eth.h      |   40 +
 drivers/net/ethernet/airoha/airoha_ppe.c      |  606 +++++-
 drivers/net/ethernet/airoha/airoha_regs.h     |   16 +
 drivers/net/ethernet/airoha/airoha_soe.c      | 1896 +++++++++++++++++
 drivers/net/ethernet/airoha/airoha_soe.h      |  126 ++
 include/linux/netdevice.h                     |    8 +
 include/linux/soc/airoha/airoha_offload.h     |    5 +
 net/xfrm/xfrm_output.c                        |   11 +
 14 files changed, 3342 insertions(+), 103 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/airoha,en7581-soe.yaml
 create mode 100644 drivers/net/ethernet/airoha/airoha_soe.c
 create mode 100644 drivers/net/ethernet/airoha/airoha_soe.h

-- 
2.53.0


^ permalink raw reply

* [PATCH] PCI: meson: Fix PERST# timing by asserting reset before LTSSM enable
From: Gowtham Kudupudi @ 2026-06-14  1:56 UTC (permalink / raw)
  To: yue.wang, lpieralisi, kwilczynski, mani
  Cc: robh, bhelgaas, neil.armstrong, khilman, jbrunet,
	martin.blumenstingl, linux-pci, linux-amlogic, linux-arm-kernel,
	linux-kernel, Gowtham Kudupudi

On warm reboot, the PCIe controller's LTSSM starts link training
immediately if PERST# is already deasserted from the previous boot.
The driver then pulses PERST# for only 500us, which is too short to
properly reset the endpoint device that has already started training.

Fix by moving the PERST# assert/deassert pulse BEFORE enabling LTSSM,
so the endpoint gets a clean reset cycle before link training begins.

This was found on Amlogic G12B (A311D) with NVMe on an M.2 slot.
Cold boot worked because POR held PERST# low; warm reboot did not.
The fix was confirmed on a Banana Pi CM4 with Waveshare IO base board.

Signed-off-by: Gowtham Kudupudi <gowtham@ferryfair.com>
---
 drivers/pci/controller/dwc/pci-meson.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/controller/dwc/pci-meson.c b/drivers/pci/controller/dwc/pci-meson.c
index 5f8e2f4b3c12..3a7e9f1d5b8c 100644
--- a/drivers/pci/controller/dwc/pci-meson.c
+++ b/drivers/pci/controller/dwc/pci-meson.c
@@ -310,8 +310,8 @@ static int meson_pcie_start_link(struct dw_pcie *pci)
 {
 	struct meson_pcie *mp = to_meson_pcie(pci);
 
+	meson_pcie_assert_reset(mp);
 	meson_pcie_ltssm_enable(mp);
-	meson_pcie_assert_reset(mp);
 
 	return 0;
 }
-- 
2.49.0


^ permalink raw reply related

* Re: [RFC PATCH] ARM: move reserve_lp[012] handling into affected machines
From: Ethan Nelson-Moore @ 2026-06-14  2:20 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: Russell King, Andrew Morton, Jiri Bohac, Linus Walleij,
	Arnd Bergmann
In-Reply-To: <20260511011504.77760-1-enelsonmoore@gmail.com>

On Sun, May 10, 2026 at 6:15 PM Ethan Nelson-Moore
<enelsonmoore@gmail.com> wrote:
> arch/arm/kernel/setup.c contains code to reserve lp0/1/2 I/O ports for
> machines that can't possibly have these ports. This code is only used
> by netwinder and footbridge, and is small enough that it can just be
> moved into these machines. Do so to make the setup code more generic
> and the machine code more self-contained.
>
> This patch is an RFC because I'm not sure if using .init_early is
> actually necessary. I did it to match the place the original code was
> called as closely as possible. Can anyone weigh in on this?

Hi, everyone,

Gentle ping (+ cc Arnd, LinusW) - anyone have any thoughts on this?

I meant rpc and footbridge in the commit message - I'll fix it in a
future revision.

Ethan


^ permalink raw reply

* Re: [PATCH] ARM: remove references to removed CONFIG_CPU_ARM92x_CPU_IDLE options
From: Ethan Nelson-Moore @ 2026-06-14  2:06 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-arm-kernel, Nathan Chancellor, Kees Cook, Russell King
In-Reply-To: <CAD++jLnD2EU=YfABb1ySRtgeOPVzQKH3iMjZtj2AsW+RK8M6mQ@mail.gmail.com>

On Thu, Jun 11, 2026 at 5:38 AM Linus Walleij <linusw@kernel.org> wrote:
> Please put this patch into Russell's patch tracker.

Here:
https://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=9478/1

Ethan


^ permalink raw reply

* Re: [PATCH] ARM: disable broken eBPF JIT on the Risc PC
From: Ethan Nelson-Moore @ 2026-06-14  1:50 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-arm-kernel, linux-kernel, stable, Russell King,
	Russell King (Oracle), Arnd Bergmann, Kees Cook,
	Nathan Chancellor, Thomas Weissschuh, Peter Zijlstra,
	Shubham Bansal, David S. Miller
In-Reply-To: <CAD++jL=0qYGoygUwGEXQL7C_ROnC7kfpRv8RA+H5tNWwYu+pQA@mail.gmail.com>

On Mon, May 25, 2026 at 1:18 AM Linus Walleij <linusw@kernel.org> wrote:
> Looks correct to me.
> Reviewed-by: Linus Walleij <linusw@kernel.org>
>
> Please put this into Russell's patch tracker!

Done!

https://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=9477/1


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox