Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* RE: [External Mail] [PATCH v2 6/7] net: wwan: t9xx: Add AT & MBIM WWAN ports
From: Wu. JackBB (GSM) @ 2026-06-24  9:24 UTC (permalink / raw)
  To: Loic Poulain, Sergey Ryazanov, Johannes Berg, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Wen-Zhi Huang, Shi-Wei Yeh, Minano Tseng, Matthias Brugger,
	AngeloGioacchino Del Regno, Simon Horman, Jonathan Corbet,
	Shuah Khan, Wu. JackBB (GSM)
  Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-mediatek@lists.infradead.org, linux-doc@vger.kernel.org
In-Reply-To: <20260610-t9xx_driver_v1-v2-6-c65addf23b3f@compal.com>

Hi Jakub,

Addressing sashiko AI code review comments for this patch, as
requested by you in the patch 3/7 review:
https://patchwork.kernel.org/project/netdevbpf/patch/20260610-t9xx_driver_v1-v2-3-c65addf23b3f@compal.com/#27006088

Q1: Is this mutex ever acquired? It is initialized during port setup, but
it does not appear to be used to serialize operations in
mtk_port_common_write() or when sending data.

  Valid. Fixed in v3 by removing the unused write_lock mutex from
  struct mtk_port and its mutex_init call.

Q2: Will concurrent writes safely execute here without holding write_lock?
Could this lack of serialization lead to sequence number corruption?

  The WWAN core framework holds port->ops_lock (a mutex) around the
  tx/tx_blocking callback invocations, serializing write operations
  on the same WWAN port. For internal ports, writes are exclusively
  performed by the FSM kthread, which is single-threaded. Concurrent
  writes to the same port do not occur, and port->tx_seq is always
  modified by a single thread.

Q3: What happens if a blocking write is interrupted by a port teardown?
If PORT_S_WR is cleared, trb->status remains at MTK_DFLT_TRB_STATUS (1).
ret = (!trb->status) ? len : trb->status evaluates to 1, causing an
incorrect byte count to be returned.

  This only occurs during port teardown (PORT_S_WR cleared by
  mtk_port_common_close or disable). At this point the port is being
  shut down and the return value is largely irrelevant — the caller
  cannot meaningfully use the port afterward. The submitted data may
  or may not have been transmitted by DMA, depending on timing. This
  is a teardown-only scenario with no practical impact on data
  integrity.

Q4: Is it safe to mutate the flags directly here? This is a non-atomic
read-modify-write on the shared port structure.

  The WWAN core holds port->ops_lock (a mutex) around tx/tx_blocking
  callbacks, serializing flag modifications. mtk_port_wwan_write()
  and mtk_port_wwan_write_blocking() are never called concurrently
  for the same port.

Q5: Does this silently drop data on partial writes? If
mtk_port_common_write() returns a positive value (partial success),
consume_skb is called unconditionally and 0 (success) is returned.

  This is a design limitation of the WWAN port API: wwan_port_op_tx
  returns 0 for success or negative for error — there is no mechanism
  to report partial writes back to the WWAN core. The already-submitted
  fragments cannot be recalled from the DMA engine. For the AT/MBIM
  control ports in this driver, messages are small (typically under
  1KB, within a single MTU). The multi-fragment path is rarely
  exercised for control plane traffic.

Q6: As with mtk_port_wwan_write(), mutating the shared blocking flag
without atomics could race, and ignoring positive return values could
lead to silent data loss.

  Same as Q4 (flags serialized by WWAN core ops_lock) and Q5
  (partial write is a WWAN API limitation, rare for control messages).

Q7: Is there a race condition here if wwan_create_port() fails? The return
value is directly assigned to w_port without checking IS_ERR() first.
Could concurrent RX pass the error pointer to wwan_port_rx()?

  No race. mtk_port_wwan_enable() is called from the FSM thread
  during the handshake sequence, before the port starts receiving
  data. The CLDMA RX queue for this port has not been opened at this
  point — RX data only arrives after the modem completes its
  handshake. The RX path cannot observe the error pointer because no
  data arrives until after the port is fully enabled.

Q8: Is the WWAN port exposed to userspace before its state is fully
initialized? wwan_create_port() registers the character device and
triggers a uevent. If userspace opens immediately, PORT_S_ENABLE is
not set yet so open returns -ENODEV.

  The window between wwan_create_port() returning and
  set_bit(PORT_S_ENABLE) is a few instructions (nanoseconds). If
  userspace opens in that window, the open returns -ENODEV and the
  application retries. In practice, user space WWAN managers (e.g.,
  ModemManager) wait for udev events to settle before opening ports.
  Reordering to set PORT_S_ENABLE before wwan_create_port is not
  correct either — the port should not be marked enabled before the
  WWAN port object exists.

Thanks.


================================================================================================================================================================
This message may contain information which is private, privileged or confidential of Compal Electronics, Inc. If you are not the intended recipient of this message, please notify the sender and destroy/delete the message. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information, by persons or entities other than the intended recipient is prohibited.
================================================================================================================================================================

^ permalink raw reply

* [PATCH v3 01/12] cpu/hotplug: Introduce CONFIG_HOTPLUG_PARALLEL_SMT
From: Jinjie Ruan @ 2026-06-24  9:25 UTC (permalink / raw)
  To: catalin.marinas, will, tsbogend, tglx, mingo, bp, dave.hansen,
	hpa, peterz, kees, nathan, linusw, ojeda, david.kaplan,
	lukas.bulwahn, ryan.roberts, maz, timothy.hayes, lpieralisi,
	thuth, menglong8.dong, oupton, yeoreum.yun, miko.lenczewski,
	broonie, kevin.brodsky, james.clark, yangyicong, tabba, osandov,
	arnd, anshuman.khandual, david, akpm, ljs, dev.jain, yang,
	chaitanyas.prakash, kprateek.nayak, chenl311, sshegde,
	thorsten.blum, chang.seok.bae, tim.c.chen, x86, linux-kernel,
	linux-arm-kernel, linux-mips
  Cc: ruanjinjie
In-Reply-To: <20260624092537.2916971-1-ruanjinjie@huawei.com>

During parallel CPU bringup, x86 requires primary SMT threads to boot
first to avoid siblings stopping during microcode updates. This constraint
is architecture-specific and unnecessary for other platforms
like arm64.

Introduce CONFIG_HOTPLUG_PARALLEL_SMT to decouple this constraint.
Platforms requiring this temporal order (e.g., x86) can select it
in Kconfig. Other architectures (e.g., arm64) can leave it unselected
to entirely bypass the SMT branch.

Suggested-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 arch/Kconfig      | 5 +++++
 arch/mips/Kconfig | 3 +--
 arch/x86/Kconfig  | 2 +-
 kernel/cpu.c      | 4 ++--
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index e86880045158..d25b61dc03b2 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -102,6 +102,11 @@ config HOTPLUG_PARALLEL
 	bool
 	select HOTPLUG_SPLIT_STARTUP
 
+config HOTPLUG_PARALLEL_SMT
+	bool
+	select HOTPLUG_PARALLEL
+	select HOTPLUG_SMT
+
 config GENERIC_IRQ_ENTRY
 	bool
 
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 4364f3dba688..8d9c57f3df23 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -660,7 +660,7 @@ config EYEQ
 	select USB_UHCI_BIG_ENDIAN_DESC if CPU_BIG_ENDIAN
 	select USB_UHCI_BIG_ENDIAN_MMIO if CPU_BIG_ENDIAN
 	select USE_OF
-	select HOTPLUG_PARALLEL if HOTPLUG_CPU
+	select HOTPLUG_PARALLEL_SMT if HOTPLUG_CPU
 	help
 	  Select this to build a kernel supporting EyeQ SoC from Mobileye.
 
@@ -2301,7 +2301,6 @@ config MIPS_CPS
 	select MIPS_CM
 	select MIPS_CPS_PM if HOTPLUG_CPU
 	select SMP
-	select HOTPLUG_SMT if HOTPLUG_PARALLEL
 	select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
 	select SYNC_R4K if (CEVT_R4K || CSRC_R4K)
 	select SYS_SUPPORTS_HOTPLUG_CPU
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f3f7cb01d69d..2ea80da1e4f8 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -305,7 +305,7 @@ config X86
 	select HAVE_USER_RETURN_NOTIFIER
 	select HAVE_GENERIC_VDSO
 	select VDSO_GETRANDOM			if X86_64
-	select HOTPLUG_PARALLEL			if SMP && X86_64
+	select HOTPLUG_PARALLEL_SMT		if SMP && X86_64
 	select HOTPLUG_SMT			if SMP
 	select HOTPLUG_SPLIT_STARTUP		if SMP && X86_32
 	select IRQ_FORCED_THREADING
diff --git a/kernel/cpu.c b/kernel/cpu.c
index bc4f7a9ba64e..5a90f60ff60e 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1792,7 +1792,7 @@ static int __init parallel_bringup_parse_param(char *arg)
 }
 early_param("cpuhp.parallel", parallel_bringup_parse_param);
 
-#ifdef CONFIG_HOTPLUG_SMT
+#ifdef CONFIG_HOTPLUG_PARALLEL_SMT
 static inline bool cpuhp_smt_aware(void)
 {
 	return cpu_smt_max_threads > 1;
@@ -1811,7 +1811,7 @@ static inline const struct cpumask *cpuhp_get_primary_thread_mask(void)
 {
 	return cpu_none_mask;
 }
-#endif
+#endif /* CONFIG_HOTPLUG_PARALLEL_SMT */
 
 bool __weak arch_cpuhp_init_parallel_bringup(void)
 {
-- 
2.34.1



^ permalink raw reply related

* [PATCH v3 02/12] cpu/hotplug: Propagate bring-up status to arch_cpuhp_cleanup_kick_cpu()
From: Jinjie Ruan @ 2026-06-24  9:25 UTC (permalink / raw)
  To: catalin.marinas, will, tsbogend, tglx, mingo, bp, dave.hansen,
	hpa, peterz, kees, nathan, linusw, ojeda, david.kaplan,
	lukas.bulwahn, ryan.roberts, maz, timothy.hayes, lpieralisi,
	thuth, menglong8.dong, oupton, yeoreum.yun, miko.lenczewski,
	broonie, kevin.brodsky, james.clark, yangyicong, tabba, osandov,
	arnd, anshuman.khandual, david, akpm, ljs, dev.jain, yang,
	chaitanyas.prakash, kprateek.nayak, chenl311, sshegde,
	thorsten.blum, chang.seok.bae, tim.c.chen, x86, linux-kernel,
	linux-arm-kernel, linux-mips
  Cc: ruanjinjie
In-Reply-To: <20260624092537.2916971-1-ruanjinjie@huawei.com>

From: Will Deacon <will@kernel.org>

In preparation for enabling the generic CPU hotplug machinery on arm64,
which has architecture-specific handling of early bringup failures,
extend arch_cpuhp_cleanup_kick_cpu() to take an additional argument
indicating whether or not the target AP reached the alive state.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 arch/x86/kernel/smpboot.c  | 4 ++--
 include/linux/cpuhotplug.h | 2 +-
 kernel/cpu.c               | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 294a8ea60298..637660b15aee 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1057,7 +1057,7 @@ static int do_boot_cpu(u32 apicid, unsigned int cpu, struct task_struct *idle)
 
 	/* If the wakeup mechanism failed, cleanup the warm reset vector */
 	if (ret)
-		arch_cpuhp_cleanup_kick_cpu(cpu);
+		arch_cpuhp_cleanup_kick_cpu(cpu, false);
 	return ret;
 }
 
@@ -1105,7 +1105,7 @@ int arch_cpuhp_kick_ap_alive(unsigned int cpu, struct task_struct *tidle)
 	return smp_ops.kick_ap_alive(cpu, tidle);
 }
 
-void arch_cpuhp_cleanup_kick_cpu(unsigned int cpu)
+void arch_cpuhp_cleanup_kick_cpu(unsigned int cpu, bool is_alive)
 {
 	/* Cleanup possible dangling ends... */
 	if (smp_ops.kick_ap_alive == native_kick_ap && x86_platform.legacy.warm_reset)
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 22ba327ec227..5c3b3e0bce47 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -511,7 +511,7 @@ struct task_struct;
 
 void cpuhp_ap_sync_alive(void);
 void arch_cpuhp_sync_state_poll(void);
-void arch_cpuhp_cleanup_kick_cpu(unsigned int cpu);
+void arch_cpuhp_cleanup_kick_cpu(unsigned int cpu, bool is_alive);
 int arch_cpuhp_kick_ap_alive(unsigned int cpu, struct task_struct *tidle);
 bool arch_cpuhp_init_parallel_bringup(void);
 
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 5a90f60ff60e..b0e31e624623 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -427,7 +427,7 @@ static bool cpuhp_can_boot_ap(unsigned int cpu)
 	return true;
 }
 
-void __weak arch_cpuhp_cleanup_kick_cpu(unsigned int cpu) { }
+void __weak arch_cpuhp_cleanup_kick_cpu(unsigned int cpu, bool is_alive) { }
 
 /*
  * Early CPU bringup synchronization point. Cannot use cpuhp_state::done_up
@@ -446,7 +446,7 @@ static int cpuhp_bp_sync_alive(unsigned int cpu)
 	}
 
 	/* Let the architecture cleanup the kick alive mechanics. */
-	arch_cpuhp_cleanup_kick_cpu(cpu);
+	arch_cpuhp_cleanup_kick_cpu(cpu, !ret);
 	return ret;
 }
 #else /* CONFIG_HOTPLUG_CORE_SYNC_FULL */
-- 
2.34.1



^ permalink raw reply related

* [PATCH v3 00/12] arm64: Add HOTPLUG_PARALLEL support for secondary CPUs
From: Jinjie Ruan @ 2026-06-24  9:25 UTC (permalink / raw)
  To: catalin.marinas, will, tsbogend, tglx, mingo, bp, dave.hansen,
	hpa, peterz, kees, nathan, linusw, ojeda, david.kaplan,
	lukas.bulwahn, ryan.roberts, maz, timothy.hayes, lpieralisi,
	thuth, menglong8.dong, oupton, yeoreum.yun, miko.lenczewski,
	broonie, kevin.brodsky, james.clark, yangyicong, tabba, osandov,
	arnd, anshuman.khandual, david, akpm, ljs, dev.jain, yang,
	chaitanyas.prakash, kprateek.nayak, chenl311, sshegde,
	thorsten.blum, chang.seok.bae, tim.c.chen, x86, linux-kernel,
	linux-arm-kernel, linux-mips
  Cc: ruanjinjie

Support for parallel secondary CPU bringup is already utilized by x86,
MIPS, and RISC-V. This patch brings this capability to the arm64
architecture.

Introduce CONFIG_HOTPLUG_PARALLEL_SMT to avoid primary SMT threads
to boot first constraint.

And add a 'cpu' parameter to update_cpu_boot_status() to allow updating
the boot status at a per-CPU granularity during parallel bringup.

Rework the global `secondary_data` and `__early_cpu_boot_status` accessed
during early boot into per-CPU arrays to allow secondary CPUs to boot
in parallel.

And reuse `__cpu_logical_map` array in the early boot code in head.S
to resolve each secondary CPU's logical ID concurrently.

This series includes a subset of the refactoring patches proposed
by Will Deacon, with further adjustments.

Link: https://web.git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=cpu-hotplug

Bringup Time Comparison on real hardware(ms, lower is better):

 |     Platform                     | Baseline|   P=0   |   P=1  | Delta(%)|
 | -------------------------------- | ------- | ------- | ------ | ------- |
 | 192-core server(HIP12)           | 14619.2 | 14619.1 | 8589.4 | 41.21%  |
 | 32-core board                    | 2776.5  | 2881.0  | 1045.0 | 62.36%  |
 | 64-core board                    | 2297.0  |    /    | 814.4  | 64.5%   |

Below is the actual dmesg output demonstrating four concurrent boot
failures on different CPUs:

	CPU4 failed to report alive state
	CPU4: is stuck in kernel
	CPU4: does not support 52-bit VAs

	CPU6 failed to report alive state
	CPU6: is stuck in kernel
	CPU6: does not support 4K granule

	GICv3: CPU8: found redistributor 8 region 0:0x00000000081a0000
	GICv3: CPU8: using allocated LPI pending table @0x0000000100360000
	CPU8: Booted secondary processor 0x0000000008 [0x410fd034]
	...
	CPU16 failed to report alive state
	psci: CPU16 killed (polled 0 ms)
	CPU16: died during early boot

	CPU17: will not boot
	CPU17 failed to report alive state
	psci: CPU17 killed (polled 0 ms)
	CPU17: died during early boot

	CPU18 failed to report alive state
	Kernel panic - not syncing: CPU18 detected unsupported configuration
	CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 7.1.0-rc1-gdd2d3bbca3b5 #151 PREEMPT
	Hardware name: linux,dummy-virt (DT)
	Call trace:
	 show_stack+0x18/0x24 (C)
	 dump_stack_lvl+0x38/0xd0
	 dump_stack+0x18/0x24
	 vpanic+0x4f8/0x4fc
	 do_panic_on_target_cpu+0x0/0x1c
	 secondary_start_kernel+0x0/0x17c
	 cpuhp_bringup_ap+0x2cc/0x2dc
	 cpuhp_invoke_callback+0x168/0x2ac
	 __cpuhp_invoke_callback_range+0x90/0x118
	 _cpu_up+0x148/0x220
	 cpu_up+0xcc/0x158
	 cpuhp_bringup_mask.constprop.0+0x80/0xcc
	 bringup_nonboot_cpus+0x38/0x80
	 smp_init+0x30/0x8c
	 kernel_init_freeable+0x170/0x35c
	 kernel_init+0x24/0x1e0
	 ret_from_fork+0x10/0x20
	SMP: stopping secondary CPUs
	---[ end Kernel panic - not syncing: CPU18 detected unsupported configuration ]---

Changes in v3:
- Add necessary rework patches.
- Fix AI review issues in [2].
  1. Use lockdep_off() and lockdep_on to resolve the lockdep splat
     on failure paths, which avoid printk_deferred() and solve
     the pr_fmt() prefix issue or missing update.
  2. Ensure atomic updates to system_cpucaps bitmap for arm64 cpufeature.
  3. Use NR_CPUS __early_cpu_boot_status array in head.S and not clear
     __early_cpu_boot_status.
  4. Solve get_cpu_ops(0) which evaluates to the boot CPU issue in
    arch_cpuhp_init_parallel_bringup() with Will's rework patch.
- Handle `__early_cpu_boot_status` properly as Will pointed out with
  Will's patch.
- Implement arch_cpuhp_cleanup_kick_cpu() to cleanup for fail boot
  secondary CPUS and add support for error handling with Will's patch.
- Update the code as Thomas suggested. Rename PARALLEL_SMT_PRIMARY_FIRST
  to `HOTPLUG_PARALLEL_SMT` and not select it for RISC-V.
- Select HOTPLUG_PARALLEL if SMP as Thomas suggested.
- Rework early boot data into per-CPU arrays directly rather than using
  CONFIG_HOTPLUG_PARALLEL to differentiate code paths as Thomas suggested.
- Remove `cpu_running` and related complete.
- Add new test data on new hardware.

[2]: https://sashiko.dev/#/patchset/20260618092444.1316336-1-ruanjinjie%40huawei.com

Changes in v2:
- Remove RFC.
- Add Tested-by.
- Fix AI review issues in [1].
- Add arch_cpuhp_init_parallel_bringup() to check psci boot.
- Reuse `__cpu_logical_map` instead of a new aray.
- Defer rcutree_report_cpu_starting() until after
  check_local_cpu_capabilities() to prevent a potential control CPU
  deadlock if an early capability check fails.
- Move the assembly in head.S to a macro called `mpidr_to_cpuid`.
- Add `SECONDARY_DATA_SHIFT` for `lsl` to access `cpu_boot_data`.
- Add sizeof(struct secondary_data) power of 2 assert check.
- Expand testing with more data collected from real hardware.

[1] https://sashiko.dev/#/patchset/20260611133809.3854977-1-ruanjinjie%40huawei.com

Jinjie Ruan (5):
  cpu/hotplug: Introduce CONFIG_HOTPLUG_PARALLEL_SMT
  arm64: cpufeature: Ensure atomic updates to system_cpucaps bitmap
  arm64: smp: Pass CPU ID to update_cpu_boot_status()
  arm64: smp: Rework early boot data into per-CPU arrays
  arm64: Add HOTPLUG_PARALLEL support for secondary CPUs

Will Deacon (7):
  cpu/hotplug: Propagate bring-up status to
    arch_cpuhp_cleanup_kick_cpu()
  arm64: smp: Tidy up smp_prepare_cpus()
  arm64: smp: Tidy up cpuinfo init and cpufeature updates
  arm64: smp: Defer RCU registration during secondary CPU bringup
  arm64: smp: Use generic HOTPLUG_SPLIT_STARTUP machinery for CPU
    onlining
  arm64: cpu_ops: Make 'cpu_operations' pointer global instead of
    per-cpu
  arm64: cpu_ops: Introduce get_secondary_cpu_ops()

 arch/Kconfig                     |   5 ++
 arch/arm64/Kconfig               |   2 +-
 arch/arm64/include/asm/cpu.h     |   6 +-
 arch/arm64/include/asm/cpu_ops.h |   1 +
 arch/arm64/include/asm/smp.h     |  22 +++---
 arch/arm64/kernel/asm-offsets.c  |   2 +
 arch/arm64/kernel/cpu_ops.c      |  34 ++++++---
 arch/arm64/kernel/cpufeature.c   |  27 +++++--
 arch/arm64/kernel/cpuinfo.c      |  11 ---
 arch/arm64/kernel/head.S         |  44 ++++++++++--
 arch/arm64/kernel/setup.c        |   9 +++
 arch/arm64/kernel/smp.c          | 118 +++++++++++++++++--------------
 arch/arm64/mm/context.c          |   5 +-
 arch/arm64/mm/mmu.c              |   2 +-
 arch/mips/Kconfig                |   3 +-
 arch/x86/Kconfig                 |   2 +-
 arch/x86/kernel/smpboot.c        |   4 +-
 include/linux/cpuhotplug.h       |   2 +-
 kernel/cpu.c                     |   8 +--
 19 files changed, 189 insertions(+), 118 deletions(-)

-- 
2.34.1



^ permalink raw reply

* [PATCH v3 03/12] arm64: smp: Tidy up smp_prepare_cpus()
From: Jinjie Ruan @ 2026-06-24  9:25 UTC (permalink / raw)
  To: catalin.marinas, will, tsbogend, tglx, mingo, bp, dave.hansen,
	hpa, peterz, kees, nathan, linusw, ojeda, david.kaplan,
	lukas.bulwahn, ryan.roberts, maz, timothy.hayes, lpieralisi,
	thuth, menglong8.dong, oupton, yeoreum.yun, miko.lenczewski,
	broonie, kevin.brodsky, james.clark, yangyicong, tabba, osandov,
	arnd, anshuman.khandual, david, akpm, ljs, dev.jain, yang,
	chaitanyas.prakash, kprateek.nayak, chenl311, sshegde,
	thorsten.blum, chang.seok.bae, tim.c.chen, x86, linux-kernel,
	linux-arm-kernel, linux-mips
  Cc: ruanjinjie
In-Reply-To: <20260624092537.2916971-1-ruanjinjie@huawei.com>

From: Will Deacon <will@kernel.org>

smp_prepare_cpus() is always run on the boot CPU (i.e. CPU 0) but goes
to great lengths to support running on a CPU where smp_processor_id()
is non-zero.

Clean up the code a little by hardcoding zero for the boot CPU ID.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 arch/arm64/kernel/smp.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 1aa324104afb..e858d7d64d1f 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -772,16 +772,14 @@ void __init smp_init_cpus(void)
 void __init smp_prepare_cpus(unsigned int max_cpus)
 {
 	const struct cpu_operations *ops;
-	int err;
 	unsigned int cpu;
-	unsigned int this_cpu;
+	int err;
 
 	init_cpu_topology();
 
-	this_cpu = smp_processor_id();
-	store_cpu_topology(this_cpu);
-	numa_store_cpu_info(this_cpu);
-	numa_add_cpu(this_cpu);
+	store_cpu_topology(0);
+	numa_store_cpu_info(0);
+	numa_add_cpu(0);
 
 	/*
 	 * If UP is mandated by "nosmp" (which implies "maxcpus=0"), don't set
@@ -796,8 +794,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 	 * secondaries from the bootloader.
 	 */
 	for_each_possible_cpu(cpu) {
-
-		if (cpu == smp_processor_id())
+		if (cpu == 0)
 			continue;
 
 		ops = get_cpu_ops(cpu);
-- 
2.34.1



^ permalink raw reply related

* [PATCH v3 04/12] arm64: smp: Tidy up cpuinfo init and cpufeature updates
From: Jinjie Ruan @ 2026-06-24  9:25 UTC (permalink / raw)
  To: catalin.marinas, will, tsbogend, tglx, mingo, bp, dave.hansen,
	hpa, peterz, kees, nathan, linusw, ojeda, david.kaplan,
	lukas.bulwahn, ryan.roberts, maz, timothy.hayes, lpieralisi,
	thuth, menglong8.dong, oupton, yeoreum.yun, miko.lenczewski,
	broonie, kevin.brodsky, james.clark, yangyicong, tabba, osandov,
	arnd, anshuman.khandual, david, akpm, ljs, dev.jain, yang,
	chaitanyas.prakash, kprateek.nayak, chenl311, sshegde,
	thorsten.blum, chang.seok.bae, tim.c.chen, x86, linux-kernel,
	linux-arm-kernel, linux-mips
  Cc: ruanjinjie
In-Reply-To: <20260624092537.2916971-1-ruanjinjie@huawei.com>

From: Will Deacon <will@kernel.org>

Populating the 'cpuinfo_arm64' structure during CPU bringup and
subsequently checking/updating cpufeature structures is slightly
convoluted and differs unnecessarily between the boot CPU and secondary
CPUs.

Rework the code so that cpuinfo_store_cpu() is used to populate the
'cpuinfo_arm64' structure for each CPU, with secondary CPUs then calling
update_cpu_features() to update the global view of the available
features. This allows us to internalise the 'boot_cpu_data' in
cpufeature.c and paves the way for parallelising the ID register probing
during bring-up of secondary CPUs.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 arch/arm64/include/asm/cpu.h   |  6 ++----
 arch/arm64/kernel/cpufeature.c | 21 +++++++++++++++++----
 arch/arm64/kernel/cpuinfo.c    | 11 -----------
 arch/arm64/kernel/smp.c        |  3 ++-
 4 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index 71493b760b83..b77af3b1bde6 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -73,10 +73,8 @@ struct cpuinfo_arm64 {
 DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);
 
 void cpuinfo_store_cpu(void);
-void __init cpuinfo_store_boot_cpu(void);
 
-void __init init_cpu_features(struct cpuinfo_arm64 *info);
-void update_cpu_features(int cpu, struct cpuinfo_arm64 *info,
-				 struct cpuinfo_arm64 *boot);
+void __init init_cpu_features(void);
+void update_cpu_features(int cpu);
 
 #endif /* __ASM_CPU_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 6d53bb15cf7b..be75e60d56ca 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -117,6 +117,7 @@ EXPORT_SYMBOL(system_cpucaps);
 static struct arm64_cpu_capabilities const __ro_after_init *cpucap_ptrs[ARM64_NCAPS];
 
 DECLARE_BITMAP(boot_cpucaps, ARM64_NCAPS);
+static struct cpuinfo_arm64 boot_cpu_data;
 
 /*
  * arm64_use_ng_mappings must be placed in the .data section, otherwise it
@@ -1164,11 +1165,19 @@ static __init void detect_system_supports_pseudo_nmi(void)
 static inline void detect_system_supports_pseudo_nmi(void) { }
 #endif
 
-void __init init_cpu_features(struct cpuinfo_arm64 *info)
+void __init init_cpu_features(void)
 {
+	struct cpuinfo_arm64 *info = &per_cpu(cpu_data, 0);
+
 	/* Before we start using the tables, make sure it is sorted */
 	sort_ftr_regs();
 
+	/*
+	 * We keep a copy of the boot CPU registers so that physical hotplug
+	 * of CPU 0 can still be properly checked.
+	 */
+	boot_cpu_data = *info;
+
 	init_cpu_ftr_reg(SYS_CTR_EL0, info->reg_ctr);
 	init_cpu_ftr_reg(SYS_DCZID_EL0, info->reg_dczid);
 	init_cpu_ftr_reg(SYS_CNTFRQ_EL0, info->reg_cntfrq);
@@ -1363,12 +1372,14 @@ static int update_32bit_cpu_features(int cpu, struct cpuinfo_32bit *info,
  * non-boot CPU. Also performs SANITY checks to make sure that there
  * aren't any insane variations from that of the boot CPU.
  */
-void update_cpu_features(int cpu,
-			 struct cpuinfo_arm64 *info,
-			 struct cpuinfo_arm64 *boot)
+void update_cpu_features(int cpu)
 {
+	struct cpuinfo_arm64 *boot, *info;
 	int taint = 0;
 
+	boot = &boot_cpu_data;
+	info = per_cpu_ptr(&cpu_data, cpu);
+
 	/*
 	 * The kernel can handle differing I-cache policies, but otherwise
 	 * caches should look identical. Userspace JITs will make use of
@@ -3924,6 +3935,8 @@ static void __init setup_boot_cpu_capabilities(void)
 
 void __init setup_boot_cpu_features(void)
 {
+	init_cpu_features();
+
 	/*
 	 * Initialize the indirect array of CPU capabilities pointers before we
 	 * handle the boot CPU.
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 6149bc91251d..df740dc478b2 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -31,7 +31,6 @@
  * values depending on configuration at or after reset.
  */
 DEFINE_PER_CPU(struct cpuinfo_arm64, cpu_data);
-static struct cpuinfo_arm64 boot_cpu_data;
 
 static inline const char *icache_policy_str(int l1ip)
 {
@@ -523,14 +522,4 @@ void cpuinfo_store_cpu(void)
 {
 	struct cpuinfo_arm64 *info = this_cpu_ptr(&cpu_data);
 	__cpuinfo_store_cpu(info);
-	update_cpu_features(smp_processor_id(), info, &boot_cpu_data);
-}
-
-void __init cpuinfo_store_boot_cpu(void)
-{
-	struct cpuinfo_arm64 *info = &per_cpu(cpu_data, 0);
-	__cpuinfo_store_cpu(info);
-
-	boot_cpu_data = *info;
-	init_cpu_features(&boot_cpu_data);
 }
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index e858d7d64d1f..c14b179c595d 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -233,6 +233,7 @@ asmlinkage notrace void secondary_start_kernel(void)
 	 * Log the CPU info before it is marked online and might get read.
 	 */
 	cpuinfo_store_cpu();
+	update_cpu_features(cpu);
 	store_cpu_topology(cpu);
 
 	/*
@@ -453,7 +454,7 @@ void __init smp_prepare_boot_cpu(void)
 	 */
 	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
 
-	cpuinfo_store_boot_cpu();
+	cpuinfo_store_cpu();
 	setup_boot_cpu_features();
 
 	/* Conditionally switch to GIC PMR for interrupt masking */
-- 
2.34.1



^ permalink raw reply related

* [PATCH v3 08/12] arm64: cpu_ops: Introduce get_secondary_cpu_ops()
From: Jinjie Ruan @ 2026-06-24  9:25 UTC (permalink / raw)
  To: catalin.marinas, will, tsbogend, tglx, mingo, bp, dave.hansen,
	hpa, peterz, kees, nathan, linusw, ojeda, david.kaplan,
	lukas.bulwahn, ryan.roberts, maz, timothy.hayes, lpieralisi,
	thuth, menglong8.dong, oupton, yeoreum.yun, miko.lenczewski,
	broonie, kevin.brodsky, james.clark, yangyicong, tabba, osandov,
	arnd, anshuman.khandual, david, akpm, ljs, dev.jain, yang,
	chaitanyas.prakash, kprateek.nayak, chenl311, sshegde,
	thorsten.blum, chang.seok.bae, tim.c.chen, x86, linux-kernel,
	linux-arm-kernel, linux-mips
  Cc: ruanjinjie
In-Reply-To: <20260624092537.2916971-1-ruanjinjie@huawei.com>

From: Will Deacon <will@kernel.org>

Introduce get_secondary_cpu_ops() to retrieve a pointer to the
'cpu_operations' structure for the non-boot CPUs and use it instead of
get_cpu_ops() where we are dealing with secondary CPUs.

This is a pre-requisite for enabling parallel CPU bring-up.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 arch/arm64/include/asm/cpu_ops.h |  1 +
 arch/arm64/kernel/cpu_ops.c      |  5 +++++
 arch/arm64/kernel/smp.c          | 19 +++++++------------
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/cpu_ops.h b/arch/arm64/include/asm/cpu_ops.h
index a444c8915e88..cd298a8710d8 100644
--- a/arch/arm64/include/asm/cpu_ops.h
+++ b/arch/arm64/include/asm/cpu_ops.h
@@ -48,6 +48,7 @@ struct cpu_operations {
 
 int __init init_cpu_ops(int cpu);
 extern const struct cpu_operations *get_cpu_ops(int cpu);
+extern const struct cpu_operations *get_secondary_cpu_ops(void);
 
 static inline void __init init_bootcpu_ops(void)
 {
diff --git a/arch/arm64/kernel/cpu_ops.c b/arch/arm64/kernel/cpu_ops.c
index eacfb88a0c0c..7d183ca31dc8 100644
--- a/arch/arm64/kernel/cpu_ops.c
+++ b/arch/arm64/kernel/cpu_ops.c
@@ -127,3 +127,8 @@ const struct cpu_operations *get_cpu_ops(int cpu)
 
 	return NULL;
 }
+
+const struct cpu_operations *get_secondary_cpu_ops(void)
+{
+	return cpu_ops;
+}
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 9482e8d38b98..6b9586a69429 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -99,7 +99,7 @@ static inline int op_cpu_kill(unsigned int cpu)
  */
 static int boot_secondary(unsigned int cpu, struct task_struct *idle)
 {
-	const struct cpu_operations *ops = get_cpu_ops(cpu);
+	const struct cpu_operations *ops = get_secondary_cpu_ops();
 
 	if (ops->cpu_boot)
 		return ops->cpu_boot(cpu);
@@ -234,7 +234,7 @@ asmlinkage notrace void secondary_start_kernel(void)
 	rcutree_report_cpu_starting(cpu);
 	trace_hardirqs_off_finish();
 
-	ops = get_cpu_ops(cpu);
+	ops = get_secondary_cpu_ops();
 	if (ops->cpu_postboot)
 		ops->cpu_postboot();
 
@@ -334,7 +334,7 @@ int __cpu_disable(void)
 
 static int op_cpu_kill(unsigned int cpu)
 {
-	const struct cpu_operations *ops = get_cpu_ops(cpu);
+	const struct cpu_operations *ops = get_secondary_cpu_ops();
 
 	/*
 	 * If we have no means of synchronising with the dying CPU, then assume
@@ -375,7 +375,7 @@ void arch_cpuhp_cleanup_dead_cpu(unsigned int cpu)
 void __noreturn cpu_die(void)
 {
 	unsigned int cpu = smp_processor_id();
-	const struct cpu_operations *ops = get_cpu_ops(cpu);
+	const struct cpu_operations *ops = get_secondary_cpu_ops();
 
 	idle_task_exit();
 
@@ -501,7 +501,7 @@ static int __init smp_cpu_setup(int cpu)
 	if (init_cpu_ops(cpu))
 		return -ENODEV;
 
-	ops = get_cpu_ops(cpu);
+	ops = get_secondary_cpu_ops();
 	if (ops->cpu_init(cpu))
 		return -ENODEV;
 
@@ -780,7 +780,7 @@ void __init smp_init_cpus(void)
 
 void __init smp_prepare_cpus(unsigned int max_cpus)
 {
-	const struct cpu_operations *ops;
+	const struct cpu_operations *ops = get_secondary_cpu_ops();
 	unsigned int cpu;
 	int err;
 
@@ -806,10 +806,6 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 		if (cpu == 0)
 			continue;
 
-		ops = get_cpu_ops(cpu);
-		if (!ops)
-			continue;
-
 		err = ops->cpu_prepare(cpu);
 		if (err)
 			continue;
@@ -1299,8 +1295,7 @@ bool smp_crash_stop_failed(void)
 static bool have_cpu_die(void)
 {
 #ifdef CONFIG_HOTPLUG_CPU
-	int any_cpu = raw_smp_processor_id();
-	const struct cpu_operations *ops = get_cpu_ops(any_cpu);
+	const struct cpu_operations *ops = get_secondary_cpu_ops();
 
 	if (ops && ops->cpu_die)
 		return true;
-- 
2.34.1



^ permalink raw reply related

* [PATCH v3 05/12] arm64: smp: Defer RCU registration during secondary CPU bringup
From: Jinjie Ruan @ 2026-06-24  9:25 UTC (permalink / raw)
  To: catalin.marinas, will, tsbogend, tglx, mingo, bp, dave.hansen,
	hpa, peterz, kees, nathan, linusw, ojeda, david.kaplan,
	lukas.bulwahn, ryan.roberts, maz, timothy.hayes, lpieralisi,
	thuth, menglong8.dong, oupton, yeoreum.yun, miko.lenczewski,
	broonie, kevin.brodsky, james.clark, yangyicong, tabba, osandov,
	arnd, anshuman.khandual, david, akpm, ljs, dev.jain, yang,
	chaitanyas.prakash, kprateek.nayak, chenl311, sshegde,
	thorsten.blum, chang.seok.bae, tim.c.chen, x86, linux-kernel,
	linux-arm-kernel, linux-mips
  Cc: ruanjinjie
In-Reply-To: <20260624092537.2916971-1-ruanjinjie@huawei.com>

From: Will Deacon <will@kernel.org>

Calling rcutree_report_cpu_starting() early during boot can lead to
livelocks with the generic CPU hotplug mechanism if the boot CPU blocks
on an RCU grace period while the CPU being onlined is spinning in
cpuhp_ap_sync_alive(). So cpuhp_ap_sync_alive() must be called
before rcutree_report_cpu_starting().

And to prevent a potential deadlock on the boot CPU,
check_local_cpu_capabilities() must be executed before
cpuhp_ap_sync_alive(). This ensures that if an early capability mismatch
occurs and the AP invokes cpu_die_early(), the boot CPU can detect
the boot timeout and proceed, rather than hanging indefinitely.

In preparation for enabling the generic CPU hotplug code on arm64, split
up the trace_hardirqs_off() call during secondary CPU bringup so that we
update lockdep early but defer the tracing updates until after
RCU is ready.

Furthermore, to support parallel bringup without triggering false RCU CPU
stall Warnings or deadlocks, the initialization order must be:

    secondary_start_kernel()
        -> lockdep_hardirqs_off()
        -> check_local_cpu_capabilities()
           -> cpuhp_ap_sync_alive()
        -> cpuhp_ap_sync_alive()
        -> rcutree_report_cpu_starting()
        -> trace_hardirqs_off_finish()

Because check_local_cpu_capabilities() must execute while RCU is still
offline on the local CPU, it normally triggers a false-positive lockdep
"suspicious RCU usage" splat during early lock acquisitions as commit
ce3d31ad3cac ("arm64/smp: Move rcu_cpu_starting() earlier") pointed out.

Resolve this lockdep splat by wrapping the early capability verification
path within lockdep_off() and lockdep_on(). This safely suppresses
false-positive RCU validation flags on the offline CPU while maintaining
the strictly mandated initialization order for race-free parallel bringup.

Signed-off-by: Will Deacon <will@kernel.org>
Co-developed-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 arch/arm64/kernel/smp.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index c14b179c595d..87f92cf9ffa8 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -35,6 +35,7 @@
 #include <linux/kgdb.h>
 #include <linux/kvm_host.h>
 #include <linux/nmi.h>
+#include <linux/lockdep.h>
 
 #include <asm/alternative.h>
 #include <asm/atomic.h>
@@ -215,15 +216,23 @@ asmlinkage notrace void secondary_start_kernel(void)
 	if (system_uses_irq_prio_masking())
 		init_gic_priority_masking();
 
-	rcutree_report_cpu_starting(cpu);
-	trace_hardirqs_off();
+	lockdep_hardirqs_off(CALLER_ADDR0);
 
+	/*
+	 * Since RCU is still offline on this CPU, any nested native printk
+	 * or lock acquisition would normally trigger a false-positive
+	 * "suspicious RCU usage" lockdep splat.
+	 */
+	lockdep_off();
 	/*
 	 * If the system has established the capabilities, make sure
 	 * this CPU ticks all of those. If it doesn't, the CPU will
 	 * fail to come online.
 	 */
 	check_local_cpu_capabilities();
+	lockdep_on();
+	rcutree_report_cpu_starting(cpu);
+	trace_hardirqs_off_finish();
 
 	ops = get_cpu_ops(cpu);
 	if (ops->cpu_postboot)
-- 
2.34.1



^ permalink raw reply related

* [PATCH v3 06/12] arm64: smp: Use generic HOTPLUG_SPLIT_STARTUP machinery for CPU onlining
From: Jinjie Ruan @ 2026-06-24  9:25 UTC (permalink / raw)
  To: catalin.marinas, will, tsbogend, tglx, mingo, bp, dave.hansen,
	hpa, peterz, kees, nathan, linusw, ojeda, david.kaplan,
	lukas.bulwahn, ryan.roberts, maz, timothy.hayes, lpieralisi,
	thuth, menglong8.dong, oupton, yeoreum.yun, miko.lenczewski,
	broonie, kevin.brodsky, james.clark, yangyicong, tabba, osandov,
	arnd, anshuman.khandual, david, akpm, ljs, dev.jain, yang,
	chaitanyas.prakash, kprateek.nayak, chenl311, sshegde,
	thorsten.blum, chang.seok.bae, tim.c.chen, x86, linux-kernel,
	linux-arm-kernel, linux-mips
  Cc: ruanjinjie
In-Reply-To: <20260624092537.2916971-1-ruanjinjie@huawei.com>

From: Will Deacon <will@kernel.org>

In preparation for enabling parallel bringup of secondary CPUs
on arm64, take the baby step of moving from HOTPLUG_CORE_SYNC_DEAD
to HOTPLUG_SPLIT_STARTUP.

To fully enable HOTPLUG_SPLIT_STARTUP, this patch implements:

1) arch_cpuhp_kick_ap_alive(). Kick the secondary CPU via firmware
without blocking.

2) arch_cpuhp_cleanup_kick_cpu(). Extracts early boot telemetry upon
AP bringup timeouts.

3) Callbacks to cpuhp_ap_sync_alive() inside secondary_start_kernel().
Enforces the initial pre-online boot handshake from the secondary
CPU side.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 arch/arm64/Kconfig      |  2 +-
 arch/arm64/kernel/smp.c | 39 +++++++++++++++++++--------------------
 2 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fe60738e5943..24496e9967a8 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -231,7 +231,7 @@ config ARM64
 	select HAVE_KPROBES
 	select HAVE_KRETPROBES
 	select HAVE_GENERIC_VDSO
-	select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
+	select HOTPLUG_SPLIT_STARTUP if SMP
 	select HOTPLUG_SMT if HOTPLUG_CPU
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 87f92cf9ffa8..9482e8d38b98 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -107,12 +107,9 @@ static int boot_secondary(unsigned int cpu, struct task_struct *idle)
 	return -EOPNOTSUPP;
 }
 
-static DECLARE_COMPLETION(cpu_running);
-
-int __cpu_up(unsigned int cpu, struct task_struct *idle)
+int arch_cpuhp_kick_ap_alive(unsigned int cpu, struct task_struct *idle)
 {
 	int ret;
-	long status;
 
 	/*
 	 * We need to tell the secondary core where to find its stack and the
@@ -123,22 +120,22 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 
 	/* Now bring the CPU into our world */
 	ret = boot_secondary(cpu, idle);
-	if (ret) {
-		if (ret != -EPERM)
-			pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
-		return ret;
-	}
+	if (ret && ret != -EPERM)
+		pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
+	return ret;
+}
+
+void arch_cpuhp_cleanup_kick_cpu(unsigned int cpu, bool is_alive)
+{
+	long status;
+
+	if (is_alive)
+		return;
 
 	/*
-	 * CPU was successfully started, wait for it to come online or
-	 * time out.
+	 * We failed to synchronise with the CPU, so check if it left us
+	 * any breadcrumbs.
 	 */
-	wait_for_completion_timeout(&cpu_running,
-				    msecs_to_jiffies(5000));
-	if (cpu_online(cpu))
-		return 0;
-
-	pr_crit("CPU%u: failed to come online\n", cpu);
 	secondary_data.task = NULL;
 	status = READ_ONCE(secondary_data.status);
 	if (status == CPU_MMU_OFF)
@@ -170,8 +167,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 	case CPU_PANIC_KERNEL:
 		panic("CPU%u detected unsupported configuration\n", cpu);
 	}
-
-	return -EIO;
 }
 
 static void init_gic_priority_masking(void)
@@ -231,6 +226,11 @@ asmlinkage notrace void secondary_start_kernel(void)
 	 */
 	check_local_cpu_capabilities();
 	lockdep_on();
+	/*
+	 * Synchronise with the core bringing us online so that it knows
+	 * we made it into the kernel. We're still not 'online'.
+	 */
+	cpuhp_ap_sync_alive();
 	rcutree_report_cpu_starting(cpu);
 	trace_hardirqs_off_finish();
 
@@ -264,7 +264,6 @@ asmlinkage notrace void secondary_start_kernel(void)
 					 read_cpuid_id());
 	update_cpu_boot_status(CPU_BOOT_SUCCESS);
 	set_cpu_online(cpu, true);
-	complete(&cpu_running);
 
 	/*
 	 * Secondary CPUs enter the kernel with all DAIF exceptions masked.
-- 
2.34.1



^ permalink raw reply related

* [PATCH v3 09/12] arm64: cpufeature: Ensure atomic updates to system_cpucaps bitmap
From: Jinjie Ruan @ 2026-06-24  9:25 UTC (permalink / raw)
  To: catalin.marinas, will, tsbogend, tglx, mingo, bp, dave.hansen,
	hpa, peterz, kees, nathan, linusw, ojeda, david.kaplan,
	lukas.bulwahn, ryan.roberts, maz, timothy.hayes, lpieralisi,
	thuth, menglong8.dong, oupton, yeoreum.yun, miko.lenczewski,
	broonie, kevin.brodsky, james.clark, yangyicong, tabba, osandov,
	arnd, anshuman.khandual, david, akpm, ljs, dev.jain, yang,
	chaitanyas.prakash, kprateek.nayak, chenl311, sshegde,
	thorsten.blum, chang.seok.bae, tim.c.chen, x86, linux-kernel,
	linux-arm-kernel, linux-mips
  Cc: ruanjinjie
In-Reply-To: <20260624092537.2916971-1-ruanjinjie@huawei.com>

Parallel CPU bringup allows multiple secondary CPUs to concurrently
execute update_cpu_capabilities() during early boot.

The current non-atomic __set_bit() and __clear_bit() helpers perform
unserialized updates on the shared global bitmap, risking data races
and feature flag erasure.

Upgrade these operations to set_bit() and clear_bit() to ensure all
concurrent modifications are properly serialized via arm64 atomics.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 arch/arm64/kernel/cpufeature.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index be75e60d56ca..a1a13f3e01ed 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -3548,7 +3548,7 @@ static void update_cpu_capabilities(u16 scope_mask)
 
 		if (!caps->matches(caps, cpucap_default_scope(caps))) {
 			if (match_all)
-				__clear_bit(caps->capability, system_cpucaps);
+				clear_bit(caps->capability, system_cpucaps);
 			continue;
 		}
 
@@ -3559,7 +3559,7 @@ static void update_cpu_capabilities(u16 scope_mask)
 		if (!match_all && caps->desc && !caps->cpus)
 			pr_info("detected: %s\n", caps->desc);
 
-		__set_bit(caps->capability, system_cpucaps);
+		set_bit(caps->capability, system_cpucaps);
 
 		if (boot_cpu && (caps->type & SCOPE_BOOT_CPU))
 			set_bit(caps->capability, boot_cpucaps);
-- 
2.34.1



^ permalink raw reply related

* [PATCH v3 12/12] arm64: Add HOTPLUG_PARALLEL support for secondary CPUs
From: Jinjie Ruan @ 2026-06-24  9:25 UTC (permalink / raw)
  To: catalin.marinas, will, tsbogend, tglx, mingo, bp, dave.hansen,
	hpa, peterz, kees, nathan, linusw, ojeda, david.kaplan,
	lukas.bulwahn, ryan.roberts, maz, timothy.hayes, lpieralisi,
	thuth, menglong8.dong, oupton, yeoreum.yun, miko.lenczewski,
	broonie, kevin.brodsky, james.clark, yangyicong, tabba, osandov,
	arnd, anshuman.khandual, david, akpm, ljs, dev.jain, yang,
	chaitanyas.prakash, kprateek.nayak, chenl311, sshegde,
	thorsten.blum, chang.seok.bae, tim.c.chen, x86, linux-kernel,
	linux-arm-kernel, linux-mips
  Cc: ruanjinjie
In-Reply-To: <20260624092537.2916971-1-ruanjinjie@huawei.com>

Support for parallel secondary CPU bringup is already utilized by x86,
MIPS and RISC-V. This patch brings this capability to the arm64
architecture.

To fully enable HOTPLUG_PARALLEL, this patch implements an arm64-specific
arch_cpuhp_init_parallel_bringup() handler.

In parallel bringup, early `set_cpu_present(cpu, 0)` inside
cpu_die_early() removes the secondary CPU prematurely, causing the primary
CPU's second-stage cpuhp_bringup_mask() sweep to skip it and drop
failure logs.

Remove this early unregistration from the secondary CPU, deferring the
set_cpu_present(cpu, 0) call to the primary CPU's cleanup path to ensure
robust parallel boot timeout detection.

Tested natively with ATF on QEMU arm64 virt machine with 64 cores
and also tested with KVM arm64 guest with 128 vCPUs.

Tested-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 arch/arm64/Kconfig      |  2 +-
 arch/arm64/kernel/smp.c | 12 ++++++++++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 24496e9967a8..a9d8030e7492 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -231,7 +231,7 @@ config ARM64
 	select HAVE_KPROBES
 	select HAVE_KRETPROBES
 	select HAVE_GENERIC_VDSO
-	select HOTPLUG_SPLIT_STARTUP if SMP
+	select HOTPLUG_PARALLEL if SMP
 	select HOTPLUG_SMT if HOTPLUG_CPU
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 98ddbe50081d..a973b2d3bab1 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -93,6 +93,15 @@ static inline int op_cpu_kill(unsigned int cpu)
 }
 #endif
 
+extern const struct cpu_operations cpu_psci_ops;
+
+/* Establish whether parallel bringup can be supported. */
+bool __init arch_cpuhp_init_parallel_bringup(void)
+{
+	const struct cpu_operations *ops = get_secondary_cpu_ops();
+
+	return ops == &cpu_psci_ops;
+}
 
 /*
  * Boot a secondary CPU, and assign it the specified idle task.
@@ -137,6 +146,7 @@ void arch_cpuhp_cleanup_kick_cpu(unsigned int cpu, bool is_alive)
 	 * We failed to synchronise with the CPU, so check if it left us
 	 * any breadcrumbs.
 	 */
+	set_cpu_present(cpu, 0);
 	cpu_boot_data[cpu].task = NULL;
 	status = READ_ONCE(cpu_boot_data[cpu].status);
 	if (status == CPU_MMU_OFF)
@@ -416,8 +426,6 @@ void __noreturn cpu_die_early(void)
 
 	pr_crit("CPU%d: will not boot\n", cpu);
 
-	/* Mark this CPU absent */
-	set_cpu_present(cpu, 0);
 	rcutree_report_cpu_dead();
 
 	if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
-- 
2.34.1



^ permalink raw reply related

* [PATCH v3 07/12] arm64: cpu_ops: Make 'cpu_operations' pointer global instead of per-cpu
From: Jinjie Ruan @ 2026-06-24  9:25 UTC (permalink / raw)
  To: catalin.marinas, will, tsbogend, tglx, mingo, bp, dave.hansen,
	hpa, peterz, kees, nathan, linusw, ojeda, david.kaplan,
	lukas.bulwahn, ryan.roberts, maz, timothy.hayes, lpieralisi,
	thuth, menglong8.dong, oupton, yeoreum.yun, miko.lenczewski,
	broonie, kevin.brodsky, james.clark, yangyicong, tabba, osandov,
	arnd, anshuman.khandual, david, akpm, ljs, dev.jain, yang,
	chaitanyas.prakash, kprateek.nayak, chenl311, sshegde,
	thorsten.blum, chang.seok.bae, tim.c.chen, x86, linux-kernel,
	linux-arm-kernel, linux-mips
  Cc: ruanjinjie
In-Reply-To: <20260624092537.2916971-1-ruanjinjie@huawei.com>

From: Will Deacon <will@kernel.org>

'cpu_ops' is an NR_CPUS-length array of 'cpu_operations' pointers, which
theoretically allows for different CPUs to have different bringup and
hotplug backends.

In reality, this complexity exists only to deal with the case where CPU0
is not hotpluggable, so replace the array with a single, global pointer
and record separately whether or not they apply to the boot CPU. Update
the logic in init_cpu_ops() to enforce that only a single set of
'cpu_ops' is required.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 arch/arm64/kernel/cpu_ops.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kernel/cpu_ops.c b/arch/arm64/kernel/cpu_ops.c
index e133011f64b5..eacfb88a0c0c 100644
--- a/arch/arm64/kernel/cpu_ops.c
+++ b/arch/arm64/kernel/cpu_ops.c
@@ -20,7 +20,8 @@ extern const struct cpu_operations acpi_parking_protocol_ops;
 #endif
 extern const struct cpu_operations cpu_psci_ops;
 
-static const struct cpu_operations *cpu_ops[NR_CPUS] __ro_after_init;
+static const struct cpu_operations *cpu_ops __ro_after_init;
+static bool boot_cpu_has_enable_method __ro_after_init;
 
 static const struct cpu_operations *const dt_supported_cpu_ops[] __initconst = {
 	&smp_spin_table_ops,
@@ -40,6 +41,9 @@ static const struct cpu_operations * __init cpu_get_ops(const char *name)
 {
 	const struct cpu_operations *const *ops;
 
+	if (!name)
+		return NULL;
+
 	ops = acpi_disabled ? dt_supported_cpu_ops : acpi_supported_cpu_ops;
 
 	while (*ops) {
@@ -49,6 +53,7 @@ static const struct cpu_operations * __init cpu_get_ops(const char *name)
 		ops++;
 	}
 
+	pr_warn("Unsupported enable-method: %s\n", name);
 	return NULL;
 }
 
@@ -94,25 +99,31 @@ static const char *__init cpu_read_enable_method(int cpu)
 	return enable_method;
 }
 /*
- * Read a cpu's enable method and record it in cpu_ops.
+ * Read a cpu's enable method and update/check cpu_ops.
  */
 int __init init_cpu_ops(int cpu)
 {
 	const char *enable_method = cpu_read_enable_method(cpu);
+	const struct cpu_operations *ops = cpu_get_ops(enable_method);
 
-	if (!enable_method)
+	if (!ops)
 		return -ENODEV;
 
-	cpu_ops[cpu] = cpu_get_ops(enable_method);
-	if (!cpu_ops[cpu]) {
-		pr_warn("Unsupported enable-method: %s\n", enable_method);
-		return -EOPNOTSUPP;
-	}
+	if (!cpu_ops)
+		cpu_ops = ops;
+	else if (cpu_ops != ops)
+		return -EBUSY;
+
+	if (cpu == 0)
+		boot_cpu_has_enable_method = true;
 
 	return 0;
 }
 
 const struct cpu_operations *get_cpu_ops(int cpu)
 {
-	return cpu_ops[cpu];
+	if (cpu || boot_cpu_has_enable_method)
+		return cpu_ops;
+
+	return NULL;
 }
-- 
2.34.1



^ permalink raw reply related

* [PATCH v3 10/12] arm64: smp: Pass CPU ID to update_cpu_boot_status()
From: Jinjie Ruan @ 2026-06-24  9:25 UTC (permalink / raw)
  To: catalin.marinas, will, tsbogend, tglx, mingo, bp, dave.hansen,
	hpa, peterz, kees, nathan, linusw, ojeda, david.kaplan,
	lukas.bulwahn, ryan.roberts, maz, timothy.hayes, lpieralisi,
	thuth, menglong8.dong, oupton, yeoreum.yun, miko.lenczewski,
	broonie, kevin.brodsky, james.clark, yangyicong, tabba, osandov,
	arnd, anshuman.khandual, david, akpm, ljs, dev.jain, yang,
	chaitanyas.prakash, kprateek.nayak, chenl311, sshegde,
	thorsten.blum, chang.seok.bae, tim.c.chen, x86, linux-kernel,
	linux-arm-kernel, linux-mips
  Cc: ruanjinjie
In-Reply-To: <20260624092537.2916971-1-ruanjinjie@huawei.com>

To support CONFIG_HOTPLUG_PARALLEL, the CPU boot status tracking must
be refactored from a single global variable (secondary_data.status)
to a per-CPU tracking structure to prevent multi-core race conditions.

Add a 'cpu' parameter to update_cpu_boot_status() and update all its
callsites to pass the corresponding CPU ID. This allows updating the
boot status at a per-CPU granularity during parallel bringup.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 arch/arm64/include/asm/smp.h   | 6 +++---
 arch/arm64/kernel/cpufeature.c | 2 +-
 arch/arm64/kernel/smp.c        | 8 ++++----
 arch/arm64/mm/context.c        | 5 +++--
 4 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index 10ea4f543069..e2151a01731f 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -122,7 +122,7 @@ static inline void __noreturn cpu_park_loop(void)
 	}
 }
 
-static inline void update_cpu_boot_status(int val)
+static inline void update_cpu_boot_status(unsigned int cpu, int val)
 {
 	WRITE_ONCE(secondary_data.status, val);
 	/* Ensure the visibility of the status update */
@@ -134,9 +134,9 @@ static inline void update_cpu_boot_status(int val)
  * which calls for a kernel panic. Update the boot status and park the calling
  * CPU.
  */
-static inline void __noreturn cpu_panic_kernel(void)
+static inline void __noreturn cpu_panic_kernel(unsigned int cpu)
 {
-	update_cpu_boot_status(CPU_PANIC_KERNEL);
+	update_cpu_boot_status(cpu, CPU_PANIC_KERNEL);
 	cpu_park_loop();
 }
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index a1a13f3e01ed..abc3aef9c206 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -3685,7 +3685,7 @@ static void verify_local_cpu_caps(u16 scope_mask)
 			caps->desc, system_has_cap, cpu_has_cap);
 
 		if (cpucap_panic_on_conflict(caps))
-			cpu_panic_kernel();
+			cpu_panic_kernel(smp_processor_id());
 		else
 			cpu_die_early();
 	}
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 6b9586a69429..14b94df26b44 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -116,7 +116,7 @@ int arch_cpuhp_kick_ap_alive(unsigned int cpu, struct task_struct *idle)
 	 * page tables.
 	 */
 	secondary_data.task = idle;
-	update_cpu_boot_status(CPU_MMU_OFF);
+	update_cpu_boot_status(cpu, CPU_MMU_OFF);
 
 	/* Now bring the CPU into our world */
 	ret = boot_secondary(cpu, idle);
@@ -262,7 +262,7 @@ asmlinkage notrace void secondary_start_kernel(void)
 	pr_info("CPU%u: Booted secondary processor 0x%010lx [0x%08x]\n",
 					 cpu, (unsigned long)mpidr,
 					 read_cpuid_id());
-	update_cpu_boot_status(CPU_BOOT_SUCCESS);
+	update_cpu_boot_status(cpu, CPU_BOOT_SUCCESS);
 	set_cpu_online(cpu, true);
 
 	/*
@@ -420,11 +420,11 @@ void __noreturn cpu_die_early(void)
 	rcutree_report_cpu_dead();
 
 	if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
-		update_cpu_boot_status(CPU_KILL_ME);
+		update_cpu_boot_status(cpu, CPU_KILL_ME);
 		__cpu_try_die(cpu);
 	}
 
-	update_cpu_boot_status(CPU_STUCK_IN_KERNEL);
+	update_cpu_boot_status(cpu, CPU_STUCK_IN_KERNEL);
 
 	cpu_park_loop();
 }
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 0f4a28b87469..e78ae989ad57 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -63,6 +63,7 @@ static u32 get_cpu_asid_bits(void)
 /* Check if the current cpu's ASIDBits is compatible with asid_bits */
 void verify_cpu_asid_bits(void)
 {
+	unsigned int cpu = smp_processor_id();
 	u32 asid = get_cpu_asid_bits();
 
 	if (asid < asid_bits) {
@@ -71,8 +72,8 @@ void verify_cpu_asid_bits(void)
 		 * fewer ASID bits than the boot CPU.
 		 */
 		pr_crit("CPU%d: smaller ASID size(%u) than boot CPU (%u)\n",
-				smp_processor_id(), asid, asid_bits);
-		cpu_panic_kernel();
+			cpu, asid, asid_bits);
+		cpu_panic_kernel(cpu);
 	}
 }
 
-- 
2.34.1



^ permalink raw reply related

* [PATCH v3 11/12] arm64: smp: Rework early boot data into per-CPU arrays
From: Jinjie Ruan @ 2026-06-24  9:25 UTC (permalink / raw)
  To: catalin.marinas, will, tsbogend, tglx, mingo, bp, dave.hansen,
	hpa, peterz, kees, nathan, linusw, ojeda, david.kaplan,
	lukas.bulwahn, ryan.roberts, maz, timothy.hayes, lpieralisi,
	thuth, menglong8.dong, oupton, yeoreum.yun, miko.lenczewski,
	broonie, kevin.brodsky, james.clark, yangyicong, tabba, osandov,
	arnd, anshuman.khandual, david, akpm, ljs, dev.jain, yang,
	chaitanyas.prakash, kprateek.nayak, chenl311, sshegde,
	thorsten.blum, chang.seok.bae, tim.c.chen, x86, linux-kernel,
	linux-arm-kernel, linux-mips
  Cc: ruanjinjie
In-Reply-To: <20260624092537.2916971-1-ruanjinjie@huawei.com>

Rework the global `secondary_data` and `__early_cpu_boot_status`
accessed during early boot into per-CPU arrays to allow secondary
CPUs to boot in parallel.

And reuse `__cpu_logical_map` array in the early boot code in head.S
to resolve each secondary CPU's logical ID concurrently.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 arch/arm64/include/asm/smp.h    | 16 ++++++------
 arch/arm64/kernel/asm-offsets.c |  2 ++
 arch/arm64/kernel/head.S        | 44 +++++++++++++++++++++++++++------
 arch/arm64/kernel/setup.c       |  9 +++++++
 arch/arm64/kernel/smp.c         | 11 +++++----
 arch/arm64/mm/mmu.c             |  2 +-
 6 files changed, 62 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index e2151a01731f..7e69219e6e33 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -34,14 +34,9 @@
 /*
  * Logical CPU mapping.
  */
-extern u64 __cpu_logical_map[NR_CPUS];
+extern void set_cpu_logical_map(unsigned int cpu, u64 hwid);
 extern u64 cpu_logical_map(unsigned int cpu);
 
-static inline void set_cpu_logical_map(unsigned int cpu, u64 hwid)
-{
-	__cpu_logical_map[cpu] = hwid;
-}
-
 struct seq_file;
 
 /*
@@ -92,8 +87,11 @@ struct secondary_data {
 	long status;
 };
 
-extern struct secondary_data secondary_data;
-extern long __early_cpu_boot_status;
+static_assert((sizeof(struct secondary_data) & (sizeof(struct secondary_data) - 1)) == 0,
+	      "secondary_data size must be a power of 2 for assembly lsl assembly!");
+
+extern struct secondary_data cpu_boot_data[NR_CPUS];
+extern long __early_cpu_boot_status[NR_CPUS];
 extern void secondary_entry(void);
 
 extern void arch_send_call_function_single_ipi(int cpu);
@@ -124,7 +122,7 @@ static inline void __noreturn cpu_park_loop(void)
 
 static inline void update_cpu_boot_status(unsigned int cpu, int val)
 {
-	WRITE_ONCE(secondary_data.status, val);
+	WRITE_ONCE(cpu_boot_data[cpu].status, val);
 	/* Ensure the visibility of the status update */
 	dsb(ishst);
 }
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index b6367ff3a49c..566e2222af5b 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -11,6 +11,7 @@
 #include <linux/arm_sdei.h>
 #include <linux/sched.h>
 #include <linux/ftrace.h>
+#include <linux/log2.h>
 #include <linux/kexec.h>
 #include <linux/mm.h>
 #include <linux/kvm_host.h>
@@ -97,6 +98,7 @@ int main(void)
   BLANK();
 #endif
   DEFINE(CPU_BOOT_TASK,		offsetof(struct secondary_data, task));
+  DEFINE(SECONDARY_DATA_SHIFT,	ilog2(sizeof(struct secondary_data)));
   BLANK();
   DEFINE(FTR_OVR_VAL_OFFSET,	offsetof(struct arm64_ftr_override, val));
   DEFINE(FTR_OVR_MASK_OFFSET,	offsetof(struct arm64_ftr_override, mask));
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 87a822e5c4ca..f58de58c4edc 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -12,6 +12,7 @@
 #include <linux/linkage.h>
 #include <linux/init.h>
 #include <linux/pgtable.h>
+#include <linux/threads.h>
 
 #include <asm/asm_pointer_auth.h>
 #include <asm/assembler.h>
@@ -348,6 +349,31 @@ pen:	ldr	x4, [x3]
 	b	pen
 SYM_FUNC_END(secondary_holding_pen)
 
+	/*
+	 * Convert the physical MPIDR of the current secondary CPU
+	 * to its logical CPUID by traversing __cpu_logical_map
+	 * in parallel.
+	 */
+	.macro	mpidr_to_cpuid, mpidr, cpuid, tmp1, tmp2
+	mov_q	\tmp1, MPIDR_HWID_BITMASK
+	and	\mpidr, \mpidr, \tmp1
+
+	adr_l	\tmp1, __cpu_logical_map
+	mov	\cpuid, #0
+.Lfind_cpuid\@:
+	ldr	\tmp2, [\tmp1, \cpuid, lsl #3]
+	cmp	\tmp2, #-1
+	b.eq	.Lnext_cpu\@
+	cmp	\tmp2, \mpidr
+	b.eq	.Lfound_cpuid\@
+.Lnext_cpu\@:
+	add	\cpuid, \cpuid, #1
+	cmp	\cpuid, #NR_CPUS
+	b.ne	.Lfind_cpuid\@
+	b	__secondary_too_slow
+.Lfound_cpuid\@:
+	.endm
+
 	/*
 	 * Secondary entry point that jumps straight into the kernel. Only to
 	 * be used where CPUs are brought online dynamically by the kernel.
@@ -363,6 +389,8 @@ SYM_FUNC_START_LOCAL(secondary_startup)
 	 * Common entry point for secondary CPUs.
 	 */
 	mov	x20, x0				// preserve boot mode
+	mrs	x0, mpidr_el1
+	mpidr_to_cpuid mpidr=x0, cpuid=x19, tmp1=x1, tmp2=x3
 
 #ifdef CONFIG_ARM64_VA_BITS_52
 alternative_if ARM64_HAS_VA52
@@ -386,12 +414,12 @@ SYM_FUNC_START_LOCAL(__secondary_switched)
 	mov	x0, x20
 	bl	finalise_el2
 
-	str_l	xzr, __early_cpu_boot_status, x3
 	adr_l	x5, vectors
 	msr	vbar_el1, x5
 	isb
 
-	adr_l	x0, secondary_data
+	adr_l	x0, cpu_boot_data
+	add	x0, x0, x19, lsl #SECONDARY_DATA_SHIFT
 	ldr	x2, [x0, #CPU_BOOT_TASK]
 	cbz	x2, __secondary_too_slow
 
@@ -430,13 +458,14 @@ SYM_FUNC_END(set_cpu_boot_mode_flag)
  *
  * update_early_cpu_boot_status tmp, status
  *  - Corrupts tmp1, tmp2
- *  - Writes 'status' to __early_cpu_boot_status and makes sure
+ *  - Writes 'status' to __early_cpu_boot_status[cpu] and makes sure
  *    it is committed to memory.
  */
 
-	.macro	update_early_cpu_boot_status status, tmp1, tmp2
-	mov	\tmp2, #\status
+	.macro	update_early_cpu_boot_status status, cpu_reg, tmp1, tmp2
 	adr_l	\tmp1, __early_cpu_boot_status
+	mov	\tmp2, #\status
+	add	\tmp1, \tmp1, \cpu_reg, lsl #3
 	str	\tmp2, [\tmp1]
 	dmb	sy
 	dc	ivac, \tmp1			// Invalidate potentially stale cache line
@@ -486,7 +515,7 @@ SYM_FUNC_START(__cpu_secondary_check52bitva)
 #endif
 
 	update_early_cpu_boot_status \
-		CPU_STUCK_IN_KERNEL | CPU_STUCK_REASON_52_BIT_VA, x0, x1
+		CPU_STUCK_IN_KERNEL | CPU_STUCK_REASON_52_BIT_VA, x19, x0, x1
 1:	wfe
 	wfi
 	b	1b
@@ -498,7 +527,7 @@ SYM_FUNC_END(__cpu_secondary_check52bitva)
 SYM_FUNC_START_LOCAL(__no_granule_support)
 	/* Indicate that this CPU can't boot and is stuck in the kernel */
 	update_early_cpu_boot_status \
-		CPU_STUCK_IN_KERNEL | CPU_STUCK_REASON_NO_GRAN, x1, x2
+		CPU_STUCK_IN_KERNEL | CPU_STUCK_REASON_NO_GRAN, x19, x1, x2
 1:
 	wfe
 	wfi
@@ -508,6 +537,7 @@ SYM_FUNC_END(__no_granule_support)
 SYM_FUNC_START_LOCAL(__primary_switch)
 	adrp	x1, reserved_pg_dir
 	adrp	x2, __pi_init_idmap_pg_dir
+	mov	x19, #0
 	bl	__enable_mmu
 
 	adrp	x1, early_init_stack
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 23c05dc7a8f2..856c00ab6f19 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -273,6 +273,15 @@ arch_initcall(reserve_memblock_reserved_regions);
 
 u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
 
+void set_cpu_logical_map(unsigned int cpu, u64 hwid)
+{
+	unsigned long start = (unsigned long)&__cpu_logical_map[cpu];
+	unsigned long end = start + sizeof(__cpu_logical_map[cpu]);
+
+	__cpu_logical_map[cpu] = hwid;
+	dcache_clean_inval_poc(start, end);
+}
+
 u64 cpu_logical_map(unsigned int cpu)
 {
 	return __cpu_logical_map[cpu];
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 14b94df26b44..98ddbe50081d 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -61,7 +61,8 @@
  * so we need some other way of telling a new secondary core
  * where to place its SVC stack
  */
-struct secondary_data secondary_data;
+struct secondary_data cpu_boot_data[NR_CPUS] ____cacheline_aligned;
+
 /* Number of CPUs which aren't online, but looping in kernel text. */
 static int cpus_stuck_in_kernel;
 
@@ -115,7 +116,7 @@ int arch_cpuhp_kick_ap_alive(unsigned int cpu, struct task_struct *idle)
 	 * We need to tell the secondary core where to find its stack and the
 	 * page tables.
 	 */
-	secondary_data.task = idle;
+	cpu_boot_data[cpu].task = idle;
 	update_cpu_boot_status(cpu, CPU_MMU_OFF);
 
 	/* Now bring the CPU into our world */
@@ -136,10 +137,10 @@ void arch_cpuhp_cleanup_kick_cpu(unsigned int cpu, bool is_alive)
 	 * We failed to synchronise with the CPU, so check if it left us
 	 * any breadcrumbs.
 	 */
-	secondary_data.task = NULL;
-	status = READ_ONCE(secondary_data.status);
+	cpu_boot_data[cpu].task = NULL;
+	status = READ_ONCE(cpu_boot_data[cpu].status);
 	if (status == CPU_MMU_OFF)
-		status = READ_ONCE(__early_cpu_boot_status);
+		status = READ_ONCE(__early_cpu_boot_status[cpu]);
 
 	switch (status & CPU_BOOT_STATUS_MASK) {
 	default:
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index dd85e093ffdb..71c160c6c383 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -62,7 +62,7 @@ static bool rodata_is_rw __ro_after_init = true;
  * The booting CPU updates the failed status @__early_cpu_boot_status,
  * with MMU turned off.
  */
-long __section(".mmuoff.data.write") __early_cpu_boot_status;
+long __section(".mmuoff.data.write") __early_cpu_boot_status[NR_CPUS];
 
 static DEFINE_SPINLOCK(swapper_pgdir_lock);
 static DEFINE_MUTEX(fixmap_lock);
-- 
2.34.1



^ permalink raw reply related

* Re: [PATCH RFC 3/3] arm64: Add HOTPLUG_PARALLEL support for secondary CPUs
From: Jinjie Ruan @ 2026-06-24  9:29 UTC (permalink / raw)
  To: Will Deacon
  Cc: Michael Kelley, catalin.marinas@arm.com,
	tsbogend@alpha.franken.de, pjw@kernel.org, palmer@dabbelt.com,
	aou@eecs.berkeley.edu, alex@ghiti.fr, tglx@kernel.org,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	hpa@zytor.com, peterz@infradead.org, kees@kernel.org,
	nathan@kernel.org, linusw@kernel.org, ojeda@kernel.org,
	david.kaplan@amd.com, lukas.bulwahn@redhat.com,
	ryan.roberts@arm.com, maz@kernel.org, timothy.hayes@arm.com,
	lpieralisi@kernel.org, thuth@redhat.com, oupton@kernel.org,
	yeoreum.yun@arm.com, miko.lenczewski@arm.com, broonie@kernel.org,
	kevin.brodsky@arm.com, james.clark@linaro.org, tabba@google.com,
	mrigendra.chaubey@gmail.com, arnd@arndb.de,
	anshuman.khandual@arm.com, x86@kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org,
	linux-riscv@lists.infradead.org
In-Reply-To: <ajqZJVBCu_D-BkTy@willie-the-truck>



On 6/23/2026 10:33 PM, Will Deacon wrote:
> On Mon, Jun 22, 2026 at 05:16:30PM +0800, Jinjie Ruan wrote:
>>
>>
>> On 6/18/2026 8:21 PM, Will Deacon wrote:
>>> Hi Jinjie,
>>>
>>> On Mon, Jun 15, 2026 at 04:51:48PM +0800, Jinjie Ruan wrote:
>>>> On 6/12/2026 11:45 PM, Michael Kelley wrote:
>>>>> From: Jinjie Ruan <ruanjinjie@huawei.com> Sent: Thursday, June 11, 2026 6:38 AM
>>>>>>
>>>>>> Support for parallel secondary CPU bringup is already utilized by x86,
>>>>>> MIPS, and RISC-V. This patch brings this capability to the arm64
>>>>>> architecture.
>>>>>>
>>>>>> Rework the global `secondary_data` accessed during early boot into
>>>>>> a per-CPU array. This array maps logical CPU IDs to MPIDR_EL1 values,
>>>>>> enabling the early boot code in head.S to resolve each secondary CPU's
>>>>>> logical ID concurrently.
>>>>>>
>>>>>> To fully enable HOTPLUG_PARALLEL, this patch implements:
>>>>>> 1) An arm64-specific arch_cpuhp_kick_ap_alive() handler.
>>>>>> 2) Callbacks to cpuhp_ap_sync_alive() inside secondary_start_kernel().
>>>>>>
>>>>>> Successfully tested on QEMU ARM64 virt machine (KVM on, 128 vCPUs).
>>>>>>
>>>>>> |     test kernel	   | secondary CPUs boot time |
>>>>>> |  ---------------------   |	--------------------  |
>>>>>> |   Without this patch     |		155.672	      |
>>>>>> |   cpuhp.parallel=0	   |		62.897	      |
>>>>>> |   cpuhp.parallel=1	   |		166.703	      |
>>>>>
>>>>> The last two rows seem mixed up. I would expect parallel=0 to
>>>>> result in a longer boot time.
>>>>
>>>> Hi, Michael,
>>>>
>>>> The results are correct and not mixed up.
>>>>
>>>> Compared to the original non‑HOTPLUG_PARALLEL approach, the advantage of
>>>> cpuhp.parallel=0 lies in its use of cpu_relax(`yield` on arm64) instead
>>>> of the wait_for_completion_timeout() mechanism (which may cause sleep
>>>> and context switching). This significantly reduces the overhead of VM
>>>> exits and context switches in a KVM guest, thereby cutting the secondary
>>>> CPU boot time by more than half.
>>>
>>> I don't think that's a particularly compelling reason to enable this for
>>> arm64, in all honesty. The yield instruction typically doesn't do
>>> anything on actual arm64 silicon, so this probably means that you're
>>> introducing busy-loops which tend to be bad for power and scalability.
>>>
>>> I implemented this a while ago [1] but didn't manage to see much in terms
>>> of performance improvement and so I didn't bother to send the patches out
>>> after talking about it at KVM forum [2]. However, as mentioned at the end
>>> of that talk, it _is_ still useful for confidential VMs using PSCI so
>>> let me dust off my old series and send it out to see what you think.
>>
>> Hi Will,
>>
>> Thanks for the insights! Your point about using PSCI v0.2's Context ID
>> to avoid the NR_CPUS array for input parameters (like
>> secondary_data.task) is incredibly elegant.
>>
>> However, if I understand your series correctly, it seems your approach
>> primarily targets preventing the concurrent use of secondary_data.task,
>> but it doesn't seem to account for the potential data trampling on
>> secondary_data.status when multiple secondary CPUs are brought up
>> simultaneously.
>>
>> update_cpu_boot_status()
>>   -> WRITE_ONCE(secondary_data.status.flags[val], 1)
>>
>> arch_cpuhp_cleanup_kick_cpu()
>>   -> status = READ_ONCE(secondary_data.status)
> 
> I need to dust it back off but IIRC I made that thing a byte array, with
> a separate byte for each failure reason.
> 
> Will

Hi, Will,

Thanks for the clarification. A byte array with a separate byte per
failure reason does prevent trampling between different failure types.

However, the issue arises if multiple secondary CPUs fail for the exact
same reason simultaneously. In that scenario, they will still attempt to
write to the same byte index at the same time. As a result, the primary
CPU reading the status later won't be able to distinguish which specific
CPUs encountered the problem, or how many of them failed.

I test your patch with error inject, which configures CPU4 and CPU6,
along with CPU16 and CPU18, to generate distinct boot failures, while
making CPU17 hit the same boot failure as CPU16. The output is not
correct as below:

[    0.332528] smp: Bringing up secondary CPUs ...
[   10.674114] CPU1 failed to report alive state
[   10.674392] CPU1 detected lack of support for 52-bit VAs
[   10.674610] CPUs may be stuck in kernel
[   21.016707] CPU2 failed to report alive state
[   31.357320] CPU3 failed to report alive state
[   41.693228] CPU4 failed to report alive state
[   52.033112] CPU5 failed to report alive state
[   62.378198] CPU6 failed to report alive state
[   72.716467] CPU7 failed to report alive state
[   83.046746] CPU8 failed to report alive state
[   93.338020] CPU9 failed to report alive state
[  103.591986] CPU10 failed to report alive state
[  113.893741] CPU11 failed to report alive state
[  124.230870] CPU12 failed to report alive state
[  134.567597] CPU13 failed to report alive state
[  144.905256] CPU14 failed to report alive state
[  155.247633] CPU15 failed to report alive state
[  165.584891] CPU16 failed to report alive state
[  175.920794] CPU17 failed to report alive state
[  186.256323] CPU18 failed to report alive state
[  196.596136] CPU19 failed to report alive state

The expected output is as below:

        CPU4 failed to report alive state
        CPU4: is stuck in kernel
        CPU4: does not support 52-bit VAs

        CPU6 failed to report alive state
        CPU6: is stuck in kernel
        CPU6: does not support 4K granule

        GICv3: CPU8: found redistributor 8 region 0:0x00000000081a0000
        GICv3: CPU8: using allocated LPI pending table @0x0000000100360000
        CPU8: Booted secondary processor 0x0000000008 [0x410fd034]
        ...
        CPU16 failed to report alive state
        psci: CPU16 killed (polled 0 ms)
        CPU16: died during early boot

        CPU17: will not boot
        CPU17 failed to report alive state
        psci: CPU17 killed (polled 0 ms)
        CPU17: died during early boot

        CPU18 failed to report alive state
        Kernel panic - not syncing: CPU18 detected unsupported configuration

Best regards,
Jinjie


> 



^ permalink raw reply

* Re: [RFC PATCH] irqchip/gic-v3-its: enable dynamic MSI-X allocation
From: Jinqian Yang @ 2026-06-24  9:29 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: lpieralisi, tglx, alex, linux-kernel, linux-arm-kernel,
	liuyonglong, wangzhou1, linuxarm
In-Reply-To: <86o6h0quvj.wl-maz@kernel.org>



On 2026/6/24 15:07, Marc Zyngier wrote:
> On Wed, 24 Jun 2026 03:53:45 +0100,
> Jinqian Yang <yangjinqian1@huawei.com> wrote:
>>
>> On ARM64 platforms with GICv3 ITS, VFIO PCI passthrough currently
>> cannot dynamically allocate MSI-X vectors after MSI-X has been
>> enabled. When QEMU needs to extend the vector range, it must
>> disable MSI-X, free all interrupts, then re-enable with a larger
>> allocation. This creates an interrupt loss window for already-active
>> vectors.
>>
>> Consider HNS3 with RoCE: NIC and RDMA share one PCI device and
>> ITS DeviceID, with MSI-X vectors partitioned as NIC (lower range)
>> then RoCE (starting at base_vector = num_nic_msi). In VFIO
>> passthrough, loading hns_roce after hns3 forces QEMU to tear down
>> all interrupts before re-allocating the larger range. During this
>> process, NIC interrupts may be lost. Testing confirmed that this
>> occasionally occurs, causing the network port reset to fail.
> 
> Well, that's what you get for not exposing differentiated functions.
> Eventually, you face the reality that this is a poor design.
> 

Fair point, though this is not unique to HNS3.. All major NIC+RDMA
vendors share the same PCI function.

>>
>> ITS_MSI_FLAGS_SUPPORTED lacks MSI_FLAG_PCI_MSIX_ALLOC_DYN, causing
>> pci_msix_can_alloc_dyn() to return false. VFIO then sets
>> has_dyn_msix=false and never clears VFIO_IRQ_INFO_NORESIZE for
>> MSI-X, keeping the old "disable and reallocate" behavior.
>>
>> The essential prerequisite for enabling this flag is the fix to
>> msi_prepare() call timing (commit 1396e89e09f0 ("genirq/msi: Move
>> prepare() call to per-device allocation")): msi_prepare() is
>> now called once at per-device domain creation with hwsize, so ITS
>> creates an ITT with sufficient capacity for all MSI-X vectors.
>> Without this fix, msi_prepare() was called per-allocation with
>> semi-random nvec, maybe resulting in an ITT too small for dynamic
>> vector addition.
> 
> How is this paragraph relevant? The kernel has had this fix for over a
> year, and backporting this series is not something I plan to ever do.
> 

Will remove from commit msg.

>>
>> With this in place, dynamic MSI-X allocation works correctly:
>> msi_domain_alloc_irq_at() uses populate_alloc_info() to copy the
>> pre-prepared alloc_data without re-invoking msi_prepare(), so each
>> new vector simply gets a LPI entry in the already-allocated ITT,
>> without affecting existing vectors.
>>
>> Signed-off-by: Jinqian Yang <yangjinqian1@huawei.com>
>> ---
>>   drivers/irqchip/irq-gic-its-msi-parent.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/irqchip/irq-gic-its-msi-parent.c b/drivers/irqchip/irq-gic-its-msi-parent.c
>> index b9257103a999..b2b9d2068bb1 100644
>> --- a/drivers/irqchip/irq-gic-its-msi-parent.c
>> +++ b/drivers/irqchip/irq-gic-its-msi-parent.c
>> @@ -18,7 +18,8 @@
>>   
>>   #define ITS_MSI_FLAGS_SUPPORTED (MSI_GENERIC_FLAGS_MASK |	\
>>   				 MSI_FLAG_PCI_MSIX      |	\
>> -				 MSI_FLAG_MULTI_PCI_MSI)
>> +				 MSI_FLAG_MULTI_PCI_MSI |	\
>> +				 MSI_FLAG_PCI_MSIX_ALLOC_DYN)
>>   
>>   static int its_translate_frame_address(struct fwnode_handle *msi_node, phys_addr_t *pa)
>>   {
> 
> What has this been tested with? In which conditions?
> 

Tested on Hisilicon HIP09 (ARM64, GICv3/GICv4.1) with latest
upstream kernel and QEMU 8.2.

VFIO passthrough of HNS3 NIC to VM: load both hns3 and
hns_roce_hw_v2 drivers, then trigger FLR. Without the flag,
QEMU disables/re-enables MSI-X around FLR, causing occasional
link up failure due to interrupt loss.

Thanks,
Jinqian



^ permalink raw reply

* Re: [PATCH v3 07/15] drm/tidss: oldi: Remove define for unused register OLDI_LB_CTRL
From: Swamil Jain @ 2026-06-24  9:41 UTC (permalink / raw)
  To: Tomi Valkeinen, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Lee Jones, Aradhya Bhatia,
	Nishanth Menon, Vignesh Raghavendra, Devarsh Thakkar,
	Louis Chauvet
  Cc: devicetree, dri-devel, linux-kernel, linux-arm-kernel
In-Reply-To: <20260529-beagley-ai-display-v3-7-7fefdc5d1adf@ideasonboard.com>



On 5/29/26 14:15, Tomi Valkeinen wrote:
> OLDI_LB_CTRL define is not used, and doesn't seem to exist at least on
> some SoCs. Let's remove the define.
> 
> Tested-by: Swamil Jain <s-jain1@ti.com>
> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
> ---

Reviewed-by: Swamil Jain <s-jain1@ti.com>

>   drivers/gpu/drm/tidss/tidss_oldi.h | 1 -
>   1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/tidss/tidss_oldi.h b/drivers/gpu/drm/tidss/tidss_oldi.h
> index 8cd535c5ee65..a361e6dbfce3 100644
> --- a/drivers/gpu/drm/tidss/tidss_oldi.h
> +++ b/drivers/gpu/drm/tidss/tidss_oldi.h
> @@ -20,7 +20,6 @@ struct tidss_oldi;
>   
>   /* Register offsets */
>   #define OLDI_PD_CTRL            0x100
> -#define OLDI_LB_CTRL            0x104
>   
>   /* Power control bits */
>   #define OLDI_PWRDOWN_TX(n)	BIT(n)
> 



^ permalink raw reply

* Re: [PATCH 1/5] dt-bindings: vendor-prefixes: Add Opto Logic
From: Krzysztof Kozlowski @ 2026-06-24  9:42 UTC (permalink / raw)
  To: Leonardo Costa
  Cc: laurent.pinchart, neil.armstrong, jesszhan0024, maarten.lankhorst,
	mripard, tzimmermann, airlied, simona, robh, krzk+dt, conor+dt,
	nm, vigneshr, kristo, prabhakar.mahadev-lad.rj, thierry.reding,
	sam, leonardo.costa, dri-devel, devicetree, linux-kernel,
	linux-arm-kernel
In-Reply-To: <20260623195741.495734-2-leoreis.costa@gmail.com>

On Tue, Jun 23, 2026 at 04:57:37PM -0300, Leonardo Costa wrote:
> From: Leonardo Costa <leonardo.costa@toradex.com>
> 
> Add vendor prefix for Opto Logic, a Swiss display solutions provider and
> printing systems manufacturer.
> 
> Link: https://optologic.ch/
> Signed-off-by: Leonardo Costa <leonardo.costa@toradex.com>
> ---
>  Documentation/devicetree/bindings/vendor-prefixes.yaml | 2 ++
>  1 file changed, 2 insertions(+)

Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>

Best regards,
Krzysztof



^ permalink raw reply

* Re: [PATCH 2/5] dt-bindings: display: panel-lvds: Add compatible for Opto Logic SCX1001511GGC49
From: Krzysztof Kozlowski @ 2026-06-24  9:43 UTC (permalink / raw)
  To: Leonardo Costa
  Cc: laurent.pinchart, neil.armstrong, jesszhan0024, maarten.lankhorst,
	mripard, tzimmermann, airlied, simona, robh, krzk+dt, conor+dt,
	nm, vigneshr, kristo, prabhakar.mahadev-lad.rj, thierry.reding,
	sam, leonardo.costa, dri-devel, devicetree, linux-kernel,
	linux-arm-kernel
In-Reply-To: <20260623195741.495734-3-leoreis.costa@gmail.com>

On Tue, Jun 23, 2026 at 04:57:38PM -0300, Leonardo Costa wrote:
> From: Leonardo Costa <leonardo.costa@toradex.com>
> 
> The Opto Logic SCX1001511GGC49 is a 10.1" WXGA (1280x800) TFT LCD LVDS
> panel.
> 
> Signed-off-by: Leonardo Costa <leonardo.costa@toradex.com>
> ---
>  Documentation/devicetree/bindings/display/panel/panel-lvds.yaml | 2 ++
>  1 file changed, 2 insertions(+)

Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>

Best regards,
Krzysztof



^ permalink raw reply

* [PATCH v2 0/2] Fix traceNoC probe issue on Kaanapali
From: Jie Gan @ 2026-06-24  9:49 UTC (permalink / raw)
  To: Bjorn Andersson, Konrad Dybcio, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Tingwei Zhang, Jingyi Wang, Jie Gan, Abel Vesa,
	Suzuki K Poulose, Mike Leach, James Clark, Leo Yan,
	Yuanfang Zhang
  Cc: Konrad Dybcio, linux-arm-msm, devicetree, linux-kernel, coresight,
	linux-arm-kernel

Patch 1 changes the binding to allow the TraceNoC device accepts
arm,primecell-periphid property.

Patch 2 fixes the deferred probe issue for the TraceNoC device by
adding the arm,primecell-periphid property to bypass the AMBA check.

Signed-off-by: Jie Gan <jie.gan@oss.qualcomm.com>
---
Changes in v2:
- address the ATID issue reported by Sashiko.
- update binding to accept arm,primecell-periphid property.
- Link to v1: https://lore.kernel.org/r/20260624-fix-tracenoc-probe-issue-v1-1-bcc785198fc5@oss.qualcomm.com

---
Jie Gan (2):
      dt-bindings: arm: qcom,coresight-tnoc: allow arm,primecell-periphid
      arm64: dts: qcom: kaanapali: fix traceNoC probe issue

 Documentation/devicetree/bindings/arm/qcom,coresight-tnoc.yaml | 5 ++++-
 arch/arm64/boot/dts/qcom/kaanapali.dtsi                        | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)
---
base-commit: 4e5dfb7c84012007c3c7061126491bbc92d71bf1
change-id: 20260624-fix-tracenoc-probe-issue-c6429da28df4

Best regards,
-- 
Jie Gan <jie.gan@oss.qualcomm.com>



^ permalink raw reply

* [PATCH v2 1/2] dt-bindings: arm: qcom,coresight-tnoc: allow arm,primecell-periphid
From: Jie Gan @ 2026-06-24  9:49 UTC (permalink / raw)
  To: Bjorn Andersson, Konrad Dybcio, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Tingwei Zhang, Jingyi Wang, Jie Gan, Abel Vesa,
	Suzuki K Poulose, Mike Leach, James Clark, Leo Yan,
	Yuanfang Zhang
  Cc: Konrad Dybcio, linux-arm-msm, devicetree, linux-kernel, coresight,
	linux-arm-kernel
In-Reply-To: <20260624-fix-tracenoc-probe-issue-v2-0-786520f62f21@oss.qualcomm.com>

The TNOC device is an AMBA primecell and may carry the standard
arm,primecell-periphid property, which is used to supply the
peripheral ID when it cannot be read from the device registers.

Reference primecell.yaml and set additionalProperties to true so the
binding accepts arm,primecell-periphid along with the other common
primecell properties.

Signed-off-by: Jie Gan <jie.gan@oss.qualcomm.com>
---
 Documentation/devicetree/bindings/arm/qcom,coresight-tnoc.yaml | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/arm/qcom,coresight-tnoc.yaml b/Documentation/devicetree/bindings/arm/qcom,coresight-tnoc.yaml
index ef648a15b806..9624fc0adfdc 100644
--- a/Documentation/devicetree/bindings/arm/qcom,coresight-tnoc.yaml
+++ b/Documentation/devicetree/bindings/arm/qcom,coresight-tnoc.yaml
@@ -32,6 +32,9 @@ select:
   required:
     - compatible
 
+allOf:
+  - $ref: /schemas/arm/primecell.yaml#
+
 properties:
   $nodename:
     pattern: "^tn(@[0-9a-f]+)$"
@@ -78,7 +81,7 @@ required:
   - in-ports
   - out-ports
 
-additionalProperties: false
+additionalProperties: true
 
 examples:
   - |

-- 
2.34.1



^ permalink raw reply related

* [PATCH v2 2/2] arm64: dts: qcom: kaanapali: fix traceNoC probe issue
From: Jie Gan @ 2026-06-24  9:49 UTC (permalink / raw)
  To: Bjorn Andersson, Konrad Dybcio, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Tingwei Zhang, Jingyi Wang, Jie Gan, Abel Vesa,
	Suzuki K Poulose, Mike Leach, James Clark, Leo Yan,
	Yuanfang Zhang
  Cc: Konrad Dybcio, linux-arm-msm, devicetree, linux-kernel, coresight,
	linux-arm-kernel
In-Reply-To: <20260624-fix-tracenoc-probe-issue-v2-0-786520f62f21@oss.qualcomm.com>

The AMBA bus attempts to read the CID/PID of a device before invoking
its probe function if the arm,primecell-periphid property is absent.
This causes a deferred probe issue for the TraceNoC device, as the
CID/PID cannot be read from the periphid register.
Add the arm,primecell-periphid property to bypass the AMBA bus
check and resolve the probe issue.

Fixes: f73959d86c15 ("arm64: dts: qcom: kaanapali: add coresight nodes")
Signed-off-by: Jie Gan <jie.gan@oss.qualcomm.com>
---
 arch/arm64/boot/dts/qcom/kaanapali.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/boot/dts/qcom/kaanapali.dtsi b/arch/arm64/boot/dts/qcom/kaanapali.dtsi
index 7aa9653bd456..25820f7c04cd 100644
--- a/arch/arm64/boot/dts/qcom/kaanapali.dtsi
+++ b/arch/arm64/boot/dts/qcom/kaanapali.dtsi
@@ -5009,6 +5009,7 @@ tn@111b8000 {
 
 			clocks = <&aoss_qmp>;
 			clock-names = "apb_pclk";
+			arm,primecell-periphid = <0x000f0c00>;
 
 			in-ports {
 				#address-cells = <1>;

-- 
2.34.1



^ permalink raw reply related

* [RFC] drm/imx: upstream direction for i.MX95 display support
From: Piyush Patle @ 2026-06-24 10:03 UTC (permalink / raw)
  To: dri-devel, imx, linux-arm-kernel
  Cc: victor.liu, marex, daniel.baluta, Frank.Li, shawnguo, tzimmermann,
	maarten.lankhorst, mripard, airlied, simona

Hi all,

This is an RFC to settle the i.MX95 display architecture before any code is
(re)posted. It is a question, not a submission.

It follows Marek's earlier [1] and Liu Ying's reply there proposing a separate 
i.MX95 driver plus a shared helper library rather than extending the existing 
i.MX8QXP DC driver (drivers/gpu/drm/imx/dc/). That question was never resolved, 
and it gates any serious submission.

The implementation I evaluated is based on the existing NXP downstream 
dpu95 driver. My work has focused on bringing it up on current
mainline, DT integration, FRDM enablement and validation rather than
developing a new driver. I am not proposing to repost that driver as-is;
I would rather settle the architecture first.

The current dc/ implementation is a multi-device component driver with one
platform_driver per block bound via the component framework. The downstream
i.MX95 driver is a single monolithic platform_driver mapping all blocks from
one register base. Unifying appears to require reconciling two bind models,
rather than only adding match_data.

DomainBlend is i.MX95-only and sits on the atomic CRTC path, with no
i.MX8QXP analogue.

The block decomposition also differs: i.MX95 has
dither/hscaler/vscaler/fetcheco/fetchyuv/domainblend, while i.MX8QXP uses
fetchwarp.

There is also anticipated divergence which is not yet upstream (i.MX8QXP
prefetch/PRG, LTS and tiling modifiers, and the downstream i.MX95 blit
engine), although mainline dc/ is KMS-only today.

A single parametrised driver may still be possible, but these
differences led me to revisit the question before preparing a series.

The ported stack is functional on an i.MX95 15x15 FRDM with an IT6263
LVDS-to-HDMI bridge on LVDS channel 1. The DPU probes successfully,
EDID is read through the bridge, and modesetting works at
1280x720@60 and 1920x1080@60. Weston and sway both run correctly.
Tested pipeline DPU -> pixel-interleaver -> pixel-link -> LDB -> 
LVDS PHY -> IT6263 -> HDMI, using JEIDA-24 mapping. DSI is not covered.

One question for Liu Ying is whether the separate-driver plus shared
helper-library approach is still the preferred direction, and where the
helper boundary would be drawn (which blocks/ops are shared versus
implemented per driver).

If that approach is still preferred, I would be interested in working on
the helper-library extraction. Before spending time on it, I would like
to understand whether it matches the intended upstream direction or
whether similar work is already planned.

Likewise, it would be useful to understand whether extending dc/ is still
considered preferable, and how the component and monolithic driver models
would be reconciled given the differences described above.

Thanks,
Piyush Patle

References
[1]  https://lore.kernel.org/dri-devel/20251011170213.128907-1-marek.vasut@mailbox.org/


^ permalink raw reply

* [PATCH v3 3/7] net: wwan: t9xx: Add control DMA interface
From: Jack Wu via B4 Relay @ 2026-06-24 10:04 UTC (permalink / raw)
  To: Loic Poulain, Sergey Ryazanov, Johannes Berg, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Jack Wu, Wen-Zhi Huang, Shi-Wei Yeh, Minano Tseng,
	Matthias Brugger, AngeloGioacchino Del Regno, Simon Horman,
	Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, netdev, linux-arm-kernel, linux-mediatek, linux-doc
In-Reply-To: <20260624-t9xx_driver_v1-v3-0-73ff03f60c48@compal.com>

From: Jack Wu <jackbb_wu@compal.com>

Cross Layer Direct Memory Access(CLDMA) is the hardware
interface used by the control plane and designated to
translate data between the host and the device. It supports
8 hardware queues for the device AP and modem respectively.

CLDMA driver uses General Purpose Descriptor (GPD) to
describe transaction information that can be recognized by
CLDMA hardware. Once CLDMA hardware transaction is started,
it would fetch and parse GPD to transfer data correctly.
To facilitate the CLDMA transaction, a GPD ring for each
queue is used. Once the transaction is started, CLDMA
hardware will traverse the GPD ring to transfer data between
the host and the device until no GPD is available.

CLDMA TX flow:
Once a TX service receives the TX data from the port layer,
it uses APIs exported by the CLDMA driver to configure GPD
with the DMA address of TX data. After that, the service
triggers CLDMA to fetch the first available GPD to transfer
data.

CLDMA RX flow:
When there is RX data from the MD, CLDMA hardware asserts an
interrupt to notify the host to fetch data and dispatch it
to FSM (for handshake messages) or the port layer.
After CLDMA opening is finished, All RX GPDs are fulfilled
and ready to receive data from the device.

Signed-off-by: Jack Wu <jackbb_wu@compal.com>
---
 drivers/net/wwan/t9xx/mtk_ctrl_plane.c          |    4 +-
 drivers/net/wwan/t9xx/mtk_ctrl_plane.h          |   52 +-
 drivers/net/wwan/t9xx/pcie/Makefile             |    7 +-
 drivers/net/wwan/t9xx/pcie/mtk_cldma.c          | 1200 +++++++++++++++++++++++
 drivers/net/wwan/t9xx/pcie/mtk_cldma.h          |  170 ++++
 drivers/net/wwan/t9xx/pcie/mtk_cldma_drv.c      |  371 +++++++
 drivers/net/wwan/t9xx/pcie/mtk_cldma_drv.h      |  177 ++++
 drivers/net/wwan/t9xx/pcie/mtk_cldma_drv_m9xx.c |  183 ++++
 drivers/net/wwan/t9xx/pcie/mtk_cldma_drv_m9xx.h |  103 ++
 drivers/net/wwan/t9xx/pcie/mtk_ctrl_cfg_m9xx.c  |   24 +
 drivers/net/wwan/t9xx/pcie/mtk_pci.c            |   39 +
 drivers/net/wwan/t9xx/pcie/mtk_pci_reg.h        |    1 +
 drivers/net/wwan/t9xx/pcie/mtk_trans_ctrl.c     |  579 +++++++++++
 drivers/net/wwan/t9xx/pcie/mtk_trans_ctrl.h     |   86 ++
 14 files changed, 2992 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wwan/t9xx/mtk_ctrl_plane.c b/drivers/net/wwan/t9xx/mtk_ctrl_plane.c
index 07938f3e6fe2..70348696ac44 100644
--- a/drivers/net/wwan/t9xx/mtk_ctrl_plane.c
+++ b/drivers/net/wwan/t9xx/mtk_ctrl_plane.c
@@ -11,13 +11,14 @@
 /**
  * mtk_ctrl_init() - Initialize the control plane block.
  * @mdev: Pointer to the MTK modem device.
+ * @ops: HIF operations for the control plane.
  *
  * Allocates and initializes the control plane block
  * associated with @mdev.
  *
  * Return: 0 on success, -ENOMEM on allocation failure.
  */
-int mtk_ctrl_init(struct mtk_md_dev *mdev)
+int mtk_ctrl_init(struct mtk_md_dev *mdev, struct mtk_ctrl_hif_ops *ops)
 {
 	struct mtk_ctrl_blk *ctrl_blk;
 
@@ -27,6 +28,7 @@ int mtk_ctrl_init(struct mtk_md_dev *mdev)
 
 	ctrl_blk->mdev = mdev;
 	mdev->ctrl_blk = ctrl_blk;
+	ctrl_blk->ops = ops;
 
 	return 0;
 }
diff --git a/drivers/net/wwan/t9xx/mtk_ctrl_plane.h b/drivers/net/wwan/t9xx/mtk_ctrl_plane.h
index c141876ef95d..88d71ac92084 100644
--- a/drivers/net/wwan/t9xx/mtk_ctrl_plane.h
+++ b/drivers/net/wwan/t9xx/mtk_ctrl_plane.h
@@ -11,12 +11,60 @@
 
 #include "mtk_dev.h"
 
+enum mtk_trb_cmd_type {
+	TRB_CMD_MIN,
+	TRB_CMD_ENABLE,
+	TRB_CMD_TX,
+	TRB_CMD_DISABLE,
+	TRB_CMD_STOP,
+	TRB_CMD_RECOVER,
+	TRB_CMD_MAX,
+};
+
+enum mtk_hif_dev_ctrl_cmd {
+	HIF_CTRL_CMD_CHECK_TX_FULL,
+};
+
+struct trb_open_priv {
+	u8 log_rg_offset;
+	u32 tx_mtu;
+	u32 rx_mtu;
+	u32 tx_frag_size;
+	u32 rx_frag_size;
+	int (*rx_done)(struct sk_buff *skb, void *priv, bool force_recv);
+};
+
+struct trb {
+	u32 channel_id;
+	enum mtk_trb_cmd_type cmd;
+	int status;
+	struct kref kref;
+	void *priv;
+	int (*trb_complete)(struct sk_buff *skb);
+};
+
+union ctrl_hif_cmd_data {
+	u32 rx_ch;
+};
+
+struct mtk_ctrl_hif_ops {
+	int (*init)(struct mtk_md_dev *mdev);
+	int (*exit)(struct mtk_md_dev *mdev);
+	int (*submit_skb)(struct mtk_md_dev *mdev, struct sk_buff *skb, bool force_send);
+	int (*send_cmd)(struct mtk_md_dev *mdev, int cmd, void *data);
+};
+
+struct mtk_ctrl_cfg;
+struct mtk_ctrl_trans;
+
 struct mtk_ctrl_blk {
 	struct mtk_md_dev *mdev;
-	struct mtk_ctrl_trans *trans;
+	struct mtk_ctrl_hif_ops *ops;
+	void *ctrl_hw_priv;
+	struct mtk_ctrl_cfg *cfg;
 };
 
-int mtk_ctrl_init(struct mtk_md_dev *mdev);
+int mtk_ctrl_init(struct mtk_md_dev *mdev, struct mtk_ctrl_hif_ops *ops);
 void mtk_ctrl_exit(struct mtk_md_dev *mdev);
 
 #endif /* __MTK_CTRL_PLANE_H__ */
diff --git a/drivers/net/wwan/t9xx/pcie/Makefile b/drivers/net/wwan/t9xx/pcie/Makefile
index 7410d1796d27..5252f158b058 100644
--- a/drivers/net/wwan/t9xx/pcie/Makefile
+++ b/drivers/net/wwan/t9xx/pcie/Makefile
@@ -7,4 +7,9 @@ obj-$(CONFIG_MTK_T9XX_PCI) += mtk_t9xx_pcie.o
 
 mtk_t9xx_pcie-y := \
 	mtk_pci_drv_m9xx.o \
-	mtk_pci.o
+	mtk_cldma_drv_m9xx.o \
+	mtk_ctrl_cfg_m9xx.o \
+	mtk_pci.o \
+	mtk_trans_ctrl.o \
+	mtk_cldma.o \
+	mtk_cldma_drv.o
diff --git a/drivers/net/wwan/t9xx/pcie/mtk_cldma.c b/drivers/net/wwan/t9xx/pcie/mtk_cldma.c
new file mode 100644
index 000000000000..7a0815aa2fc8
--- /dev/null
+++ b/drivers/net/wwan/t9xx/pcie/mtk_cldma.c
@@ -0,0 +1,1200 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022, MediaTek Inc.
+ */
+
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/dma-mapping.h>
+#include <linux/dmapool.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/kdev_t.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/netdevice.h>
+#include <linux/sched.h>
+#include <linux/skbuff.h>
+#include <linux/slab.h>
+#include <linux/timer.h>
+#include <linux/wait.h>
+#include <linux/workqueue.h>
+#include "mtk_pci.h"
+#include "mtk_cldma.h"
+#include "mtk_cldma_drv.h"
+#include "mtk_dev.h"
+
+#define cldma_drv_ops_null	NULL
+#define DMA_POOL_NAME_LEN	(64)
+#define WAIT_HWO_ROUND		(10)
+#define WAIT_HWO_TIME		(5)
+#define CLDMA_RETRY_DELAY_MS	(100)
+#define NO_BUDGET		(0)
+
+static const int mtk_cldma_hw_id_tbl[NR_CLDMA] = {
+	[CLDMA0] = CLDMA0_HW_ID,
+	[CLDMA1] = CLDMA1_HW_ID,
+	[CLDMA4] = CLDMA4_HW_ID,
+};
+
+static inline void mtk_cldma_clr_bd_dsc(struct cldma_drv_info *drv_info,
+					struct bd_dsc *bd_dsc_pool, int nr_bds)
+{
+	struct bd_dsc *bd_dsc;
+	int i;
+
+	for (i = 0; i < nr_bds; i++) {
+		bd_dsc = bd_dsc_pool + i;
+		dma_unmap_single(drv_info->mdev->dev, bd_dsc->data_dma_addr,
+				 bd_dsc->data_len, DMA_TO_DEVICE);
+		bd_dsc->data_dma_addr = 0;
+		bd_dsc->data_len = 0;
+		if (bd_dsc->bd->tx_bd.bd_flags & CLDMA_BD_FLAG_EOL) {
+			bd_dsc->bd->tx_bd.bd_flags &= ~CLDMA_BD_FLAG_EOL;
+			break;
+		}
+	}
+}
+
+static void mtk_cldma_tx_done_work(struct work_struct *work)
+{
+	struct txq *txq = container_of(work, struct txq, tx_done_work);
+	struct cldma_drv_info *drv_info;
+	struct cldma_drv_ops *drv_ops;
+	struct mtk_ctrl_trans *trans;
+	struct mtk_md_dev *mdev;
+	struct tx_req *req;
+	unsigned int state;
+	struct trb *trb;
+	int i, hif_id;
+	u32 txqno;
+
+	drv_info = txq->drv_info;
+	hif_id = drv_info->hif_id;
+	txqno = txq->txqno;
+	mdev = drv_info->mdev;
+	drv_ops = drv_info->drv_ops;
+	trans = drv_info->cd->trans;
+
+again:
+	for (i = 0; i < txq->nr_gpds; i++) {
+		req = txq->req_pool + txq->free_idx;
+
+		rmb(); /* ensure HWO setup done before HWO read */
+
+		if (!req->data_vm_addr || (req->gpd->tx_gpd.gpd_flags & CLDMA_GPD_FLAG_HWO))
+			break;
+
+		if (txq->nr_bds)
+			mtk_cldma_clr_bd_dsc(drv_info, req->bd_dsc_pool, txq->nr_bds);
+		else
+			dma_unmap_single(mdev->dev, req->data_dma_addr,
+					 req->data_len, DMA_TO_DEVICE);
+
+		trb = (struct trb *)req->skb->cb;
+		trb->status = 0;
+		trb->trb_complete(req->skb);
+
+		req->data_vm_addr = NULL;
+		req->data_dma_addr = 0;
+		req->data_len = 0;
+		req->skb = NULL;
+
+		txq->free_idx = (txq->free_idx + 1) % txq->nr_gpds;
+		if (atomic_fetch_inc(&txq->req_budget) == NO_BUDGET)
+			wake_up(&trans->trb_srv[trans->srv_cfg[hif_id][txqno]]->trb_waitq);
+	}
+
+	state = drv_ops->cldma_check_intr_status(drv_info, DIR_TX, txqno, QUEUE_XFER_DONE);
+	if (state) {
+		if (unlikely(state == LINK_ERROR_VAL))
+			goto out;
+
+		drv_ops->cldma_clr_intr_status(drv_info, DIR_TX, txqno, QUEUE_XFER_DONE);
+
+		cond_resched();
+
+		goto again;
+	}
+
+out:
+	drv_ops->cldma_unmask_intr(drv_info, DIR_TX, txqno, QUEUE_XFER_DONE);
+}
+
+static void mtk_cldma_rx_skb_adjust(struct mtk_md_dev *mdev, struct rxq *rxq,
+				    struct rx_req *req)
+{
+	struct bd_dsc *bd_dsc;
+	int i;
+
+	for (i = 0; i < rxq->nr_bds; i++) {
+		bd_dsc = req->bd_dsc_pool + i;
+		if (bd_dsc->data_dma_addr) {
+			dma_unmap_single(mdev->dev, bd_dsc->data_dma_addr,
+					 req->frag_size, DMA_FROM_DEVICE);
+			bd_dsc->data_dma_addr = 0;
+		}
+		bd_dsc->skb->len = 0;
+		skb_reset_tail_pointer(bd_dsc->skb);
+		skb_put(bd_dsc->skb,
+			min_t(u16, le16_to_cpu(bd_dsc->bd->rx_bd.data_recv_len),
+			      req->frag_size));
+		if (req->skb != bd_dsc->skb) {
+			req->skb->len += bd_dsc->skb->len;
+			req->skb->data_len += bd_dsc->skb->len;
+		}
+		bd_dsc->bd->rx_bd.data_recv_len = 0;
+		bd_dsc->skb = NULL;
+	}
+	if (!rxq->nr_bds) {
+		if (req->data_dma_addr) {
+			dma_unmap_single(mdev->dev, req->data_dma_addr,
+					 req->mtu, DMA_FROM_DEVICE);
+			req->data_dma_addr = 0;
+		}
+		req->skb->len = 0;
+		skb_reset_tail_pointer(req->skb);
+		skb_put(req->skb,
+			min_t(u16, le16_to_cpu(req->gpd->rx_gpd.data_recv_len),
+			      req->mtu));
+	}
+
+	req->gpd->rx_gpd.data_recv_len = 0;
+}
+
+static int mtk_cldma_reload_rx_skb(struct mtk_md_dev *mdev, struct rxq *rxq,
+				   struct rx_req *req)
+{
+	struct sk_buff *tail = NULL;
+	struct bd_dsc *bd_dsc;
+	int nr_bds;
+	int i, ret;
+
+	nr_bds = rxq->nr_bds;
+
+	for (i = 0; i < nr_bds; i++) {
+		bd_dsc = req->bd_dsc_pool + i;
+		bd_dsc->skb = __dev_alloc_skb(req->frag_size, GFP_KERNEL);
+		if (!bd_dsc->skb) {
+			dev_warn((mdev)->dev, "Failed to alloc SKB\n");
+			ret = -ENOMEM;
+			goto err_free_skb;
+		}
+		bd_dsc->skb->next = NULL;
+		bd_dsc->data_dma_addr = dma_map_single(mdev->dev, bd_dsc->skb->data,
+						       req->frag_size, DMA_FROM_DEVICE);
+		ret = dma_mapping_error(mdev->dev, bd_dsc->data_dma_addr);
+		if (unlikely(ret)) {
+			dev_warn((mdev)->dev, "Failed to map SKB data\n");
+			ret = -EFAULT;
+			goto err_free_skb;
+		}
+		bd_dsc->bd->rx_bd.data_buff_ptr_h =
+			cpu_to_le32((u64)(bd_dsc->data_dma_addr) >> 32);
+		bd_dsc->bd->rx_bd.data_buff_ptr_l =
+			cpu_to_le32(bd_dsc->data_dma_addr);
+		if (tail) {
+			tail->next = bd_dsc->skb;
+			tail = bd_dsc->skb;
+			continue;
+		}
+		if (!req->skb) {
+			req->skb = bd_dsc->skb;
+		} else {
+			skb_shinfo(req->skb)->frag_list = bd_dsc->skb;
+			tail = bd_dsc->skb;
+		}
+	}
+	if (!nr_bds) {
+		req->skb = __dev_alloc_skb(req->mtu, GFP_KERNEL);
+		if (!req->skb) {
+			ret = -ENOMEM;
+			goto err_free_skb;
+		}
+
+		req->data_dma_addr = dma_map_single(mdev->dev, req->skb->data,
+						    req->mtu, DMA_FROM_DEVICE);
+		ret = dma_mapping_error(mdev->dev, req->data_dma_addr);
+		if (unlikely(ret)) {
+			dev_warn((mdev)->dev, "Failed to map SKB data\n");
+			ret = -EFAULT;
+			goto err_free_skb;
+		}
+		req->gpd->rx_gpd.data_buff_ptr_h = cpu_to_le32((u64)req->data_dma_addr >> 32);
+		req->gpd->rx_gpd.data_buff_ptr_l = cpu_to_le32(req->data_dma_addr);
+	}
+	return 0;
+
+err_free_skb:
+	if (nr_bds) {
+		if (req->skb)
+			skb_shinfo(req->skb)->frag_list = NULL;
+		for (i = 0; i < nr_bds; i++) {
+			bd_dsc = req->bd_dsc_pool + i;
+			if (!bd_dsc->skb)
+				break;
+			if (!dma_mapping_error(mdev->dev, bd_dsc->data_dma_addr))
+				dma_unmap_single(mdev->dev, bd_dsc->data_dma_addr,
+						 req->frag_size, DMA_FROM_DEVICE);
+			bd_dsc->data_dma_addr = 0;
+			bd_dsc->skb->next = NULL;
+			dev_kfree_skb_any(bd_dsc->skb);
+		}
+	} else {
+		req->data_dma_addr = 0;
+		if (req->skb)
+			dev_kfree_skb_any(req->skb);
+	}
+	req->skb = NULL;
+
+	return ret;
+}
+
+static int mtk_cldma_check_rx_req(struct cldma_drv_info *drv_info, struct rxq *rxq)
+{
+	struct rx_req *req = rxq->req_pool + rxq->free_idx;
+	u64 curr_addr;
+	int i;
+
+	curr_addr = drv_info->drv_ops->cldma_get_rx_curr_addr(drv_info, rxq->rxqno);
+	if (unlikely(!curr_addr))
+		return -ENXIO;
+
+	if (req->gpd_dma_addr == curr_addr)
+		return -EAGAIN;
+	for (i = 0; i < WAIT_HWO_ROUND; i++) {
+		udelay(WAIT_HWO_TIME);
+		if (!(READ_ONCE(req->gpd->rx_gpd.gpd_flags) & CLDMA_GPD_FLAG_HWO))
+			break;
+	}
+	if (i == WAIT_HWO_ROUND) {
+		dev_err((drv_info->mdev)->dev, "Failed to check HWO=0\n");
+		return -EAGAIN;
+	}
+
+	return 0;
+}
+
+static bool mtk_cldma_rx_check_again(struct rxq *rxq)
+{
+	struct cldma_drv_info *drv_info;
+	struct cldma_drv_ops *drv_ops;
+	bool need_check_again = false;
+	u32 state;
+	int rxqno;
+
+	drv_info = rxq->drv_info;
+	drv_ops = drv_info->drv_ops;
+	rxqno = rxq->rxqno;
+
+	do {
+		state = drv_ops->cldma_check_intr_status(drv_info, DIR_RX,
+							 rxqno, QUEUE_XFER_DONE);
+		if (state) {
+			if (unlikely(state == LINK_ERROR_VAL))
+				break;
+
+			drv_ops->cldma_clr_intr_status(drv_info, DIR_RX,
+						       rxqno, QUEUE_XFER_DONE);
+			cond_resched();
+			return true;
+		}
+	} while (need_check_again);
+
+	return false;
+}
+
+static void mtk_cldma_rx_done_work(struct work_struct *work)
+{
+	struct rx_req *req = NULL, *pre_req = NULL;
+	struct rxq *rxq = container_of(work, struct rxq, rx_done_work);
+	struct cldma_drv_info *drv_info;
+	struct cldma_drv_ops *drv_ops;
+	struct mtk_md_dev *mdev;
+	int i, ret, idx;
+
+	drv_info = rxq->drv_info;
+	mdev = drv_info->mdev;
+	drv_ops = drv_info->drv_ops;
+
+again:
+	for (i = 0; i < rxq->nr_gpds; i++) {
+		req = rxq->req_pool + rxq->free_idx;
+		if (!req->skb) {
+			dev_err((mdev)->dev,
+				"Failed to get valid req cldma%d rxq%d req%d\n",
+				drv_info->hw_id, rxq->rxqno, rxq->free_idx);
+			goto out;
+		}
+
+		if (req->gpd->rx_gpd.gpd_flags & CLDMA_GPD_FLAG_HWO)
+			break;
+
+		mtk_cldma_rx_skb_adjust(mdev, rxq, req);
+		do {
+			ret = rxq->rx_done(req->skb, rxq->arg,
+					   atomic_read(&rxq->need_exit) ? true : false);
+			if (ret == -EAGAIN)
+				usleep_range(1000, 2000);
+			else
+				req->skb = NULL;
+		} while (ret == -EAGAIN);
+
+		ret = mtk_cldma_reload_rx_skb(mdev, rxq, req);
+		if (ret)
+			goto out;
+
+		wmb(); /* ensure addr set done before HWO setup done  */
+
+		idx = rxq->free_idx == 0 ? rxq->nr_gpds - 1 : rxq->free_idx - 1;
+		pre_req = rxq->req_pool + idx;
+		pre_req->gpd->rx_gpd.gpd_flags |= CLDMA_GPD_FLAG_HWO;
+		rxq->free_idx = (rxq->free_idx + 1) % rxq->nr_gpds;
+	}
+
+	ret = mtk_cldma_check_rx_req(drv_info, rxq);
+	if (!ret)
+		goto again;
+	else if (ret == -ENXIO)
+		goto out;
+
+	if (!atomic_read(&rxq->need_exit))
+		drv_ops->cldma_resume_queue(drv_info, DIR_RX, rxq->rxqno);
+
+	if (mtk_cldma_rx_check_again(rxq))
+		goto again;
+
+out:
+	drv_ops->cldma_unmask_intr(drv_info, DIR_RX, rxq->rxqno, QUEUE_XFER_DONE);
+	drv_ops->cldma_clear_ip_busy(drv_info);
+}
+
+static int mtk_cldma_alloc_tx_bd(struct cldma_drv_info *drv_info, struct txq *txq,
+				 struct tx_req *req)
+{
+	struct bd_dsc *bd_dsc, *last_bd_dsc = NULL;
+	int i;
+
+	req->bd_dsc_pool = devm_kcalloc(drv_info->mdev->dev, txq->nr_bds,
+					sizeof(*bd_dsc), GFP_KERNEL);
+	if (!req->bd_dsc_pool)
+		return -ENOMEM;
+
+	for (i = 0; i < txq->nr_bds; i++) {
+		bd_dsc = req->bd_dsc_pool + i;
+		bd_dsc->bd = dma_pool_zalloc(drv_info->bd_dma_pool, GFP_KERNEL,
+					     &bd_dsc->bd_dma_addr);
+		if (!bd_dsc->bd)
+			return -ENOMEM;
+		if (!last_bd_dsc) {
+			req->gpd->tx_gpd.data_buff_ptr_h =
+				cpu_to_le32((u64)(bd_dsc->bd_dma_addr) >> 32);
+			req->gpd->tx_gpd.data_buff_ptr_l =
+				cpu_to_le32(bd_dsc->bd_dma_addr);
+		} else {
+			last_bd_dsc->bd->tx_bd.next_bd_ptr_h =
+				cpu_to_le32((u64)(bd_dsc->bd_dma_addr) >> 32);
+			last_bd_dsc->bd->tx_bd.next_bd_ptr_l =
+				cpu_to_le32(bd_dsc->bd_dma_addr);
+		}
+		last_bd_dsc = bd_dsc;
+	}
+	return 0;
+}
+
+static struct txq *mtk_cldma_txq_alloc(struct cldma_drv_info *drv_info, struct sk_buff *skb)
+{
+	struct trb *trb = (struct trb *)skb->cb;
+	struct cldma_drv_ops *drv_ops;
+	struct mtk_ctrl_trans *trans;
+	struct mtk_ctrl_blk *ctrl_blk;
+	struct mtk_md_dev *mdev;
+	struct bd_dsc *bd_dsc;
+	struct tx_req *next;
+	struct tx_req *req;
+	u16 tx_frag_size;
+	struct txq *txq;
+	int i, j, ret;
+
+	mdev = drv_info->mdev;
+	ctrl_blk = mdev->ctrl_blk;
+	trans = ctrl_blk->ctrl_hw_priv;
+	drv_ops = drv_info->drv_ops;
+
+	txq = devm_kzalloc(mdev->dev, sizeof(*txq), GFP_KERNEL);
+	if (!txq)
+		return NULL;
+
+	txq->que = radix_tree_lookup(&trans->queue_tbl, trb->channel_id & 0xFFFF);
+	txq->drv_info = drv_info;
+	txq->txqno = txq->que->txqno;
+	txq->nr_gpds = txq->que->tx_nr_gpds;
+	atomic_set(&txq->req_budget, txq->que->tx_nr_gpds);
+	txq->is_stopping = false;
+	tx_frag_size = txq->que->tx_frag_size;
+	if (txq->que->tx_mtu > tx_frag_size && tx_frag_size)
+		txq->nr_bds = (txq->que->tx_mtu + tx_frag_size - 1) / tx_frag_size;
+
+	txq->req_pool = devm_kcalloc(mdev->dev, txq->nr_gpds, sizeof(*req), GFP_KERNEL);
+	if (!txq->req_pool)
+		goto err_free_txq;
+
+	for (i = 0; i < txq->nr_gpds; i++) {
+		req = txq->req_pool + i;
+		req->mtu = txq->que->tx_mtu;
+		req->frag_size = tx_frag_size;
+		req->gpd = dma_pool_zalloc(drv_info->gpd_dma_pool, GFP_KERNEL, &req->gpd_dma_addr);
+		if (!req->gpd)
+			goto err_free_req;
+		if (txq->nr_bds) {
+			ret = mtk_cldma_alloc_tx_bd(drv_info, txq, req);
+			if (ret)
+				goto err_free_req;
+			req->gpd->tx_gpd.gpd_flags |= CLDMA_GPD_FLAG_BDP;
+		}
+	}
+
+	for (i = 0; i < txq->nr_gpds; i++) {
+		req = txq->req_pool + i;
+		next = txq->req_pool + ((i + 1) % txq->nr_gpds);
+		req->gpd->tx_gpd.gpd_flags |= CLDMA_GPD_FLAG_IOC;
+		req->gpd->tx_gpd.next_gpd_ptr_h = cpu_to_le32((u64)(next->gpd_dma_addr) >> 32);
+		req->gpd->tx_gpd.next_gpd_ptr_l = cpu_to_le32(next->gpd_dma_addr);
+	}
+
+	INIT_WORK(&txq->tx_done_work, mtk_cldma_tx_done_work);
+
+	drv_ops->cldma_stop_queue(drv_info, DIR_TX, txq->txqno);
+	txq->tx_started = false;
+	drv_ops->cldma_setup_start_addr(drv_info, DIR_TX, txq->txqno,
+					txq->req_pool[0].gpd_dma_addr);
+	drv_ops->cldma_unmask_intr(drv_info, DIR_TX, txq->txqno, QUEUE_ERROR);
+	drv_ops->cldma_unmask_intr(drv_info, DIR_TX, txq->txqno, QUEUE_XFER_DONE);
+
+	drv_info->txq[txq->txqno] = txq;
+	return txq;
+
+err_free_req:
+	for (i = 0; i < txq->nr_gpds; i++) {
+		req = txq->req_pool + i;
+		if (!req->gpd)
+			break;
+		if (req->bd_dsc_pool) {
+			for (j = 0; j < txq->nr_bds; j++) {
+				bd_dsc = req->bd_dsc_pool + j;
+				if (!bd_dsc->bd)
+					break;
+				dma_pool_free(drv_info->bd_dma_pool, bd_dsc->bd,
+					      bd_dsc->bd_dma_addr);
+			}
+			devm_kfree(mdev->dev, req->bd_dsc_pool);
+		}
+		dma_pool_free(drv_info->gpd_dma_pool, req->gpd, req->gpd_dma_addr);
+	}
+	devm_kfree(mdev->dev, txq->req_pool);
+err_free_txq:
+	devm_kfree(mdev->dev, txq);
+	return NULL;
+}
+
+static void mtk_cldma_txq_free(struct cldma_drv_info *drv_info, u32 txqno)
+{
+	struct cldma_drv_ops *drv_ops;
+	struct mtk_md_dev *mdev;
+	struct bd_dsc *bd_dsc;
+	struct tx_req *req;
+	struct txq *txq;
+	struct trb *trb;
+	int irq_id;
+	int i, j;
+
+	mdev = drv_info->mdev;
+	drv_ops = drv_info->drv_ops;
+
+	txq = drv_info->txq[txqno];
+	drv_info->txq[txqno] = NULL;
+	/* stop HW tx transaction */
+	drv_ops->cldma_stop_queue(drv_info, DIR_TX, txqno);
+	txq->tx_started = false;
+
+	irq_id = mtk_pci_get_virq_id(mdev, drv_info->pci_ext_irq_id);
+	synchronize_irq(irq_id);
+	/* flush on-going work */
+	flush_work(&txq->tx_done_work);
+	drv_ops->cldma_mask_intr(drv_info, DIR_TX, txqno, QUEUE_XFER_DONE);
+	drv_ops->cldma_mask_intr(drv_info, DIR_TX, txqno, QUEUE_ERROR);
+
+	/* free tx req resource */
+	for (i = 0; i < txq->nr_gpds; i++) {
+		req = txq->req_pool + txq->free_idx;
+		if (req->skb && req->data_len) {
+			if (!txq->nr_bds)
+				dma_unmap_single(mdev->dev, req->data_dma_addr,
+						 req->data_len, DMA_TO_DEVICE);
+			for (j = 0; j < txq->nr_bds; j++) {
+				bd_dsc = req->bd_dsc_pool + j;
+				if (!bd_dsc->data_dma_addr)
+					continue;
+				dma_unmap_single(mdev->dev, bd_dsc->data_dma_addr,
+						 bd_dsc->data_len, DMA_TO_DEVICE);
+			}
+			trb = (struct trb *)req->skb->cb;
+			trb->status = -EPIPE;
+			trb->trb_complete(req->skb);
+		}
+		for (j = 0; j < txq->nr_bds; j++) {
+			bd_dsc = req->bd_dsc_pool + j;
+			dma_pool_free(drv_info->bd_dma_pool, bd_dsc->bd,
+				      bd_dsc->bd_dma_addr);
+		}
+		if (req->bd_dsc_pool)
+			devm_kfree(mdev->dev, req->bd_dsc_pool);
+		dma_pool_free(drv_info->gpd_dma_pool, req->gpd, req->gpd_dma_addr);
+		txq->free_idx = (txq->free_idx + 1) % txq->nr_gpds;
+	}
+
+	devm_kfree(mdev->dev, txq->req_pool);
+	devm_kfree(mdev->dev, txq);
+}
+
+static int mtk_cldma_alloc_rx_bd(struct cldma_drv_info *drv_info, struct rx_req *req,
+				 int nr_bds)
+{
+	struct bd_dsc *bd_dsc, *last_bd_dsc = NULL;
+	struct sk_buff *tail = NULL;
+	struct mtk_md_dev *mdev;
+	u32 left_size;
+	int ret;
+	int i;
+
+	mdev = drv_info->mdev;
+	left_size = req->mtu;
+
+	req->bd_dsc_pool = devm_kcalloc(mdev->dev, nr_bds,
+					sizeof(*bd_dsc), GFP_KERNEL);
+	if (!req->bd_dsc_pool)
+		return -ENOMEM;
+	for (i = 0; i < nr_bds; i++) {
+		bd_dsc = req->bd_dsc_pool + i;
+		bd_dsc->bd = dma_pool_zalloc(drv_info->bd_dma_pool, GFP_KERNEL,
+					     &bd_dsc->bd_dma_addr);
+		if (!bd_dsc->bd)
+			return -ENOMEM;
+
+		bd_dsc->skb = __dev_alloc_skb(req->frag_size, GFP_KERNEL);
+		if (!bd_dsc->skb)
+			return -ENOMEM;
+		bd_dsc->skb->next = NULL;
+		bd_dsc->data_dma_addr =
+			dma_map_single(mdev->dev, bd_dsc->skb->data,
+				       req->frag_size, DMA_FROM_DEVICE);
+		ret = dma_mapping_error(mdev->dev, bd_dsc->data_dma_addr);
+		if (unlikely(ret))
+			return -ENOMEM;
+
+		bd_dsc->bd->rx_bd.data_buff_ptr_h =
+			cpu_to_le32((u64)(bd_dsc->data_dma_addr) >> 32);
+		bd_dsc->bd->rx_bd.data_buff_ptr_l =
+			cpu_to_le32(bd_dsc->data_dma_addr);
+		bd_dsc->bd->rx_bd.data_allow_len =
+			cpu_to_le16(min(req->frag_size, left_size));
+		left_size -= min(req->frag_size, left_size);
+		if (!last_bd_dsc) {
+			req->gpd->rx_gpd.data_buff_ptr_h =
+				cpu_to_le32((u64)(bd_dsc->bd_dma_addr) >> 32);
+			req->gpd->rx_gpd.data_buff_ptr_l =
+				cpu_to_le32(bd_dsc->bd_dma_addr);
+		} else {
+			last_bd_dsc->bd->rx_bd.next_bd_ptr_h =
+				cpu_to_le32((u64)(bd_dsc->bd_dma_addr) >> 32);
+			last_bd_dsc->bd->rx_bd.next_bd_ptr_l =
+				cpu_to_le32(bd_dsc->bd_dma_addr);
+		}
+		last_bd_dsc = bd_dsc;
+		if (tail) {
+			tail->next = bd_dsc->skb;
+			tail = bd_dsc->skb;
+			continue;
+		}
+		if (!req->skb) {
+			req->skb = bd_dsc->skb;
+		} else {
+			skb_shinfo(req->skb)->frag_list = bd_dsc->skb;
+			tail = bd_dsc->skb;
+		}
+	}
+	last_bd_dsc->bd->rx_bd.bd_flags |= CLDMA_BD_FLAG_EOL;
+	return 0;
+}
+
+static void mtk_cldma_rxq_alloc_cancel(struct cldma_drv_info *drv_info, struct rx_req *req,
+				       int nr_bds)
+{
+	struct mtk_md_dev *mdev;
+	struct bd_dsc *bd_dsc;
+	int i;
+
+	mdev = drv_info->mdev;
+
+	if (nr_bds) {
+		if (req->skb)
+			skb_shinfo(req->skb)->frag_list = NULL;
+		if (req->bd_dsc_pool) {
+			for (i = 0; i < nr_bds; i++) {
+				bd_dsc = req->bd_dsc_pool + i;
+				if (!bd_dsc->bd)
+					break;
+				if (bd_dsc->skb) {
+					if (!dma_mapping_error(mdev->dev, bd_dsc->data_dma_addr))
+						dma_unmap_single(mdev->dev, bd_dsc->data_dma_addr,
+								 req->frag_size, DMA_FROM_DEVICE);
+					bd_dsc->data_dma_addr = 0;
+					bd_dsc->skb->next = NULL;
+					dev_kfree_skb_any(bd_dsc->skb);
+				}
+				dma_pool_free(drv_info->bd_dma_pool, bd_dsc->bd,
+					      bd_dsc->bd_dma_addr);
+			}
+			devm_kfree(mdev->dev, req->bd_dsc_pool);
+		}
+	} else {
+		if (req->skb) {
+			if (!dma_mapping_error(mdev->dev, req->data_dma_addr))
+				dma_unmap_single(mdev->dev, req->data_dma_addr,
+						 req->mtu, DMA_FROM_DEVICE);
+			req->data_dma_addr = 0;
+			dev_kfree_skb_any(req->skb);
+		}
+	}
+	dma_pool_free(drv_info->gpd_dma_pool, req->gpd, req->gpd_dma_addr);
+}
+
+static struct rxq *mtk_cldma_rxq_alloc(struct cldma_drv_info *drv_info, struct sk_buff *skb)
+{
+	struct trb_open_priv *trb_open_priv = (struct trb_open_priv *)skb->data;
+	struct trb *trb = (struct trb *)skb->cb;
+	struct cldma_drv_ops *drv_ops;
+	struct mtk_ctrl_trans *trans;
+	struct mtk_ctrl_blk *ctrl_blk;
+	struct mtk_md_dev *mdev;
+	struct rx_req *next;
+	struct rx_req *req;
+	u16 rx_frag_size;
+	struct rxq *rxq;
+	int ret;
+	int i;
+
+	mdev = drv_info->mdev;
+	ctrl_blk = mdev->ctrl_blk;
+	trans = ctrl_blk->ctrl_hw_priv;
+	drv_ops = drv_info->drv_ops;
+
+	rxq = devm_kzalloc(mdev->dev, sizeof(*rxq), GFP_KERNEL);
+	if (!rxq)
+		return NULL;
+
+	rxq->que = radix_tree_lookup(&trans->queue_tbl, trb->channel_id & 0xFFFF);
+	if (rxq->que->rx_nr_gpds < MIN_GPD_NUM) {
+		dev_err((mdev)->dev,
+			"Failed to alloc cldma%d rxq%d due to gpd number < 2\n",
+			drv_info->hw_id, rxq->rxqno);
+		goto err_free_rxq;
+	}
+	rxq->drv_info = drv_info;
+	rxq->rxqno = rxq->que->rxqno;
+	rxq->nr_gpds = rxq->que->rx_nr_gpds;
+	rxq->arg = trb->priv;
+	rxq->rx_done = trb_open_priv->rx_done;
+	atomic_set(&rxq->need_exit, 0);
+	rx_frag_size = rxq->que->rx_frag_size;
+	if (rxq->que->rx_mtu > rx_frag_size && rx_frag_size)
+		rxq->nr_bds = (rxq->que->rx_mtu + rx_frag_size - 1) / rx_frag_size;
+
+	rxq->req_pool = devm_kcalloc(mdev->dev, rxq->nr_gpds, sizeof(*req), GFP_KERNEL);
+	if (!rxq->req_pool)
+		goto err_free_rxq;
+
+	/* setup rx request */
+	for (i = 0; i < rxq->nr_gpds; i++) {
+		req = rxq->req_pool + i;
+		req->mtu = rxq->que->rx_mtu;
+		req->frag_size = rx_frag_size;
+		req->gpd = dma_pool_zalloc(drv_info->gpd_dma_pool, GFP_KERNEL, &req->gpd_dma_addr);
+		if (!req->gpd)
+			goto err_free_req;
+		if (rxq->nr_bds) {
+			ret = mtk_cldma_alloc_rx_bd(drv_info, req, rxq->nr_bds);
+			if (ret)
+				goto err_free_req;
+			req->gpd->rx_gpd.gpd_flags |= CLDMA_GPD_FLAG_BDP;
+		} else {
+			req->skb = __dev_alloc_skb(req->mtu, GFP_KERNEL);
+			if (!req->skb)
+				goto err_free_req;
+			req->data_dma_addr = dma_map_single(mdev->dev, req->skb->data,
+							    req->mtu, DMA_FROM_DEVICE);
+			ret = dma_mapping_error(mdev->dev, req->data_dma_addr);
+			if (unlikely(ret))
+				goto err_free_req;
+		}
+	}
+
+	for (i = 0; i < rxq->nr_gpds; i++) {
+		req = rxq->req_pool + i;
+		next = rxq->req_pool + ((i + 1) % rxq->nr_gpds);
+		req->gpd->rx_gpd.gpd_flags |= CLDMA_GPD_FLAG_IOC;
+		req->gpd->rx_gpd.data_allow_len = cpu_to_le16(req->mtu);
+		req->gpd->rx_gpd.next_gpd_ptr_h = cpu_to_le32((u64)(next->gpd_dma_addr) >> 32);
+		req->gpd->rx_gpd.next_gpd_ptr_l = cpu_to_le32(next->gpd_dma_addr);
+		if (!rxq->nr_bds) {
+			req->gpd->rx_gpd.data_buff_ptr_h =
+				cpu_to_le32((u64)(req->data_dma_addr) >> 32);
+			req->gpd->rx_gpd.data_buff_ptr_l = cpu_to_le32(req->data_dma_addr);
+		}
+		if (i != rxq->nr_gpds - 1)
+			req->gpd->rx_gpd.gpd_flags |= CLDMA_GPD_FLAG_HWO;
+	}
+
+	INIT_WORK(&rxq->rx_done_work, mtk_cldma_rx_done_work);
+
+	drv_info->rxq[rxq->rxqno] = rxq;
+	drv_ops->cldma_stop_queue(drv_info, DIR_RX, rxq->rxqno);
+	drv_ops->cldma_setup_start_addr(drv_info, DIR_RX,
+					rxq->rxqno, rxq->req_pool[0].gpd_dma_addr);
+	drv_ops->cldma_start_queue(drv_info, DIR_RX, rxq->rxqno);
+	drv_ops->cldma_unmask_intr(drv_info, DIR_RX, rxq->rxqno, QUEUE_ERROR);
+	drv_ops->cldma_unmask_intr(drv_info, DIR_RX, rxq->rxqno, QUEUE_XFER_DONE);
+
+	return rxq;
+
+err_free_req:
+	for (i = 0; i < rxq->nr_gpds; i++) {
+		req = rxq->req_pool + i;
+		if (!req->gpd)
+			break;
+		mtk_cldma_rxq_alloc_cancel(drv_info, req, rxq->nr_bds);
+	}
+
+	devm_kfree(mdev->dev, rxq->req_pool);
+err_free_rxq:
+	devm_kfree(mdev->dev, rxq);
+	return NULL;
+}
+
+static void mtk_cldma_rxq_free(struct cldma_drv_info *drv_info, u32 rxqno)
+{
+	struct cldma_drv_ops *drv_ops;
+	struct mtk_md_dev *mdev;
+	struct bd_dsc *bd_dsc;
+	struct rx_req *req;
+	struct rxq *rxq;
+	int irq_id;
+	int i, j;
+
+	mdev = drv_info->mdev;
+	drv_ops = drv_info->drv_ops;
+
+	rxq = drv_info->rxq[rxqno];
+	drv_info->rxq[rxqno] = NULL;
+
+	/* stop HW rx transaction */
+	atomic_set(&rxq->need_exit, 1);
+	drv_ops->cldma_stop_queue(drv_info, DIR_RX, rxqno);
+
+	irq_id = mtk_pci_get_virq_id(mdev, drv_info->pci_ext_irq_id);
+	synchronize_irq(irq_id);
+	/* flush on-going work */
+	flush_work(&rxq->rx_done_work);
+	/* mask L2 RX interrupt again to avoid race condition causing use-after-free issue */
+	drv_ops->cldma_mask_intr(drv_info, DIR_RX, rxqno, QUEUE_XFER_DONE);
+	drv_ops->cldma_mask_intr(drv_info, DIR_RX, rxqno, QUEUE_ERROR);
+
+	/* free rx req resource */
+	for (i = 0; i < rxq->nr_gpds; i++) {
+		req = rxq->req_pool + rxq->free_idx;
+		if (!(req->gpd->rx_gpd.gpd_flags & CLDMA_GPD_FLAG_HWO) &&
+		    le16_to_cpu(req->gpd->rx_gpd.data_recv_len)) {
+			mtk_cldma_rx_skb_adjust(mdev, rxq, req);
+			rxq->rx_done(req->skb, rxq->arg, true);
+			req->skb = NULL;
+		}
+		if (req->skb) {
+			if (rxq->nr_bds) {
+				skb_shinfo(req->skb)->frag_list = NULL;
+			} else {
+				if (req->data_dma_addr)
+					dma_unmap_single(mdev->dev, req->data_dma_addr,
+							 req->mtu, DMA_FROM_DEVICE);
+				dev_kfree_skb_any(req->skb);
+			}
+		}
+		for (j = 0; j < rxq->nr_bds; j++) {
+			bd_dsc = req->bd_dsc_pool + j;
+			if (bd_dsc->skb) {
+				if (bd_dsc->data_dma_addr)
+					dma_unmap_single(mdev->dev, bd_dsc->data_dma_addr,
+							 req->frag_size, DMA_FROM_DEVICE);
+				bd_dsc->skb->next = NULL;
+				dev_kfree_skb_any(bd_dsc->skb);
+			}
+			dma_pool_free(drv_info->bd_dma_pool,
+				      bd_dsc->bd, bd_dsc->bd_dma_addr);
+		}
+		if (req->bd_dsc_pool)
+			devm_kfree(mdev->dev, req->bd_dsc_pool);
+		dma_pool_free(drv_info->gpd_dma_pool, req->gpd, req->gpd_dma_addr);
+		rxq->free_idx = (rxq->free_idx + 1) % rxq->nr_gpds;
+	}
+
+	devm_kfree(mdev->dev, rxq->req_pool);
+	devm_kfree(mdev->dev, rxq);
+}
+
+static int mtk_cldma_start_xfer(struct cldma_drv_info *drv_info, u32 qno)
+{
+	struct cldma_drv_ops *drv_ops;
+	struct txq *txq;
+	u32 val;
+
+	txq = drv_info->txq[qno];
+	drv_ops = drv_info->drv_ops;
+
+	val = drv_ops->cldma_get_tx_start_addr(drv_info, qno);
+	if (unlikely(val == LINK_ERROR_VAL))
+		return -EIO;
+
+	if (unlikely(!val)) {
+		drv_ops->cldma_drv_init(drv_info);
+		txq = drv_info->txq[qno];
+		drv_ops->cldma_setup_start_addr(drv_info, DIR_TX, qno,
+						txq->req_pool[txq->free_idx].gpd_dma_addr);
+		drv_ops->cldma_start_queue(drv_info, DIR_TX, qno);
+		txq->tx_started = true;
+	} else if (unlikely(!txq->tx_started)) {
+		drv_ops->cldma_start_queue(drv_info, DIR_TX, qno);
+		txq->tx_started = true;
+	} else {
+		drv_ops->cldma_resume_queue(drv_info, DIR_TX, qno);
+	}
+
+	return 0;
+}
+
+int mtk_cldma_init(struct mtk_ctrl_trans *trans)
+{
+	struct cldma_dev *cd;
+
+	cd = devm_kzalloc(trans->mdev->dev, sizeof(*cd), GFP_KERNEL);
+	if (!cd)
+		return -ENOMEM;
+
+	cd->trans = trans;
+	trans->dev = cd;
+
+	return 0;
+}
+
+void mtk_cldma_exit(struct mtk_ctrl_trans *trans)
+{
+	if (!trans->dev)
+		return;
+
+	devm_kfree(trans->mdev->dev, trans->dev);
+	trans->dev = NULL;
+}
+
+static int mtk_cldma_open(struct cldma_dev *cd, struct sk_buff *skb)
+{
+	struct trb_open_priv *trb_open_priv = (struct trb_open_priv *)skb->data;
+	struct trb *trb = (struct trb *)skb->cb;
+	struct cldma_drv_info *drv_info;
+	struct queue_info *que;
+	struct txq *txq;
+	struct rxq *rxq;
+	int ret = 0;
+
+	que = radix_tree_lookup(&cd->trans->queue_tbl, trb->channel_id & 0xFFFF);
+	drv_info = cd->cldma_drv_info[que->hif_id];
+	if (!drv_info) {
+		ret = -EIO;
+		goto out;
+	}
+
+	if (que->tx_mtu == 0 || que->rx_mtu == 0) {
+		dev_err((cd->trans->mdev)->dev,
+			"Failed to enable cldma%d txq%d rxq%d due to wrong mtu\n",
+			drv_info->hw_id, que->txqno, que->rxqno);
+		ret = -EINVAL;
+		goto out;
+	}
+
+	trb_open_priv->tx_mtu = que->tx_mtu;
+	trb_open_priv->rx_mtu = que->rx_mtu;
+	trb_open_priv->tx_frag_size = que->tx_frag_size;
+	trb_open_priv->rx_frag_size = que->rx_frag_size;
+
+	if (drv_info->txq[que->txqno] || drv_info->rxq[que->rxqno]) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	txq = mtk_cldma_txq_alloc(drv_info, skb);
+	if (!txq) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	rxq = mtk_cldma_rxq_alloc(drv_info, skb);
+	if (!rxq) {
+		ret = -ENOMEM;
+		mtk_cldma_txq_free(drv_info, txq->txqno);
+		goto out;
+	}
+
+out:
+	trb->status = ret;
+	trb->trb_complete(skb);
+
+	return ret;
+}
+
+static int mtk_cldma_tx(struct cldma_dev *cd, struct sk_buff *skb)
+{
+	struct trb *trb = (struct trb *)skb->cb;
+	struct cldma_drv_info *drv_info;
+	struct mtk_md_dev *mdev;
+	struct queue_info *que;
+	struct txq *txq;
+	int ret;
+
+	que = radix_tree_lookup(&cd->trans->queue_tbl, trb->channel_id & 0xFFFF);
+	drv_info = cd->cldma_drv_info[que->hif_id];
+	if (unlikely(!drv_info))
+		return -EPIPE;
+	txq = drv_info->txq[que->txqno];
+	if (unlikely(!txq) || txq->is_stopping)
+		return -EPIPE;
+
+	mdev = drv_info->mdev;
+
+	ret = mtk_cldma_start_xfer(drv_info, que->txqno);
+	if (unlikely(ret))
+		dev_err((mdev)->dev, "Failed to trigger cldma tx\n");
+
+	return ret;
+}
+
+static int mtk_cldma_close(struct cldma_dev *cd, struct sk_buff *skb)
+{
+	struct trb *trb = (struct trb *)skb->cb;
+	struct cldma_drv_info *drv_info;
+	struct queue_info *que;
+
+	que = radix_tree_lookup(&cd->trans->queue_tbl, trb->channel_id & 0xFFFF);
+	drv_info = cd->cldma_drv_info[que->hif_id];
+	if (unlikely(!drv_info))
+		return -EPIPE;
+
+	if (drv_info->txq[que->txqno])
+		mtk_cldma_txq_free(drv_info, que->txqno);
+	if (drv_info->rxq[que->rxqno])
+		mtk_cldma_rxq_free(drv_info, que->rxqno);
+
+	trb->status = 0;
+	trb->trb_complete(skb);
+
+	return 0;
+}
+
+static int mtk_cldma_txbuf_set(struct cldma_drv_info *drv_info, struct sk_buff *skb,
+			       struct tx_req *req, int nr_bds)
+{
+	struct sk_buff *curr_skb, *next_skb;
+	struct mtk_md_dev *mdev;
+	struct bd_dsc *bd_dsc;
+	int ret;
+	int i;
+
+	mdev = drv_info->mdev;
+
+	if (nr_bds) {
+		bd_dsc = req->bd_dsc_pool;
+		curr_skb = skb;
+		for (i = 0; i < nr_bds && curr_skb; i++) {
+			bd_dsc = req->bd_dsc_pool + i;
+			if (req->bd_dsc_pool == bd_dsc) {
+				bd_dsc->data_len = skb->len - skb->data_len;
+				next_skb = skb_shinfo(skb)->frag_list;
+			} else {
+				bd_dsc->data_len = curr_skb->len;
+				next_skb = curr_skb->next;
+			}
+			bd_dsc->data_dma_addr = dma_map_single(mdev->dev, curr_skb->data,
+							       bd_dsc->data_len, DMA_TO_DEVICE);
+			ret = dma_mapping_error(mdev->dev, bd_dsc->data_dma_addr);
+			if (unlikely(ret))
+				goto err_unmap_buffer;
+
+			bd_dsc->bd->tx_bd.data_buff_ptr_h =
+				cpu_to_le32((u64)(bd_dsc->data_dma_addr) >> 32);
+			bd_dsc->bd->tx_bd.data_buff_ptr_l = cpu_to_le32(bd_dsc->data_dma_addr);
+			bd_dsc->bd->tx_bd.data_buffer_len = cpu_to_le16(bd_dsc->data_len);
+			curr_skb = next_skb;
+		}
+		bd_dsc->bd->tx_bd.bd_flags = CLDMA_BD_FLAG_EOL;
+	} else {
+		req->data_dma_addr = dma_map_single(mdev->dev, skb->data,
+						    skb->len, DMA_TO_DEVICE);
+		ret = dma_mapping_error(mdev->dev, req->data_dma_addr);
+		if (unlikely(ret)) {
+			req->data_dma_addr = 0;
+			goto err_exit;
+		}
+
+		req->gpd->tx_gpd.data_buff_ptr_h = cpu_to_le32((u64)(req->data_dma_addr) >> 32);
+		req->gpd->tx_gpd.data_buff_ptr_l = cpu_to_le32(req->data_dma_addr);
+	}
+
+	return 0;
+
+err_unmap_buffer:
+	for (i = 0; i < nr_bds; i++) {
+		bd_dsc = req->bd_dsc_pool + i;
+		if (dma_mapping_error(mdev->dev, bd_dsc->data_dma_addr)) {
+			bd_dsc->data_dma_addr = 0;
+			break;
+		}
+		dma_unmap_single(mdev->dev, bd_dsc->data_dma_addr,
+				 bd_dsc->data_len, DMA_TO_DEVICE);
+		bd_dsc->data_dma_addr = 0;
+	}
+err_exit:
+	dev_err((mdev)->dev, "Failed to map dma! error:%d\n", ret);
+	return -EAGAIN;
+}
+
+int mtk_cldma_submit_tx(void *dev, struct sk_buff *skb)
+{
+	struct trb *trb = (struct trb *)skb->cb;
+	struct cldma_drv_info *drv_info;
+	struct cldma_dev *cd = dev;
+	struct queue_info *que;
+	struct tx_req *req;
+	struct txq *txq;
+	int ret;
+
+	que = radix_tree_lookup(&cd->trans->queue_tbl, trb->channel_id & 0xFFFF);
+	drv_info = cd->cldma_drv_info[que->hif_id];
+	if (unlikely(!drv_info))
+		return -EINVAL;
+
+	txq = drv_info->txq[que->txqno];
+	if (unlikely(!txq))
+		return -EINVAL;
+
+	if (!atomic_read(&txq->req_budget))
+		return -EAGAIN;
+
+	req = txq->req_pool + txq->wr_idx;
+	req->gpd->tx_gpd.debug_id = 0x01;
+	ret = mtk_cldma_txbuf_set(drv_info, skb, req, txq->nr_bds);
+	if (ret)
+		return ret;
+
+	req->gpd->tx_gpd.data_buff_len = cpu_to_le16(skb->len);
+
+	req->data_len = skb->len;
+	req->skb = skb;
+	req->data_vm_addr = skb->data;
+
+	wmb(); /* ensure req and data msg set done before HWO setup */
+
+	req->gpd->tx_gpd.gpd_flags |= CLDMA_GPD_FLAG_HWO;
+
+	wmb(); /* ensure HWO setup done before index update */
+
+	txq->wr_idx = (txq->wr_idx + 1) % txq->nr_gpds;
+	atomic_dec(&txq->req_budget);
+
+	return 0;
+}
+
+int mtk_cldma_get_tx_budget(void *dev, enum mtk_hif_id hif_id, u32 qno)
+{
+	struct cldma_drv_info *drv_info;
+	struct cldma_dev *cd = dev;
+	struct txq *txq;
+
+	if (unlikely(hif_id >= NR_CLDMA || qno >= HW_QUE_NUM || !cd))
+		return -EINVAL;
+
+	drv_info = cd->cldma_drv_info[hif_id];
+	if (!drv_info)
+		return -EINVAL;
+	txq = drv_info->txq[qno];
+	if (!txq)
+		return -EINVAL;
+	return atomic_read(&txq->req_budget);
+}
+
+static int (*trb_act_tbl[TRB_CMD_MAX])(struct cldma_dev *cd, struct sk_buff *skb) = {
+	[TRB_CMD_ENABLE] = mtk_cldma_open,
+	[TRB_CMD_TX] = mtk_cldma_tx,
+	[TRB_CMD_DISABLE] = mtk_cldma_close,
+};
+
+int mtk_cldma_trb_process(void *dev, struct sk_buff *skb)
+{
+	struct cldma_dev *cd;
+	struct trb *trb;
+
+	if (!dev || !skb)
+		return -EINVAL;
+
+	cd = (struct cldma_dev *)dev;
+	trb = (struct trb *)skb->cb;
+
+	if (!(trb->cmd > TRB_CMD_MIN && trb->cmd < TRB_CMD_STOP))
+		return -EINVAL;
+
+	return trb_act_tbl[trb->cmd](cd, skb);
+}
+
+int mtk_cldma_check_ch_cfg(void *dev, struct queue_info *que)
+{
+	struct cldma_drv_info *drv_info;
+	struct cldma_dev *cd = dev;
+	struct mtk_md_dev *mdev;
+	struct txq *txq;
+	struct rxq *rxq;
+
+	mdev = cd->trans->mdev;
+	drv_info = cd->cldma_drv_info[que->hif_id];
+
+	if (!drv_info) {
+		dev_err((mdev)->dev, "CLDMA%d has not been initialized\n",
+			mtk_cldma_hw_id_tbl[que->hif_id]);
+		return -EINVAL;
+	}
+
+	txq = drv_info->txq[que->txqno];
+	rxq = drv_info->rxq[que->rxqno];
+	if (!txq || !rxq) {
+		dev_err((mdev)->dev,
+			"CLDMA%d txq%d rxq%d has not been enabled\n",
+			mtk_cldma_hw_id_tbl[que->hif_id], que->txqno, que->rxqno);
+		return -EINVAL;
+	}
+
+	if (que->tx_mtu != txq->que->tx_mtu || que->rx_mtu != rxq->que->rx_mtu) {
+		dev_err((mdev)->dev,
+			"Channel:%08x tx_mtu:%08x rx_mtu:%08x do not match ch cfg\n",
+			que->tx_chl, que->tx_mtu, que->rx_mtu);
+		return -EINVAL;
+	}
+
+	return 0;
+}
diff --git a/drivers/net/wwan/t9xx/pcie/mtk_cldma.h b/drivers/net/wwan/t9xx/pcie/mtk_cldma.h
new file mode 100644
index 000000000000..74ce4f2f0b30
--- /dev/null
+++ b/drivers/net/wwan/t9xx/pcie/mtk_cldma.h
@@ -0,0 +1,170 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ *
+ * Copyright (c) 2022, MediaTek Inc.
+ */
+
+#ifndef __MTK_CLDMA_H__
+#define __MTK_CLDMA_H__
+
+#include <linux/dma-mapping.h>
+#include <linux/dmapool.h>
+#include <linux/interrupt.h>
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/types.h>
+
+#include "mtk_ctrl_plane.h"
+#include "mtk_trans_ctrl.h"
+
+struct mtk_fsm_param;
+
+#define TXQ(N)					(N)
+#define RXQ(N)					(N)
+
+#define CLDMA_GPD_FLAG_HWO			BIT(0)
+#define CLDMA_GPD_FLAG_BDP			BIT(1)
+#define CLDMA_GPD_FLAG_BPS			BIT(2)
+#define CLDMA_GPD_FLAG_IOC			BIT(7)
+#define CLDMA_BD_FLAG_EOL			BIT(0)
+
+union gpd {
+	struct {
+		u8 gpd_flags;
+		u8 non_used1;
+		__le16 data_allow_len;
+		__le32 next_gpd_ptr_h;
+		__le32 next_gpd_ptr_l;
+		__le32 data_buff_ptr_h;
+		__le32 data_buff_ptr_l;
+		__le16 data_recv_len;
+		u8 non_used2;
+		u8 debug_id;
+	} rx_gpd;
+
+	struct {
+		u8 gpd_flags;
+		u8 non_used1;
+		u8 non_used2;
+		u8 debug_id;
+		__le32 next_gpd_ptr_h;
+		__le32 next_gpd_ptr_l;
+		__le32 data_buff_ptr_h;
+		__le32 data_buff_ptr_l;
+		__le16 data_buff_len;
+		__le16 non_used3;
+	} tx_gpd;
+} __packed;
+
+union bd {
+	struct {
+		u8 bd_flags;
+		u8 non_used1;
+		__le16 data_allow_len;
+		__le32 next_bd_ptr_h;
+		__le32 next_bd_ptr_l;
+		__le32 data_buff_ptr_h;
+		__le32 data_buff_ptr_l;
+		__le16 data_recv_len;
+		__le16 non_used2;
+	} rx_bd;
+
+	struct {
+		u8 bd_flags;
+		u8 non_used1;
+		__le16 non_used2;
+		__le32 next_bd_ptr_h;
+		__le32 next_bd_ptr_l;
+		__le32 data_buff_ptr_h;
+		__le32 data_buff_ptr_l;
+		__le16 data_buffer_len;
+		u8 extension_len;
+		u8 non_used3;
+	} tx_bd;
+} __packed;
+
+struct bd_dsc {
+	union bd *bd;
+	struct sk_buff *skb;
+	dma_addr_t bd_dma_addr;
+	dma_addr_t data_dma_addr;
+	size_t data_len;
+};
+
+struct rx_req {
+	union gpd *gpd;
+	u32 mtu;
+	struct sk_buff *skb;
+	size_t data_len;
+	dma_addr_t gpd_dma_addr;
+	dma_addr_t data_dma_addr;
+	u32 frag_size;
+	struct bd_dsc *bd_dsc_pool;
+};
+
+struct rxq {
+	struct cldma_drv_info *drv_info;
+	u32 rxqno;
+	struct queue_info *que;
+	struct work_struct rx_done_work;
+	struct rx_req *req_pool;
+	u32 nr_gpds;
+	u32 free_idx;
+	unsigned short rx_done_cnt;
+	void *arg;
+	int (*rx_done)(struct sk_buff *skb, void *priv, bool force_recv);
+	u32 nr_bds;
+	atomic_t need_exit;
+};
+
+struct tx_req {
+	union gpd *gpd;
+	u32 mtu;
+	void *data_vm_addr;
+	size_t data_len;
+	dma_addr_t data_dma_addr;
+	dma_addr_t gpd_dma_addr;
+	struct sk_buff *skb;
+	int (*trb_complete)(struct sk_buff *skb);
+	u32 frag_size;
+	struct bd_dsc *bd_dsc_pool;
+};
+
+struct txq {
+	struct cldma_drv_info *drv_info;
+	u32 txqno;
+	struct queue_info *que;
+	struct work_struct tx_done_work;
+	struct tx_req *req_pool;
+	u32 nr_gpds;
+	atomic_t req_budget;
+	u32 wr_idx;
+	u32 free_idx;
+	bool tx_started;
+	bool is_stopping;
+	unsigned short tx_done_cnt;
+	u32 nr_bds;
+};
+
+struct cldma_dev {
+	struct cldma_drv_info *cldma_drv_info[NR_CLDMA];
+	struct mtk_ctrl_trans *trans;
+};
+
+struct cldma_drv_info_desc {
+	u32 hw_ver;
+	struct cldma_drv_ops *drv_ops;
+	struct cldma_hw_regs *hw_regs;
+};
+
+int mtk_cldma_init(struct mtk_ctrl_trans *trans);
+void mtk_cldma_exit(struct mtk_ctrl_trans *trans);
+int mtk_cldma_submit_tx(void *dev, struct sk_buff *skb);
+int mtk_cldma_get_tx_budget(void *dev, enum mtk_hif_id hif_id, u32 qno);
+int mtk_cldma_trb_process(void *dev, struct sk_buff *skb);
+void mtk_cldma_fsm_state_listener(struct mtk_fsm_param *param, struct mtk_ctrl_trans *trans);
+int mtk_cldma_check_ch_cfg(void *dev, struct queue_info *que);
+
+#define drv_ops_name(NAME) cldma_drv_ops_##NAME
+#define cldma_regs_name(NAME) mtk_cldma_regs_##NAME
+
+#endif
diff --git a/drivers/net/wwan/t9xx/pcie/mtk_cldma_drv.c b/drivers/net/wwan/t9xx/pcie/mtk_cldma_drv.c
new file mode 100644
index 000000000000..b5d3894dd62c
--- /dev/null
+++ b/drivers/net/wwan/t9xx/pcie/mtk_cldma_drv.c
@@ -0,0 +1,371 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2023, MediaTek Inc.
+ */
+
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/dma-mapping.h>
+#include <linux/dmapool.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/kdev_t.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/netdevice.h>
+#include <linux/sched.h>
+#include <linux/skbuff.h>
+#include <linux/slab.h>
+#include <linux/timer.h>
+#include <linux/wait.h>
+#include <linux/workqueue.h>
+
+#include "mtk_cldma_drv.h"
+#include "mtk_dev.h"
+#include "mtk_pci.h"
+#include "mtk_pci_reg.h"
+
+#define WAIT_QUEUE_STOP		(70)
+
+void mtk_cldma_drv_init(struct cldma_drv_info *drv_info)
+{
+	struct cldma_hw_regs *hw_regs;
+	struct mtk_md_dev *mdev;
+	int base;
+	u32 val;
+
+	mdev = drv_info->mdev;
+	base = drv_info->base_addr;
+	hw_regs = drv_info->hw_regs;
+
+	/* set CLDMA to 64 bit mode GPD */
+	val = mtk_pci_read32(mdev, base + hw_regs->reg_cldma_ul_cfg);
+	val = (val & (~(0x7 << 5))) | ((0x4) << 5);
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_ul_cfg, val);
+
+	val = mtk_pci_read32(mdev, base + hw_regs->reg_cldma_so_cfg);
+	val = (val & (~(0x7 << 10))) | ((0x4) << 10) | (1 << 2);
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_so_cfg, val);
+
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_rx_work_to_reg_mask_set, ALLQ);
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_ip_busy_to_pcie_mask_set,
+			ALLQ << 16);
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_ip_busy_to_pcie_mask_clr,
+			ALLQ << 24);
+
+	/* enable interrupt to PCIe */
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_int_mask, 0);
+
+	/* disable illegal memory check */
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_ul_dummy_0, 1);
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_so_dummy_0, 1);
+}
+
+void mtk_cldma_setup_start_addr(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+				u32 qno, dma_addr_t addr)
+{
+	struct cldma_hw_regs *hw_regs;
+	unsigned int addr_l;
+	unsigned int addr_h;
+	int base;
+
+	hw_regs = drv_info->hw_regs;
+	base = drv_info->base_addr;
+
+	if (dir == DIR_TX) {
+		addr_l = base + hw_regs->reg_cldma_ul_start_addrl_0 + qno * HW_QUEUE_NUM;
+		addr_h = base + hw_regs->reg_cldma_ul_start_addrh_0 + qno * HW_QUEUE_NUM;
+	} else {
+		addr_l = base + hw_regs->reg_cldma_so_start_addrl_0 + qno * HW_QUEUE_NUM;
+		addr_h = base + hw_regs->reg_cldma_so_start_addrh_0 + qno * HW_QUEUE_NUM;
+	}
+
+	mtk_pci_write32(drv_info->mdev, addr_l, (u32)addr);
+	mtk_pci_write32(drv_info->mdev, addr_h, (u32)((u64)addr >> 32));
+}
+
+void mtk_cldma_mask_intr(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+			 u32 qno, enum mtk_intr_type type)
+{
+	struct cldma_hw_regs *hw_regs;
+	int base;
+	u32 addr;
+	u32 val;
+
+	hw_regs = drv_info->hw_regs;
+	base = drv_info->base_addr;
+
+	if (dir == DIR_TX)
+		addr = base + hw_regs->reg_cldma_l2timsr0;
+	else
+		addr = base + hw_regs->reg_cldma_l2rimsr0;
+
+	if (qno == ALLQ)
+		val = qno << type;
+	else
+		val = BIT(qno) << type;
+
+	mtk_pci_write32(drv_info->mdev, addr, val);
+}
+
+void mtk_cldma_unmask_intr(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+			   u32 qno, enum mtk_intr_type type)
+{
+	struct cldma_hw_regs *hw_regs;
+	int base;
+	u32 addr;
+	u32 val;
+
+	hw_regs = drv_info->hw_regs;
+	base = drv_info->base_addr;
+
+	if (dir == DIR_TX)
+		addr = base + hw_regs->reg_cldma_l2timcr0;
+	else
+		addr = base + hw_regs->reg_cldma_l2rimcr0;
+
+	if (qno == ALLQ)
+		val = qno << type;
+	else
+		val = BIT(qno) << type;
+
+	mtk_pci_write32(drv_info->mdev, addr, val);
+}
+
+void mtk_cldma_clr_intr_status(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+			       u32 qno, enum mtk_intr_type type)
+{
+	struct cldma_hw_regs *hw_regs;
+	struct mtk_md_dev *mdev;
+	int base;
+	u32 addr;
+	u32 val;
+
+	hw_regs = drv_info->hw_regs;
+	base = drv_info->base_addr;
+	mdev = drv_info->mdev;
+
+	if (type == QUEUE_ERROR) {
+		if (dir == DIR_TX) {
+			val = mtk_pci_read32(mdev, base + hw_regs->reg_cldma_l3tisar0);
+			mtk_pci_write32(mdev, base + hw_regs->reg_cldma_l3tisar0, val);
+			val = mtk_pci_read32(mdev, base + hw_regs->reg_cldma_l3tisar1);
+			mtk_pci_write32(mdev, base + hw_regs->reg_cldma_l3tisar1, val);
+			val = mtk_pci_read32(mdev, base + hw_regs->reg_cldma_l3tisar2);
+			mtk_pci_write32(mdev, base + hw_regs->reg_cldma_l3tisar2, val);
+		} else {
+			val = mtk_pci_read32(mdev, base + hw_regs->reg_cldma_l3risar0);
+			mtk_pci_write32(mdev, base + hw_regs->reg_cldma_l3risar0, val);
+			val = mtk_pci_read32(mdev, base + hw_regs->reg_cldma_l3risar1);
+			mtk_pci_write32(mdev, base + hw_regs->reg_cldma_l3risar1, val);
+		}
+	}
+
+	if (dir == DIR_TX)
+		addr = base + hw_regs->reg_cldma_l2tisar0;
+	else
+		addr = base + hw_regs->reg_cldma_l2risar0;
+
+	if (qno == ALLQ)
+		val = qno << type;
+	else
+		val = BIT(qno) << type;
+
+	mtk_pci_write32(mdev, addr, val);
+	val = mtk_pci_read32(mdev, addr);
+}
+
+u32 mtk_cldma_check_intr_status(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+				u32 qno, enum mtk_intr_type type)
+{
+	struct cldma_hw_regs *hw_regs;
+	u32 addr, val, sta;
+	int base;
+
+	hw_regs = drv_info->hw_regs;
+	base = drv_info->base_addr;
+
+	if (dir == DIR_TX)
+		addr = base + hw_regs->reg_cldma_l2tisar0;
+	else
+		addr = base + hw_regs->reg_cldma_l2risar0;
+
+	val = mtk_pci_read32(drv_info->mdev, addr);
+	if (val == LINK_ERROR_VAL)
+		sta = val;
+	else if (qno == ALLQ)
+		sta = (val >> type) & 0xFF;
+	else
+		sta = (val >> type) & BIT(qno);
+
+	return sta;
+}
+
+void mtk_cldma_start_queue(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir, u32 qno)
+{
+	struct cldma_hw_regs *hw_regs;
+	u32 val = BIT(qno);
+	int base;
+	u32 addr;
+
+	hw_regs = drv_info->hw_regs;
+	base = drv_info->base_addr;
+
+	if (dir == DIR_TX)
+		addr = base + hw_regs->reg_cldma_ul_start_cmd;
+	else
+		addr = base + hw_regs->reg_cldma_so_start_cmd;
+
+	mtk_pci_write32(drv_info->mdev, addr, val);
+}
+
+void mtk_cldma_resume_queue(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir, u32 qno)
+{
+	struct cldma_hw_regs *hw_regs;
+	u32 val = BIT(qno);
+	int base;
+	u32 addr;
+
+	hw_regs = drv_info->hw_regs;
+	base = drv_info->base_addr;
+
+	if (dir == DIR_TX)
+		addr = base + hw_regs->reg_cldma_ul_resume_cmd;
+	else
+		addr = base + hw_regs->reg_cldma_so_resume_cmd;
+
+	mtk_pci_write32(drv_info->mdev, addr, val);
+}
+
+u32 mtk_cldma_queue_status(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir, u32 qno)
+{
+	struct cldma_hw_regs *hw_regs;
+	int base;
+	u32 addr;
+	u32 val;
+
+	hw_regs = drv_info->hw_regs;
+	base = drv_info->base_addr;
+
+	if (dir == DIR_TX)
+		addr = base + hw_regs->reg_cldma_ul_status;
+	else
+		addr = base + hw_regs->reg_cldma_so_status;
+
+	val = mtk_pci_read32(drv_info->mdev, addr);
+
+	if (qno == ALLQ || val == LINK_ERROR_VAL)
+		return val;
+
+	return val & BIT(qno);
+}
+
+u32 mtk_cldma_stop_queue(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir, u32 qno)
+{
+	u32 val = (qno == ALLQ) ? qno : BIT(qno);
+	struct cldma_hw_regs *hw_regs;
+	unsigned int active;
+	int cnt = 0;
+	int base;
+	u32 addr;
+
+	hw_regs = drv_info->hw_regs;
+	base = drv_info->base_addr;
+
+	if (dir == DIR_TX)
+		addr = base + hw_regs->reg_cldma_ul_stop_cmd;
+	else
+		addr = base + hw_regs->reg_cldma_so_stop_cmd;
+
+	mtk_pci_write32(drv_info->mdev, addr, val);
+
+	do {
+		active = drv_info->drv_ops->cldma_queue_status(drv_info, dir, qno);
+		if (active == LINK_ERROR_VAL || !active)
+			break;
+		usleep_range(WAIT_QUEUE_STOP, 2 * WAIT_QUEUE_STOP);
+	} while (++cnt < 10);
+
+	return active;
+}
+
+void mtk_cldma_clear_ip_busy(struct cldma_drv_info *drv_info)
+{
+	mtk_pci_write32(drv_info->mdev, drv_info->base_addr +
+			drv_info->hw_regs->reg_cldma_ip_busy, 0x01);
+}
+
+void mtk_cldma_get_intr_status(struct cldma_drv_info *drv_info, u32 *tx_sta, u32 *rx_sta)
+{
+	struct cldma_hw_regs *hw_regs;
+	struct mtk_md_dev *mdev;
+	u32 tx_mask, rx_mask;
+	int base;
+
+	mdev = drv_info->mdev;
+	base = drv_info->base_addr;
+	hw_regs = drv_info->hw_regs;
+
+	*tx_sta = mtk_pci_read32(mdev, base + hw_regs->reg_cldma_l2tisar0);
+	tx_mask = mtk_pci_read32(mdev, base + hw_regs->reg_cldma_l2timr0);
+	*rx_sta = mtk_pci_read32(mdev, base + hw_regs->reg_cldma_l2risar0);
+	rx_mask = mtk_pci_read32(mdev, base + hw_regs->reg_cldma_l2rimr0);
+
+	*tx_sta = (*tx_sta) & (~tx_mask);
+	*rx_sta = (*rx_sta) & (~rx_mask);
+
+	if (*tx_sta) {
+		/* TX XFER_DONE and QUEUE_ERROR mask */
+		mtk_pci_write32(mdev, base + hw_regs->reg_cldma_l2timsr0, *tx_sta);
+		/* TX XFER_DONE clear */
+		mtk_pci_write32(mdev, base + hw_regs->reg_cldma_l2tisar0,
+				(*tx_sta) & (0xFF << QUEUE_XFER_DONE));
+	}
+
+	if (*rx_sta) {
+		/* RX XFER_DONE and QUEUE_ERROR mask */
+		mtk_pci_write32(mdev, base + hw_regs->reg_cldma_l2rimsr0, *rx_sta);
+		/* RX XFER_DONE clear */
+		mtk_pci_write32(mdev, base + hw_regs->reg_cldma_l2risar0,
+				(*rx_sta) & (0xFF << QUEUE_XFER_DONE));
+	}
+}
+
+u32 mtk_cldma_get_tx_start_addr(struct cldma_drv_info *drv_info, u32 qno)
+{
+	u32 addr, val;
+
+	addr = drv_info->base_addr + drv_info->hw_regs->reg_cldma_ul_start_addrl_0 +
+	       qno * HW_QUEUE_NUM;
+	val = mtk_pci_read32(drv_info->mdev, addr);
+
+	return val;
+}
+
+u64 mtk_cldma_get_rx_curr_addr(struct cldma_drv_info *drv_info, u32 qno)
+{
+	struct cldma_hw_regs *hw_regs;
+	u32 curr_addr_h, curr_addr_l;
+	struct mtk_md_dev *mdev;
+	u64 curr_addr;
+	int base;
+	u64 addr;
+
+	hw_regs = drv_info->hw_regs;
+	base = drv_info->base_addr;
+	mdev = drv_info->mdev;
+
+	addr = base + hw_regs->reg_cldma_so_current_addrh_0 +
+	       (u64)qno * HW_QUEUE_NUM;
+	curr_addr_h = mtk_pci_read32(mdev, addr);
+	addr = base + hw_regs->reg_cldma_so_current_addrl_0 +
+	       (u64)qno * HW_QUEUE_NUM;
+	curr_addr_l = mtk_pci_read32(mdev, addr);
+	curr_addr = ((u64)curr_addr_h << 32) | curr_addr_l;
+	if (curr_addr_h == LINK_ERROR_VAL && curr_addr_l == LINK_ERROR_VAL)
+		curr_addr = 0;
+	return curr_addr;
+}
diff --git a/drivers/net/wwan/t9xx/pcie/mtk_cldma_drv.h b/drivers/net/wwan/t9xx/pcie/mtk_cldma_drv.h
new file mode 100644
index 000000000000..8763c23abf54
--- /dev/null
+++ b/drivers/net/wwan/t9xx/pcie/mtk_cldma_drv.h
@@ -0,0 +1,177 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ *
+ * Copyright (c) 2023, MediaTek Inc.
+ */
+
+#ifndef __MTK_CLDMA_DRV_H__
+#define __MTK_CLDMA_DRV_H__
+
+#define HW_QUEUE_NUM		(8)
+#define ALLQ			(0xFF)
+#define LINK_ERROR_VAL		(0xFFFFFFFF)
+#define CLDMA0_HW_ID		(0)
+#define CLDMA1_HW_ID		(1)
+#define CLDMA4_HW_ID		(4)
+
+struct cldma_hw_regs {
+	u8 cldma_rx_skb_pool_max_size;
+	u8 cldma_rx_skb_reload_threshold;
+	u8 tq_err_int_offset;
+	u8 tq_active_start_err_int_offset;
+	u8 rq_err_int_offset;
+	u8 rq_active_start_err_int_offset;
+	u16 reg_cldma_so_cfg;
+	u16 reg_cldma_so_start_addrl_0;
+	u16 reg_cldma_so_start_addrh_0;
+	u16 reg_cldma_so_current_addrl_0;
+	u16 reg_cldma_so_current_addrh_0;
+	u16 reg_cldma_so_status;
+	u16 reg_cldma_debug_id_en;
+	u16 reg_cldma_so_last_update_addrl_0;
+	u16 reg_cldma_so_last_update_addrh_0;
+	u16 reg_cldma_l2rimr0;
+	u16 reg_cldma_l2rimr1;
+	u16 reg_cldma_l2rimcr0;
+	u16 reg_cldma_l2rimcr1;
+	u16 reg_cldma_l2rimsr0;
+	u16 reg_cldma_l2rimsr1;
+	u16 reg_cldma_int_mask;
+	u16 reg_cldma4_int_mask;
+	u16 reg_cldma_slp_mem_ctl;
+	u16 reg_cldma_busy_mask;
+	u16 reg_cldma_ip_busy_to_pcie_mask;
+	u16 reg_cldma_ip_busy_to_pcie_mask_set;
+	u16 reg_cldma_ip_busy_to_pcie_mask_clr;
+	u16 reg_cldma_ip_busy_to_ap_mask;
+	u16 reg_cldma_ip_busy_to_ap_mask_set;
+	u16 reg_cldma_ip_busy_to_ap_mask_clr;
+	u16 reg_cldma_ip_busy_to_md_mask_set;
+	u16 reg_cldma_rx_work_to_reg_mask_set;
+	u16 reg_infra_rst4_set;
+	u16 reg_infra_rst4_clr;
+	u16 reg_infra_rst2_set;
+	u16 reg_infra_rst2_clr;
+	u16 reg_infra_rst0_set;
+	u16 reg_infra_rst0_clr;
+	u32 tq_err_int_bitmask;
+	u32 tq_active_start_err_int_bitmask;
+	u32 rq_err_int_bitmask;
+	u32 cldma0_base_addr;
+	u32 cldma1_base_addr;
+	u32 cldma4_base_addr;
+	u32 rq_active_start_err_int_bitmask;
+	u32 reg_cldma_ul_start_addrl_0;
+	u32 reg_cldma_ul_start_addrh_0;
+	u32 reg_cldma_ul_current_addrl_0;
+	u32 reg_cldma_ul_current_addrh_0;
+	u32 reg_cldma_ul_status;
+	u32 reg_cldma_ul_start_cmd;
+	u32 reg_cldma_ul_resume_cmd;
+	u32 reg_cldma_ul_stop_cmd;
+	u32 reg_cldma_ul_error;
+	u32 reg_cldma_ul_cfg;
+	u32 reg_cldma_ul_dummy_0;
+	u32 reg_cldma_so_error;
+	u32 reg_cldma_so_start_cmd;
+	u32 reg_cldma_so_resume_cmd;
+	u32 reg_cldma_so_stop_cmd;
+	u32 reg_cldma_so_dummy_0;
+	u32 reg_cldma_l2tisar0;
+	u32 reg_cldma_l2tisar1;
+	u32 reg_cldma_l2timr0;
+	u32 reg_cldma_l2timr1;
+	u32 reg_cldma_l2timcr0;
+	u32 reg_cldma_l2timcr1;
+	u32 reg_cldma_l2timsr0;
+	u32 reg_cldma_l2timsr1;
+	u32 reg_cldma_l2risar0;
+	u32 reg_cldma_l2risar1;
+	u32 reg_cldma_l3tisar0;
+	u32 reg_cldma_l3tisar1;
+	u32 reg_cldma_l3tisar2;
+	u32 reg_cldma_l3risar0;
+	u32 reg_cldma_l3risar1;
+	u32 reg_cldma_ip_busy;
+};
+
+enum mtk_ip_busy_src {
+	IP_BUSY_TXDONE = 0,
+	IP_BUSY_TXEMPTY = 8,
+	IP_BUSY_TXACTIVE = 16,
+	IP_BUSY_RXDONE = 24
+};
+
+enum mtk_intr_type {
+	QUEUE_XFER_DONE = 0,
+	QUEUE_EMPTY = 8,
+	QUEUE_ERROR = 16,
+	QUEUE_ACTIVE_START = 24,
+	INVALID_TYPE
+};
+
+enum mtk_tx_rx {
+	DIR_TX,
+	DIR_RX,
+	DIR_MAX
+};
+
+struct cldma_drv_info {
+	int hif_id;
+	int hw_id;
+	int base_addr;
+	int pci_ext_irq_id;
+	struct mtk_md_dev *mdev;
+	struct cldma_dev *cd;
+	struct txq *txq[HW_QUEUE_NUM];
+	struct rxq *rxq[HW_QUEUE_NUM];
+	struct dma_pool *gpd_dma_pool;
+	struct dma_pool *bd_dma_pool;
+	struct workqueue_struct *wq;
+	struct cldma_hw_regs *hw_regs;
+	struct cldma_drv_ops *drv_ops;
+};
+
+struct cldma_drv_ops {
+	void (*cldma_drv_init)(struct cldma_drv_info *drv_info);
+	void (*cldma_drv_reset)(struct cldma_drv_info *drv_info);
+	void (*cldma_setup_start_addr)(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+				       u32 qno, dma_addr_t addr);
+	void (*cldma_mask_intr)(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+				u32 qno, enum mtk_intr_type type);
+	void (*cldma_unmask_intr)(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+				  u32 qno, enum mtk_intr_type type);
+	void (*cldma_clr_intr_status)(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+				      u32 qno, enum mtk_intr_type type);
+	u32 (*cldma_check_intr_status)(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+				       u32 qno, enum mtk_intr_type type);
+	void (*cldma_start_queue)(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir, u32 qno);
+	void (*cldma_resume_queue)(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir, u32 qno);
+	u32 (*cldma_queue_status)(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir, u32 qno);
+	u32 (*cldma_stop_queue)(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir, u32 qno);
+	void (*cldma_clear_ip_busy)(struct cldma_drv_info *drv_info);
+	void (*cldma_get_intr_status)(struct cldma_drv_info *drv_info, u32 *tx_sta, u32 *rx_sta);
+	u32 (*cldma_get_tx_start_addr)(struct cldma_drv_info *drv_info, u32 qno);
+	u64 (*cldma_get_rx_curr_addr)(struct cldma_drv_info *drv_info, u32 qno);
+};
+
+void mtk_cldma_drv_init(struct cldma_drv_info *drv_info);
+void mtk_cldma_setup_start_addr(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+				u32 qno, dma_addr_t addr);
+void mtk_cldma_mask_intr(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+			 u32 qno, enum mtk_intr_type type);
+void mtk_cldma_unmask_intr(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+			   u32 qno, enum mtk_intr_type type);
+void mtk_cldma_clr_intr_status(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+			       u32 qno, enum mtk_intr_type type);
+u32 mtk_cldma_check_intr_status(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir,
+				u32 qno, enum mtk_intr_type type);
+void mtk_cldma_start_queue(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir, u32 qno);
+void mtk_cldma_resume_queue(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir, u32 qno);
+u32 mtk_cldma_queue_status(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir, u32 qno);
+u32 mtk_cldma_stop_queue(struct cldma_drv_info *drv_info, enum mtk_tx_rx dir, u32 qno);
+void mtk_cldma_clear_ip_busy(struct cldma_drv_info *drv_info);
+void mtk_cldma_get_intr_status(struct cldma_drv_info *drv_info, u32 *tx_sta, u32 *rx_sta);
+u32 mtk_cldma_get_tx_start_addr(struct cldma_drv_info *drv_info, u32 qno);
+u64 mtk_cldma_get_rx_curr_addr(struct cldma_drv_info *drv_info, u32 qno);
+
+#endif
diff --git a/drivers/net/wwan/t9xx/pcie/mtk_cldma_drv_m9xx.c b/drivers/net/wwan/t9xx/pcie/mtk_cldma_drv_m9xx.c
new file mode 100644
index 000000000000..d9145d146a5c
--- /dev/null
+++ b/drivers/net/wwan/t9xx/pcie/mtk_cldma_drv_m9xx.c
@@ -0,0 +1,183 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2023, MediaTek Inc.
+ */
+
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/dma-mapping.h>
+#include <linux/dmapool.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/kdev_t.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/netdevice.h>
+#include <linux/sched.h>
+#include <linux/skbuff.h>
+#include <linux/slab.h>
+#include <linux/timer.h>
+#include <linux/wait.h>
+#include <linux/workqueue.h>
+
+#include "mtk_cldma.h"
+#include "mtk_cldma_drv.h"
+#include "mtk_cldma_drv_m9xx.h"
+#include "mtk_dev.h"
+#include "mtk_pci.h"
+#include "mtk_pci_reg.h"
+#include "mtk_trans_ctrl.h"
+
+struct cldma_hw_regs mtk_cldma_regs_m9xx = {
+	.cldma0_base_addr = CLDMA0_BASE_ADDR,
+	.cldma1_base_addr = CLDMA1_BASE_ADDR,
+	.cldma4_base_addr = CLDMA4_BASE_ADDR,
+	.cldma_rx_skb_pool_max_size = CLDMA_RX_SKB_POOL_MAX_SIZE,
+	.cldma_rx_skb_reload_threshold = CLDMA_RX_SKB_RELOAD_THRESHOLD,
+	.tq_err_int_offset = TQ_ERR_INT_OFFSET,
+	.tq_err_int_bitmask = TQ_ERR_INT_BITMASK,
+	.tq_active_start_err_int_offset = TQ_ACTIVE_START_ERR_INT_OFFSET,
+	.tq_active_start_err_int_bitmask = TQ_ACTIVE_START_ERR_INT_BITMASK,
+	.rq_err_int_offset = RQ_ERR_INT_OFFSET,
+	.rq_err_int_bitmask = RQ_ERR_INT_BITMASK,
+	.rq_active_start_err_int_offset = RQ_ACTIVE_START_ERR_INT_OFFSET,
+	.rq_active_start_err_int_bitmask = RQ_ACTIVE_START_ERR_INT_BITMASK,
+	.reg_cldma_ul_start_addrl_0 = REG_CLDMA_UL_START_ADDRL_0,
+	.reg_cldma_ul_start_addrh_0 = REG_CLDMA_UL_START_ADDRH_0,
+	.reg_cldma_ul_current_addrl_0 = REG_CLDMA_UL_CURRENT_ADDRL_0,
+	.reg_cldma_ul_current_addrh_0 = REG_CLDMA_UL_CURRENT_ADDRH_0,
+	.reg_cldma_ul_status = REG_CLDMA_UL_STATUS,
+	.reg_cldma_ul_start_cmd = REG_CLDMA_UL_START_CMD,
+	.reg_cldma_ul_resume_cmd = REG_CLDMA_UL_RESUME_CMD,
+	.reg_cldma_ul_stop_cmd = REG_CLDMA_UL_STOP_CMD,
+	.reg_cldma_ul_error = REG_CLDMA_UL_ERROR,
+	.reg_cldma_ul_cfg = REG_CLDMA_UL_CFG,
+	.reg_cldma_ul_dummy_0 = REG_CLDMA_UL_DUMMY_0,
+	.reg_cldma_so_error = REG_CLDMA_SO_ERROR,
+	.reg_cldma_so_start_cmd = REG_CLDMA_SO_START_CMD,
+	.reg_cldma_so_resume_cmd = REG_CLDMA_SO_RESUME_CMD,
+	.reg_cldma_so_stop_cmd = REG_CLDMA_SO_STOP_CMD,
+	.reg_cldma_so_dummy_0 = REG_CLDMA_SO_DUMMY_0,
+	.reg_cldma_so_cfg = REG_CLDMA_SO_CFG,
+	.reg_cldma_so_start_addrl_0 = REG_CLDMA_SO_START_ADDRL_0,
+	.reg_cldma_so_start_addrh_0 = REG_CLDMA_SO_START_ADDRH_0,
+	.reg_cldma_so_current_addrl_0 = REG_CLDMA_SO_CUR_ADDRL_0,
+	.reg_cldma_so_current_addrh_0 = REG_CLDMA_SO_CUR_ADDRH_0,
+	.reg_cldma_so_status = REG_CLDMA_SO_STATUS,
+	.reg_cldma_debug_id_en = REG_CLDMA_DEBUG_ID_EN,
+	.reg_cldma_so_last_update_addrl_0 = REG_CLDMA_SO_LAST_UPDATE_ADDRL_0,
+	.reg_cldma_so_last_update_addrh_0 = REG_CLDMA_SO_LAST_UPDATE_ADDRH_0,
+	.reg_cldma_l2tisar0 = REG_CLDMA_L2TISAR0,
+	.reg_cldma_l2tisar1 = REG_CLDMA_L2TISAR1,
+	.reg_cldma_l2timr0 = REG_CLDMA_L2TIMR0,
+	.reg_cldma_l2timr1 = REG_CLDMA_L2TIMR1,
+	.reg_cldma_l2timcr0 = REG_CLDMA_L2TIMCR0,
+	.reg_cldma_l2timcr1 = REG_CLDMA_L2TIMCR1,
+	.reg_cldma_l2timsr0 = REG_CLDMA_L2TIMSR0,
+	.reg_cldma_l2timsr1 = REG_CLDMA_L2TIMSR1,
+	.reg_cldma_l3tisar0 = REG_CLDMA_L3TISAR0,
+	.reg_cldma_l3tisar1 = REG_CLDMA_L3TISAR1,
+	.reg_cldma_l3tisar2 = REG_CLDMA_L3TISAR2,
+	.reg_cldma_l2risar0 = REG_CLDMA_L2RISAR0,
+	.reg_cldma_l2risar1 = REG_CLDMA_L2RISAR1,
+	.reg_cldma_l2rimr0 = REG_CLDMA_L2RIMR0,
+	.reg_cldma_l2rimr1 = REG_CLDMA_L2RIMR1,
+	.reg_cldma_l2rimcr0 = REG_CLDMA_L2RIMCR0,
+	.reg_cldma_l2rimcr1 = REG_CLDMA_L2RIMCR1,
+	.reg_cldma_l2rimsr0 = REG_CLDMA_L2RIMSR0,
+	.reg_cldma_l2rimsr1 = REG_CLDMA_L2RIMSR1,
+	.reg_cldma_l3risar0 = REG_CLDMA_L3RISAR0,
+	.reg_cldma_l3risar1 = REG_CLDMA_L3RISAR1,
+	.reg_cldma_ip_busy = REG_CLDMA_IP_BUSY,
+	.reg_cldma_int_mask = REG_CLDMA_INT_EAP_USIP_MASK,
+	.reg_cldma4_int_mask = REG_CLDMA_INT_WF_MASK,
+	.reg_cldma_ip_busy_to_pcie_mask = REG_CLDMA_IP_BUSY_TO_PCIE_MASK,
+	.reg_cldma_ip_busy_to_pcie_mask_set = REG_CLDMA_IP_BUSY_TO_PCIE_MASK_SET,
+	.reg_cldma_ip_busy_to_pcie_mask_clr = REG_CLDMA_IP_BUSY_TO_PCIE_MASK_CLR,
+	.reg_cldma_ip_busy_to_ap_mask = REG_CLDMA_IP_BUSY_TO_AP_MASK,
+	.reg_cldma_ip_busy_to_ap_mask_set = REG_CLDMA_IP_BUSY_TO_AP_MASK_SET,
+	.reg_cldma_ip_busy_to_ap_mask_clr = REG_CLDMA_IP_BUSY_TO_AP_MASK_CLR,
+	.reg_cldma_ip_busy_to_md_mask_set = REG_CLDMA_IP_BUSY_TO_MD_MASK_SET,
+	.reg_cldma_rx_work_to_reg_mask_set = REG_CLDMA_RX_WORK_TO_REG_MASK_SET,
+	.reg_infra_rst0_set = REG_INFRA_RST0_SET,
+	.reg_infra_rst0_clr = REG_INFRA_RST0_CLR,
+};
+
+static void mtk_cldma_drv_init_m9xx(struct cldma_drv_info *drv_info)
+{
+	struct cldma_hw_regs *hw_regs;
+	struct mtk_md_dev *mdev;
+	int base;
+	u32 val;
+
+	mdev = drv_info->mdev;
+	base = drv_info->base_addr;
+	hw_regs = drv_info->hw_regs;
+
+	/* set CLDMA to 64 bit mode GPD */
+	val = mtk_pci_read32(mdev, base + hw_regs->reg_cldma_ul_cfg);
+
+	val = (val & (~(0x7 << 5))) | ((0x4) << 5);
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_ul_cfg, val);
+
+	val = mtk_pci_read32(mdev, base + hw_regs->reg_cldma_so_cfg);
+	val = (val & (~(0x7 << 10))) | ((0x4) << 10) | (1 << 2);
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_so_cfg, val);
+
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_rx_work_to_reg_mask_set, ALLQ);
+
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_ip_busy_to_pcie_mask_set,
+			ALLQ << 16);
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_ip_busy_to_pcie_mask_clr,
+			ALLQ << 24);
+
+	/* enable interrupt to PCIe */
+	if (drv_info->hw_id == CLDMA4_HW_ID)
+		mtk_pci_write32(mdev, base + hw_regs->reg_cldma4_int_mask, 0);
+	else
+		mtk_pci_write32(mdev, base + hw_regs->reg_cldma_int_mask, 0);
+
+	/* disable illegal memory check */
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_ul_dummy_0, 1);
+	mtk_pci_write32(mdev, base + hw_regs->reg_cldma_so_dummy_0, 1);
+}
+
+static void mtk_cldma_drv_reset_m9xx(struct cldma_drv_info *drv_info)
+{
+	struct cldma_hw_regs *hw_regs;
+	struct mtk_md_dev *mdev;
+	u32 val;
+
+	mdev = drv_info->mdev;
+	hw_regs = drv_info->hw_regs;
+
+	val = mtk_pci_read32(mdev, REG_DEV_INFRA_BASE + hw_regs->reg_infra_rst0_set);
+
+	val |= 1 << (REG_CLDMA0_RST_SET_BIT + drv_info->hw_id);
+	mtk_pci_write32(mdev, REG_DEV_INFRA_BASE + hw_regs->reg_infra_rst0_set, val);
+	udelay(1);
+	val = mtk_pci_read32(mdev, REG_DEV_INFRA_BASE + hw_regs->reg_infra_rst0_clr);
+	val |= 1 << (REG_CLDMA0_RST_CLR_BIT + drv_info->hw_id);
+	mtk_pci_write32(mdev, REG_DEV_INFRA_BASE + hw_regs->reg_infra_rst0_clr, val);
+}
+
+struct cldma_drv_ops cldma_drv_ops_m9xx = {
+	.cldma_drv_init = mtk_cldma_drv_init_m9xx,
+	.cldma_drv_reset = mtk_cldma_drv_reset_m9xx,
+	.cldma_setup_start_addr = mtk_cldma_setup_start_addr,
+	.cldma_mask_intr = mtk_cldma_mask_intr,
+	.cldma_unmask_intr = mtk_cldma_unmask_intr,
+	.cldma_clr_intr_status = mtk_cldma_clr_intr_status,
+	.cldma_check_intr_status = mtk_cldma_check_intr_status,
+	.cldma_start_queue = mtk_cldma_start_queue,
+	.cldma_resume_queue = mtk_cldma_resume_queue,
+	.cldma_queue_status = mtk_cldma_queue_status,
+	.cldma_stop_queue = mtk_cldma_stop_queue,
+	.cldma_clear_ip_busy = mtk_cldma_clear_ip_busy,
+	.cldma_get_intr_status = mtk_cldma_get_intr_status,
+	.cldma_get_tx_start_addr = mtk_cldma_get_tx_start_addr,
+	.cldma_get_rx_curr_addr = mtk_cldma_get_rx_curr_addr,
+};
diff --git a/drivers/net/wwan/t9xx/pcie/mtk_cldma_drv_m9xx.h b/drivers/net/wwan/t9xx/pcie/mtk_cldma_drv_m9xx.h
new file mode 100644
index 000000000000..2c63c43ff065
--- /dev/null
+++ b/drivers/net/wwan/t9xx/pcie/mtk_cldma_drv_m9xx.h
@@ -0,0 +1,103 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ *
+ * Copyright (c) 2023, MediaTek Inc.
+ */
+
+#ifndef __MTK_CLDMA_DRV_M9XX_H__
+#define __MTK_CLDMA_DRV_M9XX_H__
+
+#define CLDMA0_BASE_ADDR				(0x1021C000)
+#define CLDMA1_BASE_ADDR				(0x1021E000)
+#define CLDMA4_BASE_ADDR				(0x10224000)
+
+#define CLDMA_RX_SKB_POOL_MAX_SIZE			(64)
+#define CLDMA_RX_SKB_RELOAD_THRESHOLD			(16)
+
+/* L2TISAR0 */
+#define TQ_ERR_INT_OFFSET				(16)
+#define TQ_ERR_INT_BITMASK				(0x00FF0000)
+#define TQ_ACTIVE_START_ERR_INT_OFFSET			(24)
+#define TQ_ACTIVE_START_ERR_INT_BITMASK			(0xFF000000)
+
+/* L2RISAR0 */
+#define RQ_ERR_INT_OFFSET				(16)
+#define RQ_ERR_INT_BITMASK				(0x00FF0000)
+#define RQ_ACTIVE_START_ERR_INT_OFFSET			(24)
+#define RQ_ACTIVE_START_ERR_INT_BITMASK			(0xFF000000)
+
+/* CLDMA IN(Tx) */
+#define REG_CLDMA_UL_START_ADDRL_0			(0x0004)
+#define REG_CLDMA_UL_START_ADDRH_0			(0x0008)
+#define REG_CLDMA_UL_CURRENT_ADDRL_0			(0x0044)
+#define REG_CLDMA_UL_CURRENT_ADDRH_0			(0x0048)
+#define REG_CLDMA_UL_STATUS				(0x0084)
+#define REG_CLDMA_UL_START_CMD				(0x0088)
+#define REG_CLDMA_UL_RESUME_CMD				(0x008C)
+#define REG_CLDMA_UL_STOP_CMD				(0x0090)
+#define REG_CLDMA_UL_ERROR				(0x0094)
+#define REG_CLDMA_UL_CFG				(0x0098)
+#define REG_CLDMA_UL_DUMMY_0				(0x009C)
+
+/* CLDMA OUT(Rx) */
+#define REG_CLDMA_SO_ERROR				(0x0400 + 0x0100)
+#define REG_CLDMA_SO_START_CMD				(0x0400 + 0x01BC)
+#define REG_CLDMA_SO_RESUME_CMD				(0x0400 + 0x01C0)
+#define REG_CLDMA_SO_STOP_CMD				(0x0400 + 0x01C4)
+#define REG_CLDMA_SO_DUMMY_0				(0x0400 + 0x0108)
+#define REG_CLDMA_SO_CFG				(0x0400 + 0x0004)
+#define REG_CLDMA_SO_START_ADDRL_0			(0x0400 + 0x0078)
+#define REG_CLDMA_SO_START_ADDRH_0			(0x0400 + 0x007C)
+#define REG_CLDMA_SO_CUR_ADDRL_0			(0x0400 + 0x00B8)
+#define REG_CLDMA_SO_CUR_ADDRH_0			(0x0400 + 0x00BC)
+#define REG_CLDMA_SO_STATUS				(0x0400 + 0x00F8)
+#define REG_CLDMA_DEBUG_ID_EN				(0x0400 + 0x00FC)
+#define REG_CLDMA_SO_LAST_UPDATE_ADDRL_0		(0x0400 + 0x01C8)
+#define REG_CLDMA_SO_LAST_UPDATE_ADDRH_0		(0x0400 + 0x01CC)
+
+/* CLDMA MISC */
+#define REG_CLDMA_L2TISAR0				(0x0800 + 0x0010)
+#define REG_CLDMA_L2TISAR1				(0x0800 + 0x0014)
+#define REG_CLDMA_L2TIMR0				(0x0800 + 0x0018)
+#define REG_CLDMA_L2TIMR1				(0x0800 + 0x001C)
+#define REG_CLDMA_L2TIMCR0				(0x0800 + 0x0020)
+#define REG_CLDMA_L2TIMCR1				(0x0800 + 0x0024)
+#define REG_CLDMA_L2TIMSR0				(0x0800 + 0x0028)
+#define REG_CLDMA_L2TIMSR1				(0x0800 + 0x002C)
+#define REG_CLDMA_L3TISAR0				(0x0800 + 0x0030)
+#define REG_CLDMA_L3TISAR1				(0x0800 + 0x0034)
+#define REG_CLDMA_L2RISAR0				(0x0800 + 0x0050)
+#define REG_CLDMA_L2RISAR1				(0x0800 + 0x0054)
+#define REG_CLDMA_L3RISAR0				(0x0800 + 0x0070)
+#define REG_CLDMA_L3RISAR1				(0x0800 + 0x0074)
+#define REG_CLDMA_IP_BUSY				(0x0800 + 0x00B4)
+#define REG_CLDMA_L3TISAR2				(0x0800 + 0x00C0)
+
+#define REG_CLDMA_L2RIMR0				(0x0800 + 0x00E8)
+#define REG_CLDMA_L2RIMR1				(0x0800 + 0x00EC)
+#define REG_CLDMA_L2RIMCR0				(0x0800 + 0x00F0)
+#define REG_CLDMA_L2RIMCR1				(0x0800 + 0x00F4)
+#define REG_CLDMA_L2RIMSR0				(0x0800 + 0x00F8)
+#define REG_CLDMA_L2RIMSR1				(0x0800 + 0x00FC)
+
+#define REG_CLDMA_INT_EAP_USIP_MASK			(0x0800 + 0x011C)
+#define REG_CLDMA_INT_WF_MASK				(0x0800 + 0x0120)
+#define REG_CLDMA_RQ1_GPD_DONE_CNT			(0x0800 + 0x0174)
+#define REG_CLDMA_TQ1_GPD_DONE_CNT			(0x0800 + 0x0184)
+
+#define REG_CLDMA_IP_BUSY_TO_PCIE_MASK			(0x0800 + 0x0194)
+#define REG_CLDMA_IP_BUSY_TO_PCIE_MASK_SET		(0x0800 + 0x0198)
+#define REG_CLDMA_IP_BUSY_TO_PCIE_MASK_CLR		(0x0800 + 0x019C)
+
+#define REG_CLDMA_IP_BUSY_TO_AP_MASK			(0x0800 + 0x0200)
+#define REG_CLDMA_IP_BUSY_TO_AP_MASK_SET		(0x0800 + 0x0204)
+#define REG_CLDMA_IP_BUSY_TO_AP_MASK_CLR		(0x0800 + 0x0208)
+#define REG_CLDMA_IP_BUSY_TO_MD_MASK_SET		(0x0800 + 0x0210)
+#define REG_CLDMA_RX_WORK_TO_REG_MASK_SET		(0x0800 + 0x021C)
+
+/* CLDMA RESET */
+#define REG_INFRA_RST0_SET				(0x120)
+#define REG_INFRA_RST0_CLR				(0x124)
+#define REG_CLDMA0_RST_SET_BIT				(8)
+#define REG_CLDMA0_RST_CLR_BIT				(8)
+
+#endif
diff --git a/drivers/net/wwan/t9xx/pcie/mtk_ctrl_cfg_m9xx.c b/drivers/net/wwan/t9xx/pcie/mtk_ctrl_cfg_m9xx.c
new file mode 100644
index 000000000000..c1bb787ee981
--- /dev/null
+++ b/drivers/net/wwan/t9xx/pcie/mtk_ctrl_cfg_m9xx.c
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022, MediaTek Inc.
+ */
+
+#include "mtk_cldma.h"
+#include "mtk_trans_ctrl.h"
+
+#define TRB_SRV_NUM	(1)
+
+static const int mtk_srv_cfg_m9xx[NR_CLDMA][HW_QUE_NUM] = {
+	{0},
+	{0},
+};
+
+static const struct queue_info mtk_queue_info_m9xx[] = {
+};
+
+struct mtk_ctrl_info mtk_ctrl_info_m9xx = {
+	.queue_info = (struct queue_info *)mtk_queue_info_m9xx,
+	.queue_info_num = ARRAY_SIZE(mtk_queue_info_m9xx),
+	.srv_cfg = (int **)mtk_srv_cfg_m9xx,
+	.trb_srv_num = TRB_SRV_NUM,
+};
diff --git a/drivers/net/wwan/t9xx/pcie/mtk_pci.c b/drivers/net/wwan/t9xx/pcie/mtk_pci.c
index 90b33dd6effd..6efbcd0cba73 100644
--- a/drivers/net/wwan/t9xx/pcie/mtk_pci.c
+++ b/drivers/net/wwan/t9xx/pcie/mtk_pci.c
@@ -891,6 +891,28 @@ static void mtk_pci_free_irq(struct mtk_md_dev *mdev)
 	pci_free_irq_vectors(pdev);
 }
 
+static int mtk_pci_dev_init(struct mtk_md_dev *mdev)
+{
+	int ret;
+
+	ret = mtk_trans_ctrl_init(mdev);
+	if (ret) {
+		dev_err(mdev->dev, "Failed to initialize control plane: %d\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void mtk_pci_dev_exit(struct mtk_md_dev *mdev)
+{
+	mtk_trans_ctrl_exit(mdev);
+}
+
+static int mtk_pci_dev_start(struct mtk_md_dev *mdev)
+{
+	return 0;
+}
 static const struct mtk_dev_ops pci_hw_ops = {
 	.get_dev_state = mtk_pci_get_dev_state,
 	.ack_dev_state = mtk_pci_ack_dev_state,
@@ -965,6 +987,12 @@ static int mtk_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (ret)
 		goto free_mhccif;
 
+	ret = mtk_pci_dev_init(mdev);
+	if (ret) {
+		dev_err((mdev)->dev, "Failed to init dev.\n");
+		goto free_irq;
+	}
+
 	pci_set_master(pdev);
 	mtk_pci_unmask_irq(mdev, priv->mhccif_irq_id);
 
@@ -981,10 +1009,20 @@ static int mtk_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		goto clear_master;
 	}
 
+	ret = mtk_pci_dev_start(mdev);
+	if (ret) {
+		dev_err((mdev)->dev, "Failed to start dev.\n");
+		goto free_saved_state;
+	}
+
 	return 0;
 
+free_saved_state:
+	pci_load_and_free_saved_state(pdev, &priv->saved_state);
 clear_master:
 	pci_clear_master(pdev);
+	mtk_pci_dev_exit(mdev);
+free_irq:
 	mtk_pci_free_irq(mdev);
 free_mhccif:
 	mtk_mhccif_exit(mdev);
@@ -1012,6 +1050,7 @@ static void mtk_pci_remove(struct pci_dev *pdev)
 	}
 
 	pci_clear_master(pdev);
+	mtk_pci_dev_exit(mdev);
 	mtk_pci_free_irq(mdev);
 	mtk_mhccif_exit(mdev);
 	pci_load_and_free_saved_state(pdev, &priv->saved_state);
diff --git a/drivers/net/wwan/t9xx/pcie/mtk_pci_reg.h b/drivers/net/wwan/t9xx/pcie/mtk_pci_reg.h
index 3f0667e8a846..73299ae03f89 100644
--- a/drivers/net/wwan/t9xx/pcie/mtk_pci_reg.h
+++ b/drivers/net/wwan/t9xx/pcie/mtk_pci_reg.h
@@ -21,6 +21,7 @@
 #define REG_IMASK_HOST_MSIX_SET_GRP0_0		0x3000
 #define REG_IMASK_HOST_MSIX_CLR_GRP0_0		0x3080
 #define REG_IMASK_HOST_MSIX_GRP0_0		0x3100
+#define REG_DEV_INFRA_BASE			0x10001000
 
 /* mhccif registers */
 #define MHCCIF_RC2EP_SW_BSY			0x4
diff --git a/drivers/net/wwan/t9xx/pcie/mtk_trans_ctrl.c b/drivers/net/wwan/t9xx/pcie/mtk_trans_ctrl.c
new file mode 100644
index 000000000000..32f00c15f383
--- /dev/null
+++ b/drivers/net/wwan/t9xx/pcie/mtk_trans_ctrl.c
@@ -0,0 +1,579 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022, MediaTek Inc.
+ */
+
+#include <linux/device.h>
+#include <linux/freezer.h>
+#include <linux/hashtable.h>
+#include <linux/kthread.h>
+#include <linux/list.h>
+#include <linux/nospec.h>
+#include <linux/sched.h>
+#include <linux/wait.h>
+
+#include "mtk_cldma.h"
+#include "mtk_ctrl_plane.h"
+#include "mtk_dev.h"
+#include "mtk_pci.h"
+#include "mtk_trans_ctrl.h"
+
+#define MTK_DFLT_PORT_NAME_LEN			(20)
+
+static struct mtk_ctrl_info_desc mtk_ctrl_info_tbl[] = {
+	{2304, &ctrl_info_name(m9xx)},
+	{0, NULL},
+};
+
+#define RX_CH_ID_SHIFT	16
+#define PORT_MTU_MASK	0xFFFF
+#define QUEUE_CHL_MASK	0xFFFF
+
+static bool mtk_queue_list_is_full(struct mtk_ctrl_trans *trans, struct queue_info *que)
+{
+	return trans->trans_list[que->hif_id].skb_list[que->txqno].qlen >= SKB_LIST_MAX_LEN;
+}
+
+static bool mtk_ctrl_chs_is_busy_or_empty(struct trb_srv *srv)
+{
+	struct srv_que *srv_que;
+	int i;
+
+	for (i = 0; i < NR_CLDMA; i++)
+		list_for_each_entry(srv_que, &srv->srv_q_list[i], list)
+			if (!skb_queue_empty(&srv->trans->trans_list[i].skb_list[srv_que->qno]) &&
+			    mtk_cldma_get_tx_budget(srv->trans->dev, i, srv_que->qno))
+				return false;
+
+	return true;
+}
+
+static void mtk_ctrl_ch_flush(struct sk_buff_head *skb_list)
+{
+	struct sk_buff *skb;
+	struct trb *trb;
+
+	while (!skb_queue_empty(skb_list)) {
+		skb = skb_dequeue(skb_list);
+		trb = (struct trb *)skb->cb;
+		trb->status = -EIO;
+		trb->trb_complete(skb);
+	}
+}
+
+static void mtk_ctrl_chs_flush(struct trb_srv *srv)
+{
+	struct srv_que *srv_que;
+	int i;
+
+	for (i = 0; i < NR_CLDMA; i++)
+		list_for_each_entry(srv_que, &srv->srv_q_list[i], list)
+			mtk_ctrl_ch_flush(&srv->trans->trans_list[i].skb_list[srv_que->qno]);
+}
+
+static int mtk_ch_status_check(struct mtk_ctrl_trans *trans, struct sk_buff *skb)
+{
+	struct trb *trb = (struct trb *)skb->cb;
+	struct trb_open_priv *trb_open_priv;
+	struct queue_info *que;
+	int ret = 0;
+
+	que = radix_tree_lookup(&trans->queue_tbl, trb->channel_id & QUEUE_CHL_MASK);
+
+	switch (trb->cmd) {
+	case TRB_CMD_ENABLE:
+		trb_open_priv = (struct trb_open_priv *)skb->data;
+		trb_open_priv->log_rg_offset = que->log_rg_offset;
+		trans->usr_cnt[que->hif_id][que->txqno]++;
+		if (trans->usr_cnt[que->hif_id][que->txqno] == 1)
+			break;
+		trb_open_priv->tx_mtu = que->tx_mtu;
+		trb_open_priv->rx_mtu = que->rx_mtu;
+		trb_open_priv->tx_frag_size = que->tx_frag_size;
+		trb_open_priv->rx_frag_size = que->rx_frag_size;
+		if (mtk_cldma_check_ch_cfg(trans->dev, que)) {
+			trb->status = -EINVAL;
+			ret = -EINVAL;
+		} else {
+			trb->status = -EBUSY;
+			ret = -EBUSY;
+		}
+		trb->trb_complete(skb);
+		break;
+	case TRB_CMD_DISABLE:
+		if (trans->usr_cnt[que->hif_id][que->txqno] > 0) {
+			trans->usr_cnt[que->hif_id][que->txqno]--;
+			if (!trans->usr_cnt[que->hif_id][que->txqno])
+				break;
+		}
+		trb->status = -EBUSY;
+		trb->trb_complete(skb);
+		ret = -EBUSY;
+		break;
+	default:
+		dev_err((trans->mdev)->dev, "Invalid trb command(%d)\n", trb->cmd);
+		ret = -EINVAL;
+		break;
+	}
+	return ret;
+}
+
+static void mtk_ctrl_trb_handler(struct trb_srv *srv, struct trans_list *trans_list, u32 qno)
+{
+	struct sk_buff_head *skb_list = &trans_list->skb_list[qno];
+	struct mtk_ctrl_trans *trans = srv->trans;
+	struct sk_buff *skb, *skb_next;
+	struct trb *trb, *trb_next;
+	bool kick = false;
+	int loop = 0;
+	int err;
+
+	do {
+		skb = skb_peek(skb_list);
+		if (!skb)
+			break;
+		trb = (struct trb *)skb->cb;
+
+		switch (trb->cmd) {
+		case TRB_CMD_ENABLE:
+		case TRB_CMD_DISABLE:
+			skb_unlink(skb, skb_list);
+			err = mtk_ch_status_check(trans, skb);
+			if (!err) {
+				kick = true;
+				if (trb->cmd == TRB_CMD_DISABLE)
+					mtk_ctrl_ch_flush(skb_list);
+			}
+			break;
+		case TRB_CMD_TX:
+			err = mtk_cldma_submit_tx(trans->dev, skb);
+			if (err) {
+				if (trans_list->tx_burst_cnt[qno]) {
+					kick = true;
+					break;
+				}
+				if (err == -EAGAIN)
+					return;
+
+				skb_unlink(skb, skb_list);
+				trb->status = err;
+				trb->trb_complete(skb);
+				break;
+			}
+
+			trans_list->tx_burst_cnt[qno]++;
+			if (trans_list->tx_burst_cnt[qno] >= TX_BURST_MAX_CNT ||
+			    skb_queue_is_last(skb_list, skb)) {
+				kick = true;
+			} else {
+				skb_next = skb_peek_next(skb, skb_list);
+				trb_next = (struct trb *)skb_next->cb;
+				if (trb_next->cmd != TRB_CMD_TX)
+					kick = true;
+			}
+
+			skb_unlink(skb, skb_list);
+			break;
+		default:
+			skb_unlink(skb, skb_list);
+		}
+
+		if (kick) {
+			mtk_cldma_trb_process(trans->dev, skb);
+			trans_list->tx_burst_cnt[qno] = 0;
+			kick = false;
+		}
+
+		loop++;
+	} while (loop < TRB_NUM_PER_ROUND);
+}
+
+static void mtk_ctrl_trb_process(struct trb_srv *srv)
+{
+	struct mtk_ctrl_trans *trans = srv->trans;
+	struct srv_que *srv_que;
+	int i;
+
+	for (i = 0; i < NR_CLDMA; i++)
+		list_for_each_entry(srv_que, &srv->srv_q_list[i], list)
+			mtk_ctrl_trb_handler(srv, &trans->trans_list[i], srv_que->qno);
+}
+
+static int mtk_ctrl_trb_thread(void *args)
+{
+	struct trb_srv *srv = args;
+
+	for (;;) {
+		wait_event_interruptible(srv->trb_waitq,
+					 !mtk_ctrl_chs_is_busy_or_empty(srv) ||
+					 kthread_should_stop() || kthread_should_park());
+		if (kthread_should_stop())
+			break;
+
+		if (kthread_should_park())
+			kthread_parkme();
+
+		do {
+			mtk_ctrl_trb_process(srv);
+			cond_resched();
+		} while (!mtk_ctrl_chs_is_busy_or_empty(srv) && !kthread_should_stop() &&
+			 !kthread_should_park());
+	}
+	mtk_ctrl_chs_flush(srv);
+	return 0;
+}
+
+static int mtk_ctrl_trb_srv_init(struct mtk_ctrl_trans *trans)
+{
+	struct srv_que *srv_que;
+	struct trb_srv *srv;
+	int i, j;
+	int ret;
+
+	for (i = 0; i < trans->trb_srv_num; i++) {
+		srv = devm_kzalloc(trans->mdev->dev, sizeof(*srv), GFP_KERNEL);
+		if (!srv) {
+			ret = -ENOMEM;
+			goto err_free_srv;
+		}
+
+		srv->trans = trans;
+		srv->srv_id = i;
+		trans->trb_srv[i] = srv;
+
+		init_waitqueue_head(&srv->trb_waitq);
+		for (j = 0; j < NR_CLDMA; j++)
+			INIT_LIST_HEAD(&srv->srv_q_list[j]);
+	}
+
+	for (i = 0; i < NR_CLDMA; i++)
+		for (j = 0; j < HW_QUE_NUM; j++) {
+			if (trans->srv_cfg[i][j] < 0 ||
+			    trans->srv_cfg[i][j] >= trans->trb_srv_num)
+				trans->srv_cfg[i][j] = 0;
+			srv_que = devm_kzalloc(trans->mdev->dev, sizeof(*srv_que), GFP_KERNEL);
+			if (!srv_que) {
+				ret = -ENOMEM;
+				goto err_free_srv_que;
+			}
+			srv_que->hif_id = i;
+			srv_que->qno = j;
+			list_add_tail(&srv_que->list,
+				      &trans->trb_srv[trans->srv_cfg[i][j]]->srv_q_list[i]);
+		}
+
+	for (i = 0; i < trans->trb_srv_num; i++) {
+		trans->trb_srv[i]->trb_thread = kthread_run(mtk_ctrl_trb_thread, trans->trb_srv[i],
+							    "mtk_trb_srv%d_%s", i,
+							    trans->mdev->dev_str);
+		if (IS_ERR(trans->trb_srv[i]->trb_thread)) {
+			ret = PTR_ERR(trans->trb_srv[i]->trb_thread);
+			trans->trb_srv[i]->trb_thread = NULL;
+			goto err_stop_kthread;
+		}
+	}
+
+	return 0;
+err_stop_kthread:
+	while (--i >= 0)
+		kthread_stop(trans->trb_srv[i]->trb_thread);
+err_free_srv_que:
+	for (i = 0; i < trans->trb_srv_num; i++) {
+		for (j = 0; j < NR_CLDMA; j++) {
+			struct srv_que *next_srv_que;
+
+			list_for_each_entry_safe(srv_que, next_srv_que,
+						 &trans->trb_srv[i]->srv_q_list[j], list) {
+				list_del(&srv_que->list);
+				devm_kfree(trans->mdev->dev, srv_que);
+			}
+		}
+	}
+err_free_srv:
+	for (i = 0; i < trans->trb_srv_num; i++) {
+		if (!trans->trb_srv[i])
+			break;
+		devm_kfree(trans->mdev->dev, trans->trb_srv[i]);
+		trans->trb_srv[i] = NULL;
+	}
+
+	return ret;
+}
+
+static void mtk_ctrl_trb_srv_exit(struct mtk_ctrl_trans *trans)
+{
+	struct srv_que *srv_que, *next_srv_que;
+	struct trb_srv *srv;
+	int i, j;
+
+	for (i = 0; i < trans->trb_srv_num; i++) {
+		srv = trans->trb_srv[i];
+		kthread_stop(srv->trb_thread);
+		for (j = 0; j < NR_CLDMA; j++) {
+			list_for_each_entry_safe(srv_que, next_srv_que,
+						 &trans->trb_srv[i]->srv_q_list[j], list) {
+				list_del(&srv_que->list);
+				devm_kfree(trans->mdev->dev, srv_que);
+			}
+		}
+		devm_kfree(trans->mdev->dev, srv);
+		trans->trb_srv[i] = NULL;
+	}
+}
+
+static void mtk_ctrl_remove_radix_tree(struct mtk_ctrl_trans *trans)
+{
+	struct radix_tree_iter iter;
+	struct queue_info *queue;
+	void __rcu **slot;
+
+	radix_tree_for_each_slot(slot, &trans->queue_tbl, &iter, 0) {
+		queue = radix_tree_deref_slot(slot);
+		if (!queue)
+			continue;
+		radix_tree_delete(&trans->queue_tbl, iter.index);
+		kfree(queue);
+	}
+}
+
+static void mtk_ctrl_queue_info_update(struct radix_tree_root *queue_tbl, u32 port_chl_mtu)
+{
+	struct queue_info *queue;
+	u32 rx_chl, mtu;
+
+	if (!port_chl_mtu)
+		return;
+
+	rx_chl = port_chl_mtu >> RX_CH_ID_SHIFT;
+	mtu = port_chl_mtu & PORT_MTU_MASK;
+	queue = radix_tree_lookup(queue_tbl, rx_chl);
+	if (!queue)
+		return;
+
+	queue->tx_mtu = mtu;
+	queue->rx_mtu = mtu;
+	queue->tx_frag_size = mtu;
+	queue->rx_frag_size = mtu;
+}
+
+static unsigned int ctrl_port_chl_mtu;
+
+static int mtk_pcie_hif_init(struct mtk_md_dev *mdev)
+{
+	struct mtk_ctrl_blk *ctrl_blk = mdev->ctrl_blk;
+	struct queue_info *queue, *queue_info;
+	struct mtk_ctrl_trans *trans;
+	int i, j;
+	int ret;
+
+	trans = ctrl_blk->ctrl_hw_priv;
+	trans->ctrl_blk = ctrl_blk;
+	queue_info = trans->queue_info;
+
+	INIT_RADIX_TREE(&trans->queue_tbl, GFP_KERNEL);
+	for (i = 0; i < trans->queue_info_num; i++) {
+		queue = kmemdup(queue_info + i, sizeof(*queue), GFP_KERNEL);
+		if (!queue) {
+			ret = -ENOMEM;
+			goto err_free_radix_tree;
+		}
+		if (queue->txqno >= HW_QUE_NUM || queue->rxqno >= HW_QUE_NUM ||
+		    queue->hif_id >= NR_CLDMA) {
+			dev_err((mdev)->dev, "Failed to get correct queue info %x\n",
+				queue->rx_chl);
+			kfree(queue);
+			ret = -EINVAL;
+			goto err_free_radix_tree;
+		}
+		ret = radix_tree_insert(&trans->queue_tbl, queue->rx_chl & QUEUE_CHL_MASK, queue);
+		if (ret) {
+			dev_err((mdev)->dev, "Insert %x fail, ret: %d", queue->rx_chl, ret);
+			kfree(queue);
+			goto err_free_radix_tree;
+		}
+		trans->queues_cnt++;
+	}
+
+	mtk_ctrl_queue_info_update(&trans->queue_tbl, ctrl_port_chl_mtu);
+
+	for (i = 0; i < NR_CLDMA; i++) {
+		for (j = 0; j < HW_QUE_NUM; j++) {
+			skb_queue_head_init(&trans->trans_list[i].skb_list[j]);
+			trans->trans_list[i].tx_burst_cnt[j] = 0;
+		}
+	}
+	ret = mtk_cldma_init(trans);
+	if (ret)
+		goto err_free_radix_tree;
+
+	ret = mtk_ctrl_trb_srv_init(trans);
+	if (ret)
+		goto err_cldma_exit;
+
+	atomic_set(&trans->available, 1);
+
+	return 0;
+
+err_cldma_exit:
+	mtk_cldma_exit(trans);
+err_free_radix_tree:
+	mtk_ctrl_remove_radix_tree(trans);
+
+	return ret;
+}
+
+static int mtk_pcie_hif_exit(struct mtk_md_dev *mdev)
+{
+	struct mtk_ctrl_blk *ctrl_blk = mdev->ctrl_blk;
+	struct mtk_ctrl_trans *trans;
+
+	trans = ctrl_blk->ctrl_hw_priv;
+
+	atomic_set(&trans->available, 0);
+	mtk_ctrl_trb_srv_exit(trans);
+	mtk_ctrl_remove_radix_tree(trans);
+	mtk_cldma_exit(trans);
+
+	return 0;
+}
+
+static int mtk_pcie_hif_submit_skb(struct mtk_md_dev *mdev, struct sk_buff *skb, bool force_send)
+{
+	struct mtk_ctrl_blk *ctrl_blk = mdev->ctrl_blk;
+	struct mtk_ctrl_trans *trans;
+	struct queue_info *que;
+	struct trb *trb;
+
+	trans = ctrl_blk->ctrl_hw_priv;
+	trb = (struct trb *)skb->cb;
+
+	if (trb->cmd == TRB_CMD_STOP || trb->cmd == TRB_CMD_RECOVER) {
+		trb->trb_complete(skb);
+		return 0;
+	}
+
+	que = radix_tree_lookup(&trans->queue_tbl, trb->channel_id & QUEUE_CHL_MASK);
+	if (!que) {
+		dev_warn((mdev)->dev, "lookup que fail, ch_id: %x, que: 0x%p\n",
+			 trb->channel_id, que);
+		return -EINVAL;
+	}
+
+	if (!atomic_read(&trans->available))
+		return -EIO;
+
+	if (mtk_queue_list_is_full(trans, que) && !force_send)
+		return -EAGAIN;
+
+	if (trb->cmd == TRB_CMD_DISABLE)
+		skb_queue_head(&trans->trans_list[que->hif_id].skb_list[que->txqno], skb);
+	else
+		skb_queue_tail(&trans->trans_list[que->hif_id].skb_list[que->txqno], skb);
+
+	wake_up(&trans->trb_srv[trans->srv_cfg[que->hif_id][que->txqno]]->trb_waitq);
+
+	return 0;
+}
+
+static int mtk_pcie_hif_cmd_func(struct mtk_md_dev *mdev, int cmd, void *data)
+{
+	struct mtk_ctrl_blk *ctrl_blk = mdev->ctrl_blk;
+	struct mtk_ctrl_trans *trans;
+	struct queue_info *que;
+
+	switch (cmd) {
+	case HIF_CTRL_CMD_CHECK_TX_FULL:
+		trans = ctrl_blk->ctrl_hw_priv;
+		que = radix_tree_lookup(&trans->queue_tbl,
+					((union ctrl_hif_cmd_data *)data)->rx_ch & QUEUE_CHL_MASK);
+		if (!que) {
+			dev_warn((mdev)->dev, "Failed to find que to check tx full\n");
+			return -EINVAL;
+		}
+		return mtk_queue_list_is_full(trans, que);
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static struct mtk_ctrl_hif_ops pcie_ctrl_ops = {
+	.init = mtk_pcie_hif_init,
+	.exit = mtk_pcie_hif_exit,
+	.submit_skb = mtk_pcie_hif_submit_skb,
+	.send_cmd = mtk_pcie_hif_cmd_func,
+};
+
+static void mtk_trans_get_ctrl_info(struct mtk_ctrl_cfg *cfg,
+				    struct mtk_ctrl_trans *trans, u32 hw_ver)
+{
+	struct mtk_ctrl_info_desc *ctrl_info_desc;
+	struct mtk_ctrl_info *ctrl_info;
+	u8 i;
+
+	for (i = 0; (ctrl_info_desc = &mtk_ctrl_info_tbl[i]) && ctrl_info_desc &&
+	     ctrl_info_desc->ctrl_info; i++) {
+		if (ctrl_info_desc->hw_ver != hw_ver)
+			continue;
+
+		ctrl_info = ctrl_info_desc->ctrl_info;
+		memcpy(trans->srv_cfg, ctrl_info->srv_cfg,
+		       sizeof(int) * NR_CLDMA * HW_QUE_NUM);
+		trans->queue_info = ctrl_info->queue_info;
+		trans->queue_info_num = ctrl_info->queue_info_num;
+		trans->trb_srv_num = ctrl_info->trb_srv_num;
+	}
+}
+
+int mtk_trans_ctrl_init(struct mtk_md_dev *mdev)
+{
+	struct mtk_ctrl_trans *trans;
+	struct mtk_ctrl_blk *ctrl_blk;
+	int err;
+
+	trans = devm_kzalloc(mdev->dev, sizeof(*trans), GFP_KERNEL);
+	if (!trans)
+		return -ENOMEM;
+	trans->mdev = mdev;
+	trans->queues_cnt = 0;
+
+	mtk_trans_get_ctrl_info(NULL, trans, mdev->hw_ver);
+	if (!trans->queue_info ||
+	    trans->trb_srv_num <= 0 || trans->trb_srv_num > TRB_SRV_MAX_NUM ||
+	    trans->queue_info_num <= 0) {
+		dev_err((mdev)->dev, "Failed to get ctrl info!\n");
+		goto err_free_cfg;
+	}
+
+	err = mtk_ctrl_init(mdev, &pcie_ctrl_ops);
+	if (err)
+		goto err_free_cfg;
+
+	ctrl_blk = mdev->ctrl_blk;
+	ctrl_blk->ctrl_hw_priv = trans;
+
+	return 0;
+
+err_free_cfg:
+	devm_kfree(mdev->dev, trans);
+	return -ENOMEM;
+}
+
+int mtk_trans_ctrl_exit(struct mtk_md_dev *mdev)
+{
+	struct mtk_ctrl_trans *trans;
+	struct mtk_ctrl_blk *ctrl_blk;
+
+	ctrl_blk = mdev->ctrl_blk;
+	trans = ctrl_blk->ctrl_hw_priv;
+
+	devm_kfree(mdev->dev, ctrl_blk->cfg);
+	mtk_ctrl_exit(mdev);
+	devm_kfree(mdev->dev, trans);
+
+	return 0;
+}
+
+module_param(ctrl_port_chl_mtu, uint, 0644);
+MODULE_PARM_DESC(ctrl_port_chl_mtu, "This is used to config the ctrl port mtu!\n");
diff --git a/drivers/net/wwan/t9xx/pcie/mtk_trans_ctrl.h b/drivers/net/wwan/t9xx/pcie/mtk_trans_ctrl.h
index d6de4c43b529..0d25d1f51671 100644
--- a/drivers/net/wwan/t9xx/pcie/mtk_trans_ctrl.h
+++ b/drivers/net/wwan/t9xx/pcie/mtk_trans_ctrl.h
@@ -13,9 +13,95 @@
 
 #include "mtk_dev.h"
 
+#define TRB_SRV_MAX_NUM			(1)
+#define HW_QUE_NUM			(8)
+#define TX_GPD_NUM			(16)
+#define RX_GPD_NUM			(TX_GPD_NUM)
+#define MIN_GPD_NUM			(2)
+#define SKB_LIST_MAX_LEN		(16)
+#define MTU_RSV_ROOM			(0x100)
+#define TRB_NUM_PER_ROUND		(TX_GPD_NUM)
+#define TX_BURST_MAX_CNT		(TX_GPD_NUM / 4 + 1)
+
+#define HIF_ID(peer_id)			((peer_id) - 1)
+
+enum mtk_hif_id {
+	CLDMA0,
+	CLDMA1,
+	CLDMA4,
+	NR_CLDMA
+};
+
+struct queue_info {
+	u32 tx_chl;
+	u32 rx_chl;
+	enum mtk_hif_id hif_id;
+	u32 txqno;
+	u32 rxqno;
+	u32 tx_mtu;
+	u32 rx_mtu;
+	u32 tx_nr_gpds;
+	u32 rx_nr_gpds;
+	u32 tx_frag_size;
+	u32 rx_frag_size;
+	u8 log_rg_offset;
+};
+
+struct trans_list {
+	struct sk_buff_head skb_list[HW_QUE_NUM];
+	u8 tx_burst_cnt[HW_QUE_NUM];
+};
+
 struct mtk_ctrl_trans {
 	struct mtk_ctrl_blk *ctrl_blk;
+	struct trb_srv *trb_srv[TRB_SRV_MAX_NUM];
+	struct trans_list trans_list[NR_CLDMA];
+	void *dev;
+	struct radix_tree_root queue_tbl;
 	struct mtk_md_dev *mdev;
+	int usr_cnt[NR_CLDMA][HW_QUE_NUM];
+	u32 tx_mtu_cfg[NR_CLDMA][HW_QUE_NUM];
+	u32 rx_mtu_cfg[NR_CLDMA][HW_QUE_NUM];
+	atomic_t available;
+	int queues_cnt;
+	int srv_cfg[NR_CLDMA][HW_QUE_NUM];
+	struct queue_info *queue_info;
+	int queue_info_num;
+	int trb_srv_num;
+};
+
+struct srv_que {
+	u32 hif_id;
+	u32 qno;
+	struct list_head list;
 };
 
+struct trb_srv {
+	u32 srv_id;
+	struct list_head srv_q_list[NR_CLDMA];
+	struct mtk_ctrl_trans *trans;
+	wait_queue_head_t trb_waitq;
+	struct task_struct *trb_thread;
+};
+
+struct mtk_ctrl_info {
+	struct mtk_ctrl_cfg *ctrl_cfg;
+	int **srv_cfg;
+	struct queue_info *queue_info;
+	u32 queue_info_num;
+	u32 trb_srv_num;
+};
+
+struct mtk_ctrl_info_desc {
+	u32 hw_ver;
+	struct mtk_ctrl_info *ctrl_info;
+};
+
+#define ctrl_info_name(NAME)	mtk_ctrl_info_##NAME
+
+extern struct mtk_ctrl_info mtk_ctrl_info_m9xx;
+
+int mtk_trans_ctrl_init(struct mtk_md_dev *mdev);
+int mtk_trans_ctrl_exit(struct mtk_md_dev *mdev);
+
 #endif

-- 
2.34.1




^ permalink raw reply related

* [PATCH v3 7/7] net: wwan: t9xx: Add maintainers entry
From: Jack Wu via B4 Relay @ 2026-06-24 10:04 UTC (permalink / raw)
  To: Loic Poulain, Sergey Ryazanov, Johannes Berg, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Jack Wu, Wen-Zhi Huang, Shi-Wei Yeh, Minano Tseng,
	Matthias Brugger, AngeloGioacchino Del Regno, Simon Horman,
	Jonathan Corbet, Shuah Khan
  Cc: linux-kernel, netdev, linux-arm-kernel, linux-mediatek, linux-doc
In-Reply-To: <20260624-t9xx_driver_v1-v3-0-73ff03f60c48@compal.com>

From: Jack Wu <jackbb_wu@compal.com>

Add MAINTAINERS entry for the MediaTek T9XX 5G WWAN modem device
driver.

Signed-off-by: Jack Wu <jackbb_wu@compal.com>
---
 MAINTAINERS | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 461a3eed6129..8155d26bff03 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16494,6 +16494,15 @@ L:	netdev@vger.kernel.org
 S:	Supported
 F:	drivers/net/wwan/t7xx/
 
+MEDIATEK T9XX 5G WWAN MODEM DRIVER
+M:	Jack Wu <jackbb_wu@compal.com>
+R:	Wen-Zhi Huang <wen-zhi.huang@mediatek.com>
+R:	Shi-Wei Yeh <shi-wei.yeh@mediatek.com>
+R:	Minano Tseng <Minano.tseng@mediatek.com>
+L:	netdev@vger.kernel.org
+S:	Supported
+F:	drivers/net/wwan/t9xx/
+
 MEDIATEK USB3 DRD IP DRIVER
 M:	Chunfeng Yun <chunfeng.yun@mediatek.com>
 L:	linux-usb@vger.kernel.org

-- 
2.34.1




^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox