Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH v1 net 08/15] net: Fix data-races around netdev_tstamp_prequeue.
From: Kuniyuki Iwashima @ 2022-08-16  5:23 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, linux-kernel
In-Reply-To: <20220816052347.70042-1-kuniyu@amazon.com>

While reading netdev_tstamp_prequeue, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 3b098e2d7c69 ("net: Consistent skb timestamping")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/core/dev.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 07da69c1ac0a..4705e6630efa 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4928,7 +4928,7 @@ static int netif_rx_internal(struct sk_buff *skb)
 {
 	int ret;
 
-	net_timestamp_check(netdev_tstamp_prequeue, skb);
+	net_timestamp_check(READ_ONCE(netdev_tstamp_prequeue), skb);
 
 	trace_netif_rx(skb);
 
@@ -5281,7 +5281,7 @@ static int __netif_receive_skb_core(struct sk_buff **pskb, bool pfmemalloc,
 	int ret = NET_RX_DROP;
 	__be16 type;
 
-	net_timestamp_check(!netdev_tstamp_prequeue, skb);
+	net_timestamp_check(!READ_ONCE(netdev_tstamp_prequeue), skb);
 
 	trace_netif_receive_skb(skb);
 
@@ -5664,7 +5664,7 @@ static int netif_receive_skb_internal(struct sk_buff *skb)
 {
 	int ret;
 
-	net_timestamp_check(netdev_tstamp_prequeue, skb);
+	net_timestamp_check(READ_ONCE(netdev_tstamp_prequeue), skb);
 
 	if (skb_defer_rx_timestamp(skb))
 		return NET_RX_SUCCESS;
@@ -5694,7 +5694,7 @@ void netif_receive_skb_list_internal(struct list_head *head)
 
 	INIT_LIST_HEAD(&sublist);
 	list_for_each_entry_safe(skb, next, head, list) {
-		net_timestamp_check(netdev_tstamp_prequeue, skb);
+		net_timestamp_check(READ_ONCE(netdev_tstamp_prequeue), skb);
 		skb_list_del_init(skb);
 		if (!skb_defer_rx_timestamp(skb))
 			list_add_tail(&skb->list, &sublist);
-- 
2.30.2


^ permalink raw reply related

* [PATCH v1 net 09/15] ratelimit: Fix data-races in ___ratelimit().
From: Kuniyuki Iwashima @ 2022-08-16  5:23 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, linux-kernel
In-Reply-To: <20220816052347.70042-1-kuniyu@amazon.com>

While reading rs->interval and rs->burst, they can be changed
concurrently.  Thus, we need to add READ_ONCE() to their readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 lib/ratelimit.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/ratelimit.c b/lib/ratelimit.c
index e01a93f46f83..b59a1d3d0cc3 100644
--- a/lib/ratelimit.c
+++ b/lib/ratelimit.c
@@ -26,10 +26,12 @@
  */
 int ___ratelimit(struct ratelimit_state *rs, const char *func)
 {
+	int interval = READ_ONCE(rs->interval);
+	int burst = READ_ONCE(rs->burst);
 	unsigned long flags;
 	int ret;
 
-	if (!rs->interval)
+	if (!interval)
 		return 1;
 
 	/*
@@ -44,7 +46,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
 	if (!rs->begin)
 		rs->begin = jiffies;
 
-	if (time_is_before_jiffies(rs->begin + rs->interval)) {
+	if (time_is_before_jiffies(rs->begin + interval)) {
 		if (rs->missed) {
 			if (!(rs->flags & RATELIMIT_MSG_ON_RELEASE)) {
 				printk_deferred(KERN_WARNING
@@ -56,7 +58,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
 		rs->begin   = jiffies;
 		rs->printed = 0;
 	}
-	if (rs->burst && rs->burst > rs->printed) {
+	if (burst && burst > rs->printed) {
 		rs->printed++;
 		ret = 1;
 	} else {
-- 
2.30.2


^ permalink raw reply related

* [PATCH v1 net 07/15] bpf: Fix a data-race around bpf_jit_limit.
From: Kuniyuki Iwashima @ 2022-08-16  5:23 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, linux-kernel,
	Daniel Borkmann
In-Reply-To: <20220816052347.70042-1-kuniyu@amazon.com>

While reading bpf_jit_limit, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
CC: Daniel Borkmann <daniel@iogearbox.net>
---
 kernel/bpf/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index c1e10d088dbb..3d9eb3ae334c 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -971,7 +971,7 @@ pure_initcall(bpf_jit_charge_init);
 
 int bpf_jit_charge_modmem(u32 size)
 {
-	if (atomic_long_add_return(size, &bpf_jit_current) > bpf_jit_limit) {
+	if (atomic_long_add_return(size, &bpf_jit_current) > READ_ONCE(bpf_jit_limit)) {
 		if (!bpf_capable()) {
 			atomic_long_sub(size, &bpf_jit_current);
 			return -EPERM;
-- 
2.30.2


^ permalink raw reply related

* [PATCH v1 net 04/15] bpf: Fix data-races around bpf_jit_enable.
From: Kuniyuki Iwashima @ 2022-08-16  5:23 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, linux-kernel
In-Reply-To: <20220816052347.70042-1-kuniyu@amazon.com>

A sysctl variable bpf_jit_enable is accessed concurrently, and there is
always a chance of data-race.  So, all readers and a writer need some
basic protection to avoid load/store-tearing.

Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 arch/arm/net/bpf_jit_32.c        | 2 +-
 arch/arm64/net/bpf_jit_comp.c    | 2 +-
 arch/mips/net/bpf_jit_comp.c     | 2 +-
 arch/powerpc/net/bpf_jit_comp.c  | 5 +++--
 arch/riscv/net/bpf_jit_core.c    | 2 +-
 arch/s390/net/bpf_jit_comp.c     | 2 +-
 arch/sparc/net/bpf_jit_comp_32.c | 5 +++--
 arch/sparc/net/bpf_jit_comp_64.c | 5 +++--
 arch/x86/net/bpf_jit_comp.c      | 2 +-
 arch/x86/net/bpf_jit_comp32.c    | 2 +-
 include/linux/filter.h           | 2 +-
 net/core/sysctl_net_core.c       | 4 ++--
 12 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 6a1c9fca5260..4b6b62a6fdd4 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -1999,7 +1999,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	}
 	flush_icache_range((u32)header, (u32)(ctx.target + ctx.idx));
 
-	if (bpf_jit_enable > 1)
+	if (READ_ONCE(bpf_jit_enable) > 1)
 		/* there are 2 passes here */
 		bpf_jit_dump(prog->len, image_size, 2, ctx.target);
 
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 389623ae5a91..03bb40352d2c 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1568,7 +1568,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	}
 
 	/* And we're done. */
-	if (bpf_jit_enable > 1)
+	if (READ_ONCE(bpf_jit_enable) > 1)
 		bpf_jit_dump(prog->len, prog_size, 2, ctx.image);
 
 	bpf_flush_icache(header, ctx.image + ctx.idx);
diff --git a/arch/mips/net/bpf_jit_comp.c b/arch/mips/net/bpf_jit_comp.c
index b17130d510d4..1e623ae7eadf 100644
--- a/arch/mips/net/bpf_jit_comp.c
+++ b/arch/mips/net/bpf_jit_comp.c
@@ -1012,7 +1012,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	flush_icache_range((unsigned long)header,
 			   (unsigned long)&ctx.target[ctx.jit_index]);
 
-	if (bpf_jit_enable > 1)
+	if (READ_ONCE(bpf_jit_enable) > 1)
 		bpf_jit_dump(prog->len, image_size, 2, ctx.target);
 
 	prog->bpf_func = (void *)ctx.target;
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 43e634126514..c71d1e94ee7e 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -122,6 +122,7 @@ bool bpf_jit_needs_zext(void)
 
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 {
+	int jit_enable = READ_ONCE(bpf_jit_enable);
 	u32 proglen;
 	u32 alloclen;
 	u8 *image = NULL;
@@ -263,13 +264,13 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 		}
 		bpf_jit_build_epilogue(code_base, &cgctx);
 
-		if (bpf_jit_enable > 1)
+		if (jit_enable > 1)
 			pr_info("Pass %d: shrink = %d, seen = 0x%x\n", pass,
 				proglen - (cgctx.idx * 4), cgctx.seen);
 	}
 
 skip_codegen_passes:
-	if (bpf_jit_enable > 1)
+	if (jit_enable > 1)
 		/*
 		 * Note that we output the base address of the code_base
 		 * rather than image, since opcodes are in code_base.
diff --git a/arch/riscv/net/bpf_jit_core.c b/arch/riscv/net/bpf_jit_core.c
index 737baf8715da..603b5b66379b 100644
--- a/arch/riscv/net/bpf_jit_core.c
+++ b/arch/riscv/net/bpf_jit_core.c
@@ -151,7 +151,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	}
 	bpf_jit_build_epilogue(ctx);
 
-	if (bpf_jit_enable > 1)
+	if (READ_ONCE(bpf_jit_enable) > 1)
 		bpf_jit_dump(prog->len, prog_size, pass, ctx->insns);
 
 	prog->bpf_func = (void *)ctx->insns;
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index af35052d06ed..06897a4e9c62 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -1831,7 +1831,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 		fp = orig_fp;
 		goto free_addrs;
 	}
-	if (bpf_jit_enable > 1) {
+	if (READ_ONCE(bpf_jit_enable) > 1) {
 		bpf_jit_dump(fp->len, jit.size, pass, jit.prg_buf);
 		print_fn_code(jit.prg_buf, jit.size_prg);
 	}
diff --git a/arch/sparc/net/bpf_jit_comp_32.c b/arch/sparc/net/bpf_jit_comp_32.c
index b1dbf2fa8c0a..7c454b920250 100644
--- a/arch/sparc/net/bpf_jit_comp_32.c
+++ b/arch/sparc/net/bpf_jit_comp_32.c
@@ -326,13 +326,14 @@ do {	*prog++ = BR_OPC | WDISP22(OFF);		\
 void bpf_jit_compile(struct bpf_prog *fp)
 {
 	unsigned int cleanup_addr, proglen, oldproglen = 0;
+	int jit_enable = READ_ONCE(bpf_jit_enable);
 	u32 temp[8], *prog, *func, seen = 0, pass;
 	const struct sock_filter *filter = fp->insns;
 	int i, flen = fp->len, pc_ret0 = -1;
 	unsigned int *addrs;
 	void *image;
 
-	if (!bpf_jit_enable)
+	if (!jit_enable)
 		return;
 
 	addrs = kmalloc_array(flen, sizeof(*addrs), GFP_KERNEL);
@@ -743,7 +744,7 @@ cond_branch:			f_offset = addrs[i + filter[i].jf];
 		oldproglen = proglen;
 	}
 
-	if (bpf_jit_enable > 1)
+	if (jit_enable > 1)
 		bpf_jit_dump(flen, proglen, pass + 1, image);
 
 	if (image) {
diff --git a/arch/sparc/net/bpf_jit_comp_64.c b/arch/sparc/net/bpf_jit_comp_64.c
index fa0759bfe498..74cc1fa1f97f 100644
--- a/arch/sparc/net/bpf_jit_comp_64.c
+++ b/arch/sparc/net/bpf_jit_comp_64.c
@@ -1479,6 +1479,7 @@ struct sparc64_jit_data {
 
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
+	int jit_enable = READ_ONCE(bpf_jit_enable);
 	struct bpf_prog *tmp, *orig_prog = prog;
 	struct sparc64_jit_data *jit_data;
 	struct bpf_binary_header *header;
@@ -1549,7 +1550,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		}
 		build_epilogue(&ctx);
 
-		if (bpf_jit_enable > 1)
+		if (jit_enable > 1)
 			pr_info("Pass %d: size = %u, seen = [%c%c%c%c%c%c]\n", pass,
 				ctx.idx * 4,
 				ctx.tmp_1_used ? '1' : ' ',
@@ -1596,7 +1597,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		goto out_off;
 	}
 
-	if (bpf_jit_enable > 1)
+	if (jit_enable > 1)
 		bpf_jit_dump(prog->len, image_size, pass, ctx.image);
 
 	bpf_flush_icache(header, (u8 *)header + header->size);
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index c1f6c1c51d99..a5c7df7cab2a 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -2439,7 +2439,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		cond_resched();
 	}
 
-	if (bpf_jit_enable > 1)
+	if (READ_ONCE(bpf_jit_enable) > 1)
 		bpf_jit_dump(prog->len, proglen, pass + 1, image);
 
 	if (image) {
diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c
index 429a89c5468b..745f15a29dd3 100644
--- a/arch/x86/net/bpf_jit_comp32.c
+++ b/arch/x86/net/bpf_jit_comp32.c
@@ -2597,7 +2597,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		cond_resched();
 	}
 
-	if (bpf_jit_enable > 1)
+	if (READ_ONCE(bpf_jit_enable) > 1)
 		bpf_jit_dump(prog->len, proglen, pass + 1, image);
 
 	if (image) {
diff --git a/include/linux/filter.h b/include/linux/filter.h
index a5f21dc3c432..ce8072626ccf 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1080,7 +1080,7 @@ static inline bool bpf_jit_is_ebpf(void)
 
 static inline bool ebpf_jit_enabled(void)
 {
-	return bpf_jit_enable && bpf_jit_is_ebpf();
+	return READ_ONCE(bpf_jit_enable) && bpf_jit_is_ebpf();
 }
 
 static inline bool bpf_prog_ebpf_jited(const struct bpf_prog *fp)
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index d82ba0c27175..022abf326dfe 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -265,7 +265,7 @@ static int proc_dointvec_minmax_bpf_enable(struct ctl_table *table, int write,
 					   void *buffer, size_t *lenp,
 					   loff_t *ppos)
 {
-	int ret, jit_enable = *(int *)table->data;
+	int ret, jit_enable = READ_ONCE(*(int *)table->data);
 	int min = *(int *)table->extra1;
 	int max = *(int *)table->extra2;
 	struct ctl_table tmp = *table;
@@ -278,7 +278,7 @@ static int proc_dointvec_minmax_bpf_enable(struct ctl_table *table, int write,
 	if (write && !ret) {
 		if (jit_enable < 2 ||
 		    (jit_enable == 2 && bpf_dump_raw_ok(current_cred()))) {
-			*(int *)table->data = jit_enable;
+			WRITE_ONCE(*(int *)table->data, jit_enable);
 			if (jit_enable == 2)
 				pr_warn("bpf_jit_enable = 2 was set! NEVER use this in production, only for JIT debugging!\n");
 		} else {
-- 
2.30.2


^ permalink raw reply related

* [PATCH v1 net 06/15] bpf: Fix data-races around bpf_jit_kallsyms.
From: Kuniyuki Iwashima @ 2022-08-16  5:23 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, linux-kernel,
	Daniel Borkmann
In-Reply-To: <20220816052347.70042-1-kuniyu@amazon.com>

While reading bpf_jit_kallsyms, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 74451e66d516 ("bpf: make jited programs visible in traces")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
CC: Daniel Borkmann <daniel@iogearbox.net>
---
 include/linux/filter.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 09566ad211bd..35881fccce05 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1110,14 +1110,16 @@ static inline bool bpf_jit_blinding_enabled(struct bpf_prog *prog)
 
 static inline bool bpf_jit_kallsyms_enabled(void)
 {
+	int jit_kallsyms = READ_ONCE(bpf_jit_kallsyms);
+
 	/* There are a couple of corner cases where kallsyms should
 	 * not be enabled f.e. on hardening.
 	 */
 	if (READ_ONCE(bpf_jit_harden))
 		return false;
-	if (!bpf_jit_kallsyms)
+	if (!jit_kallsyms)
 		return false;
-	if (bpf_jit_kallsyms == 1)
+	if (jit_kallsyms == 1)
 		return true;
 
 	return false;
-- 
2.30.2


^ permalink raw reply related

* [PATCH v1 net 05/15] bpf: Fix data-races around bpf_jit_harden.
From: Kuniyuki Iwashima @ 2022-08-16  5:23 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, linux-kernel,
	Daniel Borkmann
In-Reply-To: <20220816052347.70042-1-kuniyu@amazon.com>

While reading bpf_jit_harden, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 4f3446bb809f ("bpf: add generic constant blinding for use in jits")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
CC: Daniel Borkmann <daniel@iogearbox.net>
---
 include/linux/filter.h | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index ce8072626ccf..09566ad211bd 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1090,6 +1090,8 @@ static inline bool bpf_prog_ebpf_jited(const struct bpf_prog *fp)
 
 static inline bool bpf_jit_blinding_enabled(struct bpf_prog *prog)
 {
+	int jit_harden = READ_ONCE(bpf_jit_harden);
+
 	/* These are the prerequisites, should someone ever have the
 	 * idea to call blinding outside of them, we make sure to
 	 * bail out.
@@ -1098,9 +1100,9 @@ static inline bool bpf_jit_blinding_enabled(struct bpf_prog *prog)
 		return false;
 	if (!prog->jit_requested)
 		return false;
-	if (!bpf_jit_harden)
+	if (!jit_harden)
 		return false;
-	if (bpf_jit_harden == 1 && capable(CAP_SYS_ADMIN))
+	if (jit_harden == 1 && capable(CAP_SYS_ADMIN))
 		return false;
 
 	return true;
@@ -1111,7 +1113,7 @@ static inline bool bpf_jit_kallsyms_enabled(void)
 	/* There are a couple of corner cases where kallsyms should
 	 * not be enabled f.e. on hardening.
 	 */
-	if (bpf_jit_harden)
+	if (READ_ONCE(bpf_jit_harden))
 		return false;
 	if (!bpf_jit_kallsyms)
 		return false;
-- 
2.30.2


^ permalink raw reply related

* [PATCH v1 net 03/15] net: Fix data-races around netdev_max_backlog.
From: Kuniyuki Iwashima @ 2022-08-16  5:23 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, linux-kernel
In-Reply-To: <20220816052347.70042-1-kuniyu@amazon.com>

While reading netdev_max_backlog, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

While at it, we remove the unnecessary spaces in the doc.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 Documentation/admin-guide/sysctl/net.rst | 2 +-
 net/core/dev.c                           | 4 ++--
 net/core/gro_cells.c                     | 2 +-
 net/xfrm/espintcp.c                      | 2 +-
 net/xfrm/xfrm_input.c                    | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index 805f2281e000..60d44165fba7 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -271,7 +271,7 @@ poll cycle or the number of packets processed reaches netdev_budget.
 netdev_max_backlog
 ------------------
 
-Maximum number  of  packets,  queued  on  the  INPUT  side, when the interface
+Maximum number of packets, queued on the INPUT side, when the interface
 receives packets faster than kernel can process them.
 
 netdev_rss_key
diff --git a/net/core/dev.c b/net/core/dev.c
index b5b92dcd5eea..07da69c1ac0a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4624,7 +4624,7 @@ static bool skb_flow_limit(struct sk_buff *skb, unsigned int qlen)
 	struct softnet_data *sd;
 	unsigned int old_flow, new_flow;
 
-	if (qlen < (netdev_max_backlog >> 1))
+	if (qlen < (READ_ONCE(netdev_max_backlog) >> 1))
 		return false;
 
 	sd = this_cpu_ptr(&softnet_data);
@@ -4672,7 +4672,7 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu,
 	if (!netif_running(skb->dev))
 		goto drop;
 	qlen = skb_queue_len(&sd->input_pkt_queue);
-	if (qlen <= netdev_max_backlog && !skb_flow_limit(skb, qlen)) {
+	if (qlen <= READ_ONCE(netdev_max_backlog) && !skb_flow_limit(skb, qlen)) {
 		if (qlen) {
 enqueue:
 			__skb_queue_tail(&sd->input_pkt_queue, skb);
diff --git a/net/core/gro_cells.c b/net/core/gro_cells.c
index 541c7a72a28a..21619c70a82b 100644
--- a/net/core/gro_cells.c
+++ b/net/core/gro_cells.c
@@ -26,7 +26,7 @@ int gro_cells_receive(struct gro_cells *gcells, struct sk_buff *skb)
 
 	cell = this_cpu_ptr(gcells->cells);
 
-	if (skb_queue_len(&cell->napi_skbs) > netdev_max_backlog) {
+	if (skb_queue_len(&cell->napi_skbs) > READ_ONCE(netdev_max_backlog)) {
 drop:
 		dev_core_stats_rx_dropped_inc(dev);
 		kfree_skb(skb);
diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c
index 82d14eea1b5a..974eb97b77d2 100644
--- a/net/xfrm/espintcp.c
+++ b/net/xfrm/espintcp.c
@@ -168,7 +168,7 @@ int espintcp_queue_out(struct sock *sk, struct sk_buff *skb)
 {
 	struct espintcp_ctx *ctx = espintcp_getctx(sk);
 
-	if (skb_queue_len(&ctx->out_queue) >= netdev_max_backlog)
+	if (skb_queue_len(&ctx->out_queue) >= READ_ONCE(netdev_max_backlog))
 		return -ENOBUFS;
 
 	__skb_queue_tail(&ctx->out_queue, skb);
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 144238a50f3d..a3eb21a85810 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -783,7 +783,7 @@ int xfrm_trans_queue_net(struct net *net, struct sk_buff *skb,
 
 	trans = this_cpu_ptr(&xfrm_trans_tasklet);
 
-	if (skb_queue_len(&trans->queue) >= netdev_max_backlog)
+	if (skb_queue_len(&trans->queue) >= READ_ONCE(netdev_max_backlog))
 		return -ENOBUFS;
 
 	BUILD_BUG_ON(sizeof(struct xfrm_trans_cb) > sizeof(skb->cb));
-- 
2.30.2


^ permalink raw reply related

* [PATCH v1 net 02/15] net: Fix data-races around weight_p and dev_weight_[rt]x_bias.
From: Kuniyuki Iwashima @ 2022-08-16  5:23 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, linux-kernel,
	Matthias Tafelmeier
In-Reply-To: <20220816052347.70042-1-kuniyu@amazon.com>

While reading weight_p and dev_weight_[rt]x_bias, they can be changed
concurrently.  Thus, we need to add READ_ONCE() to their readers.

Fixes: 3d48b53fb2ae ("net: dev_weight: TX/RX orthogonality")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
CC: Matthias Tafelmeier <matthias.tafelmeier@gmx.net>
---
 net/core/dev.c             | 2 +-
 net/core/sysctl_net_core.c | 6 ++++--
 net/sched/sch_generic.c    | 2 +-
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 716df64fcfa5..b5b92dcd5eea 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5918,7 +5918,7 @@ static int process_backlog(struct napi_struct *napi, int quota)
 		net_rps_action_and_irq_enable(sd);
 	}
 
-	napi->weight = dev_rx_weight;
+	napi->weight = READ_ONCE(dev_rx_weight);
 	while (again) {
 		struct sk_buff *skb;
 
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 71a13596ea2b..d82ba0c27175 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -240,8 +240,10 @@ static int proc_do_dev_weight(struct ctl_table *table, int write,
 	if (ret != 0)
 		return ret;
 
-	dev_rx_weight = weight_p * dev_weight_rx_bias;
-	dev_tx_weight = weight_p * dev_weight_tx_bias;
+	WRITE_ONCE(dev_rx_weight,
+		   READ_ONCE(weight_p) * READ_ONCE(dev_weight_rx_bias));
+	WRITE_ONCE(dev_tx_weight,
+		   READ_ONCE(weight_p) * READ_ONCE(dev_weight_tx_bias));
 
 	return ret;
 }
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index d47b9689eba6..99b697ad2b98 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -409,7 +409,7 @@ static inline bool qdisc_restart(struct Qdisc *q, int *packets)
 
 void __qdisc_run(struct Qdisc *q)
 {
-	int quota = dev_tx_weight;
+	int quota = READ_ONCE(dev_tx_weight);
 	int packets;
 
 	while (qdisc_restart(q, &packets)) {
-- 
2.30.2


^ permalink raw reply related

* [PATCH v1 net 01/15] net: Fix data-races around sysctl_[rw]mem_(max|default).
From: Kuniyuki Iwashima @ 2022-08-16  5:23 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, linux-kernel
In-Reply-To: <20220816052347.70042-1-kuniyu@amazon.com>

While reading sysctl_[rw]mem_(max|default), they can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/core/filter.c               | 4 ++--
 net/core/sock.c                 | 8 ++++----
 net/ipv4/ip_output.c            | 2 +-
 net/ipv4/tcp_output.c           | 2 +-
 net/netfilter/ipvs/ip_vs_sync.c | 4 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index e8508aaafd27..c4f14ad82029 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5034,14 +5034,14 @@ static int __bpf_setsockopt(struct sock *sk, int level, int optname,
 		/* Only some socketops are supported */
 		switch (optname) {
 		case SO_RCVBUF:
-			val = min_t(u32, val, sysctl_rmem_max);
+			val = min_t(u32, val, READ_ONCE(sysctl_rmem_max));
 			val = min_t(int, val, INT_MAX / 2);
 			sk->sk_userlocks |= SOCK_RCVBUF_LOCK;
 			WRITE_ONCE(sk->sk_rcvbuf,
 				   max_t(int, val * 2, SOCK_MIN_RCVBUF));
 			break;
 		case SO_SNDBUF:
-			val = min_t(u32, val, sysctl_wmem_max);
+			val = min_t(u32, val, READ_ONCE(sysctl_wmem_max));
 			val = min_t(int, val, INT_MAX / 2);
 			sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
 			WRITE_ONCE(sk->sk_sndbuf,
diff --git a/net/core/sock.c b/net/core/sock.c
index 4cb957d934a2..303af52f3b79 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1101,7 +1101,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
 		 * play 'guess the biggest size' games. RCVBUF/SNDBUF
 		 * are treated in BSD as hints
 		 */
-		val = min_t(u32, val, sysctl_wmem_max);
+		val = min_t(u32, val, READ_ONCE(sysctl_wmem_max));
 set_sndbuf:
 		/* Ensure val * 2 fits into an int, to prevent max_t()
 		 * from treating it as a negative value.
@@ -1133,7 +1133,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
 		 * play 'guess the biggest size' games. RCVBUF/SNDBUF
 		 * are treated in BSD as hints
 		 */
-		__sock_set_rcvbuf(sk, min_t(u32, val, sysctl_rmem_max));
+		__sock_set_rcvbuf(sk, min_t(u32, val, READ_ONCE(sysctl_rmem_max)));
 		break;
 
 	case SO_RCVBUFFORCE:
@@ -3309,8 +3309,8 @@ void sock_init_data(struct socket *sock, struct sock *sk)
 	timer_setup(&sk->sk_timer, NULL, 0);
 
 	sk->sk_allocation	=	GFP_KERNEL;
-	sk->sk_rcvbuf		=	sysctl_rmem_default;
-	sk->sk_sndbuf		=	sysctl_wmem_default;
+	sk->sk_rcvbuf		=	READ_ONCE(sysctl_rmem_default);
+	sk->sk_sndbuf		=	READ_ONCE(sysctl_wmem_default);
 	sk->sk_state		=	TCP_CLOSE;
 	sk_set_socket(sk, sock);
 
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index d7bd1daf022b..04e2034f2f8e 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1730,7 +1730,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
 
 	sk->sk_protocol = ip_hdr(skb)->protocol;
 	sk->sk_bound_dev_if = arg->bound_dev_if;
-	sk->sk_sndbuf = sysctl_wmem_default;
+	sk->sk_sndbuf = READ_ONCE(sysctl_wmem_default);
 	ipc.sockc.mark = fl4.flowi4_mark;
 	err = ip_append_data(sk, &fl4, ip_reply_glue_bits, arg->iov->iov_base,
 			     len, 0, &ipc, &rt, MSG_DONTWAIT);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 78b654ff421b..290019de766d 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -239,7 +239,7 @@ void tcp_select_initial_window(const struct sock *sk, int __space, __u32 mss,
 	if (wscale_ok) {
 		/* Set window scaling on max possible window */
 		space = max_t(u32, space, READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[2]));
-		space = max_t(u32, space, sysctl_rmem_max);
+		space = max_t(u32, space, READ_ONCE(sysctl_rmem_max));
 		space = min_t(u32, space, *window_clamp);
 		*rcv_wscale = clamp_t(int, ilog2(space) - 15,
 				      0, TCP_MAX_WSCALE);
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index 9d43277b8b4f..a56fd0b5a430 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -1280,12 +1280,12 @@ static void set_sock_size(struct sock *sk, int mode, int val)
 	lock_sock(sk);
 	if (mode) {
 		val = clamp_t(int, val, (SOCK_MIN_SNDBUF + 1) / 2,
-			      sysctl_wmem_max);
+			      READ_ONCE(sysctl_wmem_max));
 		sk->sk_sndbuf = val * 2;
 		sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
 	} else {
 		val = clamp_t(int, val, (SOCK_MIN_RCVBUF + 1) / 2,
-			      sysctl_rmem_max);
+			      READ_ONCE(sysctl_rmem_max));
 		sk->sk_rcvbuf = val * 2;
 		sk->sk_userlocks |= SOCK_RCVBUF_LOCK;
 	}
-- 
2.30.2


^ permalink raw reply related

* [PATCH v1 net 00/15] sysctl: Fix data-races around net.core.XXX (Round 1)
From: Kuniyuki Iwashima @ 2022-08-16  5:23 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, linux-kernel

This series fixes data-races around 22 knobs in net_core_table.
These knobs are skipped:

  - netdev_rss_key: Written only once by net_get_random_once() and
                    read-only knob
  - rps_sock_flow_entries: Protected with sock_flow_mutex
  - flow_limit_cpu_bitmap: Protected with flow_limit_update_mutex
  - flow_limit_table_len: Protected with flow_limit_update_mutex
  - default_qdisc: Protected with qdisc_mod_lock
  - warnings: Unused

Note 9th patch fixes net.core.message_cost and net.core.message_burst,
and lib/ratelimit.c does not have an explicit maintainer.

The next round is the final round for net.core.XXX and starts from
netdev_budget_usecs.

Kuniyuki Iwashima (15):
  net: Fix data-races around sysctl_[rw]mem_(max|default).
  net: Fix data-races around weight_p and dev_weight_[rt]x_bias.
  net: Fix data-races around netdev_max_backlog.
  bpf: Fix data-races around bpf_jit_enable.
  bpf: Fix data-races around bpf_jit_harden.
  bpf: Fix data-races around bpf_jit_kallsyms.
  bpf: Fix a data-race around bpf_jit_limit.
  net: Fix data-races around netdev_tstamp_prequeue.
  ratelimit: Fix data-races in ___ratelimit().
  net: Fix data-races around sysctl_optmem_max.
  net: Fix a data-race around sysctl_tstamp_allow_data.
  net: Fix a data-race around sysctl_net_busy_poll.
  net: Fix a data-race around sysctl_net_busy_read.
  net: Fix a data-race around netdev_budget.
  net: Fix data-races around sysctl_max_skb_frags.

 Documentation/admin-guide/sysctl/net.rst |  2 +-
 arch/arm/net/bpf_jit_32.c                |  2 +-
 arch/arm64/net/bpf_jit_comp.c            |  2 +-
 arch/mips/net/bpf_jit_comp.c             |  2 +-
 arch/powerpc/net/bpf_jit_comp.c          |  5 +++--
 arch/riscv/net/bpf_jit_core.c            |  2 +-
 arch/s390/net/bpf_jit_comp.c             |  2 +-
 arch/sparc/net/bpf_jit_comp_32.c         |  5 +++--
 arch/sparc/net/bpf_jit_comp_64.c         |  5 +++--
 arch/x86/net/bpf_jit_comp.c              |  2 +-
 arch/x86/net/bpf_jit_comp32.c            |  2 +-
 include/linux/filter.h                   | 16 ++++++++++------
 include/net/busy_poll.h                  |  2 +-
 kernel/bpf/core.c                        |  2 +-
 lib/ratelimit.c                          |  8 +++++---
 net/core/bpf_sk_storage.c                |  5 +++--
 net/core/dev.c                           | 16 ++++++++--------
 net/core/filter.c                        | 13 +++++++------
 net/core/gro_cells.c                     |  2 +-
 net/core/skbuff.c                        |  2 +-
 net/core/sock.c                          | 18 ++++++++++--------
 net/core/sysctl_net_core.c               | 10 ++++++----
 net/ipv4/ip_output.c                     |  2 +-
 net/ipv4/ip_sockglue.c                   |  6 +++---
 net/ipv4/tcp.c                           |  4 ++--
 net/ipv4/tcp_output.c                    |  2 +-
 net/ipv6/ipv6_sockglue.c                 |  4 ++--
 net/mptcp/protocol.c                     |  2 +-
 net/netfilter/ipvs/ip_vs_sync.c          |  4 ++--
 net/sched/sch_generic.c                  |  2 +-
 net/xfrm/espintcp.c                      |  2 +-
 net/xfrm/xfrm_input.c                    |  2 +-
 32 files changed, 85 insertions(+), 70 deletions(-)

-- 
2.30.2

^ permalink raw reply

* [PATCH net-next v2] net: dsa: microchip: ksz9477: fix fdb_dump last invalid entry
From: Arun Ramadoss @ 2022-08-16  4:26 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: Woojung Huh, UNGLinuxDriver, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Vladimir Oltean, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Russell King, Tristram Ha

In the ksz9477_fdb_dump function it reads the ALU control register and
exit from the timeout loop if there is valid entry or search is
complete. After exiting the loop, it reads the alu entry and report to
the user space irrespective of entry is valid. It works till the valid
entry. If the loop exited when search is complete, it reads the alu
table. The table returns all ones and it is reported to user space. So
bridge fdb show gives ff:ff:ff:ff:ff:ff as last entry for every port.
To fix it, after exiting the loop the entry is reported only if it is
valid one.

Fixes: b987e98e50ab ("dsa: add DSA switch driver for Microchip KSZ9477")
Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
---
changes in v2
- changed the fixes commit id
- reduced the indentation level by using ! and continue statement

 drivers/net/dsa/microchip/ksz9477.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c
index 4b14d80d27ed..e4f446db0ca1 100644
--- a/drivers/net/dsa/microchip/ksz9477.c
+++ b/drivers/net/dsa/microchip/ksz9477.c
@@ -613,6 +613,9 @@ int ksz9477_fdb_dump(struct ksz_device *dev, int port,
 			goto exit;
 		}

+		if (!(ksz_data & ALU_VALID))
+			continue;
+
 		/* read ALU table */
 		ksz9477_read_table(dev, alu_table);

base-commit: 7ebfc85e2cd7b08f518b526173e9a33b56b3913b
-- 
2.36.1

^ permalink raw reply related

* Re: [PATCH v2 net-next] net: ethernet: ti: davinci_mdio: Add workaround for errata i2329
From: Ravi Gunasekaran @ 2022-08-16  4:26 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: davem, edumazet, kuba, pabeni, linux-omap, netdev, linux-kernel,
	linux-arm-kernel, kishon, vigneshr
In-Reply-To: <YvZ3uc0xi18C1g5t@lunn.ch>

On 12/08/22 9:24 pm, Andrew Lunn wrote:
>> sh_eth is not configured for autosuspend and uses only pm_runtime_put().
> 
> I don't know the runtime power management code very well. We should
> probably ask somebody how does. However:
> 
> https://elixir.bootlin.com/linux/latest/source/drivers/base/power/runtime.c#L168
> 
> This suggests it should be safe to perform an auto suspend on a device
> which is not configured for auto suspend. To me, it looks like is
> should directly suspend.

I checked the pm_runtime_put() and pm_runtime_put_autosuspend() and I 
have quite a few unanswered questions. I do not have much knowledge 
about the runtime PM, so I'm not confident about adding the runtime PM 
mdiobb_read/mdiobb_write variants in the mdio bitbang core. I would 
rather take a safe approach for now and stick with the implementation 
submitted in this v2 patch.

Thanks for your feedback on this submission so far. I will fix the two 
other comments and send out a v3 patch.

-- 
Regards,
Ravi

^ permalink raw reply

* Re: [PATCH 2/2] vDPA: conditionally read fields in virtio-net dev
From: Zhu, Lingshan @ 2022-08-16  4:26 UTC (permalink / raw)
  To: Si-Wei Liu, jasowang, mst
  Cc: virtualization, netdev, kvm, parav, xieyongji, gautam.dawar
In-Reply-To: <20e92551-a639-ec13-3d9c-13bb215422e1@intel.com>



On 8/16/2022 9:58 AM, Zhu, Lingshan wrote:
>
>
> On 8/16/2022 7:32 AM, Si-Wei Liu wrote:
>>
>>
>> On 8/15/2022 2:26 AM, Zhu Lingshan wrote:
>>> Some fields of virtio-net device config space are
>>> conditional on the feature bits, the spec says:
>>>
>>> "The mac address field always exists
>>> (though is only valid if VIRTIO_NET_F_MAC is set)"
>>>
>>> "max_virtqueue_pairs only exists if VIRTIO_NET_F_MQ
>>> or VIRTIO_NET_F_RSS is set"
>>>
>>> "mtu only exists if VIRTIO_NET_F_MTU is set"
>>>
>>> so we should read MTU, MAC and MQ in the device config
>>> space only when these feature bits are offered.
>>>
>>> For MQ, if both VIRTIO_NET_F_MQ and VIRTIO_NET_F_RSS are
>>> not set, the virtio device should have
>>> one queue pair as default value, so when userspace querying queue 
>>> pair numbers,
>>> it should return mq=1 than zero.
>>>
>>> For MTU, if VIRTIO_NET_F_MTU is not set, we should not read
>>> MTU from the device config sapce.
>>> RFC894 <A Standard for the Transmission of IP Datagrams over 
>>> Ethernet Networks>
>>> says:"The minimum length of the data field of a packet sent over an
>>> Ethernet is 1500 octets, thus the maximum length of an IP datagram
>>> sent over an Ethernet is 1500 octets.  Implementations are encouraged
>>> to support full-length packets"
>> Noted there's a typo in the above "The *maximum* length of the data 
>> field of a packet sent over an Ethernet is 1500 octets ..." and the 
>> RFC was written 1984.
> the spec RFC894 says it is 1500, see 
> https://www.rfc-editor.org/rfc/rfc894.txt
I have seen 
https://www.rfc-editor.org/errata_search.php?rfc=894&rec_status=0,

so I think we should return nothing if _F_MTU not set.

Thanks
>>
>> Apparently that is no longer true with the introduction of Jumbo size 
>> frame later in the 2000s. I'm not sure what is the point of mention 
>> this ancient RFC. It doesn't say default MTU of any Ethernet 
>> NIC/switch should be 1500 in either  case.
> This could be a larger number for sure, we are trying to find out the 
> min value for Ethernet here, to support 1500 octets, MTU should be 
> 1500 at least, so I assume 1500 should be the default value for MTU
>>
>>>
>>> virtio spec says:"The virtio network device is a virtual ethernet 
>>> card",
>> Right,
>>> so the default MTU value should be 1500 for virtio-net.
>> ... but it doesn't say the default is 1500. At least, not in explicit 
>> way. Why it can't be 1492 or even lower? In practice, if the network 
>> backend has a MTU higher than 1500, there's nothing wrong for guest 
>> to configure default MTU more than 1500.
> same as above
>>
>>>
>>> For MAC, the spec says:"If the VIRTIO_NET_F_MAC feature bit is set,
>>> the configuration space mac entry indicates the “physical” address
>>> of the network card, otherwise the driver would typically
>>> generate a random local MAC address." So there is no
>>> default MAC address if VIRTIO_NET_F_MAC not set.
>>>
>>> This commits introduces functions vdpa_dev_net_mtu_config_fill()
>>> and vdpa_dev_net_mac_config_fill() to fill MTU and MAC.
>>> It also fixes vdpa_dev_net_mq_config_fill() to report correct
>>> MQ when _F_MQ is not present.
>>>
>>> These functions should check devices features than driver
>>> features, and struct vdpa_device is not needed as a parameter
>>>
>>> The test & userspace tool output:
>>>
>>> Feature bit VIRTIO_NET_F_MTU, VIRTIO_NET_F_RSS, VIRTIO_NET_F_MQ
>>> and VIRTIO_NET_F_MAC can be mask out by hardcode.
>>>
>>> However, it is challenging to "disable" the related fields
>>> in the HW device config space, so let's just assume the values
>>> are meaningless if the feature bits are not set.
>>>
>>> Before this change, when feature bits for RSS, MQ, MTU and MAC
>>> are not set, iproute2 output:
>>> $vdpa vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false mtu 1500
>>>    negotiated_features
>>>
>>> without this commit, function vdpa_dev_net_config_fill()
>>> reads all config space fields unconditionally, so let's
>>> assume the MAC and MTU are meaningless, and it checks
>>> MQ with driver_features, so we don't see max_vq_pairs.
>>>
>>> After applying this commit, when feature bits for
>>> MQ, RSS, MAC and MTU are not set,iproute2 output:
>>> $vdpa dev config show vdpa0
>>> vdpa0: link up link_announce false max_vq_pairs 1 mtu 1500
>>>    negotiated_features
>>>
>>> As explained above:
>>> Here is no MAC, because VIRTIO_NET_F_MAC is not set,
>>> and there is no default value for MAC. It shows
>>> max_vq_paris = 1 because even without MQ feature,
>>> a functional virtio-net must have one queue pair.
>>> mtu = 1500 is the default value as ethernet
>>> required.
>>>
>>> This commit also add supplementary comments for
>>> __virtio16_to_cpu(true, xxx) operations in
>>> vdpa_dev_net_config_fill() and vdpa_fill_stats_rec()
>>>
>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>> ---
>>>   drivers/vdpa/vdpa.c | 60 
>>> +++++++++++++++++++++++++++++++++++----------
>>>   1 file changed, 47 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>>> index efb55a06e961..a74660b98979 100644
>>> --- a/drivers/vdpa/vdpa.c
>>> +++ b/drivers/vdpa/vdpa.c
>>> @@ -801,19 +801,44 @@ static int vdpa_nl_cmd_dev_get_dumpit(struct 
>>> sk_buff *msg, struct netlink_callba
>>>       return msg->len;
>>>   }
>>>   -static int vdpa_dev_net_mq_config_fill(struct vdpa_device *vdev,
>>> -                       struct sk_buff *msg, u64 features,
>>> +static int vdpa_dev_net_mq_config_fill(struct sk_buff *msg, u64 
>>> features,
>>>                          const struct virtio_net_config *config)
>>>   {
>>>       u16 val_u16;
>>>   -    if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0)
>>> -        return 0;
>>> +    if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0 &&
>>> +        (features & BIT_ULL(VIRTIO_NET_F_RSS)) == 0)
>>> +        val_u16 = 1;
>>> +    else
>>> +        val_u16 = __virtio16_to_cpu(true, 
>>> config->max_virtqueue_pairs);
>>>   -    val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
>>>       return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, val_u16);
>>>   }
>>>   +static int vdpa_dev_net_mtu_config_fill(struct sk_buff *msg, u64 
>>> features,
>>> +                    const struct virtio_net_config *config)
>>> +{
>>> +    u16 val_u16;
>>> +
>>> +    if ((features & BIT_ULL(VIRTIO_NET_F_MTU)) == 0)
>>> +        val_u16 = 1500;
>> As said, there's no virtio spec defined value for MTU. Please leave 
>> this field out if feature VIRTIO_NET_F_MTU is not negotiated.
> same as above
>>> +    else
>>> +        val_u16 = __virtio16_to_cpu(true, config->mtu);
>>> +
>>> +    return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16);
>>> +}
>>> +
>>> +static int vdpa_dev_net_mac_config_fill(struct sk_buff *msg, u64 
>>> features,
>>> +                    const struct virtio_net_config *config)
>>> +{
>>> +    if ((features & BIT_ULL(VIRTIO_NET_F_MAC)) == 0)
>>> +        return 0;
>>> +    else
>>> +        return  nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR,
>>> +                sizeof(config->mac), config->mac);
>>> +}
>>> +
>>> +
>>>   static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, 
>>> struct sk_buff *msg)
>>>   {
>>>       struct virtio_net_config config = {};
>>> @@ -822,18 +847,16 @@ static int vdpa_dev_net_config_fill(struct 
>>> vdpa_device *vdev, struct sk_buff *ms
>>>         vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>   -    if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, 
>>> sizeof(config.mac),
>>> -            config.mac))
>>> -        return -EMSGSIZE;
>>> +    /*
>>> +     * Assume little endian for now, userspace can tweak this for
>>> +     * legacy guest support.
>> You can leave it as a TODO for kernel (vdpa core limitation), but 
>> AFAIK there's nothing userspace needs to do to infer the endianness. 
>> IMHO it's the kernel's job to provide an abstraction rather than rely 
>> on userspace guessing it.
> we have discussed it in another thread, and this comment is suggested 
> by MST.
>>
>>> +     */
>>> +    val_u16 = __virtio16_to_cpu(true, config.status);
>>>         val_u16 = __virtio16_to_cpu(true, config.status);
>>>       if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>           return -EMSGSIZE;
>>>   -    val_u16 = __virtio16_to_cpu(true, config.mtu);
>>> -    if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>> -        return -EMSGSIZE;
>>> -
>>>       features_driver = vdev->config->get_driver_features(vdev);
>>>       if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_NEGOTIATED_FEATURES, 
>>> features_driver,
>>>                     VDPA_ATTR_PAD))
>>> @@ -846,7 +869,13 @@ static int vdpa_dev_net_config_fill(struct 
>>> vdpa_device *vdev, struct sk_buff *ms
>>>                     VDPA_ATTR_PAD))
>>>           return -EMSGSIZE;
>>>   -    return vdpa_dev_net_mq_config_fill(vdev, msg, 
>>> features_driver, &config);
>>> +    if (vdpa_dev_net_mac_config_fill(msg, features_device, &config))
>>> +        return -EMSGSIZE;
>>> +
>>> +    if (vdpa_dev_net_mtu_config_fill(msg, features_device, &config))
>>> +        return -EMSGSIZE;
>>> +
>>> +    return vdpa_dev_net_mq_config_fill(msg, features_device, &config);
>>>   }
>>>     static int
>>> @@ -914,6 +943,11 @@ static int vdpa_fill_stats_rec(struct 
>>> vdpa_device *vdev, struct sk_buff *msg,
>>>       }
>>>       vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>   +    /*
>>> +     * Assume little endian for now, userspace can tweak this for
>>> +     * legacy guest support.
>>> +     */
>>> +
>> Ditto.
> same as above
>
> Thanks
>>
>> Thanks,
>> -Siwei
>>>       max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs);
>>>       if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp))
>>>           return -EMSGSIZE;
>>
>


^ permalink raw reply

* Re: [PATCH 2/2] vDPA: conditionally read fields in virtio-net dev
From: Zhu, Lingshan @ 2022-08-16  4:18 UTC (permalink / raw)
  To: Parav Pandit, jasowang@redhat.com, mst@redhat.com
  Cc: virtualization@lists.linux-foundation.org, netdev@vger.kernel.org,
	kvm@vger.kernel.org, xieyongji@bytedance.com,
	gautam.dawar@amd.com
In-Reply-To: <PH0PR12MB54815EF8C19F70072169FA56DC6B9@PH0PR12MB5481.namprd12.prod.outlook.com>



On 8/16/2022 10:32 AM, Parav Pandit wrote:
>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>> Sent: Monday, August 15, 2022 5:27 AM
>>
>> Some fields of virtio-net device config space are conditional on the feature
>> bits, the spec says:
>>
>> "The mac address field always exists
>> (though is only valid if VIRTIO_NET_F_MAC is set)"
>>
>> "max_virtqueue_pairs only exists if VIRTIO_NET_F_MQ or
>> VIRTIO_NET_F_RSS is set"
>>
>> "mtu only exists if VIRTIO_NET_F_MTU is set"
>>
>> so we should read MTU, MAC and MQ in the device config space only when
>> these feature bits are offered.
> Yes.
>
>> For MQ, if both VIRTIO_NET_F_MQ and VIRTIO_NET_F_RSS are not set, the
>> virtio device should have one queue pair as default value, so when userspace
>> querying queue pair numbers, it should return mq=1 than zero.
> No.
> No need to treat mac and max_qps differently.
> It is meaningless to differentiate when field exist/not-exists vs value valid/not valid.
as we discussed before, MQ has a default value 1, to be a functional 
virtio-net device,
while MAC has no default value, if no VIRTIO_NET_F_MAC set, the driver 
should generate
a random MAC.
>
>> For MTU, if VIRTIO_NET_F_MTU is not set, we should not read MTU from
>> the device config sapce.
>> RFC894 <A Standard for the Transmission of IP Datagrams over Ethernet
>> Networks> says:"The minimum length of the data field of a packet sent over
>> an Ethernet is 1500 octets, thus the maximum length of an IP datagram sent
>> over an Ethernet is 1500 octets.  Implementations are encouraged to support
>> full-length packets"
> This line in the RFC 894 of 1984 is wrong.
> Errata already exists for it at [1].
>
> [1] https://www.rfc-editor.org/errata_search.php?rfc=894&rec_status=0
OK, so I think we should return nothing if _F_MTU not set, like handling 
the MAC
>
>> virtio spec says:"The virtio network device is a virtual ethernet card", so the
>> default MTU value should be 1500 for virtio-net.
>>
> Practically I have seen 1500 and highe mtu.
> And this derivation is not good of what should be the default mtu as above errata exists.
>
> And I see the code below why you need to work so hard to define a default value so that _MQ and _MTU can report default values.
>
> There is really no need for this complexity and such a long commit message.
>
> Can we please expose feature bits as-is and report config space field which are valid?
>
> User space will be querying both.
I think MAC and MTU don't have default values, so return nothing if the 
feature bits not set,
for MQ, it is still max_vq_paris == 1 by default.
>
>> For MAC, the spec says:"If the VIRTIO_NET_F_MAC feature bit is set, the
>> configuration space mac entry indicates the “physical” address of the
>> network card, otherwise the driver would typically generate a random local
>> MAC address." So there is no default MAC address if VIRTIO_NET_F_MAC
>> not set.
>>
>> This commits introduces functions vdpa_dev_net_mtu_config_fill() and
>> vdpa_dev_net_mac_config_fill() to fill MTU and MAC.
>> It also fixes vdpa_dev_net_mq_config_fill() to report correct MQ when
>> _F_MQ is not present.
>>
> Multiple changes in single patch are not good idea.
> Its ok to split to smaller patches.
OK, I can try to split it.
>
>> +	if ((features & BIT_ULL(VIRTIO_NET_F_MTU)) == 0)
>> +		val_u16 = 1500;
>> +	else
>> +		val_u16 = __virtio16_to_cpu(true, config->mtu);
>> +
> Need to work hard to find default values and that too turned out had errata.
> There are more fields that doesn’t have default values.
>
> There is no point in kernel doing this guess work, that user space can figure out of what is valid/invalid.
It's not guest work, when guest finds no feature bits set, it can decide 
what to do. These code only for the
user space, just MST ever suggest, if there is a default value, we can 
return it from the kernel, once for all.

Thanks
>


^ permalink raw reply

* Re: [PATCH bpf-next v7 4/8] bpf: Introduce cgroup iter
From: Andrii Nakryiko @ 2022-08-16  4:19 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Hao Luo, Alexei Starovoitov, Linux Kernel Mailing List, bpf,
	Cgroups, Networking, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	Tejun Heo, Zefan Li, KP Singh, Johannes Weiner, Michal Hocko,
	Benjamin Tissoires, John Fastabend, Michal Koutny, Roman Gushchin,
	David Rientjes, Stanislav Fomichev, Shakeel Butt
In-Reply-To: <CAEf4BzZTrsBOPpCTFouoWZJG9yXkz8LZgLQrqDREAY-XdGb7ew@mail.gmail.com>

On Mon, Aug 15, 2022 at 9:12 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Aug 11, 2022 at 7:10 AM Yosry Ahmed <yosryahmed@google.com> wrote:
> >
> > On Wed, Aug 10, 2022 at 8:10 PM Hao Luo <haoluo@google.com> wrote:
> > >
> > > On Tue, Aug 9, 2022 at 11:38 AM Hao Luo <haoluo@google.com> wrote:
> > > >
> > > > On Tue, Aug 9, 2022 at 9:23 AM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Mon, Aug 08, 2022 at 05:56:57PM -0700, Hao Luo wrote:
> > > > > > On Mon, Aug 8, 2022 at 5:19 PM Andrii Nakryiko
> > > > > > <andrii.nakryiko@gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Aug 5, 2022 at 2:49 PM Hao Luo <haoluo@google.com> wrote:
> > > > > > > >
> > > > > > > > Cgroup_iter is a type of bpf_iter. It walks over cgroups in four modes:
> > > > > > > >
> > > > > > > >  - walking a cgroup's descendants in pre-order.
> > > > > > > >  - walking a cgroup's descendants in post-order.
> > > > > > > >  - walking a cgroup's ancestors.
> > > > > > > >  - process only the given cgroup.
> > > > > > > >
> > > > [...]
> > > > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > > > > > index 59a217ca2dfd..4d758b2e70d6 100644
> > > > > > > > --- a/include/uapi/linux/bpf.h
> > > > > > > > +++ b/include/uapi/linux/bpf.h
> > > > > > > > @@ -87,10 +87,37 @@ struct bpf_cgroup_storage_key {
> > > > > > > >         __u32   attach_type;            /* program attach type (enum bpf_attach_type) */
> > > > > > > >  };
> > > > > > > >
> > > > > > > > +enum bpf_iter_order {
> > > > > > > > +       BPF_ITER_ORDER_DEFAULT = 0,     /* default order. */
> > > > > > >
> > > > > > > why is this default order necessary? It just adds confusion (I had to
> > > > > > > look up source code to know what is default order). I might have
> > > > > > > missed some discussion, so if there is some very good reason, then
> > > > > > > please document this in commit message. But I'd rather not do some
> > > > > > > magical default order instead. We can set 0 to mean invalid and error
> > > > > > > out, or just do SELF as the very first value (and if user forgot to
> > > > > > > specify more fancy mode, they hopefully will quickly discover this in
> > > > > > > their testing).
> > > > > > >
> > > > > >
> > > > > > PRE/POST/UP are tree-specific orders. SELF applies on all iters and
> > > > > > yields only a single object. How does task_iter express a non-self
> > > > > > order? By non-self, I mean something like "I don't care about the
> > > > > > order, just scan _all_ the objects". And this "don't care" order, IMO,
> > > > > > may be the common case. I don't think everyone cares about walking
> > > > > > order for tasks. The DEFAULT is intentionally put at the first value,
> > > > > > so that if users don't care about order, they don't have to specify
> > > > > > this field.
> > > > > >
> > > > > > If that sounds valid, maybe using "UNSPEC" instead of "DEFAULT" is better?
> > > > >
> > > > > I agree with Andrii.
> > > > > This:
> > > > > +       if (order == BPF_ITER_ORDER_DEFAULT)
> > > > > +               order = BPF_ITER_DESCENDANTS_PRE;
> > > > >
> > > > > looks like an arbitrary choice.
> > > > > imo
> > > > > BPF_ITER_DESCENDANTS_PRE = 0,
> > > > > would have been more obvious. No need to dig into definition of "default".
> > > > >
> > > > > UNSPEC = 0
> > > > > is fine too if we want user to always be conscious about the order
> > > > > and the kernel will error if that field is not initialized.
> > > > > That would be my preference, since it will match the rest of uapi/bpf.h
> > > > >
> > > >
> > > > Sounds good. In the next version, will use
> > > >
> > > > enum bpf_iter_order {
> > > >         BPF_ITER_ORDER_UNSPEC = 0,
> > > >         BPF_ITER_SELF_ONLY,             /* process only a single object. */
> > > >         BPF_ITER_DESCENDANTS_PRE,       /* walk descendants in pre-order. */
> > > >         BPF_ITER_DESCENDANTS_POST,      /* walk descendants in post-order. */
> > > >         BPF_ITER_ANCESTORS_UP,          /* walk ancestors upward. */
> > > > };
> > > >
> > >
> > > Sigh, I find that having UNSPEC=0 and erroring out when seeing UNSPEC
> > > doesn't work. Basically, if we have a non-iter prog and a cgroup_iter
> > > prog written in the same source file, I can't use
> > > bpf_object__attach_skeleton to attach them. Because the default
> > > prog_attach_fn for iter initializes `order` to 0 (that is, UNSPEC),
> > > which is going to be rejected by the kernel. In order to make
> > > bpf_object__attach_skeleton work on cgroup_iter, I think I need to use
> > > the following
> > >
> > > enum bpf_iter_order {
>
> so first of all, this can't be called "bpf_iter_order" as it doesn't
> apply to BPF iterators in general. I think this should be called
> bpf_iter_cgroup_order (or maybe bpf_cgroup_iter_order) and if/when we
> add ability to iterate tasks within cgroups then we'll just reuse enum
> bpf_iter_cgroup_order as an extra parameter for task iterator.
>
> And with that future case in mind I do think that we should have 0
> being "UNSPEC" case.
>
> > >         BPF_ITER_DESCENDANTS_PRE,       /* walk descendants in pre-order. */
> > >         BPF_ITER_DESCENDANTS_POST,      /* walk descendants in post-order. */
> > >         BPF_ITER_ANCESTORS_UP,          /* walk ancestors upward. */
> > >         BPF_ITER_SELF_ONLY,             /* process only a single object. */
> > > };
> > >
> > > So that when calling bpf_object__attach_skeleton() on cgroup_iter, a
> > > link can be generated and the generated link defaults to pre-order
> > > walk on the whole hierarchy. Is there a better solution?
> > >
>
> I was actually surprised that we specify these additional parameters
> at attach (LINK_CREATE) time, and not at bpf_iter_create() call time.
> It seems more appropriate to allow to specify such runtime parameters
> very late, when we create a specific instance of seq_file. But I guess
> this was done because one of the initial motivations for iterators was
> to be pinned in BPFFS and read as a file, so it was more convenient to
> store such parameters upfront at link creation time to keep
> BPF_OBJ_PIN simpler. I guess it makes sense, worst case you'll need to
> create multiple bpf_link files, one for each cgroup hierarchy you'd
> like to query with the same single BPF program.
>
> But I digress.
>
> As for not being able to auto-attach cgroup iterator. I think that's
> sort of expected and is in line with not being able to auto-attach
> cgroup programs, as you need cgroup FD at runtime. So even if you had
> some reasonable default order, you still would need to specify target
> cgroup (either through FD or ID).
>
> So... either don't do skeleton auto-attach, or let's teach libbpf code
> to not auto-attach some iter types?
>
> Alternatively, we could teach libbpf to parse some sort of cgroup
> iterator spec, like:
>
> SEC("iter/cgroup:/path/to/cgroup:descendants_pre")
>
> But this approach won't work for a bunch of other parameterized
> iterators (e.g., task iter, or map elem iter), so I'm hesitant about
> adding this to libbpf as a generic functionality.

As yet another alternative (given you seem to default to root cgroup
when cgroup_fd or cgroup_id is not specified), we can teach libbpf to
just specify BPF_ITER_DESCENDANTS_PRE (or whichever mode seems like
best default) only for auto-attach case. If that makes most sense.

>
> >
> > I think this can be handled by userspace? We can attach the
> > cgroup_iter separately first (and maybe we will need to set prog->link
> > as well) so that bpf_object__attach_skeleton() doesn't try to attach
> > it? I am following this pattern in the selftest in the final patch,
> > although I think I might be missing setting prog->link, so I am
> > wondering why there are no issues in that selftest which has the same
> > scenario that you are talking about.
> >
> > I think such a pattern will need to be used anyway if the users need
> > to set any non-default arguments for the cgroup_iter prog (like the
> > selftest), right? The only case we are discussing here is the case
> > where the user wants to attach the cgroup_iter with all default
> > options (in which case the default order will fail).
> > I agree that it might be inconvenient if the default/uninitialized
> > options don't work for cgroup_iter, but Alexei pointed out that this
> > matches other bpf uapis.
> >
> > My concern is that in the future we try to reuse enum bpf_iter_order
> > to set ordering for other iterators, and then the
> > default/uninitialized value (BPF_ITER_DESCENDANTS_PRE) doesn't make
> > sense for that iterator (e.g. not a tree). In this case, the same
> > problem that we are avoiding for cgroup_iter here will show up for
> > that iterator, and we can't easily change it at this point because
> > it's uapi.
>
> Yep, valid concern, I agree.
>
> >
> >
> > > > and explicitly list the values acceptable by cgroup_iter, error out if
> > > > UNSPEC is detected.
> > > >
> > > > Also, following Andrii's comments, will change BPF_ITER_SELF to
> > > > BPF_ITER_SELF_ONLY, which does seem a little bit explicit in
> > > > comparison.
> > > >
> > > > > I applied the first 3 patches to ease respin.
> > > >
> > > > Thanks! This helps!
> > > >
> > > > > Thanks!

^ permalink raw reply

* Re: [PATCH bpf-next v8 1/5] bpf: Introduce cgroup iter
From: Andrii Nakryiko @ 2022-08-16  4:17 UTC (permalink / raw)
  To: Hao Luo
  Cc: linux-kernel, bpf, cgroups, netdev, Alexei Starovoitov,
	Andrii Nakryiko, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, Tejun Heo, Zefan Li, KP Singh, Johannes Weiner,
	Michal Hocko, John Fastabend, Michal Koutny, Roman Gushchin,
	David Rientjes, Stanislav Fomichev, Shakeel Butt, Yosry Ahmed
In-Reply-To: <20220812202802.3774257-2-haoluo@google.com>

On Fri, Aug 12, 2022 at 1:28 PM Hao Luo <haoluo@google.com> wrote:
>
> Cgroup_iter is a type of bpf_iter. It walks over cgroups in four modes:
>
>  - walking a cgroup's descendants in pre-order.
>  - walking a cgroup's descendants in post-order.
>  - walking a cgroup's ancestors.
>  - process only the given cgroup.
>
> When attaching cgroup_iter, one can set a cgroup to the iter_link
> created from attaching. This cgroup is passed as a file descriptor
> or cgroup id and serves as the starting point of the walk. If no
> cgroup is specified, the starting point will be the root cgroup v2.
>
> For walking descendants, one can specify the order: either pre-order or
> post-order. For walking ancestors, the walk starts at the specified
> cgroup and ends at the root.
>
> One can also terminate the walk early by returning 1 from the iter
> program.
>
> Note that because walking cgroup hierarchy holds cgroup_mutex, the iter
> program is called with cgroup_mutex held.
>
> Currently only one session is supported, which means, depending on the
> volume of data bpf program intends to send to user space, the number
> of cgroups that can be walked is limited. For example, given the current
> buffer size is 8 * PAGE_SIZE, if the program sends 64B data for each
> cgroup, assuming PAGE_SIZE is 4kb, the total number of cgroups that can
> be walked is 512. This is a limitation of cgroup_iter. If the output
> data is larger than the kernel buffer size, after all data in the
> kernel buffer is consumed by user space, the subsequent read() syscall
> will signal EOPNOTSUPP. In order to work around, the user may have to
> update their program to reduce the volume of data sent to output. For
> example, skip some uninteresting cgroups. In future, we may extend
> bpf_iter flags to allow customizing buffer size.
>
> Acked-by: Yonghong Song <yhs@fb.com>
> Acked-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Hao Luo <haoluo@google.com>
> ---
>  include/linux/bpf.h                           |   8 +
>  include/uapi/linux/bpf.h                      |  35 +++
>  kernel/bpf/Makefile                           |   3 +
>  kernel/bpf/cgroup_iter.c                      | 283 ++++++++++++++++++
>  tools/include/uapi/linux/bpf.h                |  35 +++
>  .../selftests/bpf/prog_tests/btf_dump.c       |   4 +-
>  6 files changed, 366 insertions(+), 2 deletions(-)
>  create mode 100644 kernel/bpf/cgroup_iter.c
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index a627a02cf8ab..ecb8c61178a1 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -48,6 +48,7 @@ struct mem_cgroup;
>  struct module;
>  struct bpf_func_state;
>  struct ftrace_ops;
> +struct cgroup;
>
>  extern struct idr btf_idr;
>  extern spinlock_t btf_idr_lock;
> @@ -1730,7 +1731,14 @@ int bpf_obj_get_user(const char __user *pathname, int flags);
>         int __init bpf_iter_ ## target(args) { return 0; }
>
>  struct bpf_iter_aux_info {
> +       /* for map_elem iter */
>         struct bpf_map *map;
> +
> +       /* for cgroup iter */
> +       struct {
> +               struct cgroup *start; /* starting cgroup */
> +               int order;

why not using enum as a type here?

> +       } cgroup;
>  };
>
>  typedef int (*bpf_iter_attach_target_t)(struct bpf_prog *prog,
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 7d1e2794d83e..bc3c901b9f70 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -87,10 +87,34 @@ struct bpf_cgroup_storage_key {
>         __u32   attach_type;            /* program attach type (enum bpf_attach_type) */
>  };
>
> +enum bpf_iter_order {
> +       BPF_ITER_DESCENDANTS_PRE = 0,   /* walk descendants in pre-order. */
> +       BPF_ITER_DESCENDANTS_POST,      /* walk descendants in post-order. */
> +       BPF_ITER_ANCESTORS_UP,          /* walk ancestors upward. */
> +       BPF_ITER_SELF_ONLY,             /* process only a single object. */
> +};
> +
>  union bpf_iter_link_info {
>         struct {
>                 __u32   map_fd;
>         } map;
> +       struct {
> +               /* Users must specify order using one of the following values:
> +                *  - BPF_ITER_DESCENDANTS_PRE
> +                *  - BPF_ITER_DESCENDANTS_POST
> +                *  - BPF_ITER_ANCESTORS_UP
> +                *  - BPF_ITER_SELF_ONLY
> +                */
> +               __u32   order;

same, we just declared the UAPI enum above, why not specify that this
is that enum here?

> +
> +               /* At most one of cgroup_fd and cgroup_id can be non-zero. If
> +                * both are zero, the walk starts from the default cgroup v2
> +                * root. For walking v1 hierarchy, one should always explicitly
> +                * specify cgroup_fd.
> +                */
> +               __u32   cgroup_fd;
> +               __u64   cgroup_id;

for my own education, does root cgroup has cgroup_id == 0?

> +       } cgroup;
>  };
>

[...]

^ permalink raw reply

* Re: [PATCH 1/2] vDPA: allow userspace to query features of a vDPA device
From: Zhu, Lingshan @ 2022-08-16  4:21 UTC (permalink / raw)
  To: Parav Pandit, Si-Wei Liu, jasowang@redhat.com, mst@redhat.com
  Cc: virtualization@lists.linux-foundation.org, netdev@vger.kernel.org,
	kvm@vger.kernel.org, xieyongji@bytedance.com,
	gautam.dawar@amd.com
In-Reply-To: <PH0PR12MB5481EC56E9951B99B98A6668DC6B9@PH0PR12MB5481.namprd12.prod.outlook.com>



On 8/16/2022 10:07 AM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Monday, August 15, 2022 9:49 PM
>>
>> On 8/16/2022 2:15 AM, Si-Wei Liu wrote:
>>>
>>> On 8/15/2022 2:26 AM, Zhu Lingshan wrote:
>>>> This commit adds a new vDPA netlink attribution
>>>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
>> features
>>>> of vDPA devices through this new attr.
>>>>
>>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>> ---
>>>>    drivers/vdpa/vdpa.c       | 17 +++++++++++++----
>>>>    include/uapi/linux/vdpa.h |  3 +++
>>>>    2 files changed, 16 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>> c06c02704461..efb55a06e961 100644
>>>> --- a/drivers/vdpa/vdpa.c
>>>> +++ b/drivers/vdpa/vdpa.c
>>>> @@ -491,6 +491,8 @@ static int vdpa_mgmtdev_fill(const struct
>>>> vdpa_mgmt_dev *mdev, struct sk_buff *m
>>>>            err = -EMSGSIZE;
>>>>            goto msg_err;
>>>>        }
>>>> +
>>>> +    /* report features of a vDPA management device through
>>>> VDPA_ATTR_DEV_SUPPORTED_FEATURES */
>>>>        if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_SUPPORTED_FEATURES,
>>>>                      mdev->supported_features, VDPA_ATTR_PAD)) {
>>>>            err = -EMSGSIZE;
>>>> @@ -815,7 +817,7 @@ static int vdpa_dev_net_mq_config_fill(struct
>>>> vdpa_device *vdev,
>>>>    static int vdpa_dev_net_config_fill(struct vdpa_device *vdev,
>>>> struct sk_buff *msg)
>>>>    {
>>>>        struct virtio_net_config config = {};
>>>> -    u64 features;
>>>> +    u64 features_device, features_driver;
>>>>        u16 val_u16;
>>>>          vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>> @@ -832,12 +834,19 @@ static int vdpa_dev_net_config_fill(struct
>>>> vdpa_device *vdev, struct sk_buff *ms
>>>>        if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>            return -EMSGSIZE;
>>>>    -    features = vdev->config->get_driver_features(vdev);
>>>> -    if (nla_put_u64_64bit(msg,
>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES,
>>>> features,
>>>> +    features_driver = vdev->config->get_driver_features(vdev);
>>>> +    if (nla_put_u64_64bit(msg,
>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES,
>>>> features_driver,
>>>> +                  VDPA_ATTR_PAD))
>>>> +        return -EMSGSIZE;
>>>> +
>>>> +    features_device = vdev->config->get_device_features(vdev);
>>>> +
>>>> +    /* report features of a vDPA device through
>>>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES */
>>>> +    if (nla_put_u64_64bit(msg,
>>>> VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES, features_device,
>>>>                      VDPA_ATTR_PAD))
>>>>            return -EMSGSIZE;
>>>>    -    return vdpa_dev_net_mq_config_fill(vdev, msg, features,
>>>> &config);
>>>> +    return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver,
>>>> &config);
>>>>    }
>>>>      static int
>>>> diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
>>>> index 25c55cab3d7c..d171b92ef522 100644
>>>> --- a/include/uapi/linux/vdpa.h
>>>> +++ b/include/uapi/linux/vdpa.h
>>>> @@ -46,7 +46,10 @@ enum vdpa_attr {
>>>>          VDPA_ATTR_DEV_NEGOTIATED_FEATURES,    /* u64 */
>>>>        VDPA_ATTR_DEV_MGMTDEV_MAX_VQS,        /* u32 */
>>>> +    /* features of a vDPA management device */
> Say comment as.
>        /* Features of the vDPA management device matching the virtio feature bits 0 to 63 when queried on the mgmt. device.
>          *  When returned on the vdpa device, it indicates virtio feature bits 0 to 63 of the vdpa device
>          */
This still suggests to re-use the attr VDPA_ATTR_DEV_SUPPORTED_FEATURES, 
I think a new attr should be better and easier
for others.
>
>>>>        VDPA_ATTR_DEV_SUPPORTED_FEATURES,    /* u64 */
>>>> +    /* features of a vDPA device, e.g., /dev/vhost-vdpa0 */
>>>> +    VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES,    /* u64 */
>>> Append to the end, please. Otherwise it breaks userspace ABI.
>> OK, will fix it in V2
> I have read Se-Wei comment in the v4.
> However I disagree, we don’t need to continue the past mistake done with the naming.
>
> Please add a comment that VDPA_ATTR_DEV_SUPPORTED_FEATURES like above about the exception.
> We established that there is no race condition either.
> So no need to add new UAPI for some past history.
Yes, no race conditions, the thing is, it is not a good practice to 
re-use a netlink attr,
every attr should has its own purpose, no need to confuse others, I 
think we had this discussion before.

Thanks


^ permalink raw reply

* Re: [PATCH bpf-next v7 4/8] bpf: Introduce cgroup iter
From: Andrii Nakryiko @ 2022-08-16  4:12 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Hao Luo, Alexei Starovoitov, Linux Kernel Mailing List, bpf,
	Cgroups, Networking, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	Tejun Heo, Zefan Li, KP Singh, Johannes Weiner, Michal Hocko,
	Benjamin Tissoires, John Fastabend, Michal Koutny, Roman Gushchin,
	David Rientjes, Stanislav Fomichev, Shakeel Butt
In-Reply-To: <CAJD7tkY6ihK9PkaAwrdRr-3QyiVFf8h4WkLXx73zYwNUjS_7pw@mail.gmail.com>

On Thu, Aug 11, 2022 at 7:10 AM Yosry Ahmed <yosryahmed@google.com> wrote:
>
> On Wed, Aug 10, 2022 at 8:10 PM Hao Luo <haoluo@google.com> wrote:
> >
> > On Tue, Aug 9, 2022 at 11:38 AM Hao Luo <haoluo@google.com> wrote:
> > >
> > > On Tue, Aug 9, 2022 at 9:23 AM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Mon, Aug 08, 2022 at 05:56:57PM -0700, Hao Luo wrote:
> > > > > On Mon, Aug 8, 2022 at 5:19 PM Andrii Nakryiko
> > > > > <andrii.nakryiko@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Aug 5, 2022 at 2:49 PM Hao Luo <haoluo@google.com> wrote:
> > > > > > >
> > > > > > > Cgroup_iter is a type of bpf_iter. It walks over cgroups in four modes:
> > > > > > >
> > > > > > >  - walking a cgroup's descendants in pre-order.
> > > > > > >  - walking a cgroup's descendants in post-order.
> > > > > > >  - walking a cgroup's ancestors.
> > > > > > >  - process only the given cgroup.
> > > > > > >
> > > [...]
> > > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > > > > index 59a217ca2dfd..4d758b2e70d6 100644
> > > > > > > --- a/include/uapi/linux/bpf.h
> > > > > > > +++ b/include/uapi/linux/bpf.h
> > > > > > > @@ -87,10 +87,37 @@ struct bpf_cgroup_storage_key {
> > > > > > >         __u32   attach_type;            /* program attach type (enum bpf_attach_type) */
> > > > > > >  };
> > > > > > >
> > > > > > > +enum bpf_iter_order {
> > > > > > > +       BPF_ITER_ORDER_DEFAULT = 0,     /* default order. */
> > > > > >
> > > > > > why is this default order necessary? It just adds confusion (I had to
> > > > > > look up source code to know what is default order). I might have
> > > > > > missed some discussion, so if there is some very good reason, then
> > > > > > please document this in commit message. But I'd rather not do some
> > > > > > magical default order instead. We can set 0 to mean invalid and error
> > > > > > out, or just do SELF as the very first value (and if user forgot to
> > > > > > specify more fancy mode, they hopefully will quickly discover this in
> > > > > > their testing).
> > > > > >
> > > > >
> > > > > PRE/POST/UP are tree-specific orders. SELF applies on all iters and
> > > > > yields only a single object. How does task_iter express a non-self
> > > > > order? By non-self, I mean something like "I don't care about the
> > > > > order, just scan _all_ the objects". And this "don't care" order, IMO,
> > > > > may be the common case. I don't think everyone cares about walking
> > > > > order for tasks. The DEFAULT is intentionally put at the first value,
> > > > > so that if users don't care about order, they don't have to specify
> > > > > this field.
> > > > >
> > > > > If that sounds valid, maybe using "UNSPEC" instead of "DEFAULT" is better?
> > > >
> > > > I agree with Andrii.
> > > > This:
> > > > +       if (order == BPF_ITER_ORDER_DEFAULT)
> > > > +               order = BPF_ITER_DESCENDANTS_PRE;
> > > >
> > > > looks like an arbitrary choice.
> > > > imo
> > > > BPF_ITER_DESCENDANTS_PRE = 0,
> > > > would have been more obvious. No need to dig into definition of "default".
> > > >
> > > > UNSPEC = 0
> > > > is fine too if we want user to always be conscious about the order
> > > > and the kernel will error if that field is not initialized.
> > > > That would be my preference, since it will match the rest of uapi/bpf.h
> > > >
> > >
> > > Sounds good. In the next version, will use
> > >
> > > enum bpf_iter_order {
> > >         BPF_ITER_ORDER_UNSPEC = 0,
> > >         BPF_ITER_SELF_ONLY,             /* process only a single object. */
> > >         BPF_ITER_DESCENDANTS_PRE,       /* walk descendants in pre-order. */
> > >         BPF_ITER_DESCENDANTS_POST,      /* walk descendants in post-order. */
> > >         BPF_ITER_ANCESTORS_UP,          /* walk ancestors upward. */
> > > };
> > >
> >
> > Sigh, I find that having UNSPEC=0 and erroring out when seeing UNSPEC
> > doesn't work. Basically, if we have a non-iter prog and a cgroup_iter
> > prog written in the same source file, I can't use
> > bpf_object__attach_skeleton to attach them. Because the default
> > prog_attach_fn for iter initializes `order` to 0 (that is, UNSPEC),
> > which is going to be rejected by the kernel. In order to make
> > bpf_object__attach_skeleton work on cgroup_iter, I think I need to use
> > the following
> >
> > enum bpf_iter_order {

so first of all, this can't be called "bpf_iter_order" as it doesn't
apply to BPF iterators in general. I think this should be called
bpf_iter_cgroup_order (or maybe bpf_cgroup_iter_order) and if/when we
add ability to iterate tasks within cgroups then we'll just reuse enum
bpf_iter_cgroup_order as an extra parameter for task iterator.

And with that future case in mind I do think that we should have 0
being "UNSPEC" case.

> >         BPF_ITER_DESCENDANTS_PRE,       /* walk descendants in pre-order. */
> >         BPF_ITER_DESCENDANTS_POST,      /* walk descendants in post-order. */
> >         BPF_ITER_ANCESTORS_UP,          /* walk ancestors upward. */
> >         BPF_ITER_SELF_ONLY,             /* process only a single object. */
> > };
> >
> > So that when calling bpf_object__attach_skeleton() on cgroup_iter, a
> > link can be generated and the generated link defaults to pre-order
> > walk on the whole hierarchy. Is there a better solution?
> >

I was actually surprised that we specify these additional parameters
at attach (LINK_CREATE) time, and not at bpf_iter_create() call time.
It seems more appropriate to allow to specify such runtime parameters
very late, when we create a specific instance of seq_file. But I guess
this was done because one of the initial motivations for iterators was
to be pinned in BPFFS and read as a file, so it was more convenient to
store such parameters upfront at link creation time to keep
BPF_OBJ_PIN simpler. I guess it makes sense, worst case you'll need to
create multiple bpf_link files, one for each cgroup hierarchy you'd
like to query with the same single BPF program.

But I digress.

As for not being able to auto-attach cgroup iterator. I think that's
sort of expected and is in line with not being able to auto-attach
cgroup programs, as you need cgroup FD at runtime. So even if you had
some reasonable default order, you still would need to specify target
cgroup (either through FD or ID).

So... either don't do skeleton auto-attach, or let's teach libbpf code
to not auto-attach some iter types?

Alternatively, we could teach libbpf to parse some sort of cgroup
iterator spec, like:

SEC("iter/cgroup:/path/to/cgroup:descendants_pre")

But this approach won't work for a bunch of other parameterized
iterators (e.g., task iter, or map elem iter), so I'm hesitant about
adding this to libbpf as a generic functionality.

>
> I think this can be handled by userspace? We can attach the
> cgroup_iter separately first (and maybe we will need to set prog->link
> as well) so that bpf_object__attach_skeleton() doesn't try to attach
> it? I am following this pattern in the selftest in the final patch,
> although I think I might be missing setting prog->link, so I am
> wondering why there are no issues in that selftest which has the same
> scenario that you are talking about.
>
> I think such a pattern will need to be used anyway if the users need
> to set any non-default arguments for the cgroup_iter prog (like the
> selftest), right? The only case we are discussing here is the case
> where the user wants to attach the cgroup_iter with all default
> options (in which case the default order will fail).
> I agree that it might be inconvenient if the default/uninitialized
> options don't work for cgroup_iter, but Alexei pointed out that this
> matches other bpf uapis.
>
> My concern is that in the future we try to reuse enum bpf_iter_order
> to set ordering for other iterators, and then the
> default/uninitialized value (BPF_ITER_DESCENDANTS_PRE) doesn't make
> sense for that iterator (e.g. not a tree). In this case, the same
> problem that we are avoiding for cgroup_iter here will show up for
> that iterator, and we can't easily change it at this point because
> it's uapi.

Yep, valid concern, I agree.

>
>
> > > and explicitly list the values acceptable by cgroup_iter, error out if
> > > UNSPEC is detected.
> > >
> > > Also, following Andrii's comments, will change BPF_ITER_SELF to
> > > BPF_ITER_SELF_ONLY, which does seem a little bit explicit in
> > > comparison.
> > >
> > > > I applied the first 3 patches to ease respin.
> > >
> > > Thanks! This helps!
> > >
> > > > Thanks!

^ permalink raw reply

* [PATCH net-next 5/5] net: wwan: t7xx: Devlink documentation
From: m.chetan.kumar @ 2022-08-16  4:24 UTC (permalink / raw)
  To: netdev
  Cc: kuba, davem, johannes, ryazanov.s.a, loic.poulain, krishna.c.sudi,
	m.chetan.kumar, linuxwwan, Devegowda Chandrashekar

From: M Chetan Kumar <m.chetan.kumar@linux.intel.com>

Document the t7xx devlink commands usage for fw flashing &
coredump collection.

Refer to t7xx.rst file for details.

Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Signed-off-by: Devegowda Chandrashekar <chandrashekar.devegowda@intel.com>
---
 Documentation/networking/devlink/index.rst |   1 +
 Documentation/networking/devlink/t7xx.rst  | 145 +++++++++++++++++++++
 2 files changed, 146 insertions(+)
 create mode 100644 Documentation/networking/devlink/t7xx.rst

diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index 850715512293..ed47de1d3b75 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -66,3 +66,4 @@ parameters, info versions, and other features it supports.
    prestera
    iosm
    octeontx2
+   t7xx
diff --git a/Documentation/networking/devlink/t7xx.rst b/Documentation/networking/devlink/t7xx.rst
new file mode 100644
index 000000000000..c0c83ed2d38b
--- /dev/null
+++ b/Documentation/networking/devlink/t7xx.rst
@@ -0,0 +1,145 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+t7xx devlink support
+====================
+
+This document describes the devlink features implemented by the ``t7xx``
+device driver.
+
+Flash Update
+============
+
+The ``t7xx`` driver implements the flash update using the ``devlink-flash``
+interface.
+
+The driver uses DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT to identify the type of
+firmware image that need to be programmed upon the request by user space application.
+
+The supported list of firmware image types is described below.
+
+.. list-table:: Firmware Image types
+    :widths: 15 85
+
+    * - Name
+      - Description
+    * - ``preloader``
+      - The first-stage bootloader image
+    * - ``loader_ext1``
+      - Preloader extension image
+    * - ``tee1``
+      - ARM trusted firmware and TEE (Trusted Execution Environment) image
+    * - ``lk``
+      - The second-stage bootloader image
+    * - ``spmfw``
+      - MediaTek in-house ASIC for power management image
+    * - ``sspm_1``
+      - MediaTek in-house ASIC for power management under secure world image
+    * - ``mcupm_1``
+      - MediaTek in-house ASIC for cpu power management image
+    * - ``dpm_1``
+      - MediaTek in-house ASIC for dram power management image
+    * - ``boot``
+      - The kernel and dtb image
+    * - ``rootfs``
+      - Root filesystem image
+    * - ``md1img``
+      - Modem image
+    * - ``md1dsp``
+      - Modem DSP image
+    * - ``mcf1``
+      - Modem OTA image (Modem Configuration Framework) for operators
+    * - ``mcf2``
+      - Modem OTA image (Modem Configuration Framework) for OEM vendors
+    * - ``mcf3``
+      - Modem OTA image (other usage) for OEM configurations
+
+``t7xx`` driver uses fastboot protocol for fw flashing. In the fw flashing
+procedure, fastboot command's & response's are exchanged between driver
+and wwan device.
+
+The wwan device is put into fastboot mode via devlink reload command, by
+passing "driver_reinit" action.
+
+$ devlink dev reload pci/0000:$bdf action driver_reinit
+
+Upon completion of fw flashing or coredump collection the wwan device is
+reset to normal mode using devlink reload command, by passing "fw_activate"
+action.
+
+$ devlink dev reload pci/0000:$bdf action fw_activate
+
+Flash Commands:
+===============
+
+$ devlink dev flash pci/0000:$bdf file preloader_k6880v1_mdot2_datacard.bin component "preloader"
+
+$ devlink dev flash pci/0000:$bdf file loader_ext-verified.img component "loader_ext1"
+
+$ devlink dev flash pci/0000:$bdf file tee-verified.img component "tee1"
+
+$ devlink dev flash pci/0000:$bdf file lk-verified.img component "lk"
+
+$ devlink dev flash pci/0000:$bdf file spmfw-verified.img component "spmfw"
+
+$ devlink dev flash pci/0000:$bdf file sspm-verified.img component "sspm_1"
+
+$ devlink dev flash pci/0000:$bdf file mcupm-verified.img component "mcupm_1"
+
+$ devlink dev flash pci/0000:$bdf file dpm-verified.img component "dpm_1"
+
+$ devlink dev flash pci/0000:$bdf file boot-verified.img component "boot"
+
+$ devlink dev flash pci/0000:$bdf file root.squashfs component "rootfs"
+
+$ devlink dev flash pci/0000:$bdf file modem-verified.img component "md1img"
+
+$ devlink dev flash pci/0000:$bdf file dsp-verified.bin component "md1dsp"
+
+$ devlink dev flash pci/0000:$bdf file OP_OTA.img component "mcf1"
+
+$ devlink dev flash pci/0000:$bdf file OEM_OTA.img component "mcf2"
+
+$ devlink dev flash pci/0000:$bdf file DEV_OTA.img component "mcf3"
+
+Note: component "value" represents the partition type to be programmed.
+
+Regions
+=======
+
+The ``t7xx`` driver supports core dump collection when device encounters
+an exception. When wwan device encounters an exception, a snapshot of device
+internal data will be taken by the driver using fastboot commands.
+
+Following regions are accessed for device internal data.
+
+.. list-table:: Regions implemented
+    :widths: 15 85
+
+    * - Name
+      - Description
+    * - ``mr_dump``
+      - The detailed modem components log are captured in this region
+    * - ``lk_dump``
+      - This region dumps the current snapshot of lk
+
+
+Region commands
+===============
+
+$ devlink region show
+
+
+$ devlink region new mr_dump
+
+$ devlink region read mr_dump snapshot 0 address 0 length $len
+
+$ devlink region del mr_dump snapshot 0
+
+$ devlink region new lk_dump
+
+$ devlink region read lk_dump snapshot 0 address 0 length $len
+
+$ devlink region del lk_dump snapshot 0
+
+Note: $len is actual len to be dumped.
-- 
2.34.1


^ permalink raw reply related

* [PATCH net-next 4/5] net: wwan: t7xx: Enable devlink based fw flashing and coredump collection
From: m.chetan.kumar @ 2022-08-16  4:24 UTC (permalink / raw)
  To: netdev
  Cc: kuba, davem, johannes, ryazanov.s.a, loic.poulain, krishna.c.sudi,
	m.chetan.kumar, linuxwwan, Devegowda Chandrashekar,
	Mishra Soumya Prakash

From: M Chetan Kumar <m.chetan.kumar@linux.intel.com>

This patch brings-in support for t7xx wwan device firmware flashing &
coredump collection using devlink.

Driver Registers with Devlink framework.
Implements devlink ops flash_update callback that programs modem firmware.
Creates region & snapshot required for device coredump log collection.
On early detection of wwan device in fastboot mode driver sets up CLDMA0 HW
tx/rx queues for raw data transfer then registers with devlink framework.
Upon receiving firmware image & partition details driver sends fastboot
commands for flashing the firmware.

In this flow the fastboot command & response gets exchanged between driver
and device. Once firmware flashing is success completion status is reported
to user space application.

Below is the devlink command usage for firmware flashing

$devlink dev flash pci/$BDF file ABC.img component ABC

Note: ABC.img is the firmware to be programmed to "ABC" partition.

In case of coredump collection when wwan device encounters an exception
it reboots & stays in fastboot mode for coredump collection by host driver.
On detecting exception state driver collects the core dump, creates the
devlink region & reports an event to user space application for dump
collection. The user space application invokes devlink region read command
for dump collection.

Below are the devlink commands used for coredump collection.

devlink region new pci/$BDF/mr_dump
devlink region read pci/$BDF/mr_dump snapshot $ID address $ADD length $LEN
devlink region del pci/$BDF/mr_dump snapshot $ID

Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Signed-off-by: Devegowda Chandrashekar <chandrashekar.devegowda@intel.com>
Signed-off-by: Mishra Soumya Prakash <soumya.prakash.mishra@intel.com>
---
 drivers/net/wwan/Kconfig                   |   1 +
 drivers/net/wwan/t7xx/Makefile             |   4 +-
 drivers/net/wwan/t7xx/t7xx_pci.c           |  14 +-
 drivers/net/wwan/t7xx/t7xx_pci.h           |   2 +
 drivers/net/wwan/t7xx/t7xx_port.h          |   1 +
 drivers/net/wwan/t7xx/t7xx_port_devlink.c  | 705 +++++++++++++++++++++
 drivers/net/wwan/t7xx/t7xx_port_devlink.h  |  85 +++
 drivers/net/wwan/t7xx/t7xx_port_proxy.c    |   2 +
 drivers/net/wwan/t7xx/t7xx_port_proxy.h    |   1 +
 drivers/net/wwan/t7xx/t7xx_reg.h           |   6 +
 drivers/net/wwan/t7xx/t7xx_state_monitor.c |  36 +-
 drivers/net/wwan/t7xx/t7xx_uevent.c        |  41 ++
 drivers/net/wwan/t7xx/t7xx_uevent.h        |  39 ++
 13 files changed, 932 insertions(+), 5 deletions(-)
 create mode 100644 drivers/net/wwan/t7xx/t7xx_port_devlink.c
 create mode 100644 drivers/net/wwan/t7xx/t7xx_port_devlink.h
 create mode 100644 drivers/net/wwan/t7xx/t7xx_uevent.c
 create mode 100644 drivers/net/wwan/t7xx/t7xx_uevent.h

diff --git a/drivers/net/wwan/Kconfig b/drivers/net/wwan/Kconfig
index 3486ffe94ac4..73b8cc1db0bd 100644
--- a/drivers/net/wwan/Kconfig
+++ b/drivers/net/wwan/Kconfig
@@ -108,6 +108,7 @@ config IOSM
 config MTK_T7XX
 	tristate "MediaTek PCIe 5G WWAN modem T7xx device"
 	depends on PCI
+	select NET_DEVLINK
 	help
 	  Enables MediaTek PCIe based 5G WWAN modem (T7xx series) device.
 	  Adapts WWAN framework and provides network interface like wwan0
diff --git a/drivers/net/wwan/t7xx/Makefile b/drivers/net/wwan/t7xx/Makefile
index df42764b3df8..91ecabf29dd1 100644
--- a/drivers/net/wwan/t7xx/Makefile
+++ b/drivers/net/wwan/t7xx/Makefile
@@ -18,4 +18,6 @@ mtk_t7xx-y:=	t7xx_pci.o \
 		t7xx_hif_dpmaif_rx.o  \
 		t7xx_dpmaif.o \
 		t7xx_netdev.o \
-		t7xx_pci_rescan.o
+		t7xx_pci_rescan.o \
+		t7xx_uevent.o \
+		t7xx_port_devlink.o
diff --git a/drivers/net/wwan/t7xx/t7xx_pci.c b/drivers/net/wwan/t7xx/t7xx_pci.c
index 2f5c6fbe601e..14cdf00cac8e 100644
--- a/drivers/net/wwan/t7xx/t7xx_pci.c
+++ b/drivers/net/wwan/t7xx/t7xx_pci.c
@@ -40,6 +40,7 @@
 #include "t7xx_pci.h"
 #include "t7xx_pci_rescan.h"
 #include "t7xx_pcie_mac.h"
+#include "t7xx_port_devlink.h"
 #include "t7xx_reg.h"
 #include "t7xx_state_monitor.h"
 
@@ -704,16 +705,20 @@ static int t7xx_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	t7xx_pci_infracfg_ao_calc(t7xx_dev);
 	t7xx_mhccif_init(t7xx_dev);
 
-	ret = t7xx_md_init(t7xx_dev);
+	ret = t7xx_devlink_register(t7xx_dev);
 	if (ret)
 		return ret;
 
+	ret = t7xx_md_init(t7xx_dev);
+	if (ret)
+		goto err_devlink_unregister;
+
 	t7xx_pcie_mac_interrupts_dis(t7xx_dev);
 
 	ret = t7xx_interrupt_init(t7xx_dev);
 	if (ret) {
 		t7xx_md_exit(t7xx_dev);
-		return ret;
+		goto err_devlink_unregister;
 	}
 
 	t7xx_rescan_done();
@@ -723,6 +728,10 @@ static int t7xx_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		pci_ignore_hotplug(pdev);
 
 	return 0;
+
+err_devlink_unregister:
+	t7xx_devlink_unregister(t7xx_dev);
+	return ret;
 }
 
 static void t7xx_pci_remove(struct pci_dev *pdev)
@@ -732,6 +741,7 @@ static void t7xx_pci_remove(struct pci_dev *pdev)
 
 	t7xx_dev = pci_get_drvdata(pdev);
 	t7xx_md_exit(t7xx_dev);
+	t7xx_devlink_unregister(t7xx_dev);
 
 	for (i = 0; i < EXT_INT_NUM; i++) {
 		if (!t7xx_dev->intr_handler[i])
diff --git a/drivers/net/wwan/t7xx/t7xx_pci.h b/drivers/net/wwan/t7xx/t7xx_pci.h
index a87c4cae94ef..1017d21aad59 100644
--- a/drivers/net/wwan/t7xx/t7xx_pci.h
+++ b/drivers/net/wwan/t7xx/t7xx_pci.h
@@ -59,6 +59,7 @@ typedef irqreturn_t (*t7xx_intr_callback)(int irq, void *param);
  * @md_pm_lock: protects PCIe sleep lock
  * @sleep_disable_count: PCIe L1.2 lock counter
  * @sleep_lock_acquire: indicates that sleep has been disabled
+ * @dl: devlink struct
  */
 struct t7xx_pci_dev {
 	t7xx_intr_callback	intr_handler[EXT_INT_NUM];
@@ -79,6 +80,7 @@ struct t7xx_pci_dev {
 	spinlock_t		md_pm_lock;		/* Protects PCI resource lock */
 	unsigned int		sleep_disable_count;
 	struct completion	sleep_lock_acquire;
+	struct t7xx_devlink	*dl;
 };
 
 enum t7xx_pm_id {
diff --git a/drivers/net/wwan/t7xx/t7xx_port.h b/drivers/net/wwan/t7xx/t7xx_port.h
index 6a96ee6d9449..070097a658d1 100644
--- a/drivers/net/wwan/t7xx/t7xx_port.h
+++ b/drivers/net/wwan/t7xx/t7xx_port.h
@@ -129,6 +129,7 @@ struct t7xx_port {
 	int				rx_length_th;
 	bool				chan_enable;
 	struct task_struct		*thread;
+	struct t7xx_devlink	*dl;
 };
 
 int t7xx_get_port_mtu(struct t7xx_port *port);
diff --git a/drivers/net/wwan/t7xx/t7xx_port_devlink.c b/drivers/net/wwan/t7xx/t7xx_port_devlink.c
new file mode 100644
index 000000000000..026a1db42f69
--- /dev/null
+++ b/drivers/net/wwan/t7xx/t7xx_port_devlink.c
@@ -0,0 +1,705 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022, Intel Corporation.
+ */
+
+#include <linux/bitfield.h>
+#include <linux/debugfs.h>
+#include <linux/vmalloc.h>
+
+#include "t7xx_hif_cldma.h"
+#include "t7xx_pci_rescan.h"
+#include "t7xx_port_devlink.h"
+#include "t7xx_port_proxy.h"
+#include "t7xx_state_monitor.h"
+#include "t7xx_uevent.h"
+
+static struct t7xx_devlink_region_info t7xx_devlink_region_list[T7XX_TOTAL_REGIONS] = {
+	{"mr_dump", T7XX_MRDUMP_SIZE},
+	{"lk_dump", T7XX_LKDUMP_SIZE},
+};
+
+static int t7xx_devlink_port_read(struct t7xx_port *port, char *buf, size_t count)
+{
+	int ret = 0, read_len;
+	struct sk_buff *skb;
+
+	spin_lock_irq(&port->rx_wq.lock);
+	if (skb_queue_empty(&port->rx_skb_list)) {
+		ret = wait_event_interruptible_locked_irq(port->rx_wq,
+							  !skb_queue_empty(&port->rx_skb_list));
+		if (ret == -ERESTARTSYS) {
+			spin_unlock_irq(&port->rx_wq.lock);
+			return -EINTR;
+		}
+	}
+	skb = skb_dequeue(&port->rx_skb_list);
+	spin_unlock_irq(&port->rx_wq.lock);
+
+	read_len = count > skb->len ? skb->len : count;
+	memcpy(buf, skb->data, read_len);
+	dev_kfree_skb(skb);
+
+	return ret ? ret : read_len;
+}
+
+static int t7xx_devlink_port_write(struct t7xx_port *port, const char *buf, size_t count)
+{
+	const struct t7xx_port_conf *port_conf = port->port_conf;
+	size_t actual_count;
+	struct sk_buff *skb;
+	int ret, txq_mtu;
+
+	txq_mtu = t7xx_get_port_mtu(port);
+	if (txq_mtu < 0)
+		return -EINVAL;
+
+	actual_count = count > txq_mtu ? txq_mtu : count;
+	skb = __dev_alloc_skb(actual_count, GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	skb_put_data(skb, buf, actual_count);
+	ret = t7xx_port_send_raw_skb(port, skb);
+	if (ret) {
+		dev_err(port->dev, "write error on %s, size: %zu, ret: %d\n",
+			port_conf->name, actual_count, ret);
+		dev_kfree_skb(skb);
+		return ret;
+	}
+
+	return actual_count;
+}
+
+static int t7xx_devlink_fb_handle_response(struct t7xx_port *port, int *data)
+{
+	int ret = 0, index = 0, return_data = 0, read_bytes;
+	char status[T7XX_FB_RESPONSE_SIZE + 1];
+
+	while (index < T7XX_FB_RESP_COUNT) {
+		index++;
+		read_bytes = t7xx_devlink_port_read(port, status, T7XX_FB_RESPONSE_SIZE);
+		if (read_bytes < 0) {
+			dev_err(port->dev, "status read failed");
+			ret = -EIO;
+			break;
+		}
+
+		status[read_bytes] = '\0';
+		if (!strncmp(status, T7XX_FB_RESP_INFO, strlen(T7XX_FB_RESP_INFO))) {
+			break;
+		} else if (!strncmp(status, T7XX_FB_RESP_OKAY, strlen(T7XX_FB_RESP_OKAY))) {
+			break;
+		} else if (!strncmp(status, T7XX_FB_RESP_FAIL, strlen(T7XX_FB_RESP_FAIL))) {
+			ret = -EPROTO;
+			break;
+		} else if (!strncmp(status, T7XX_FB_RESP_DATA, strlen(T7XX_FB_RESP_DATA))) {
+			if (data) {
+				if (!kstrtoint(status + strlen(T7XX_FB_RESP_DATA), 16,
+					       &return_data)) {
+					*data = return_data;
+				} else {
+					dev_err(port->dev, "kstrtoint error!\n");
+					ret = -EPROTO;
+				}
+			}
+			break;
+		}
+	}
+
+	return ret;
+}
+
+static int t7xx_devlink_fb_raw_command(char *cmd, struct t7xx_port *port, int *data)
+{
+	int ret, cmd_size = strlen(cmd);
+
+	if (cmd_size > T7XX_FB_COMMAND_SIZE) {
+		dev_err(port->dev, "command length %d is long\n", cmd_size);
+		return -EINVAL;
+	}
+
+	if (cmd_size != t7xx_devlink_port_write(port, cmd, cmd_size)) {
+		dev_err(port->dev, "raw command = %s write failed\n", cmd);
+		return -EIO;
+	}
+
+	dev_dbg(port->dev, "raw command = %s written to the device\n", cmd);
+	ret = t7xx_devlink_fb_handle_response(port, data);
+	if (ret)
+		dev_err(port->dev, "raw command = %s response FAILURE:%d\n", cmd, ret);
+
+	return ret;
+}
+
+static int t7xx_devlink_fb_send_buffer(struct t7xx_port *port, const u8 *buf, size_t size)
+{
+	size_t remaining = size, offset = 0, len;
+	int write_done;
+
+	if (!size)
+		return -EINVAL;
+
+	while (remaining) {
+		len = min_t(size_t, remaining, CLDMA_DEDICATED_Q_BUFF_SZ);
+		write_done = t7xx_devlink_port_write(port, buf + offset, len);
+
+		if (write_done < 0) {
+			dev_err(port->dev, "write to device failed in %s", __func__);
+			return -EIO;
+		} else if (write_done != len) {
+			dev_err(port->dev, "write Error. Only %d/%zu bytes written",
+				write_done, len);
+			return -EIO;
+		}
+
+		remaining -= len;
+		offset += len;
+	}
+
+	return 0;
+}
+
+static int t7xx_devlink_fb_download_command(struct t7xx_port *port, size_t size)
+{
+	char download_command[T7XX_FB_COMMAND_SIZE];
+
+	snprintf(download_command, sizeof(download_command), "%s:%08zx",
+		 T7XX_FB_CMD_DOWNLOAD, size);
+	return t7xx_devlink_fb_raw_command(download_command, port, NULL);
+}
+
+static int t7xx_devlink_fb_download(struct t7xx_port *port, const u8 *buf, size_t size)
+{
+	int ret;
+
+	if (size <= 0 || size > SIZE_MAX) {
+		dev_err(port->dev, "file is too large to download");
+		return -EINVAL;
+	}
+
+	ret = t7xx_devlink_fb_download_command(port, size);
+	if (ret)
+		return ret;
+
+	ret = t7xx_devlink_fb_send_buffer(port, buf, size);
+	if (ret)
+		return ret;
+
+	return t7xx_devlink_fb_handle_response(port, NULL);
+}
+
+static int t7xx_devlink_fb_flash(const char *cmd, struct t7xx_port *port)
+{
+	char flash_command[T7XX_FB_COMMAND_SIZE];
+
+	snprintf(flash_command, sizeof(flash_command), "%s:%s", T7XX_FB_CMD_FLASH, cmd);
+	return t7xx_devlink_fb_raw_command(flash_command, port, NULL);
+}
+
+static int t7xx_devlink_fb_flash_partition(const char *partition, const u8 *buf,
+					   struct t7xx_port *port, size_t size)
+{
+	int ret;
+
+	ret = t7xx_devlink_fb_download(port, buf, size);
+	if (ret)
+		return ret;
+
+	return t7xx_devlink_fb_flash(partition, port);
+}
+
+static int t7xx_devlink_fb_get_core(struct t7xx_port *port)
+{
+	struct t7xx_devlink_region_info *mrdump_region;
+	char mrdump_complete_event[T7XX_FB_EVENT_SIZE];
+	u32 mrd_mb = T7XX_MRDUMP_SIZE / (1024 * 1024);
+	struct t7xx_devlink *dl = port->dl;
+	int clen, dlen = 0, result = 0;
+	unsigned long long zipsize = 0;
+	char mcmd[T7XX_FB_MCMD_SIZE];
+	size_t offset_dlen = 0;
+	char *mdata;
+
+	set_bit(T7XX_MRDUMP_STATUS, &dl->status);
+	mdata = kmalloc(T7XX_FB_MDATA_SIZE, GFP_KERNEL);
+	if (!mdata) {
+		result = -ENOMEM;
+		goto get_core_exit;
+	}
+
+	mrdump_region = dl->dl_region_info[T7XX_MRDUMP_INDEX];
+	mrdump_region->dump = vmalloc(mrdump_region->default_size);
+	if (!mrdump_region->dump) {
+		kfree(mdata);
+		result = -ENOMEM;
+		goto get_core_exit;
+	}
+
+	result = t7xx_devlink_fb_raw_command(T7XX_FB_CMD_OEM_MRDUMP, port, NULL);
+	if (result) {
+		dev_err(port->dev, "%s command failed\n", T7XX_FB_CMD_OEM_MRDUMP);
+		vfree(mrdump_region->dump);
+		kfree(mdata);
+		goto get_core_exit;
+	}
+
+	while (mrdump_region->default_size > offset_dlen) {
+		clen = t7xx_devlink_port_read(port, mcmd, sizeof(mcmd));
+		if (clen == strlen(T7XX_FB_CMD_RTS) &&
+		    (!strncmp(mcmd, T7XX_FB_CMD_RTS, strlen(T7XX_FB_CMD_RTS)))) {
+			memset(mdata, 0, T7XX_FB_MDATA_SIZE);
+			dlen = 0;
+			memset(mcmd, 0, sizeof(mcmd));
+			clen = snprintf(mcmd, sizeof(mcmd), "%s", T7XX_FB_CMD_CTS);
+
+			if (t7xx_devlink_port_write(port, mcmd, clen) != clen) {
+				dev_err(port->dev, "write for _CTS failed:%d\n", clen);
+				goto get_core_free_mem;
+			}
+
+			dlen = t7xx_devlink_port_read(port, mdata, T7XX_FB_MDATA_SIZE);
+			if (dlen <= 0) {
+				dev_err(port->dev, "read data error(%d)\n", dlen);
+				goto get_core_free_mem;
+			}
+
+			zipsize += (unsigned long long)(dlen);
+			memcpy(mrdump_region->dump + offset_dlen, mdata, dlen);
+			offset_dlen += dlen;
+			memset(mcmd, 0, sizeof(mcmd));
+			clen = snprintf(mcmd, sizeof(mcmd), "%s", T7XX_FB_CMD_FIN);
+			if (t7xx_devlink_port_write(port, mcmd, clen) != clen) {
+				dev_err(port->dev, "%s: _FIN failed, (Read %05d:%05llu)\n",
+					__func__, clen, zipsize);
+				goto get_core_free_mem;
+			}
+		} else if ((clen == strlen(T7XX_FB_RESP_MRDUMP_DONE)) &&
+			  (!strncmp(mcmd, T7XX_FB_RESP_MRDUMP_DONE,
+				    strlen(T7XX_FB_RESP_MRDUMP_DONE)))) {
+			dev_dbg(port->dev, "%s! size:%zd\n", T7XX_FB_RESP_MRDUMP_DONE, offset_dlen);
+			mrdump_region->actual_size = offset_dlen;
+			snprintf(mrdump_complete_event, sizeof(mrdump_complete_event),
+				 "%s size=%zu", T7XX_UEVENT_MRDUMP_READY, offset_dlen);
+			t7xx_uevent_send(dl->dev, mrdump_complete_event);
+			kfree(mdata);
+			result = 0;
+			goto get_core_exit;
+		} else {
+			dev_err(port->dev, "getcore protocol error (read len %05d)\n", clen);
+			goto get_core_free_mem;
+		}
+	}
+
+	dev_err(port->dev, "mrdump exceeds %uMB size. Discarded!", mrd_mb);
+	t7xx_uevent_send(port->dev, T7XX_UEVENT_MRD_DISCD);
+
+get_core_free_mem:
+	kfree(mdata);
+	vfree(mrdump_region->dump);
+	clear_bit(T7XX_MRDUMP_STATUS, &dl->status);
+	return -EPROTO;
+
+get_core_exit:
+	clear_bit(T7XX_MRDUMP_STATUS, &dl->status);
+	return result;
+}
+
+static int t7xx_devlink_fb_dump_log(struct t7xx_port *port)
+{
+	struct t7xx_devlink_region_info *lkdump_region;
+	char lkdump_complete_event[T7XX_FB_EVENT_SIZE];
+	struct t7xx_devlink *dl = port->dl;
+	int dlen, datasize = 0, result;
+	size_t offset_dlen = 0;
+	u8 *data;
+
+	set_bit(T7XX_LKDUMP_STATUS, &dl->status);
+	result = t7xx_devlink_fb_raw_command(T7XX_FB_CMD_OEM_LKDUMP, port, &datasize);
+	if (result) {
+		dev_err(port->dev, "%s command returns failure\n", T7XX_FB_CMD_OEM_LKDUMP);
+		goto lkdump_exit;
+	}
+
+	lkdump_region = dl->dl_region_info[T7XX_LKDUMP_INDEX];
+	if (datasize > lkdump_region->default_size) {
+		dev_err(port->dev, "lkdump size is more than %dKB. Discarded!",
+			T7XX_LKDUMP_SIZE / 1024);
+		t7xx_uevent_send(dl->dev, T7XX_UEVENT_LKD_DISCD);
+		result = -EPROTO;
+		goto lkdump_exit;
+	}
+
+	data = kzalloc(datasize, GFP_KERNEL);
+	if (!data) {
+		result = -ENOMEM;
+		goto lkdump_exit;
+	}
+
+	lkdump_region->dump = vmalloc(lkdump_region->default_size);
+	if (!lkdump_region->dump) {
+		kfree(data);
+		result = -ENOMEM;
+		goto lkdump_exit;
+	}
+
+	while (datasize > 0) {
+		dlen = t7xx_devlink_port_read(port, data, datasize);
+		if (dlen <= 0) {
+			dev_err(port->dev, "lkdump read error ret = %d", dlen);
+			kfree(data);
+			result = -EPROTO;
+			goto lkdump_exit;
+		}
+
+		memcpy(lkdump_region->dump + offset_dlen, data, dlen);
+		datasize -= dlen;
+		offset_dlen += dlen;
+	}
+
+	dev_dbg(port->dev, "LKDUMP DONE! size:%zd\n", offset_dlen);
+	lkdump_region->actual_size = offset_dlen;
+	snprintf(lkdump_complete_event, sizeof(lkdump_complete_event), "%s size=%zu",
+		 T7XX_UEVENT_LKDUMP_READY, offset_dlen);
+	t7xx_uevent_send(dl->dev, lkdump_complete_event);
+	kfree(data);
+	clear_bit(T7XX_LKDUMP_STATUS, &dl->status);
+	return t7xx_devlink_fb_handle_response(port, NULL);
+
+lkdump_exit:
+	clear_bit(T7XX_LKDUMP_STATUS, &dl->status);
+	return result;
+}
+
+static int t7xx_devlink_flash_update(struct devlink *devlink,
+				     struct devlink_flash_update_params *params,
+				     struct netlink_ext_ack *extack)
+{
+	struct t7xx_devlink *dl = devlink_priv(devlink);
+	const char *component = params->component;
+	const struct firmware *fw = params->fw;
+	char flash_event[T7XX_FB_EVENT_SIZE];
+	struct t7xx_port *port;
+	int ret;
+
+	port = dl->port;
+	if (port->dl->mode != T7XX_FB_DL_MODE) {
+		dev_err(port->dev, "Modem is not in fastboot download mode!");
+		ret = -EPERM;
+		goto err_out;
+	}
+
+	if (dl->status != T7XX_DEVLINK_IDLE) {
+		dev_err(port->dev, "Modem is busy!");
+		ret = -EBUSY;
+		goto err_out;
+	}
+
+	if (!component || !fw->data) {
+		ret = -EINVAL;
+		goto err_out;
+	}
+
+	set_bit(T7XX_FLASH_STATUS, &dl->status);
+	dev_dbg(port->dev, "flash partition name:%s binary size:%zu\n", component, fw->size);
+	ret = t7xx_devlink_fb_flash_partition(component, fw->data, port, fw->size);
+	if (ret) {
+		devlink_flash_update_status_notify(devlink, "flashing failure!",
+						   params->component, 0, 0);
+		snprintf(flash_event, sizeof(flash_event), "%s for [%s]",
+			 T7XX_UEVENT_FLASHING_FAILURE, params->component);
+	} else {
+		devlink_flash_update_status_notify(devlink, "flashing success!",
+						   params->component, 0, 0);
+		snprintf(flash_event, sizeof(flash_event), "%s for [%s]",
+			 T7XX_UEVENT_FLASHING_SUCCESS, params->component);
+	}
+
+	t7xx_uevent_send(dl->dev, flash_event);
+
+err_out:
+	clear_bit(T7XX_FLASH_STATUS, &dl->status);
+	return ret;
+}
+
+static int t7xx_devlink_reload_down(struct devlink *devlink, bool netns_change,
+				    enum devlink_reload_action action,
+				    enum devlink_reload_limit limit,
+				    struct netlink_ext_ack *extack)
+{
+	struct t7xx_devlink *dl = devlink_priv(devlink);
+
+	switch (action) {
+	case DEVLINK_RELOAD_ACTION_DRIVER_REINIT:
+		dl->set_fastboot_dl = 1;
+		return 0;
+	case DEVLINK_RELOAD_ACTION_FW_ACTIVATE:
+		return t7xx_devlink_fb_raw_command(T7XX_FB_CMD_REBOOT, dl->port, NULL);
+	default:
+		/* Unsupported action should not get to this function */
+		return -EOPNOTSUPP;
+	}
+}
+
+static int t7xx_devlink_reload_up(struct devlink *devlink,
+				  enum devlink_reload_action action,
+				  enum devlink_reload_limit limit,
+				  u32 *actions_performed,
+				  struct netlink_ext_ack *extack)
+{
+	struct t7xx_devlink *dl = devlink_priv(devlink);
+	*actions_performed = BIT(action);
+	switch (action) {
+	case DEVLINK_RELOAD_ACTION_DRIVER_REINIT:
+	case DEVLINK_RELOAD_ACTION_FW_ACTIVATE:
+		t7xx_rescan_queue_work(dl->mtk_dev->pdev);
+		return 0;
+	default:
+		/* Unsupported action should not get to this function */
+		return -EOPNOTSUPP;
+	}
+}
+
+/* Call back function for devlink ops */
+static const struct devlink_ops devlink_flash_ops = {
+	.supported_flash_update_params = DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT,
+	.flash_update = t7xx_devlink_flash_update,
+	.reload_actions = BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT) |
+			  BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE),
+	.reload_down = t7xx_devlink_reload_down,
+	.reload_up = t7xx_devlink_reload_up,
+};
+
+static int t7xx_devlink_region_snapshot(struct devlink *dl, const struct devlink_region_ops *ops,
+					struct netlink_ext_ack *extack, u8 **data)
+{
+	struct t7xx_devlink_region_info *region_info = ops->priv;
+	struct t7xx_devlink *t7xx_dl = devlink_priv(dl);
+	u8 *snapshot_mem;
+
+	if (t7xx_dl->status != T7XX_DEVLINK_IDLE) {
+		dev_err(t7xx_dl->dev, "Modem is busy!");
+		return -EBUSY;
+	}
+
+	dev_dbg(t7xx_dl->dev, "accessed devlink region:%s index:%d", ops->name, region_info->entry);
+	if (!strncmp(ops->name, "mr_dump", strlen("mr_dump"))) {
+		if (!region_info->dump) {
+			dev_err(t7xx_dl->dev, "devlink region:%s dump memory is not valid!",
+				region_info->region_name);
+			return -ENOMEM;
+		}
+
+		snapshot_mem = vmalloc(region_info->default_size);
+		if (!snapshot_mem)
+			return -ENOMEM;
+
+		memcpy(snapshot_mem, region_info->dump, region_info->default_size);
+		*data = snapshot_mem;
+	} else if (!strncmp(ops->name, "lk_dump", strlen("lk_dump"))) {
+		int ret;
+
+		ret = t7xx_devlink_fb_dump_log(t7xx_dl->port);
+		if (ret)
+			return ret;
+
+		*data = region_info->dump;
+	}
+
+	return 0;
+}
+
+/* To create regions for dump files */
+static int t7xx_devlink_create_region(struct t7xx_devlink *dl)
+{
+	struct devlink_region_ops *region_ops;
+	int rc, i;
+
+	region_ops = dl->dl_region_ops;
+	for (i = 0; i < T7XX_TOTAL_REGIONS; i++) {
+		region_ops[i].name = t7xx_devlink_region_list[i].region_name;
+		region_ops[i].snapshot = t7xx_devlink_region_snapshot;
+		region_ops[i].destructor = vfree;
+		dl->dl_region[i] =
+		devlink_region_create(dl->dl_ctx, &region_ops[i], T7XX_MAX_SNAPSHOTS,
+				      t7xx_devlink_region_list[i].default_size);
+
+		if (IS_ERR(dl->dl_region[i])) {
+			rc = PTR_ERR(dl->dl_region[i]);
+			dev_err(dl->dev, "devlink region fail,err %d", rc);
+			for ( ; i >= 0; i--)
+				devlink_region_destroy(dl->dl_region[i]);
+
+			return rc;
+		}
+
+		t7xx_devlink_region_list[i].entry = i;
+		region_ops[i].priv = t7xx_devlink_region_list + i;
+	}
+
+	return 0;
+}
+
+/* To Destroy devlink regions */
+static void t7xx_devlink_destroy_region(struct t7xx_devlink *dl)
+{
+	u8 i;
+
+	for (i = 0; i < T7XX_TOTAL_REGIONS; i++)
+		devlink_region_destroy(dl->dl_region[i]);
+}
+
+int t7xx_devlink_register(struct t7xx_pci_dev *t7xx_dev)
+{
+	struct devlink *dl_ctx;
+
+	dl_ctx = devlink_alloc(&devlink_flash_ops, sizeof(struct t7xx_devlink),
+			       &t7xx_dev->pdev->dev);
+	if (!dl_ctx)
+		return -ENOMEM;
+
+	devlink_set_features(dl_ctx, DEVLINK_F_RELOAD);
+	devlink_register(dl_ctx);
+	t7xx_dev->dl = devlink_priv(dl_ctx);
+	t7xx_dev->dl->dl_ctx = dl_ctx;
+
+	return 0;
+}
+
+void t7xx_devlink_unregister(struct t7xx_pci_dev *t7xx_dev)
+{
+	struct devlink *dl_ctx = priv_to_devlink(t7xx_dev->dl);
+
+	devlink_unregister(dl_ctx);
+	devlink_free(dl_ctx);
+}
+
+/**
+ * t7xx_devlink_region_init - Initialize/register devlink to t7xx driver
+ * @port: Pointer to port structure
+ * @dw: Pointer to devlink work structure
+ * @wq: Pointer to devlink workqueue structure
+ *
+ * Returns: Pointer to t7xx_devlink on success and NULL on failure
+ */
+static struct t7xx_devlink *t7xx_devlink_region_init(struct t7xx_port *port,
+						     struct t7xx_devlink_work *dw,
+						     struct workqueue_struct *wq)
+{
+	struct t7xx_pci_dev *mtk_dev = port->t7xx_dev;
+	struct t7xx_devlink *dl = mtk_dev->dl;
+	int rc, i;
+
+	dl->dl_ctx = mtk_dev->dl->dl_ctx;
+	dl->mtk_dev = mtk_dev;
+	dl->dev = &mtk_dev->pdev->dev;
+	dl->mode = T7XX_FB_NO_MODE;
+	dl->status = T7XX_DEVLINK_IDLE;
+	dl->dl_work = dw;
+	dl->dl_wq = wq;
+	for (i = 0; i < T7XX_TOTAL_REGIONS; i++) {
+		dl->dl_region_info[i] = &t7xx_devlink_region_list[i];
+		dl->dl_region_info[i]->dump = NULL;
+	}
+	dl->port = port;
+	port->dl = dl;
+
+	rc = t7xx_devlink_create_region(dl);
+	if (rc) {
+		dev_err(dl->dev, "devlink region creation failed, rc %d", rc);
+		return NULL;
+	}
+
+	return dl;
+}
+
+/**
+ * t7xx_devlink_region_deinit - To unintialize the devlink from T7XX driver.
+ * @dl:        Devlink instance
+ */
+static void t7xx_devlink_region_deinit(struct t7xx_devlink *dl)
+{
+	dl->mode = T7XX_FB_NO_MODE;
+	t7xx_devlink_destroy_region(dl);
+}
+
+static void t7xx_devlink_work_handler(struct work_struct *data)
+{
+	struct t7xx_devlink_work *dl_work;
+
+	dl_work = container_of(data, struct t7xx_devlink_work, work);
+	t7xx_devlink_fb_get_core(dl_work->port);
+}
+
+static int t7xx_devlink_init(struct t7xx_port *port)
+{
+	struct t7xx_devlink_work *dl_work;
+	struct workqueue_struct *wq;
+
+	dl_work = kmalloc(sizeof(*dl_work), GFP_KERNEL);
+	if (!dl_work)
+		return -ENOMEM;
+
+	wq = create_workqueue("t7xx_devlink");
+	if (!wq) {
+		kfree(dl_work);
+		dev_err(port->dev, "create_workqueue failed\n");
+		return -ENODATA;
+	}
+
+	INIT_WORK(&dl_work->work, t7xx_devlink_work_handler);
+	dl_work->port = port;
+	port->rx_length_th = T7XX_MAX_QUEUE_LENGTH;
+
+	if (!t7xx_devlink_region_init(port, dl_work, wq))
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void t7xx_devlink_uninit(struct t7xx_port *port)
+{
+	struct t7xx_devlink *dl = port->dl;
+	struct sk_buff *skb;
+	unsigned long flags;
+
+	vfree(dl->dl_region_info[T7XX_MRDUMP_INDEX]->dump);
+	if (dl->dl_wq)
+		destroy_workqueue(dl->dl_wq);
+	kfree(dl->dl_work);
+
+	t7xx_devlink_region_deinit(port->dl);
+	spin_lock_irqsave(&port->rx_skb_list.lock, flags);
+	while ((skb = __skb_dequeue(&port->rx_skb_list)) != NULL)
+		dev_kfree_skb(skb);
+	spin_unlock_irqrestore(&port->rx_skb_list.lock, flags);
+}
+
+static int t7xx_devlink_enable_chl(struct t7xx_port *port)
+{
+	spin_lock(&port->port_update_lock);
+	port->chan_enable = true;
+	spin_unlock(&port->port_update_lock);
+
+	if (port->dl->dl_wq && port->dl->mode == T7XX_FB_DUMP_MODE)
+		queue_work(port->dl->dl_wq, &port->dl->dl_work->work);
+
+	return 0;
+}
+
+static int t7xx_devlink_disable_chl(struct t7xx_port *port)
+{
+	spin_lock(&port->port_update_lock);
+	port->chan_enable = false;
+	spin_unlock(&port->port_update_lock);
+
+	return 0;
+}
+
+struct port_ops devlink_port_ops = {
+	.init = &t7xx_devlink_init,
+	.recv_skb = &t7xx_port_enqueue_skb,
+	.uninit = &t7xx_devlink_uninit,
+	.enable_chl = &t7xx_devlink_enable_chl,
+	.disable_chl = &t7xx_devlink_disable_chl,
+};
diff --git a/drivers/net/wwan/t7xx/t7xx_port_devlink.h b/drivers/net/wwan/t7xx/t7xx_port_devlink.h
new file mode 100644
index 000000000000..85384e40519e
--- /dev/null
+++ b/drivers/net/wwan/t7xx/t7xx_port_devlink.h
@@ -0,0 +1,85 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ *
+ * Copyright (c) 2022, Intel Corporation.
+ */
+
+#ifndef __T7XX_PORT_DEVLINK_H__
+#define __T7XX_PORT_DEVLINK_H__
+
+#include <net/devlink.h>
+
+#include "t7xx_pci.h"
+
+#define T7XX_MAX_QUEUE_LENGTH 32
+#define T7XX_FB_COMMAND_SIZE  64
+#define T7XX_FB_RESPONSE_SIZE 64
+#define T7XX_FB_MCMD_SIZE     64
+#define T7XX_FB_MDATA_SIZE    1024
+#define T7XX_FB_RESP_COUNT    30
+
+#define T7XX_FB_CMD_RTS          "_RTS"
+#define T7XX_FB_CMD_CTS          "_CTS"
+#define T7XX_FB_CMD_FIN          "_FIN"
+#define T7XX_FB_CMD_OEM_MRDUMP   "oem mrdump"
+#define T7XX_FB_CMD_OEM_LKDUMP   "oem dump_pllk_log"
+#define T7XX_FB_CMD_DOWNLOAD     "download"
+#define T7XX_FB_CMD_FLASH        "flash"
+#define T7XX_FB_CMD_REBOOT       "reboot"
+#define T7XX_FB_RESP_MRDUMP_DONE "MRDUMP08_DONE"
+#define T7XX_FB_RESP_OKAY        "OKAY"
+#define T7XX_FB_RESP_FAIL        "FAIL"
+#define T7XX_FB_RESP_DATA        "DATA"
+#define T7XX_FB_RESP_INFO        "INFO"
+
+#define T7XX_FB_EVENT_SIZE      50
+
+#define T7XX_MAX_SNAPSHOTS  1
+#define T7XX_MAX_REGION_NAME_LENGTH 20
+#define T7XX_MRDUMP_SIZE    (160 * 1024 * 1024)
+#define T7XX_LKDUMP_SIZE    (256 * 1024)
+#define T7XX_TOTAL_REGIONS  2
+
+#define T7XX_FLASH_STATUS   0
+#define T7XX_MRDUMP_STATUS  1
+#define T7XX_LKDUMP_STATUS  2
+#define T7XX_DEVLINK_IDLE   0
+
+#define T7XX_FB_NO_MODE     0
+#define T7XX_FB_DL_MODE     1
+#define T7XX_FB_DUMP_MODE   2
+
+#define T7XX_MRDUMP_INDEX   0
+#define T7XX_LKDUMP_INDEX   1
+
+struct t7xx_devlink_work {
+	struct work_struct work;
+	struct t7xx_port *port;
+};
+
+struct t7xx_devlink_region_info {
+	char region_name[T7XX_MAX_REGION_NAME_LENGTH];
+	u32 default_size;
+	u32 actual_size;
+	u32 entry;
+	u8 *dump;
+};
+
+struct t7xx_devlink {
+	struct t7xx_pci_dev *mtk_dev;
+	struct t7xx_port *port;
+	struct device *dev;
+	struct devlink *dl_ctx;
+	struct t7xx_devlink_work *dl_work;
+	struct workqueue_struct *dl_wq;
+	struct t7xx_devlink_region_info *dl_region_info[T7XX_TOTAL_REGIONS];
+	struct devlink_region_ops dl_region_ops[T7XX_TOTAL_REGIONS];
+	struct devlink_region *dl_region[T7XX_TOTAL_REGIONS];
+	u8 mode;
+	unsigned long status;
+	int set_fastboot_dl;
+};
+
+int t7xx_devlink_register(struct t7xx_pci_dev *t7xx_dev);
+void t7xx_devlink_unregister(struct t7xx_pci_dev *t7xx_dev);
+
+#endif /*__T7XX_PORT_DEVLINK_H__*/
diff --git a/drivers/net/wwan/t7xx/t7xx_port_proxy.c b/drivers/net/wwan/t7xx/t7xx_port_proxy.c
index 7582777cf94d..fdf0c6e5ed6d 100644
--- a/drivers/net/wwan/t7xx/t7xx_port_proxy.c
+++ b/drivers/net/wwan/t7xx/t7xx_port_proxy.c
@@ -98,6 +98,7 @@ static struct t7xx_port_conf t7xx_early_port_conf[] = {
 		.rxq_exp_index = 1,
 		.path_id = CLDMA_ID_AP,
 		.is_early_port = true,
+		.ops = &devlink_port_ops,
 		.name = "ttyDUMP",
 	},
 };
@@ -493,6 +494,7 @@ static void t7xx_proxy_init_all_ports(struct t7xx_modem *md)
 
 		port->t7xx_dev = md->t7xx_dev;
 		port->dev = &md->t7xx_dev->pdev->dev;
+		port->dl = md->t7xx_dev->dl;
 		spin_lock_init(&port->port_update_lock);
 		port->chan_enable = false;
 
diff --git a/drivers/net/wwan/t7xx/t7xx_port_proxy.h b/drivers/net/wwan/t7xx/t7xx_port_proxy.h
index 33caf85f718a..7298a2d09fa0 100644
--- a/drivers/net/wwan/t7xx/t7xx_port_proxy.h
+++ b/drivers/net/wwan/t7xx/t7xx_port_proxy.h
@@ -93,6 +93,7 @@ struct ctrl_msg_header {
 /* Port operations mapping */
 extern struct port_ops wwan_sub_port_ops;
 extern struct port_ops ctl_port_ops;
+extern struct port_ops devlink_port_ops;
 
 void t7xx_port_proxy_reset(struct port_proxy *port_prox);
 void t7xx_port_proxy_uninit(struct port_proxy *port_prox);
diff --git a/drivers/net/wwan/t7xx/t7xx_reg.h b/drivers/net/wwan/t7xx/t7xx_reg.h
index 60e025e57baa..3a758bf79a4e 100644
--- a/drivers/net/wwan/t7xx/t7xx_reg.h
+++ b/drivers/net/wwan/t7xx/t7xx_reg.h
@@ -101,11 +101,17 @@ enum t7xx_pm_resume_state {
 	PM_RESUME_REG_STATE_L2_EXP,
 };
 
+enum host_event_e {
+	HOST_EVENT_INIT = 0,
+	FASTBOOT_DL_NOTY = 0x3,
+};
+
 #define T7XX_PCIE_MISC_DEV_STATUS		0x0d1c
 #define MISC_RESET_TYPE_FLDR			BIT(27)
 #define MISC_RESET_TYPE_PLDR			BIT(26)
 #define MISC_DEV_STATUS_MASK			GENMASK(15, 0)
 #define LK_EVENT_MASK				GENMASK(11, 8)
+#define HOST_EVENT_MASK			GENMASK(31, 28)
 
 enum lk_event_id {
 	LK_EVENT_NORMAL = 0,
diff --git a/drivers/net/wwan/t7xx/t7xx_state_monitor.c b/drivers/net/wwan/t7xx/t7xx_state_monitor.c
index 9c222809371b..00e143c8d568 100644
--- a/drivers/net/wwan/t7xx/t7xx_state_monitor.c
+++ b/drivers/net/wwan/t7xx/t7xx_state_monitor.c
@@ -38,10 +38,12 @@
 #include "t7xx_netdev.h"
 #include "t7xx_pci.h"
 #include "t7xx_pcie_mac.h"
+#include "t7xx_port_devlink.h"
 #include "t7xx_port_proxy.h"
 #include "t7xx_pci_rescan.h"
 #include "t7xx_reg.h"
 #include "t7xx_state_monitor.h"
+#include "t7xx_uevent.h"
 
 #define FSM_DRM_DISABLE_DELAY_MS		200
 #define FSM_EVENT_POLL_INTERVAL_MS		20
@@ -212,6 +214,16 @@ static void fsm_routine_exception(struct t7xx_fsm_ctl *ctl, struct t7xx_fsm_comm
 		fsm_finish_command(ctl, cmd, 0);
 }
 
+static void t7xx_host_event_notify(struct t7xx_modem *md, unsigned int event_id)
+{
+	u32 value;
+
+	value = ioread32(IREG_BASE(md->t7xx_dev) + T7XX_PCIE_MISC_DEV_STATUS);
+	value &= ~HOST_EVENT_MASK;
+	value |= FIELD_PREP(HOST_EVENT_MASK, event_id);
+	iowrite32(value, IREG_BASE(md->t7xx_dev) + T7XX_PCIE_MISC_DEV_STATUS);
+}
+
 static void t7xx_lk_stage_event_handling(struct t7xx_fsm_ctl *ctl, unsigned int dev_status)
 {
 	struct t7xx_modem *md = ctl->md;
@@ -228,6 +240,7 @@ static void t7xx_lk_stage_event_handling(struct t7xx_fsm_ctl *ctl, unsigned int
 		break;
 
 	case LK_EVENT_CREATE_PD_PORT:
+	case LK_EVENT_CREATE_POST_DL_PORT:
 		md_ctrl = md->md_ctrl[CLDMA_ID_AP];
 		t7xx_cldma_hif_hw_init(md_ctrl);
 		t7xx_cldma_stop(md_ctrl);
@@ -239,8 +252,16 @@ static void t7xx_lk_stage_event_handling(struct t7xx_fsm_ctl *ctl, unsigned int
 			return;
 		}
 
+		if (lk_event == LK_EVENT_CREATE_PD_PORT)
+			port->dl->mode = T7XX_FB_DUMP_MODE;
+		else
+			port->dl->mode = T7XX_FB_DL_MODE;
 		port->port_conf->ops->enable_chl(port);
 		t7xx_cldma_start(md_ctrl);
+		if (lk_event == LK_EVENT_CREATE_PD_PORT)
+			t7xx_uevent_send(dev, T7XX_UEVENT_MODEM_FASTBOOT_DUMP_MODE);
+		else
+			t7xx_uevent_send(dev, T7XX_UEVENT_MODEM_FASTBOOT_DL_MODE);
 		break;
 
 	case LK_EVENT_RESET:
@@ -289,13 +310,23 @@ static void fsm_routine_stopping(struct t7xx_fsm_ctl *ctl, struct t7xx_fsm_comma
 	t7xx_cldma_stop(md_ctrl);
 
 	if (!ctl->md->rgu_irq_asserted) {
+		if (t7xx_dev->dl->set_fastboot_dl)
+			t7xx_host_event_notify(ctl->md, FASTBOOT_DL_NOTY);
+
 		t7xx_mhccif_h2d_swint_trigger(t7xx_dev, H2D_CH_DRM_DISABLE_AP);
 		/* Wait for the DRM disable to take effect */
 		msleep(FSM_DRM_DISABLE_DELAY_MS);
 
-		err = t7xx_acpi_fldr_func(t7xx_dev);
-		if (err)
+		if (t7xx_dev->dl->set_fastboot_dl) {
+			/* Do not try fldr because device will always wait for
+			 * MHCCIF bit 13 in fastboot download flow.
+			 */
 			t7xx_mhccif_h2d_swint_trigger(t7xx_dev, H2D_CH_DEVICE_RESET);
+		} else {
+			err = t7xx_acpi_fldr_func(t7xx_dev);
+			if (err)
+				t7xx_mhccif_h2d_swint_trigger(t7xx_dev, H2D_CH_DEVICE_RESET);
+		}
 	}
 
 	fsm_finish_command(ctl, cmd, fsm_stopped_handler(ctl));
@@ -318,6 +349,7 @@ static void fsm_routine_ready(struct t7xx_fsm_ctl *ctl)
 
 	ctl->curr_state = FSM_STATE_READY;
 	t7xx_fsm_broadcast_ready_state(ctl);
+	t7xx_uevent_send(&md->t7xx_dev->pdev->dev, T7XX_UEVENT_MODEM_READY);
 	t7xx_md_event_notify(md, FSM_READY);
 }
 
diff --git a/drivers/net/wwan/t7xx/t7xx_uevent.c b/drivers/net/wwan/t7xx/t7xx_uevent.c
new file mode 100644
index 000000000000..5a320cf3f94b
--- /dev/null
+++ b/drivers/net/wwan/t7xx/t7xx_uevent.c
@@ -0,0 +1,41 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022, Intel Corporation.
+ */
+
+#include <linux/slab.h>
+
+#include "t7xx_uevent.h"
+
+/* Update the uevent in work queue context */
+static void t7xx_uevent_work(struct work_struct *data)
+{
+	struct t7xx_uevent_info *info;
+	char *envp[2] = { NULL, NULL };
+
+	info = container_of(data, struct t7xx_uevent_info, work);
+	envp[0] = info->uevent;
+
+	if (kobject_uevent_env(&info->dev->kobj, KOBJ_CHANGE, envp))
+		pr_err("uevent %s failed to sent", info->uevent);
+
+	kfree(info);
+}
+
+/**
+ * t7xx_uevent_send - Send modem event to user space.
+ * @dev:	Generic device pointer
+ * @uevent:	Uevent information
+ */
+void t7xx_uevent_send(struct device *dev, char *uevent)
+{
+	struct t7xx_uevent_info *info = kzalloc(sizeof(*info), GFP_ATOMIC);
+
+	if (!info)
+		return;
+
+	INIT_WORK(&info->work, t7xx_uevent_work);
+	info->dev = dev;
+	snprintf(info->uevent, T7XX_MAX_UEVENT_LEN, "T7XX_EVENT=%s", uevent);
+	schedule_work(&info->work);
+}
diff --git a/drivers/net/wwan/t7xx/t7xx_uevent.h b/drivers/net/wwan/t7xx/t7xx_uevent.h
new file mode 100644
index 000000000000..e871dc0e9444
--- /dev/null
+++ b/drivers/net/wwan/t7xx/t7xx_uevent.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ *
+ * Copyright (c) 2022, Intel Corporation.
+ */
+
+#ifndef __T7XX_UEVENT_H__
+#define __T7XX_UEVENT_H__
+
+#include <linux/device.h>
+#include <linux/kobject.h>
+
+/* Maximum length of user events */
+#define T7XX_MAX_UEVENT_LEN 64
+
+/* T7XX Host driver uevents */
+#define T7XX_UEVENT_MODEM_READY			"T7XX_MODEM_READY"
+#define T7XX_UEVENT_MODEM_FASTBOOT_DL_MODE	"T7XX_MODEM_FASTBOOT_DL_MODE"
+#define T7XX_UEVENT_MODEM_FASTBOOT_DUMP_MODE	"T7XX_MODEM_FASTBOOT_DUMP_MODE"
+#define T7XX_UEVENT_MRDUMP_READY		"T7XX_MRDUMP_READY"
+#define T7XX_UEVENT_LKDUMP_READY		"T7XX_LKDUMP_READY"
+#define T7XX_UEVENT_MRD_DISCD			"T7XX_MRDUMP_DISCARDED"
+#define T7XX_UEVENT_LKD_DISCD			"T7XX_LKDUMP_DISCARDED"
+#define T7XX_UEVENT_FLASHING_SUCCESS		"T7XX_FLASHING_SUCCESS"
+#define T7XX_UEVENT_FLASHING_FAILURE		"T7XX_FLASHING_FAILURE"
+
+/**
+ * struct t7xx_uevent_info - Uevent information structure.
+ * @dev:	Pointer to device structure
+ * @uevent:	Uevent information
+ * @work:	Uevent work struct
+ */
+struct t7xx_uevent_info {
+	struct device *dev;
+	char uevent[T7XX_MAX_UEVENT_LEN];
+	struct work_struct work;
+};
+
+void t7xx_uevent_send(struct device *dev, char *uevent);
+#endif
-- 
2.34.1


^ permalink raw reply related

* [PATCH net-next 3/5] net: wwan: t7xx: PCIe reset rescan
From: m.chetan.kumar @ 2022-08-16  4:23 UTC (permalink / raw)
  To: netdev
  Cc: kuba, davem, johannes, ryazanov.s.a, loic.poulain, krishna.c.sudi,
	m.chetan.kumar, linuxwwan, Haijun Liu, Madhusmita Sahu,
	Ricardo Martinez, Devegowda Chandrashekar

From: Haijun Liu <haijun.liu@mediatek.com>

PCI rescan module implements "rescan work queue". In firmware flashing
or coredump collection procedure WWAN device is programmed to boot in
fastboot mode and a work item is scheduled for removal & detection.
The WWAN device is reset using APCI call as part driver removal flow.
Work queue rescans pci bus at fixed interval for device detection,
later when device is detect work queue exits.

Signed-off-by: Haijun Liu <haijun.liu@mediatek.com>
Co-developed-by: Madhusmita Sahu <madhusmita.sahu@intel.com>
Signed-off-by: Madhusmita Sahu <madhusmita.sahu@intel.com>
Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Signed-off-by: Devegowda Chandrashekar <chandrashekar.devegowda@intel.com>
---
 drivers/net/wwan/t7xx/Makefile             |   3 +-
 drivers/net/wwan/t7xx/t7xx_modem_ops.c     |   5 +
 drivers/net/wwan/t7xx/t7xx_pci.c           |  51 ++++++++-
 drivers/net/wwan/t7xx/t7xx_pci.h           |   1 +
 drivers/net/wwan/t7xx/t7xx_pci_rescan.c    | 117 +++++++++++++++++++++
 drivers/net/wwan/t7xx/t7xx_pci_rescan.h    |  29 +++++
 drivers/net/wwan/t7xx/t7xx_port_wwan.c     |   6 ++
 drivers/net/wwan/t7xx/t7xx_state_monitor.c |   2 +
 8 files changed, 212 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/wwan/t7xx/t7xx_pci_rescan.c
 create mode 100644 drivers/net/wwan/t7xx/t7xx_pci_rescan.h

diff --git a/drivers/net/wwan/t7xx/Makefile b/drivers/net/wwan/t7xx/Makefile
index dc6a7d682c15..df42764b3df8 100644
--- a/drivers/net/wwan/t7xx/Makefile
+++ b/drivers/net/wwan/t7xx/Makefile
@@ -17,4 +17,5 @@ mtk_t7xx-y:=	t7xx_pci.o \
 		t7xx_hif_dpmaif_tx.o \
 		t7xx_hif_dpmaif_rx.o  \
 		t7xx_dpmaif.o \
-		t7xx_netdev.o
+		t7xx_netdev.o \
+		t7xx_pci_rescan.o
diff --git a/drivers/net/wwan/t7xx/t7xx_modem_ops.c b/drivers/net/wwan/t7xx/t7xx_modem_ops.c
index ec2269dfaf2c..fb79d041dbf5 100644
--- a/drivers/net/wwan/t7xx/t7xx_modem_ops.c
+++ b/drivers/net/wwan/t7xx/t7xx_modem_ops.c
@@ -37,6 +37,7 @@
 #include "t7xx_modem_ops.h"
 #include "t7xx_netdev.h"
 #include "t7xx_pci.h"
+#include "t7xx_pci_rescan.h"
 #include "t7xx_pcie_mac.h"
 #include "t7xx_port.h"
 #include "t7xx_port_proxy.h"
@@ -192,6 +193,10 @@ static irqreturn_t t7xx_rgu_isr_thread(int irq, void *data)
 
 	msleep(RGU_RESET_DELAY_MS);
 	t7xx_reset_device_via_pmic(t7xx_dev);
+
+	if (!t7xx_dev->hp_enable)
+		t7xx_rescan_queue_work(t7xx_dev->pdev);
+
 	return IRQ_HANDLED;
 }
 
diff --git a/drivers/net/wwan/t7xx/t7xx_pci.c b/drivers/net/wwan/t7xx/t7xx_pci.c
index 871f2a27a398..2f5c6fbe601e 100644
--- a/drivers/net/wwan/t7xx/t7xx_pci.c
+++ b/drivers/net/wwan/t7xx/t7xx_pci.c
@@ -38,6 +38,7 @@
 #include "t7xx_mhccif.h"
 #include "t7xx_modem_ops.h"
 #include "t7xx_pci.h"
+#include "t7xx_pci_rescan.h"
 #include "t7xx_pcie_mac.h"
 #include "t7xx_reg.h"
 #include "t7xx_state_monitor.h"
@@ -715,8 +716,11 @@ static int t7xx_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		return ret;
 	}
 
+	t7xx_rescan_done();
 	t7xx_pcie_mac_set_int(t7xx_dev, MHCCIF_INT);
 	t7xx_pcie_mac_interrupts_en(t7xx_dev);
+	if (!t7xx_dev->hp_enable)
+		pci_ignore_hotplug(pdev);
 
 	return 0;
 }
@@ -754,7 +758,52 @@ static struct pci_driver t7xx_pci_driver = {
 	.shutdown = t7xx_pci_shutdown,
 };
 
-module_pci_driver(t7xx_pci_driver);
+static int __init t7xx_pci_init(void)
+{
+	int ret;
+
+	t7xx_pci_dev_rescan();
+	ret = t7xx_rescan_init();
+	if (ret) {
+		pr_err("Failed to init t7xx rescan work\n");
+		return ret;
+	}
+
+	return pci_register_driver(&t7xx_pci_driver);
+}
+module_init(t7xx_pci_init);
+
+static int t7xx_always_match(struct device *dev, const void *data)
+{
+	return dev->parent->fwnode == data;
+}
+
+static void __exit t7xx_pci_cleanup(void)
+{
+	int remove_flag = 0;
+	struct device *dev;
+
+	dev = driver_find_device(&t7xx_pci_driver.driver, NULL, NULL, t7xx_always_match);
+	if (dev) {
+		pr_debug("unregister t7xx PCIe driver while device is still exist.\n");
+		put_device(dev);
+		remove_flag = 1;
+	} else {
+		pr_debug("no t7xx PCIe driver found.\n");
+	}
+
+	pci_lock_rescan_remove();
+	pci_unregister_driver(&t7xx_pci_driver);
+	pci_unlock_rescan_remove();
+	t7xx_rescan_deinit();
+
+	if (remove_flag) {
+		pr_debug("remove t7xx PCI device\n");
+		pci_stop_and_remove_bus_device_locked(to_pci_dev(dev));
+	}
+}
+
+module_exit(t7xx_pci_cleanup);
 
 MODULE_AUTHOR("MediaTek Inc");
 MODULE_DESCRIPTION("MediaTek PCIe 5G WWAN modem T7xx driver");
diff --git a/drivers/net/wwan/t7xx/t7xx_pci.h b/drivers/net/wwan/t7xx/t7xx_pci.h
index 50b37056ce5a..a87c4cae94ef 100644
--- a/drivers/net/wwan/t7xx/t7xx_pci.h
+++ b/drivers/net/wwan/t7xx/t7xx_pci.h
@@ -69,6 +69,7 @@ struct t7xx_pci_dev {
 	struct t7xx_modem	*md;
 	struct t7xx_ccmni_ctrl	*ccmni_ctlb;
 	bool			rgu_pci_irq_en;
+	bool			hp_enable;
 
 	/* Low Power Items */
 	struct list_head	md_pm_entities;
diff --git a/drivers/net/wwan/t7xx/t7xx_pci_rescan.c b/drivers/net/wwan/t7xx/t7xx_pci_rescan.c
new file mode 100644
index 000000000000..045777d8a843
--- /dev/null
+++ b/drivers/net/wwan/t7xx/t7xx_pci_rescan.c
@@ -0,0 +1,117 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2021, MediaTek Inc.
+ * Copyright (c) 2021-2022, Intel Corporation.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ":t7xx:%s: " fmt, __func__
+#define dev_fmt(fmt) "t7xx: " fmt
+
+#include <linux/delay.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/workqueue.h>
+
+#include "t7xx_pci.h"
+#include "t7xx_pci_rescan.h"
+
+static struct remove_rescan_context g_mtk_rescan_context;
+
+void t7xx_pci_dev_rescan(void)
+{
+	struct pci_bus *b = NULL;
+
+	pci_lock_rescan_remove();
+	while ((b = pci_find_next_bus(b)))
+		pci_rescan_bus(b);
+
+	pci_unlock_rescan_remove();
+}
+
+void t7xx_rescan_done(void)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&g_mtk_rescan_context.dev_lock, flags);
+	if (g_mtk_rescan_context.rescan_done == 0) {
+		pr_debug("this is a rescan probe\n");
+		g_mtk_rescan_context.rescan_done = 1;
+	} else {
+		pr_debug("this is a init probe\n");
+	}
+	spin_unlock_irqrestore(&g_mtk_rescan_context.dev_lock, flags);
+}
+
+static void t7xx_remove_rescan(struct work_struct *work)
+{
+	struct pci_dev *pdev;
+	int num_retries = RESCAN_RETRIES;
+	unsigned long flags;
+
+	spin_lock_irqsave(&g_mtk_rescan_context.dev_lock, flags);
+	g_mtk_rescan_context.rescan_done = 0;
+	pdev = g_mtk_rescan_context.dev;
+	spin_unlock_irqrestore(&g_mtk_rescan_context.dev_lock, flags);
+
+	if (pdev) {
+		pci_stop_and_remove_bus_device_locked(pdev);
+		pr_debug("start remove and rescan flow\n");
+	}
+
+	do {
+		t7xx_pci_dev_rescan();
+		spin_lock_irqsave(&g_mtk_rescan_context.dev_lock, flags);
+		if (g_mtk_rescan_context.rescan_done) {
+			spin_unlock_irqrestore(&g_mtk_rescan_context.dev_lock, flags);
+			break;
+		}
+
+		spin_unlock_irqrestore(&g_mtk_rescan_context.dev_lock, flags);
+		msleep(DELAY_RESCAN_MTIME);
+	} while (num_retries--);
+}
+
+void t7xx_rescan_queue_work(struct pci_dev *pdev)
+{
+	unsigned long flags;
+
+	dev_info(&pdev->dev, "start queue_mtk_rescan_work\n");
+	spin_lock_irqsave(&g_mtk_rescan_context.dev_lock, flags);
+	if (!g_mtk_rescan_context.rescan_done) {
+		dev_err(&pdev->dev, "rescan failed because last rescan undone\n");
+		spin_unlock_irqrestore(&g_mtk_rescan_context.dev_lock, flags);
+		return;
+	}
+
+	g_mtk_rescan_context.dev = pdev;
+	spin_unlock_irqrestore(&g_mtk_rescan_context.dev_lock, flags);
+	queue_work(g_mtk_rescan_context.pcie_rescan_wq, &g_mtk_rescan_context.service_task);
+}
+
+int t7xx_rescan_init(void)
+{
+	spin_lock_init(&g_mtk_rescan_context.dev_lock);
+	g_mtk_rescan_context.rescan_done = 1;
+	g_mtk_rescan_context.dev = NULL;
+	g_mtk_rescan_context.pcie_rescan_wq = create_singlethread_workqueue(MTK_RESCAN_WQ);
+	if (!g_mtk_rescan_context.pcie_rescan_wq) {
+		pr_err("Failed to create workqueue: %s\n", MTK_RESCAN_WQ);
+		return -ENOMEM;
+	}
+
+	INIT_WORK(&g_mtk_rescan_context.service_task, t7xx_remove_rescan);
+
+	return 0;
+}
+
+void t7xx_rescan_deinit(void)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&g_mtk_rescan_context.dev_lock, flags);
+	g_mtk_rescan_context.rescan_done = 0;
+	g_mtk_rescan_context.dev = NULL;
+	spin_unlock_irqrestore(&g_mtk_rescan_context.dev_lock, flags);
+	cancel_work_sync(&g_mtk_rescan_context.service_task);
+	destroy_workqueue(g_mtk_rescan_context.pcie_rescan_wq);
+}
diff --git a/drivers/net/wwan/t7xx/t7xx_pci_rescan.h b/drivers/net/wwan/t7xx/t7xx_pci_rescan.h
new file mode 100644
index 000000000000..de4ca1363bb0
--- /dev/null
+++ b/drivers/net/wwan/t7xx/t7xx_pci_rescan.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ *
+ * Copyright (c) 2021, MediaTek Inc.
+ * Copyright (c) 2021-2022, Intel Corporation.
+ */
+
+#ifndef __T7XX_PCI_RESCAN_H__
+#define __T7XX_PCI_RESCAN_H__
+
+#define MTK_RESCAN_WQ "mtk_rescan_wq"
+
+#define DELAY_RESCAN_MTIME 1000
+#define RESCAN_RETRIES 35
+
+struct remove_rescan_context {
+	struct work_struct	 service_task;
+	struct workqueue_struct *pcie_rescan_wq;
+	spinlock_t		dev_lock; /* protects device */
+	struct pci_dev		*dev;
+	int			rescan_done;
+};
+
+void t7xx_pci_dev_rescan(void);
+void t7xx_rescan_queue_work(struct pci_dev *pdev);
+int t7xx_rescan_init(void);
+void t7xx_rescan_deinit(void);
+void t7xx_rescan_done(void);
+
+#endif	/* __T7XX_PCI_RESCAN_H__ */
diff --git a/drivers/net/wwan/t7xx/t7xx_port_wwan.c b/drivers/net/wwan/t7xx/t7xx_port_wwan.c
index e53651ee2005..dfd7fb487fc0 100644
--- a/drivers/net/wwan/t7xx/t7xx_port_wwan.c
+++ b/drivers/net/wwan/t7xx/t7xx_port_wwan.c
@@ -156,6 +156,12 @@ static void t7xx_port_wwan_md_state_notify(struct t7xx_port *port, unsigned int
 {
 	const struct t7xx_port_conf *port_conf = port->port_conf;
 
+	if (state == MD_STATE_EXCEPTION) {
+		if (port->wwan_port)
+			wwan_port_txoff(port->wwan_port);
+		return;
+	}
+
 	if (state != MD_STATE_READY)
 		return;
 
diff --git a/drivers/net/wwan/t7xx/t7xx_state_monitor.c b/drivers/net/wwan/t7xx/t7xx_state_monitor.c
index c1789a558c9d..9c222809371b 100644
--- a/drivers/net/wwan/t7xx/t7xx_state_monitor.c
+++ b/drivers/net/wwan/t7xx/t7xx_state_monitor.c
@@ -35,9 +35,11 @@
 #include "t7xx_hif_cldma.h"
 #include "t7xx_mhccif.h"
 #include "t7xx_modem_ops.h"
+#include "t7xx_netdev.h"
 #include "t7xx_pci.h"
 #include "t7xx_pcie_mac.h"
 #include "t7xx_port_proxy.h"
+#include "t7xx_pci_rescan.h"
 #include "t7xx_reg.h"
 #include "t7xx_state_monitor.h"
 
-- 
2.34.1


^ permalink raw reply related

* [PATCH net-next 2/5] net: wwan: t7xx: Infrastructure for early port configuration
From: m.chetan.kumar @ 2022-08-16  4:23 UTC (permalink / raw)
  To: netdev
  Cc: kuba, davem, johannes, ryazanov.s.a, loic.poulain, krishna.c.sudi,
	m.chetan.kumar, linuxwwan, Haijun Liu, Madhusmita Sahu,
	Ricardo Martinez, Devegowda Chandrashekar

From: Haijun Liu <haijun.liu@mediatek.com>

To support cases such as FW update or Core dump, the t7xx device
is capable of signaling the host that a special port needs
to be created before the handshake phase.

This patch adds the infrastructure required to create the
early ports which also requires a different configuration of
CLDMA queues.

Signed-off-by: Haijun Liu <haijun.liu@mediatek.com>
Co-developed-by: Madhusmita Sahu <madhusmita.sahu@intel.com>
Signed-off-by: Madhusmita Sahu <madhusmita.sahu@intel.com>
Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
Signed-off-by: Devegowda Chandrashekar <chandrashekar.devegowda@intel.com>
Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
---
 drivers/net/wwan/t7xx/t7xx_hif_cldma.c     |  38 +++++--
 drivers/net/wwan/t7xx/t7xx_hif_cldma.h     |  24 ++++-
 drivers/net/wwan/t7xx/t7xx_modem_ops.c     |   4 +-
 drivers/net/wwan/t7xx/t7xx_port.h          |   3 +
 drivers/net/wwan/t7xx/t7xx_port_proxy.c    | 118 +++++++++++++++++++--
 drivers/net/wwan/t7xx/t7xx_port_proxy.h    |  11 +-
 drivers/net/wwan/t7xx/t7xx_port_wwan.c     |   3 +-
 drivers/net/wwan/t7xx/t7xx_reg.h           |  23 +++-
 drivers/net/wwan/t7xx/t7xx_state_monitor.c | 109 ++++++++++++++++---
 drivers/net/wwan/t7xx/t7xx_state_monitor.h |   2 +
 10 files changed, 294 insertions(+), 41 deletions(-)

diff --git a/drivers/net/wwan/t7xx/t7xx_hif_cldma.c b/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
index 37cd63d20b45..f26e6138f187 100644
--- a/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
+++ b/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
@@ -57,8 +57,6 @@
 #define CHECK_Q_STOP_TIMEOUT_US		1000000
 #define CHECK_Q_STOP_STEP_US		10000
 
-#define CLDMA_JUMBO_BUFF_SZ		(63 * 1024 + sizeof(struct ccci_header))
-
 static void md_cd_queue_struct_reset(struct cldma_queue *queue, struct cldma_ctrl *md_ctrl,
 				     enum mtk_txrx tx_rx, unsigned int index)
 {
@@ -993,6 +991,34 @@ int t7xx_cldma_send_skb(struct cldma_ctrl *md_ctrl, int qno, struct sk_buff *skb
 	return ret;
 }
 
+static void t7xx_cldma_adjust_config(struct cldma_ctrl *md_ctrl, enum cldma_cfg cfg_id)
+{
+	int qno;
+
+	for (qno = 0; qno < CLDMA_RXQ_NUM; qno++) {
+		md_ctrl->rx_ring[qno].pkt_size = CLDMA_SHARED_Q_BUFF_SZ;
+		md_ctrl->rxq[qno].q_type = CLDMA_SHARED_Q;
+	}
+
+	md_ctrl->rx_ring[CLDMA_RXQ_NUM - 1].pkt_size = CLDMA_JUMBO_BUFF_SZ;
+
+	for (qno = 0; qno < CLDMA_TXQ_NUM; qno++) {
+		md_ctrl->tx_ring[qno].pkt_size = CLDMA_SHARED_Q_BUFF_SZ;
+		md_ctrl->txq[qno].q_type = CLDMA_SHARED_Q;
+	}
+
+	if (cfg_id == CLDMA_DEDICATED_Q_CFG) {
+		md_ctrl->rxq[DOWNLOAD_PORT_ID].q_type = CLDMA_DEDICATED_Q;
+		md_ctrl->txq[DOWNLOAD_PORT_ID].q_type = CLDMA_DEDICATED_Q;
+		md_ctrl->tx_ring[DOWNLOAD_PORT_ID].pkt_size = CLDMA_DEDICATED_Q_BUFF_SZ;
+		md_ctrl->rx_ring[DOWNLOAD_PORT_ID].pkt_size = CLDMA_DEDICATED_Q_BUFF_SZ;
+		md_ctrl->rxq[DUMP_PORT_ID].q_type = CLDMA_DEDICATED_Q;
+		md_ctrl->txq[DUMP_PORT_ID].q_type = CLDMA_DEDICATED_Q;
+		md_ctrl->tx_ring[DUMP_PORT_ID].pkt_size = CLDMA_DEDICATED_Q_BUFF_SZ;
+		md_ctrl->rx_ring[DUMP_PORT_ID].pkt_size = CLDMA_DEDICATED_Q_BUFF_SZ;
+	}
+}
+
 static int t7xx_cldma_late_init(struct cldma_ctrl *md_ctrl)
 {
 	char dma_pool_name[32];
@@ -1021,11 +1047,6 @@ static int t7xx_cldma_late_init(struct cldma_ctrl *md_ctrl)
 	}
 
 	for (j = 0; j < CLDMA_RXQ_NUM; j++) {
-		md_ctrl->rx_ring[j].pkt_size = CLDMA_MTU;
-
-		if (j == CLDMA_RXQ_NUM - 1)
-			md_ctrl->rx_ring[j].pkt_size = CLDMA_JUMBO_BUFF_SZ;
-
 		ret = t7xx_cldma_rx_ring_init(md_ctrl, &md_ctrl->rx_ring[j]);
 		if (ret) {
 			dev_err(md_ctrl->dev, "Control RX ring init fail\n");
@@ -1329,9 +1350,10 @@ int t7xx_cldma_init(struct cldma_ctrl *md_ctrl)
 	return -ENOMEM;
 }
 
-void t7xx_cldma_switch_cfg(struct cldma_ctrl *md_ctrl)
+void t7xx_cldma_switch_cfg(struct cldma_ctrl *md_ctrl, enum cldma_cfg cfg_id)
 {
 	t7xx_cldma_late_release(md_ctrl);
+	t7xx_cldma_adjust_config(md_ctrl, cfg_id);
 	t7xx_cldma_late_init(md_ctrl);
 }
 
diff --git a/drivers/net/wwan/t7xx/t7xx_hif_cldma.h b/drivers/net/wwan/t7xx/t7xx_hif_cldma.h
index 4410bac6993a..da3aa21c01eb 100644
--- a/drivers/net/wwan/t7xx/t7xx_hif_cldma.h
+++ b/drivers/net/wwan/t7xx/t7xx_hif_cldma.h
@@ -31,6 +31,10 @@
 #include "t7xx_cldma.h"
 #include "t7xx_pci.h"
 
+#define CLDMA_JUMBO_BUFF_SZ		(63 * 1024 + sizeof(struct ccci_header))
+#define CLDMA_SHARED_Q_BUFF_SZ		3584
+#define CLDMA_DEDICATED_Q_BUFF_SZ	2048
+
 /**
  * enum cldma_id - Identifiers for CLDMA HW units.
  * @CLDMA_ID_MD: Modem control channel.
@@ -55,6 +59,16 @@ struct cldma_gpd {
 	__le16 not_used2;
 };
 
+enum cldma_queue_type {
+	CLDMA_SHARED_Q,
+	CLDMA_DEDICATED_Q,
+};
+
+enum cldma_cfg {
+	CLDMA_SHARED_Q_CFG,
+	CLDMA_DEDICATED_Q_CFG,
+};
+
 struct cldma_request {
 	struct cldma_gpd *gpd;	/* Virtual address for CPU */
 	dma_addr_t gpd_addr;	/* Physical address for DMA */
@@ -77,6 +91,7 @@ struct cldma_queue {
 	struct cldma_request *tr_done;
 	struct cldma_request *rx_refill;
 	struct cldma_request *tx_next;
+	enum cldma_queue_type q_type;
 	int budget;			/* Same as ring buffer size by default */
 	spinlock_t ring_lock;
 	wait_queue_head_t req_wq;	/* Only for TX */
@@ -104,17 +119,20 @@ struct cldma_ctrl {
 	int (*recv_skb)(struct cldma_queue *queue, struct sk_buff *skb);
 };
 
+enum cldma_txq_rxq_port_id {
+	DOWNLOAD_PORT_ID = 0,
+	DUMP_PORT_ID = 1
+};
+
 #define GPD_FLAGS_HWO		BIT(0)
 #define GPD_FLAGS_IOC		BIT(7)
 #define GPD_DMAPOOL_ALIGN	16
 
-#define CLDMA_MTU		3584	/* 3.5kB */
-
 int t7xx_cldma_alloc(enum cldma_id hif_id, struct t7xx_pci_dev *t7xx_dev);
 void t7xx_cldma_hif_hw_init(struct cldma_ctrl *md_ctrl);
 int t7xx_cldma_init(struct cldma_ctrl *md_ctrl);
 void t7xx_cldma_exit(struct cldma_ctrl *md_ctrl);
-void t7xx_cldma_switch_cfg(struct cldma_ctrl *md_ctrl);
+void t7xx_cldma_switch_cfg(struct cldma_ctrl *md_ctrl, enum cldma_cfg cfg_id);
 void t7xx_cldma_start(struct cldma_ctrl *md_ctrl);
 int t7xx_cldma_stop(struct cldma_ctrl *md_ctrl);
 void t7xx_cldma_reset(struct cldma_ctrl *md_ctrl);
diff --git a/drivers/net/wwan/t7xx/t7xx_modem_ops.c b/drivers/net/wwan/t7xx/t7xx_modem_ops.c
index c5a3c95004bd..ec2269dfaf2c 100644
--- a/drivers/net/wwan/t7xx/t7xx_modem_ops.c
+++ b/drivers/net/wwan/t7xx/t7xx_modem_ops.c
@@ -527,7 +527,7 @@ static void t7xx_md_hk_wq(struct work_struct *work)
 
 	/* Clear the HS2 EXIT event appended in core_reset() */
 	t7xx_fsm_clr_event(ctl, FSM_EVENT_MD_HS2_EXIT);
-	t7xx_cldma_switch_cfg(md->md_ctrl[CLDMA_ID_MD]);
+	t7xx_cldma_switch_cfg(md->md_ctrl[CLDMA_ID_MD], CLDMA_SHARED_Q_CFG);
 	t7xx_cldma_start(md->md_ctrl[CLDMA_ID_MD]);
 	t7xx_fsm_broadcast_state(ctl, MD_STATE_WAITING_FOR_HS2);
 	md->core_md.handshake_ongoing = true;
@@ -542,7 +542,7 @@ static void t7xx_ap_hk_wq(struct work_struct *work)
 	 /* Clear the HS2 EXIT event appended in t7xx_core_reset(). */
 	t7xx_fsm_clr_event(ctl, FSM_EVENT_AP_HS2_EXIT);
 	t7xx_cldma_stop(md->md_ctrl[CLDMA_ID_AP]);
-	t7xx_cldma_switch_cfg(md->md_ctrl[CLDMA_ID_AP]);
+	t7xx_cldma_switch_cfg(md->md_ctrl[CLDMA_ID_AP], CLDMA_SHARED_Q_CFG);
 	t7xx_cldma_start(md->md_ctrl[CLDMA_ID_AP]);
 	md->core_ap.handshake_ongoing = true;
 	t7xx_core_hk_handler(md, &md->core_ap, ctl, FSM_EVENT_AP_HS2, FSM_EVENT_AP_HS2_EXIT);
diff --git a/drivers/net/wwan/t7xx/t7xx_port.h b/drivers/net/wwan/t7xx/t7xx_port.h
index 4a29bd04cbe2..6a96ee6d9449 100644
--- a/drivers/net/wwan/t7xx/t7xx_port.h
+++ b/drivers/net/wwan/t7xx/t7xx_port.h
@@ -100,6 +100,7 @@ struct t7xx_port_conf {
 	struct port_ops		*ops;
 	char			*name;
 	enum wwan_port_type	port_type;
+	bool			is_early_port;
 };
 
 struct t7xx_port {
@@ -130,9 +131,11 @@ struct t7xx_port {
 	struct task_struct		*thread;
 };
 
+int t7xx_get_port_mtu(struct t7xx_port *port);
 struct sk_buff *t7xx_port_alloc_skb(int payload);
 struct sk_buff *t7xx_ctrl_alloc_skb(int payload);
 int t7xx_port_enqueue_skb(struct t7xx_port *port, struct sk_buff *skb);
+int t7xx_port_send_raw_skb(struct t7xx_port *port, struct sk_buff *skb);
 int t7xx_port_send_skb(struct t7xx_port *port, struct sk_buff *skb, unsigned int pkt_header,
 		       unsigned int ex_msg);
 int t7xx_port_send_ctl_skb(struct t7xx_port *port, struct sk_buff *skb, unsigned int msg,
diff --git a/drivers/net/wwan/t7xx/t7xx_port_proxy.c b/drivers/net/wwan/t7xx/t7xx_port_proxy.c
index 62305d59da90..7582777cf94d 100644
--- a/drivers/net/wwan/t7xx/t7xx_port_proxy.c
+++ b/drivers/net/wwan/t7xx/t7xx_port_proxy.c
@@ -88,6 +88,20 @@ static const struct t7xx_port_conf t7xx_md_port_conf[] = {
 	},
 };
 
+static struct t7xx_port_conf t7xx_early_port_conf[] = {
+	{
+		.tx_ch = 0xffff,
+		.rx_ch = 0xffff,
+		.txq_index = 1,
+		.rxq_index = 1,
+		.txq_exp_index = 1,
+		.rxq_exp_index = 1,
+		.path_id = CLDMA_ID_AP,
+		.is_early_port = true,
+		.name = "ttyDUMP",
+	},
+};
+
 static struct t7xx_port *t7xx_proxy_get_port_by_ch(struct port_proxy *port_prox, enum port_ch ch)
 {
 	const struct t7xx_port_conf *port_conf;
@@ -202,7 +216,17 @@ int t7xx_port_enqueue_skb(struct t7xx_port *port, struct sk_buff *skb)
 	return 0;
 }
 
-static int t7xx_port_send_raw_skb(struct t7xx_port *port, struct sk_buff *skb)
+int t7xx_get_port_mtu(struct t7xx_port *port)
+{
+	enum cldma_id path_id = port->port_conf->path_id;
+	int tx_qno = t7xx_port_get_queue_no(port);
+	struct cldma_ctrl *md_ctrl;
+
+	md_ctrl = port->t7xx_dev->md->md_ctrl[path_id];
+	return md_ctrl->tx_ring[tx_qno].pkt_size;
+}
+
+int t7xx_port_send_raw_skb(struct t7xx_port *port, struct sk_buff *skb)
 {
 	enum cldma_id path_id = port->port_conf->path_id;
 	struct cldma_ctrl *md_ctrl;
@@ -317,6 +341,26 @@ static void t7xx_proxy_setup_ch_mapping(struct port_proxy *port_prox)
 	}
 }
 
+static int t7xx_port_proxy_recv_skb_from_queue(struct t7xx_pci_dev *t7xx_dev,
+					       struct cldma_queue *queue, struct sk_buff *skb)
+{
+	struct port_proxy *port_prox = t7xx_dev->md->port_prox;
+	const struct t7xx_port_conf *port_conf;
+	struct t7xx_port *port;
+	int ret;
+
+	port = port_prox->ports;
+	port_conf = port->port_conf;
+
+	ret = port_conf->ops->recv_skb(port, skb);
+	if (ret < 0 && ret != -ENOBUFS) {
+		dev_err(port->dev, "drop on RX ch %d, %d\n", port_conf->rx_ch, ret);
+		dev_kfree_skb_any(skb);
+	}
+
+	return ret;
+}
+
 static struct t7xx_port *t7xx_port_proxy_find_port(struct t7xx_pci_dev *t7xx_dev,
 						   struct cldma_queue *queue, u16 channel)
 {
@@ -338,6 +382,22 @@ static struct t7xx_port *t7xx_port_proxy_find_port(struct t7xx_pci_dev *t7xx_dev
 	return NULL;
 }
 
+struct t7xx_port *t7xx_port_proxy_get_port_by_name(struct port_proxy *port_prox, char *port_name)
+{
+	const struct t7xx_port_conf *port_conf;
+	struct t7xx_port *port;
+	int i;
+
+	for_each_proxy_port(i, port, port_prox) {
+		port_conf = port->port_conf;
+
+		if (!strncmp(port_conf->name, port_name, strlen(port_conf->name)))
+			return port;
+	}
+
+	return NULL;
+}
+
 /**
  * t7xx_port_proxy_recv_skb() - Dispatch received skb.
  * @queue: CLDMA queue.
@@ -358,6 +418,9 @@ static int t7xx_port_proxy_recv_skb(struct cldma_queue *queue, struct sk_buff *s
 	u16 seq_num, channel;
 	int ret;
 
+	if (queue->q_type == CLDMA_DEDICATED_Q)
+		return t7xx_port_proxy_recv_skb_from_queue(t7xx_dev, queue, skb);
+
 	channel = FIELD_GET(CCCI_H_CHN_FLD, le32_to_cpu(ccci_h->status));
 	if (t7xx_fsm_get_md_state(ctl) == MD_STATE_INVALID) {
 		dev_err_ratelimited(dev, "Packet drop on channel 0x%x, modem not ready\n", channel);
@@ -372,7 +435,8 @@ static int t7xx_port_proxy_recv_skb(struct cldma_queue *queue, struct sk_buff *s
 
 	seq_num = t7xx_port_next_rx_seq_num(port, ccci_h);
 	port_conf = port->port_conf;
-	skb_pull(skb, sizeof(*ccci_h));
+	if (!port->port_conf->is_early_port)
+		skb_pull(skb, sizeof(*ccci_h));
 
 	ret = port_conf->ops->recv_skb(port, skb);
 	/* Error indicates to try again later */
@@ -439,26 +503,58 @@ static void t7xx_proxy_init_all_ports(struct t7xx_modem *md)
 	t7xx_proxy_setup_ch_mapping(port_prox);
 }
 
+void t7xx_port_proxy_set_cfg(struct t7xx_modem *md, enum port_cfg_id cfg_id)
+{
+	struct port_proxy *port_prox = md->port_prox;
+	const struct t7xx_port_conf *port_conf;
+	struct device *dev = port_prox->dev;
+	unsigned int port_count;
+	struct t7xx_port *port;
+	int i;
+
+	if (port_prox->cfg_id == cfg_id)
+		return;
+
+	if (port_prox->cfg_id != PORT_CFG_ID_INVALID) {
+		for_each_proxy_port(i, port, port_prox)
+			port->port_conf->ops->uninit(port);
+
+		devm_kfree(dev, port_prox->ports);
+	}
+
+	if (cfg_id == PORT_CFG_ID_EARLY) {
+		port_conf = t7xx_early_port_conf;
+		port_count = ARRAY_SIZE(t7xx_early_port_conf);
+	} else {
+		port_conf = t7xx_md_port_conf;
+		port_count = ARRAY_SIZE(t7xx_md_port_conf);
+	}
+
+	port_prox->ports = devm_kzalloc(dev, sizeof(struct t7xx_port) * port_count, GFP_KERNEL);
+	if (!port_prox->ports)
+		return;
+
+	for (i = 0; i < port_count; i++)
+		port_prox->ports[i].port_conf = &port_conf[i];
+
+	port_prox->cfg_id = cfg_id;
+	port_prox->port_count = port_count;
+	t7xx_proxy_init_all_ports(md);
+}
+
 static int t7xx_proxy_alloc(struct t7xx_modem *md)
 {
-	unsigned int port_count = ARRAY_SIZE(t7xx_md_port_conf);
 	struct device *dev = &md->t7xx_dev->pdev->dev;
 	struct port_proxy *port_prox;
-	int i;
 
-	port_prox = devm_kzalloc(dev, sizeof(*port_prox) + sizeof(struct t7xx_port) * port_count,
-				 GFP_KERNEL);
+	port_prox = devm_kzalloc(dev, sizeof(*port_prox), GFP_KERNEL);
 	if (!port_prox)
 		return -ENOMEM;
 
 	md->port_prox = port_prox;
 	port_prox->dev = dev;
+	t7xx_port_proxy_set_cfg(md, PORT_CFG_ID_EARLY);
 
-	for (i = 0; i < port_count; i++)
-		port_prox->ports[i].port_conf = &t7xx_md_port_conf[i];
-
-	port_prox->port_count = port_count;
-	t7xx_proxy_init_all_ports(md);
 	return 0;
 }
 
diff --git a/drivers/net/wwan/t7xx/t7xx_port_proxy.h b/drivers/net/wwan/t7xx/t7xx_port_proxy.h
index bc1ff5c6c700..33caf85f718a 100644
--- a/drivers/net/wwan/t7xx/t7xx_port_proxy.h
+++ b/drivers/net/wwan/t7xx/t7xx_port_proxy.h
@@ -31,12 +31,19 @@
 #define RX_QUEUE_MAXLEN		32
 #define CTRL_QUEUE_MAXLEN	16
 
+enum port_cfg_id {
+	PORT_CFG_ID_INVALID,
+	PORT_CFG_ID_NORMAL,
+	PORT_CFG_ID_EARLY,
+};
+
 struct port_proxy {
 	int			port_count;
 	struct list_head	rx_ch_ports[PORT_CH_ID_MASK + 1];
 	struct list_head	queue_ports[CLDMA_NUM][MTK_QUEUES];
 	struct device		*dev;
-	struct t7xx_port	ports[];
+	enum port_cfg_id	cfg_id;
+	struct t7xx_port	*ports;
 };
 
 struct ccci_header {
@@ -94,5 +101,7 @@ void t7xx_port_proxy_md_status_notify(struct port_proxy *port_prox, unsigned int
 int t7xx_port_enum_msg_handler(struct t7xx_modem *md, void *msg);
 int t7xx_port_proxy_chl_enable_disable(struct port_proxy *port_prox, unsigned int ch_id,
 				       bool en_flag);
+struct t7xx_port *t7xx_port_proxy_get_port_by_name(struct port_proxy *port_prox, char *port_name);
+void t7xx_port_proxy_set_cfg(struct t7xx_modem *md, enum port_cfg_id cfg_id);
 
 #endif /* __T7XX_PORT_PROXY_H__ */
diff --git a/drivers/net/wwan/t7xx/t7xx_port_wwan.c b/drivers/net/wwan/t7xx/t7xx_port_wwan.c
index 33931bfd78fd..e53651ee2005 100644
--- a/drivers/net/wwan/t7xx/t7xx_port_wwan.c
+++ b/drivers/net/wwan/t7xx/t7xx_port_wwan.c
@@ -54,7 +54,7 @@ static void t7xx_port_ctrl_stop(struct wwan_port *port)
 static int t7xx_port_ctrl_tx(struct wwan_port *port, struct sk_buff *skb)
 {
 	struct t7xx_port *port_private = wwan_port_get_drvdata(port);
-	size_t len, offset, chunk_len = 0, txq_mtu = CLDMA_MTU;
+	size_t len, offset, chunk_len = 0, txq_mtu;
 	const struct t7xx_port_conf *port_conf;
 	struct t7xx_fsm_ctl *ctl;
 	enum md_state md_state;
@@ -72,6 +72,7 @@ static int t7xx_port_ctrl_tx(struct wwan_port *port, struct sk_buff *skb)
 		return -ENODEV;
 	}
 
+	txq_mtu = t7xx_get_port_mtu(port_private);
 	for (offset = 0; offset < len; offset += chunk_len) {
 		struct sk_buff *skb_ccci;
 		int ret;
diff --git a/drivers/net/wwan/t7xx/t7xx_reg.h b/drivers/net/wwan/t7xx/t7xx_reg.h
index c41d7d094c08..60e025e57baa 100644
--- a/drivers/net/wwan/t7xx/t7xx_reg.h
+++ b/drivers/net/wwan/t7xx/t7xx_reg.h
@@ -102,10 +102,27 @@ enum t7xx_pm_resume_state {
 };
 
 #define T7XX_PCIE_MISC_DEV_STATUS		0x0d1c
-#define MISC_STAGE_MASK				GENMASK(2, 0)
-#define MISC_RESET_TYPE_PLDR			BIT(26)
 #define MISC_RESET_TYPE_FLDR			BIT(27)
-#define LINUX_STAGE				4
+#define MISC_RESET_TYPE_PLDR			BIT(26)
+#define MISC_DEV_STATUS_MASK			GENMASK(15, 0)
+#define LK_EVENT_MASK				GENMASK(11, 8)
+
+enum lk_event_id {
+	LK_EVENT_NORMAL = 0,
+	LK_EVENT_CREATE_PD_PORT = 1,
+	LK_EVENT_CREATE_POST_DL_PORT = 2,
+	LK_EVENT_RESET = 7,
+};
+
+#define MISC_STAGE_MASK				GENMASK(2, 0)
+
+enum t7xx_device_stage {
+	INIT_STAGE = 0,
+	PRE_BROM_STAGE = 1,
+	POST_BROM_STAGE = 2,
+	LK_STAGE = 3,
+	LINUX_STAGE = 4,
+};
 
 #define T7XX_PCIE_RESOURCE_STATUS		0x0d28
 #define T7XX_PCIE_RESOURCE_STS_MSK		GENMASK(4, 0)
diff --git a/drivers/net/wwan/t7xx/t7xx_state_monitor.c b/drivers/net/wwan/t7xx/t7xx_state_monitor.c
index 80edb8e75a6a..c1789a558c9d 100644
--- a/drivers/net/wwan/t7xx/t7xx_state_monitor.c
+++ b/drivers/net/wwan/t7xx/t7xx_state_monitor.c
@@ -47,6 +47,10 @@
 #define FSM_MD_EX_PASS_TIMEOUT_MS		45000
 #define FSM_CMD_TIMEOUT_MS			2000
 
+/* As per MTK, AP to MD Handshake time is ~15s*/
+#define DEVICE_STAGE_POLL_INTERVAL_MS		100
+#define DEVICE_STAGE_POLL_COUNT			150
+
 void t7xx_fsm_notifier_register(struct t7xx_modem *md, struct t7xx_fsm_notifier *notifier)
 {
 	struct t7xx_fsm_ctl *ctl = md->fsm_ctl;
@@ -206,6 +210,46 @@ static void fsm_routine_exception(struct t7xx_fsm_ctl *ctl, struct t7xx_fsm_comm
 		fsm_finish_command(ctl, cmd, 0);
 }
 
+static void t7xx_lk_stage_event_handling(struct t7xx_fsm_ctl *ctl, unsigned int dev_status)
+{
+	struct t7xx_modem *md = ctl->md;
+	struct cldma_ctrl *md_ctrl;
+	enum lk_event_id lk_event;
+	struct t7xx_port *port;
+	struct device *dev;
+
+	dev = &md->t7xx_dev->pdev->dev;
+	lk_event = FIELD_GET(LK_EVENT_MASK, dev_status);
+	dev_info(dev, "Device enter next stage from LK stage/n");
+	switch (lk_event) {
+	case LK_EVENT_NORMAL:
+		break;
+
+	case LK_EVENT_CREATE_PD_PORT:
+		md_ctrl = md->md_ctrl[CLDMA_ID_AP];
+		t7xx_cldma_hif_hw_init(md_ctrl);
+		t7xx_cldma_stop(md_ctrl);
+		t7xx_cldma_switch_cfg(md_ctrl, CLDMA_DEDICATED_Q_CFG);
+		dev_info(dev, "creating the ttyDUMP port\n");
+		port = t7xx_port_proxy_get_port_by_name(md->port_prox, "ttyDUMP");
+		if (!port) {
+			dev_err(dev, "ttyDUMP port not found\n");
+			return;
+		}
+
+		port->port_conf->ops->enable_chl(port);
+		t7xx_cldma_start(md_ctrl);
+		break;
+
+	case LK_EVENT_RESET:
+		break;
+
+	default:
+		dev_err(dev, "Invalid BROM event\n");
+		break;
+	}
+}
+
 static int fsm_stopped_handler(struct t7xx_fsm_ctl *ctl)
 {
 	ctl->curr_state = FSM_STATE_STOPPED;
@@ -317,8 +361,10 @@ static int fsm_routine_starting(struct t7xx_fsm_ctl *ctl)
 static void fsm_routine_start(struct t7xx_fsm_ctl *ctl, struct t7xx_fsm_command *cmd)
 {
 	struct t7xx_modem *md = ctl->md;
+	unsigned int device_stage;
+	struct device *dev;
 	u32 dev_status;
-	int ret;
+	int ret = 0;
 
 	if (!md)
 		return;
@@ -329,23 +375,60 @@ static void fsm_routine_start(struct t7xx_fsm_ctl *ctl, struct t7xx_fsm_command
 		return;
 	}
 
+	dev = &md->t7xx_dev->pdev->dev;
+	dev_status = ioread32(IREG_BASE(md->t7xx_dev) + T7XX_PCIE_MISC_DEV_STATUS);
+	dev_status &= MISC_DEV_STATUS_MASK;
+	dev_dbg(dev, "dev_status = %x modem state = %d\n", dev_status, ctl->md_state);
+
+	if (dev_status == MISC_DEV_STATUS_MASK) {
+		dev_err(dev, "invalid device status\n");
+		ret = -EINVAL;
+		goto finish_command;
+	}
+
 	ctl->curr_state = FSM_STATE_PRE_START;
 	t7xx_md_event_notify(md, FSM_PRE_START);
 
-	ret = read_poll_timeout(ioread32, dev_status,
-				(dev_status & MISC_STAGE_MASK) == LINUX_STAGE, 20000, 2000000,
-				false, IREG_BASE(md->t7xx_dev) + T7XX_PCIE_MISC_DEV_STATUS);
-	if (ret) {
-		struct device *dev = &md->t7xx_dev->pdev->dev;
+	device_stage = FIELD_GET(MISC_STAGE_MASK, dev_status);
+	if (dev_status == ctl->prev_dev_status) {
+		if (ctl->device_stage_check_cnt++ >= DEVICE_STAGE_POLL_COUNT) {
+			dev_err(dev, "Timeout at device stage 0x%x\n", device_stage);
+			ctl->device_stage_check_cnt = 0;
+			ret = -ETIMEDOUT;
+		} else {
+			msleep(DEVICE_STAGE_POLL_INTERVAL_MS);
+			ret = t7xx_fsm_append_cmd(ctl, FSM_CMD_START, 0);
+		}
 
-		fsm_finish_command(ctl, cmd, -ETIMEDOUT);
-		dev_err(dev, "Invalid device status 0x%lx\n", dev_status & MISC_STAGE_MASK);
-		return;
+		goto finish_command;
+	}
+
+	switch (device_stage) {
+	case INIT_STAGE:
+	case PRE_BROM_STAGE:
+	case POST_BROM_STAGE:
+		ret = t7xx_fsm_append_cmd(ctl, FSM_CMD_START, 0);
+		break;
+
+	case LK_STAGE:
+		dev_info(dev, "LK_STAGE Entered");
+		t7xx_lk_stage_event_handling(ctl, dev_status);
+		break;
+
+	case LINUX_STAGE:
+		t7xx_cldma_hif_hw_init(md->md_ctrl[CLDMA_ID_AP]);
+		t7xx_cldma_hif_hw_init(md->md_ctrl[CLDMA_ID_MD]);
+		t7xx_port_proxy_set_cfg(md, PORT_CFG_ID_NORMAL);
+		ret = fsm_routine_starting(ctl);
+		break;
+
+	default:
+		break;
 	}
 
-	t7xx_cldma_hif_hw_init(md->md_ctrl[CLDMA_ID_AP]);
-	t7xx_cldma_hif_hw_init(md->md_ctrl[CLDMA_ID_MD]);
-	fsm_finish_command(ctl, cmd, fsm_routine_starting(ctl));
+finish_command:
+	ctl->prev_dev_status = dev_status;
+	fsm_finish_command(ctl, cmd, ret);
 }
 
 static int fsm_main_thread(void *data)
@@ -516,6 +599,8 @@ void t7xx_fsm_reset(struct t7xx_modem *md)
 	fsm_flush_event_cmd_qs(ctl);
 	ctl->curr_state = FSM_STATE_STOPPED;
 	ctl->exp_flg = false;
+	ctl->prev_dev_status = 0;
+	ctl->device_stage_check_cnt = 0;
 }
 
 int t7xx_fsm_init(struct t7xx_modem *md)
diff --git a/drivers/net/wwan/t7xx/t7xx_state_monitor.h b/drivers/net/wwan/t7xx/t7xx_state_monitor.h
index b6e76f3903c8..b2459bd58624 100644
--- a/drivers/net/wwan/t7xx/t7xx_state_monitor.h
+++ b/drivers/net/wwan/t7xx/t7xx_state_monitor.h
@@ -96,6 +96,8 @@ struct t7xx_fsm_ctl {
 	bool			exp_flg;
 	spinlock_t		notifier_lock;		/* Protects notifier list */
 	struct list_head	notifier_list;
+	u32                     prev_dev_status;
+	unsigned int		device_stage_check_cnt;
 };
 
 struct t7xx_fsm_event {
-- 
2.34.1


^ permalink raw reply related

* [PATCH net-next 1/5] net: wwan: t7xx: Add AP CLDMA
From: m.chetan.kumar @ 2022-08-16  4:23 UTC (permalink / raw)
  To: netdev
  Cc: kuba, davem, johannes, ryazanov.s.a, loic.poulain, krishna.c.sudi,
	m.chetan.kumar, linuxwwan, Haijun Liu, Madhusmita Sahu,
	Moises Veleta, Devegowda Chandrashekar, Ilpo Järvinen

From: Haijun Liu <haijun.liu@mediatek.com>

The t7xx device contains two Cross Layer DMA (CLDMA) interfaces to
communicate with AP and Modem processors respectively. So far only
MD-CLDMA was being used, this patch enables AP-CLDMA.

Rename small Application Processor (sAP) to AP.

Signed-off-by: Haijun Liu <haijun.liu@mediatek.com>
Co-developed-by: Madhusmita Sahu <madhusmita.sahu@intel.com>
Signed-off-by: Madhusmita Sahu <madhusmita.sahu@intel.com>
Signed-off-by: Moises Veleta <moises.veleta@linux.intel.com>
Signed-off-by: Devegowda Chandrashekar <chandrashekar.devegowda@intel.com>
Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/net/wwan/t7xx/t7xx_hif_cldma.c     | 17 +++--
 drivers/net/wwan/t7xx/t7xx_hif_cldma.h     |  2 +-
 drivers/net/wwan/t7xx/t7xx_mhccif.h        |  1 +
 drivers/net/wwan/t7xx/t7xx_modem_ops.c     | 85 ++++++++++++++++++----
 drivers/net/wwan/t7xx/t7xx_modem_ops.h     |  3 +
 drivers/net/wwan/t7xx/t7xx_port.h          |  8 +-
 drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c |  8 +-
 drivers/net/wwan/t7xx/t7xx_port_proxy.c    | 12 +++
 drivers/net/wwan/t7xx/t7xx_reg.h           |  2 +-
 drivers/net/wwan/t7xx/t7xx_state_monitor.c | 13 +++-
 drivers/net/wwan/t7xx/t7xx_state_monitor.h |  2 +
 11 files changed, 125 insertions(+), 28 deletions(-)

diff --git a/drivers/net/wwan/t7xx/t7xx_hif_cldma.c b/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
index 6ff30cb8eb16..37cd63d20b45 100644
--- a/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
+++ b/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
@@ -1064,13 +1064,18 @@ static void t7xx_hw_info_init(struct cldma_ctrl *md_ctrl)
 	struct t7xx_cldma_hw *hw_info = &md_ctrl->hw_info;
 	u32 phy_ao_base, phy_pd_base;
 
-	if (md_ctrl->hif_id != CLDMA_ID_MD)
-		return;
-
-	phy_ao_base = CLDMA1_AO_BASE;
-	phy_pd_base = CLDMA1_PD_BASE;
-	hw_info->phy_interrupt_id = CLDMA1_INT;
 	hw_info->hw_mode = MODE_BIT_64;
+
+	if (md_ctrl->hif_id == CLDMA_ID_MD) {
+		phy_ao_base = CLDMA1_AO_BASE;
+		phy_pd_base = CLDMA1_PD_BASE;
+		hw_info->phy_interrupt_id = CLDMA1_INT;
+	} else {
+		phy_ao_base = CLDMA0_AO_BASE;
+		phy_pd_base = CLDMA0_PD_BASE;
+		hw_info->phy_interrupt_id = CLDMA0_INT;
+	}
+
 	hw_info->ap_ao_base = t7xx_pcie_addr_transfer(pbase->pcie_ext_reg_base,
 						      pbase->pcie_dev_reg_trsl_addr, phy_ao_base);
 	hw_info->ap_pdn_base = t7xx_pcie_addr_transfer(pbase->pcie_ext_reg_base,
diff --git a/drivers/net/wwan/t7xx/t7xx_hif_cldma.h b/drivers/net/wwan/t7xx/t7xx_hif_cldma.h
index 47a35e552da7..4410bac6993a 100644
--- a/drivers/net/wwan/t7xx/t7xx_hif_cldma.h
+++ b/drivers/net/wwan/t7xx/t7xx_hif_cldma.h
@@ -34,7 +34,7 @@
 /**
  * enum cldma_id - Identifiers for CLDMA HW units.
  * @CLDMA_ID_MD: Modem control channel.
- * @CLDMA_ID_AP: Application Processor control channel (not used at the moment).
+ * @CLDMA_ID_AP: Application Processor control channel.
  * @CLDMA_NUM:   Number of CLDMA HW units available.
  */
 enum cldma_id {
diff --git a/drivers/net/wwan/t7xx/t7xx_mhccif.h b/drivers/net/wwan/t7xx/t7xx_mhccif.h
index 209b386bc088..20c50dce9fc3 100644
--- a/drivers/net/wwan/t7xx/t7xx_mhccif.h
+++ b/drivers/net/wwan/t7xx/t7xx_mhccif.h
@@ -25,6 +25,7 @@
 			 D2H_INT_EXCEPTION_CLEARQ_DONE |	\
 			 D2H_INT_EXCEPTION_ALLQ_RESET |		\
 			 D2H_INT_PORT_ENUM |			\
+			 D2H_INT_ASYNC_AP_HK |			\
 			 D2H_INT_ASYNC_MD_HK)
 
 void t7xx_mhccif_mask_set(struct t7xx_pci_dev *t7xx_dev, u32 val);
diff --git a/drivers/net/wwan/t7xx/t7xx_modem_ops.c b/drivers/net/wwan/t7xx/t7xx_modem_ops.c
index 3458af31e864..c5a3c95004bd 100644
--- a/drivers/net/wwan/t7xx/t7xx_modem_ops.c
+++ b/drivers/net/wwan/t7xx/t7xx_modem_ops.c
@@ -44,6 +44,7 @@
 #include "t7xx_state_monitor.h"
 
 #define RT_ID_MD_PORT_ENUM	0
+#define RT_ID_AP_PORT_ENUM	1
 /* Modem feature query identification code - "ICCC" */
 #define MD_FEATURE_QUERY_ID	0x49434343
 
@@ -296,6 +297,7 @@ static void t7xx_md_exception(struct t7xx_modem *md, enum hif_ex_stage stage)
 	}
 
 	t7xx_cldma_exception(md->md_ctrl[CLDMA_ID_MD], stage);
+	t7xx_cldma_exception(md->md_ctrl[CLDMA_ID_AP], stage);
 
 	if (stage == HIF_EX_INIT)
 		t7xx_mhccif_h2d_swint_trigger(t7xx_dev, H2D_CH_EXCEPTION_ACK);
@@ -424,7 +426,7 @@ static int t7xx_parse_host_rt_data(struct t7xx_fsm_ctl *ctl, struct t7xx_sys_inf
 		if (ft_spt_st != MTK_FEATURE_MUST_BE_SUPPORTED)
 			return -EINVAL;
 
-		if (i == RT_ID_MD_PORT_ENUM)
+		if (i == RT_ID_MD_PORT_ENUM || i == RT_ID_AP_PORT_ENUM)
 			t7xx_port_enum_msg_handler(ctl->md, rt_feature->data);
 	}
 
@@ -454,12 +456,12 @@ static int t7xx_core_reset(struct t7xx_modem *md)
 	return 0;
 }
 
-static void t7xx_core_hk_handler(struct t7xx_modem *md, struct t7xx_fsm_ctl *ctl,
+static void t7xx_core_hk_handler(struct t7xx_modem *md, struct t7xx_sys_info *core_info,
+				 struct t7xx_fsm_ctl *ctl,
 				 enum t7xx_fsm_event_state event_id,
 				 enum t7xx_fsm_event_state err_detect)
 {
 	struct t7xx_fsm_event *event = NULL, *event_next;
-	struct t7xx_sys_info *core_info = &md->core_md;
 	struct device *dev = &md->t7xx_dev->pdev->dev;
 	unsigned long flags;
 	int ret;
@@ -529,19 +531,33 @@ static void t7xx_md_hk_wq(struct work_struct *work)
 	t7xx_cldma_start(md->md_ctrl[CLDMA_ID_MD]);
 	t7xx_fsm_broadcast_state(ctl, MD_STATE_WAITING_FOR_HS2);
 	md->core_md.handshake_ongoing = true;
-	t7xx_core_hk_handler(md, ctl, FSM_EVENT_MD_HS2, FSM_EVENT_MD_HS2_EXIT);
+	t7xx_core_hk_handler(md, &md->core_md, ctl, FSM_EVENT_MD_HS2, FSM_EVENT_MD_HS2_EXIT);
+}
+
+static void t7xx_ap_hk_wq(struct work_struct *work)
+{
+	struct t7xx_modem *md = container_of(work, struct t7xx_modem, ap_handshake_work);
+	struct t7xx_fsm_ctl *ctl = md->fsm_ctl;
+
+	 /* Clear the HS2 EXIT event appended in t7xx_core_reset(). */
+	t7xx_fsm_clr_event(ctl, FSM_EVENT_AP_HS2_EXIT);
+	t7xx_cldma_stop(md->md_ctrl[CLDMA_ID_AP]);
+	t7xx_cldma_switch_cfg(md->md_ctrl[CLDMA_ID_AP]);
+	t7xx_cldma_start(md->md_ctrl[CLDMA_ID_AP]);
+	md->core_ap.handshake_ongoing = true;
+	t7xx_core_hk_handler(md, &md->core_ap, ctl, FSM_EVENT_AP_HS2, FSM_EVENT_AP_HS2_EXIT);
 }
 
 void t7xx_md_event_notify(struct t7xx_modem *md, enum md_event_id evt_id)
 {
 	struct t7xx_fsm_ctl *ctl = md->fsm_ctl;
-	void __iomem *mhccif_base;
 	unsigned int int_sta;
 	unsigned long flags;
 
 	switch (evt_id) {
 	case FSM_PRE_START:
-		t7xx_mhccif_mask_clr(md->t7xx_dev, D2H_INT_PORT_ENUM);
+		t7xx_mhccif_mask_clr(md->t7xx_dev, D2H_INT_PORT_ENUM | D2H_INT_ASYNC_MD_HK |
+						   D2H_INT_ASYNC_AP_HK);
 		break;
 
 	case FSM_START:
@@ -554,16 +570,26 @@ void t7xx_md_event_notify(struct t7xx_modem *md, enum md_event_id evt_id)
 			ctl->exp_flg = true;
 			md->exp_id &= ~D2H_INT_EXCEPTION_INIT;
 			md->exp_id &= ~D2H_INT_ASYNC_MD_HK;
+			md->exp_id &= ~D2H_INT_ASYNC_AP_HK;
 		} else if (ctl->exp_flg) {
 			md->exp_id &= ~D2H_INT_ASYNC_MD_HK;
-		} else if (md->exp_id & D2H_INT_ASYNC_MD_HK) {
-			queue_work(md->handshake_wq, &md->handshake_work);
-			md->exp_id &= ~D2H_INT_ASYNC_MD_HK;
-			mhccif_base = md->t7xx_dev->base_addr.mhccif_rc_base;
-			iowrite32(D2H_INT_ASYNC_MD_HK, mhccif_base + REG_EP2RC_SW_INT_ACK);
-			t7xx_mhccif_mask_set(md->t7xx_dev, D2H_INT_ASYNC_MD_HK);
+			md->exp_id &= ~D2H_INT_ASYNC_AP_HK;
 		} else {
-			t7xx_mhccif_mask_clr(md->t7xx_dev, D2H_INT_ASYNC_MD_HK);
+			void __iomem *mhccif_base = md->t7xx_dev->base_addr.mhccif_rc_base;
+
+			if (md->exp_id & D2H_INT_ASYNC_MD_HK) {
+				queue_work(md->handshake_wq, &md->handshake_work);
+				md->exp_id &= ~D2H_INT_ASYNC_MD_HK;
+				iowrite32(D2H_INT_ASYNC_MD_HK, mhccif_base + REG_EP2RC_SW_INT_ACK);
+				t7xx_mhccif_mask_set(md->t7xx_dev, D2H_INT_ASYNC_MD_HK);
+			}
+
+			if (md->exp_id & D2H_INT_ASYNC_AP_HK) {
+				queue_work(md->ap_handshake_wq, &md->ap_handshake_work);
+				md->exp_id &= ~D2H_INT_ASYNC_AP_HK;
+				iowrite32(D2H_INT_ASYNC_AP_HK, mhccif_base + REG_EP2RC_SW_INT_ACK);
+				t7xx_mhccif_mask_set(md->t7xx_dev, D2H_INT_ASYNC_AP_HK);
+			}
 		}
 		spin_unlock_irqrestore(&md->exp_lock, flags);
 
@@ -576,6 +602,7 @@ void t7xx_md_event_notify(struct t7xx_modem *md, enum md_event_id evt_id)
 
 	case FSM_READY:
 		t7xx_mhccif_mask_set(md->t7xx_dev, D2H_INT_ASYNC_MD_HK);
+		t7xx_mhccif_mask_set(md->t7xx_dev, D2H_INT_ASYNC_AP_HK);
 		break;
 
 	default:
@@ -627,6 +654,19 @@ static struct t7xx_modem *t7xx_md_alloc(struct t7xx_pci_dev *t7xx_dev)
 	md->core_md.feature_set[RT_ID_MD_PORT_ENUM] &= ~FEATURE_MSK;
 	md->core_md.feature_set[RT_ID_MD_PORT_ENUM] |=
 		FIELD_PREP(FEATURE_MSK, MTK_FEATURE_MUST_BE_SUPPORTED);
+
+	md->ap_handshake_wq = alloc_workqueue("%s", WQ_UNBOUND | WQ_MEM_RECLAIM | WQ_HIGHPRI,
+					      0, "ap_hk_wq");
+	if (!md->ap_handshake_wq) {
+		destroy_workqueue(md->handshake_wq);
+		return NULL;
+	}
+
+	INIT_WORK(&md->ap_handshake_work, t7xx_ap_hk_wq);
+	md->core_ap.feature_set[RT_ID_AP_PORT_ENUM] &= ~FEATURE_MSK;
+	md->core_ap.feature_set[RT_ID_AP_PORT_ENUM] |=
+		FIELD_PREP(FEATURE_MSK, MTK_FEATURE_MUST_BE_SUPPORTED);
+
 	return md;
 }
 
@@ -638,6 +678,7 @@ int t7xx_md_reset(struct t7xx_pci_dev *t7xx_dev)
 	md->exp_id = 0;
 	t7xx_fsm_reset(md);
 	t7xx_cldma_reset(md->md_ctrl[CLDMA_ID_MD]);
+	t7xx_cldma_reset(md->md_ctrl[CLDMA_ID_AP]);
 	t7xx_port_proxy_reset(md->port_prox);
 	md->md_init_finish = true;
 	return t7xx_core_reset(md);
@@ -667,6 +708,10 @@ int t7xx_md_init(struct t7xx_pci_dev *t7xx_dev)
 	if (ret)
 		goto err_destroy_hswq;
 
+	ret = t7xx_cldma_alloc(CLDMA_ID_AP, t7xx_dev);
+	if (ret)
+		goto err_destroy_hswq;
+
 	ret = t7xx_fsm_init(md);
 	if (ret)
 		goto err_destroy_hswq;
@@ -679,12 +724,16 @@ int t7xx_md_init(struct t7xx_pci_dev *t7xx_dev)
 	if (ret)
 		goto err_uninit_ccmni;
 
-	ret = t7xx_port_proxy_init(md);
+	ret = t7xx_cldma_init(md->md_ctrl[CLDMA_ID_AP]);
 	if (ret)
 		goto err_uninit_md_cldma;
 
+	ret = t7xx_port_proxy_init(md);
+	if (ret)
+		goto err_uninit_ap_cldma;
+
 	ret = t7xx_fsm_append_cmd(md->fsm_ctl, FSM_CMD_START, 0);
-	if (ret) /* fsm_uninit flushes cmd queue */
+	if (ret) /* t7xx_fsm_uninit() flushes cmd queue */
 		goto err_uninit_proxy;
 
 	t7xx_md_sys_sw_init(t7xx_dev);
@@ -694,6 +743,9 @@ int t7xx_md_init(struct t7xx_pci_dev *t7xx_dev)
 err_uninit_proxy:
 	t7xx_port_proxy_uninit(md->port_prox);
 
+err_uninit_ap_cldma:
+	t7xx_cldma_exit(md->md_ctrl[CLDMA_ID_AP]);
+
 err_uninit_md_cldma:
 	t7xx_cldma_exit(md->md_ctrl[CLDMA_ID_MD]);
 
@@ -705,6 +757,7 @@ int t7xx_md_init(struct t7xx_pci_dev *t7xx_dev)
 
 err_destroy_hswq:
 	destroy_workqueue(md->handshake_wq);
+	destroy_workqueue(md->ap_handshake_wq);
 	dev_err(&t7xx_dev->pdev->dev, "Modem init failed\n");
 	return ret;
 }
@@ -720,8 +773,10 @@ void t7xx_md_exit(struct t7xx_pci_dev *t7xx_dev)
 
 	t7xx_fsm_append_cmd(md->fsm_ctl, FSM_CMD_PRE_STOP, FSM_CMD_FLAG_WAIT_FOR_COMPLETION);
 	t7xx_port_proxy_uninit(md->port_prox);
+	t7xx_cldma_exit(md->md_ctrl[CLDMA_ID_AP]);
 	t7xx_cldma_exit(md->md_ctrl[CLDMA_ID_MD]);
 	t7xx_ccmni_exit(t7xx_dev);
 	t7xx_fsm_uninit(md);
 	destroy_workqueue(md->handshake_wq);
+	destroy_workqueue(md->ap_handshake_wq);
 }
diff --git a/drivers/net/wwan/t7xx/t7xx_modem_ops.h b/drivers/net/wwan/t7xx/t7xx_modem_ops.h
index 7469ed636ae8..c93e870ce696 100644
--- a/drivers/net/wwan/t7xx/t7xx_modem_ops.h
+++ b/drivers/net/wwan/t7xx/t7xx_modem_ops.h
@@ -66,10 +66,13 @@ struct t7xx_modem {
 	struct cldma_ctrl		*md_ctrl[CLDMA_NUM];
 	struct t7xx_pci_dev		*t7xx_dev;
 	struct t7xx_sys_info		core_md;
+	struct t7xx_sys_info		core_ap;
 	bool				md_init_finish;
 	bool				rgu_irq_asserted;
 	struct workqueue_struct		*handshake_wq;
 	struct work_struct		handshake_work;
+	struct workqueue_struct		*ap_handshake_wq;
+	struct work_struct		ap_handshake_work;
 	struct t7xx_fsm_ctl		*fsm_ctl;
 	struct port_proxy		*port_prox;
 	unsigned int			exp_id;
diff --git a/drivers/net/wwan/t7xx/t7xx_port.h b/drivers/net/wwan/t7xx/t7xx_port.h
index dc4133eb433a..4a29bd04cbe2 100644
--- a/drivers/net/wwan/t7xx/t7xx_port.h
+++ b/drivers/net/wwan/t7xx/t7xx_port.h
@@ -36,9 +36,15 @@
 /* Channel ID and Message ID definitions.
  * The channel number consists of peer_id(15:12) , channel_id(11:0)
  * peer_id:
- * 0:reserved, 1: to sAP, 2: to MD
+ * 0:reserved, 1: to AP, 2: to MD
  */
 enum port_ch {
+	/* to AP */
+	PORT_CH_AP_CONTROL_RX = 0x1000,
+	PORT_CH_AP_CONTROL_TX = 0x1001,
+	PORT_CH_AP_LOG_RX = 0x1008,
+	PORT_CH_AP_LOG_TX = 0x1009,
+
 	/* to MD */
 	PORT_CH_CONTROL_RX = 0x2000,
 	PORT_CH_CONTROL_TX = 0x2001,
diff --git a/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c b/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c
index 68430b130a67..ae632ef96698 100644
--- a/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c
+++ b/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c
@@ -167,8 +167,12 @@ static int control_msg_handler(struct t7xx_port *port, struct sk_buff *skb)
 	case CTL_ID_HS2_MSG:
 		skb_pull(skb, sizeof(*ctrl_msg_h));
 
-		if (port_conf->rx_ch == PORT_CH_CONTROL_RX) {
-			ret = t7xx_fsm_append_event(ctl, FSM_EVENT_MD_HS2, skb->data,
+		if (port_conf->rx_ch == PORT_CH_CONTROL_RX ||
+		    port_conf->rx_ch == PORT_CH_AP_CONTROL_RX) {
+			int event = port_conf->rx_ch == PORT_CH_CONTROL_RX ?
+				    FSM_EVENT_MD_HS2 : FSM_EVENT_AP_HS2;
+
+			ret = t7xx_fsm_append_event(ctl, event, skb->data,
 						    le32_to_cpu(ctrl_msg_h->data_length));
 			if (ret)
 				dev_err(port->dev, "Failed to append Handshake 2 event");
diff --git a/drivers/net/wwan/t7xx/t7xx_port_proxy.c b/drivers/net/wwan/t7xx/t7xx_port_proxy.c
index d4de047ff0d4..62305d59da90 100644
--- a/drivers/net/wwan/t7xx/t7xx_port_proxy.c
+++ b/drivers/net/wwan/t7xx/t7xx_port_proxy.c
@@ -77,6 +77,14 @@ static const struct t7xx_port_conf t7xx_md_port_conf[] = {
 		.path_id = CLDMA_ID_MD,
 		.ops = &ctl_port_ops,
 		.name = "t7xx_ctrl",
+	}, {
+		.tx_ch = PORT_CH_AP_CONTROL_TX,
+		.rx_ch = PORT_CH_AP_CONTROL_RX,
+		.txq_index = Q_IDX_CTRL,
+		.rxq_index = Q_IDX_CTRL,
+		.path_id = CLDMA_ID_AP,
+		.ops = &ctl_port_ops,
+		.name = "t7xx_ap_ctrl",
 	},
 };
 
@@ -416,6 +424,9 @@ static void t7xx_proxy_init_all_ports(struct t7xx_modem *md)
 		if (port_conf->tx_ch == PORT_CH_CONTROL_TX)
 			md->core_md.ctl_port = port;
 
+		if (port_conf->tx_ch == PORT_CH_AP_CONTROL_TX)
+			md->core_ap.ctl_port = port;
+
 		port->t7xx_dev = md->t7xx_dev;
 		port->dev = &md->t7xx_dev->pdev->dev;
 		spin_lock_init(&port->port_update_lock);
@@ -469,6 +480,7 @@ int t7xx_port_proxy_init(struct t7xx_modem *md)
 	if (ret)
 		return ret;
 
+	t7xx_cldma_set_recv_skb(md->md_ctrl[CLDMA_ID_AP], t7xx_port_proxy_recv_skb);
 	t7xx_cldma_set_recv_skb(md->md_ctrl[CLDMA_ID_MD], t7xx_port_proxy_recv_skb);
 	return 0;
 }
diff --git a/drivers/net/wwan/t7xx/t7xx_reg.h b/drivers/net/wwan/t7xx/t7xx_reg.h
index 7c1b81091a0f..c41d7d094c08 100644
--- a/drivers/net/wwan/t7xx/t7xx_reg.h
+++ b/drivers/net/wwan/t7xx/t7xx_reg.h
@@ -56,7 +56,7 @@
 #define D2H_INT_RESUME_ACK			BIT(12)
 #define D2H_INT_SUSPEND_ACK_AP			BIT(13)
 #define D2H_INT_RESUME_ACK_AP			BIT(14)
-#define D2H_INT_ASYNC_SAP_HK			BIT(15)
+#define D2H_INT_ASYNC_AP_HK			BIT(15)
 #define D2H_INT_ASYNC_MD_HK			BIT(16)
 
 /* Register base */
diff --git a/drivers/net/wwan/t7xx/t7xx_state_monitor.c b/drivers/net/wwan/t7xx/t7xx_state_monitor.c
index 0bcca08ff2bd..80edb8e75a6a 100644
--- a/drivers/net/wwan/t7xx/t7xx_state_monitor.c
+++ b/drivers/net/wwan/t7xx/t7xx_state_monitor.c
@@ -285,8 +285,9 @@ static int fsm_routine_starting(struct t7xx_fsm_ctl *ctl)
 	t7xx_fsm_broadcast_state(ctl, MD_STATE_WAITING_FOR_HS1);
 	t7xx_md_event_notify(md, FSM_START);
 
-	wait_event_interruptible_timeout(ctl->async_hk_wq, md->core_md.ready || ctl->exp_flg,
-					 HZ * 60);
+	wait_event_interruptible_timeout(ctl->async_hk_wq,
+					 (md->core_md.ready && md->core_ap.ready) ||
+					  ctl->exp_flg, HZ * 60);
 	dev = &md->t7xx_dev->pdev->dev;
 
 	if (ctl->exp_flg)
@@ -297,6 +298,13 @@ static int fsm_routine_starting(struct t7xx_fsm_ctl *ctl)
 		if (md->core_md.handshake_ongoing)
 			t7xx_fsm_append_event(ctl, FSM_EVENT_MD_HS2_EXIT, NULL, 0);
 
+		fsm_routine_exception(ctl, NULL, EXCEPTION_HS_TIMEOUT);
+		return -ETIMEDOUT;
+	} else if (!md->core_ap.ready) {
+		dev_err(dev, "AP handshake timeout\n");
+		if (md->core_ap.handshake_ongoing)
+			t7xx_fsm_append_event(ctl, FSM_EVENT_AP_HS2_EXIT, NULL, 0);
+
 		fsm_routine_exception(ctl, NULL, EXCEPTION_HS_TIMEOUT);
 		return -ETIMEDOUT;
 	}
@@ -335,6 +343,7 @@ static void fsm_routine_start(struct t7xx_fsm_ctl *ctl, struct t7xx_fsm_command
 		return;
 	}
 
+	t7xx_cldma_hif_hw_init(md->md_ctrl[CLDMA_ID_AP]);
 	t7xx_cldma_hif_hw_init(md->md_ctrl[CLDMA_ID_MD]);
 	fsm_finish_command(ctl, cmd, fsm_routine_starting(ctl));
 }
diff --git a/drivers/net/wwan/t7xx/t7xx_state_monitor.h b/drivers/net/wwan/t7xx/t7xx_state_monitor.h
index b1af0259d4c5..b6e76f3903c8 100644
--- a/drivers/net/wwan/t7xx/t7xx_state_monitor.h
+++ b/drivers/net/wwan/t7xx/t7xx_state_monitor.h
@@ -38,10 +38,12 @@ enum t7xx_fsm_state {
 enum t7xx_fsm_event_state {
 	FSM_EVENT_INVALID,
 	FSM_EVENT_MD_HS2,
+	FSM_EVENT_AP_HS2,
 	FSM_EVENT_MD_EX,
 	FSM_EVENT_MD_EX_REC_OK,
 	FSM_EVENT_MD_EX_PASS,
 	FSM_EVENT_MD_HS2_EXIT,
+	FSM_EVENT_AP_HS2_EXIT,
 	FSM_EVENT_MAX
 };
 
-- 
2.34.1


^ permalink raw reply related

* [PATCH net-next 0/5] net: wwan: t7xx: fw flashing & coredump support
From: m.chetan.kumar @ 2022-08-16  4:23 UTC (permalink / raw)
  To: netdev
  Cc: kuba, davem, johannes, ryazanov.s.a, loic.poulain, krishna.c.sudi,
	m.chetan.kumar, linuxwwan

From: M Chetan Kumar <m.chetan.kumar@linux.intel.com>

This patch series brings-in the support for FM350 wwan device firmware
flashing & coredump collection using devlink interface.

Below is the high level description of individual patches.
Refer to individual patch commit message for details.

PATCH1:  Enables AP CLDMA communication for firmware flashing &
coredump collection.

PATCH2: Enables the infrastructure & queue configuration required
for early ports enumeration.

PATCH3: Implements device reset and rescan logic required to enter
or exit fastboot mode.

PATCH4: Implements devlink interface & uses the fastboot protocol for
fw flashing and coredump collection.

PATCH5: t7xx devlink commands documentation.

Haijun Liu (3):
  net: wwan: t7xx: Add AP CLDMA
  net: wwan: t7xx: Infrastructure for early port configuration
  net: wwan: t7xx: PCIe reset rescan

M Chetan Kumar (2):
  net: wwan: t7xx: Enable devlink based fw flashing and coredump
    collection
  net: wwan: t7xx: Devlink documentation

 Documentation/networking/devlink/index.rst |   1 +
 Documentation/networking/devlink/t7xx.rst  | 145 +++++
 drivers/net/wwan/Kconfig                   |   1 +
 drivers/net/wwan/t7xx/Makefile             |   5 +-
 drivers/net/wwan/t7xx/t7xx_hif_cldma.c     |  55 +-
 drivers/net/wwan/t7xx/t7xx_hif_cldma.h     |  26 +-
 drivers/net/wwan/t7xx/t7xx_mhccif.h        |   1 +
 drivers/net/wwan/t7xx/t7xx_modem_ops.c     |  92 ++-
 drivers/net/wwan/t7xx/t7xx_modem_ops.h     |   3 +
 drivers/net/wwan/t7xx/t7xx_pci.c           |  65 +-
 drivers/net/wwan/t7xx/t7xx_pci.h           |   3 +
 drivers/net/wwan/t7xx/t7xx_pci_rescan.c    | 117 ++++
 drivers/net/wwan/t7xx/t7xx_pci_rescan.h    |  29 +
 drivers/net/wwan/t7xx/t7xx_port.h          |  12 +-
 drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c |   8 +-
 drivers/net/wwan/t7xx/t7xx_port_devlink.c  | 705 +++++++++++++++++++++
 drivers/net/wwan/t7xx/t7xx_port_devlink.h  |  85 +++
 drivers/net/wwan/t7xx/t7xx_port_proxy.c    | 132 +++-
 drivers/net/wwan/t7xx/t7xx_port_proxy.h    |  12 +-
 drivers/net/wwan/t7xx/t7xx_port_wwan.c     |   9 +-
 drivers/net/wwan/t7xx/t7xx_reg.h           |  31 +-
 drivers/net/wwan/t7xx/t7xx_state_monitor.c | 158 ++++-
 drivers/net/wwan/t7xx/t7xx_state_monitor.h |   4 +
 drivers/net/wwan/t7xx/t7xx_uevent.c        |  41 ++
 drivers/net/wwan/t7xx/t7xx_uevent.h        |  39 ++
 25 files changed, 1706 insertions(+), 73 deletions(-)
 create mode 100644 Documentation/networking/devlink/t7xx.rst
 create mode 100644 drivers/net/wwan/t7xx/t7xx_pci_rescan.c
 create mode 100644 drivers/net/wwan/t7xx/t7xx_pci_rescan.h
 create mode 100644 drivers/net/wwan/t7xx/t7xx_port_devlink.c
 create mode 100644 drivers/net/wwan/t7xx/t7xx_port_devlink.h
 create mode 100644 drivers/net/wwan/t7xx/t7xx_uevent.c
 create mode 100644 drivers/net/wwan/t7xx/t7xx_uevent.h

--
2.34.1


^ permalink raw reply

* Re: [PATCH net 0/4][pull request] Intel Wired LAN Driver Updates 2022-08-12 (iavf)
From: patchwork-bot+netdevbpf @ 2022-08-16  4:00 UTC (permalink / raw)
  To: Tony Nguyen; +Cc: davem, kuba, pabeni, edumazet, netdev
In-Reply-To: <20220812172309.853230-1-anthony.l.nguyen@intel.com>

Hello:

This series was applied to netdev/net.git (master)
by Tony Nguyen <anthony.l.nguyen@intel.com>:

On Fri, 12 Aug 2022 10:23:05 -0700 you wrote:
> This series contains updates to iavf driver only.
> 
> Przemyslaw frees memory for admin queues in initialization error paths,
> prevents freeing of vf_res which is causing null pointer dereference,
> and adjusts calls in error path of reset to avoid iavf_close() which
> could cause deadlock.
> 
> [...]

Here is the summary with links:
  - [net,1/4] iavf: Fix adminq error handling
    https://git.kernel.org/netdev/net/c/419831617ed3
  - [net,2/4] iavf: Fix NULL pointer dereference in iavf_get_link_ksettings
    https://git.kernel.org/netdev/net/c/541a1af451b0
  - [net,3/4] iavf: Fix reset error handling
    https://git.kernel.org/netdev/net/c/31071173771e
  - [net,4/4] iavf: Fix deadlock in initialization
    https://git.kernel.org/netdev/net/c/cbe9e5112630

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox