netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module
@ 2025-03-09 13:42 Taehee Yoo
  2025-03-09 13:42 ` [PATCH v3 net 1/8] eth: bnxt: fix truesize for mb-xdp-pass case Taehee Yoo
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: Taehee Yoo @ 2025-03-09 13:42 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest
  Cc: almasrymina, asml.silence, willemb, kaiyuanz, skhawaja, sdf,
	gospo, somnath.kotur, dw, amritha.nambiar, xuanzhuo, ap420073

The first fixes setting incorrect skb->truesize.
When xdp-mb prog returns XDP_PASS, skb is allocated and initialized.
Currently, The truesize is calculated as BNXT_RX_PAGE_SIZE *
sinfo->nr_frags, but sinfo->nr_frags is flushed by napi_build_skb().
So, it stores sinfo before calling napi_build_skb() and then use it
for calculate truesize.

The second fixes kernel panic in the bnxt_queue_mem_alloc().
The bnxt_queue_mem_alloc() accesses rx ring descriptor.
rx ring descriptors are allocated when the interface is up and it's
freed when the interface is down.
So, if bnxt_queue_mem_alloc() is called when the interface is down,
kernel panic occurs.
This patch makes the bnxt_queue_mem_alloc() return -ENETDOWN if rx ring
descriptors are not allocated.

The third patch fixes kernel panic in the bnxt_queue_{start | stop}().
When a queue is restarted bnxt_queue_{start | stop}() are called.
These functions set MRU to 0 to stop packet flow and then to set up the
remaining things.
MRU variable is a member of vnic_info[] the first vnic_info is for
default and the second is for ntuple.
The first vnic_info is always allocated when interface is up, but the
second is allocated only when ntuple is enabled.
(ethtool -K eth0 ntuple <on | off>).
Currently, the bnxt_queue_{start | stop}() access
vnic_info[BNXT_VNIC_NTUPLE] regardless of whether ntuple is enabled or
not.
So kernel panic occurs.
This patch make the bnxt_queue_{start | stop}() use bp->nr_vnics instead
of BNXT_VNIC_NTUPLE.

The fourth patch fixes a warning due to checksum state.
The bnxt_rx_pkt() checks whether skb->ip_summed is not CHECKSUM_NONE
before updating ip_summed. if ip_summed is not CHECKSUM_NONE, it WARNS
about it. However, the bnxt_xdp_build_skb() is called in XDP-MB-PASS
path and it updates ip_summed earlier than bnxt_rx_pkt().
So, in the XDP-MB-PASS path, the bnxt_rx_pkt() always warns about
checksum.
Updating ip_summed at the bnxt_xdp_build_skb() is unnecessary and
duplicate, so it is removed.

The fifth patch fixes a kernel panic in the
bnxt_get_queue_stats{rx | tx}().
The bnxt_get_queue_stats{rx | tx}() callback functions are called when
a queue is resetting.
These internally access rx and tx rings without null check, but rings
are allocated and initialized when the interface is up.
So, these functions are called when the interface is down, it
occurs a kernel panic.

The sixth patch fixes memory leak in queue reset logic.
When a queue is resetting, tpa_info is allocated for the new queue and
tpa_info for an old queue is not used anymore.
So it should be freed, but not.

The seventh patch makes net_devmem_unbind_dmabuf() ignore -ENETDOWN.
When devmem socket is closed, net_devmem_unbind_dmabuf() is called to
unbind/release resources.
If interface is down, the driver returns -ENETDOWN.
The -ENETDOWN return value is not an actual error, because the interface
will release resources when the interface is down.
So, net_devmem_unbind_dmabuf() needs to ignore -ENETDOWN.

The last patch adds XDP testcases to
tools/testing/selftests/drivers/net/ping.py.

v3:
 - Copy nr_frags instead of full copy. (1/8)
 - Add Review tags from Somnath. (3/8)
 - Add new patch for fixing kernel panic in the
   bnxt_get_queue_stats{rx | tx}(). (5/8)
 - Add new patch for fixing memory leak in queue reset. (6/8)

v2:
 - Do not use num_frags in the bnxt_xdp_build_skb(). (1/6)
 - Add Review tags from Somnath and Jakub. (2/6)
 - Add new patch for fixing checksum warning. (4/6)
 - Add new patch for fixing warning in net_devmem_unbind_dmabuf(). (5/6)
 - Add new XDP testcases to ping.py (6/6)

Taehee Yoo (8):
  eth: bnxt: fix truesize for mb-xdp-pass case
  eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc()
  eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue
    restart logic
  eth: bnxt: do not update checksum in bnxt_xdp_build_skb()
  eth: bnxt: fix kernel panic in the bnxt_get_queue_stats{rx | tx}
  eth: bnxt: fix memory leak in queue reset
  net: devmem: do not WARN conditionally after netdev_rx_queue_restart()
  selftests: drv-net: add xdp cases for ping.py

 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  25 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |  13 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h |   3 +-
 net/core/devmem.c                             |   4 +-
 tools/testing/selftests/drivers/net/ping.py   | 200 ++++++++++++++++--
 .../testing/selftests/net/lib/xdp_dummy.bpf.c |   6 +
 6 files changed, 220 insertions(+), 31 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 net 1/8] eth: bnxt: fix truesize for mb-xdp-pass case
  2025-03-09 13:42 [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module Taehee Yoo
@ 2025-03-09 13:42 ` Taehee Yoo
  2025-03-09 13:42 ` [PATCH v3 net 2/8] eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc() Taehee Yoo
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Taehee Yoo @ 2025-03-09 13:42 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest
  Cc: almasrymina, asml.silence, willemb, kaiyuanz, skhawaja, sdf,
	gospo, somnath.kotur, dw, amritha.nambiar, xuanzhuo, ap420073

When mb-xdp is set and return is XDP_PASS, packet is converted from
xdp_buff to sk_buff with xdp_update_skb_shared_info() in
bnxt_xdp_build_skb().
bnxt_xdp_build_skb() passes incorrect truesize argument to
xdp_update_skb_shared_info().
The truesize is calculated as BNXT_RX_PAGE_SIZE * sinfo->nr_frags but
the skb_shared_info was wiped by napi_build_skb() before.
So it stores sinfo->nr_frags before bnxt_xdp_build_skb() and use it
instead of getting skb_shared_info from xdp_get_shared_info_from_buff().

Splat looks like:
 ------------[ cut here ]------------
 WARNING: CPU: 2 PID: 0 at net/core/skbuff.c:6072 skb_try_coalesce+0x504/0x590
 Modules linked in: xt_nat xt_tcpudp veth af_packet xt_conntrack nft_chain_nat xt_MASQUERADE nf_conntrack_netlink xfrm_user xt_addrtype nft_coms
 CPU: 2 UID: 0 PID: 0 Comm: swapper/2 Not tainted 6.14.0-rc2+ #3
 RIP: 0010:skb_try_coalesce+0x504/0x590
 Code: 4b fd ff ff 49 8b 34 24 40 80 e6 40 0f 84 3d fd ff ff 49 8b 74 24 48 40 f6 c6 01 0f 84 2e fd ff ff 48 8d 4e ff e9 25 fd ff ff <0f> 0b e99
 RSP: 0018:ffffb62c4120caa8 EFLAGS: 00010287
 RAX: 0000000000000003 RBX: ffffb62c4120cb14 RCX: 0000000000000ec0
 RDX: 0000000000001000 RSI: ffffa06e5d7dc000 RDI: 0000000000000003
 RBP: ffffa06e5d7ddec0 R08: ffffa06e6120a800 R09: ffffa06e7a119900
 R10: 0000000000002310 R11: ffffa06e5d7dcec0 R12: ffffe4360575f740
 R13: ffffe43600000000 R14: 0000000000000002 R15: 0000000000000002
 FS:  0000000000000000(0000) GS:ffffa0755f700000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f147b76b0f8 CR3: 00000001615d4000 CR4: 00000000007506f0
 PKRU: 55555554
 Call Trace:
  <IRQ>
  ? __warn+0x84/0x130
  ? skb_try_coalesce+0x504/0x590
  ? report_bug+0x18a/0x1a0
  ? handle_bug+0x53/0x90
  ? exc_invalid_op+0x14/0x70
  ? asm_exc_invalid_op+0x16/0x20
  ? skb_try_coalesce+0x504/0x590
  inet_frag_reasm_finish+0x11f/0x2e0
  ip_defrag+0x37a/0x900
  ip_local_deliver+0x51/0x120
  ip_sublist_rcv_finish+0x64/0x70
  ip_sublist_rcv+0x179/0x210
  ip_list_rcv+0xf9/0x130

How to reproduce:
<Node A>
ip link set $interface1 xdp obj xdp_pass.o
ip link set $interface1 mtu 9000 up
ip a a 10.0.0.1/24 dev $interface1
<Node B>
ip link set $interfac2 mtu 9000 up
ip a a 10.0.0.2/24 dev $interface2
ping 10.0.0.1 -s 65000

Following ping.py patch adds xdp-mb-pass case. so ping.py is going to be
able to reproduce this issue.

Fixes: 1dc4c557bfed ("bnxt: adding bnxt_xdp_build_skb to build skb from multibuffer xdp_buff")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---

v3:
 - Copy nr_frags instead of full copy.

v2:
 - Do not use num_frags in the bnxt_xdp_build_skb().

 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 8 ++++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 7b8b5b39c7bb..6b5fe4ee7a99 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2038,6 +2038,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 	struct rx_cmp_ext *rxcmp1;
 	u32 tmp_raw_cons = *raw_cons;
 	u16 cons, prod, cp_cons = RING_CMP(tmp_raw_cons);
+	struct skb_shared_info *sinfo;
 	struct bnxt_sw_rx_bd *rx_buf;
 	unsigned int len;
 	u8 *data_ptr, agg_bufs, cmp_type;
@@ -2164,6 +2165,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 							     false);
 			if (!frag_len)
 				goto oom_next_rx;
+
 		}
 		xdp_active = true;
 	}
@@ -2173,6 +2175,12 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 			rc = 1;
 			goto next_rx;
 		}
+		if (xdp_buff_has_frags(&xdp)) {
+			sinfo = xdp_get_shared_info_from_buff(&xdp);
+			agg_bufs = sinfo->nr_frags;
+		} else {
+			agg_bufs = 0;
+		}
 	}
 
 	if (len <= bp->rx_copybreak) {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
index e6c64e4bd66c..e9b49cb5b735 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
@@ -476,7 +476,7 @@ bnxt_xdp_build_skb(struct bnxt *bp, struct sk_buff *skb, u8 num_frags,
 	}
 	xdp_update_skb_shared_info(skb, num_frags,
 				   sinfo->xdp_frags_size,
-				   BNXT_RX_PAGE_SIZE * sinfo->nr_frags,
+				   BNXT_RX_PAGE_SIZE * num_frags,
 				   xdp_buff_is_frag_pfmemalloc(xdp));
 	return skb;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 net 2/8] eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc()
  2025-03-09 13:42 [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module Taehee Yoo
  2025-03-09 13:42 ` [PATCH v3 net 1/8] eth: bnxt: fix truesize for mb-xdp-pass case Taehee Yoo
@ 2025-03-09 13:42 ` Taehee Yoo
  2025-03-09 21:44   ` Mina Almasry
  2025-03-09 13:42 ` [PATCH v3 net 3/8] eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue restart logic Taehee Yoo
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Taehee Yoo @ 2025-03-09 13:42 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest
  Cc: almasrymina, asml.silence, willemb, kaiyuanz, skhawaja, sdf,
	gospo, somnath.kotur, dw, amritha.nambiar, xuanzhuo, ap420073

The bnxt_queue_mem_alloc() is called to allocate new queue memory when
a queue is restarted.
It internally accesses rx buffer descriptor corresponding to the index.
The rx buffer descriptor is allocated and set when the interface is up
and it's freed when the interface is down.
So, if queue is restarted if interface is down, kernel panic occurs.

Splat looks like:
 BUG: unable to handle page fault for address: 000000000000b240
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 3 UID: 0 PID: 1563 Comm: ncdevmem2 Not tainted 6.14.0-rc2+ #9 844ddba6e7c459cafd0bf4db9a3198e
 Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
 RIP: 0010:bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en]
 Code: 41 54 4d 89 c4 4d 69 c0 c0 05 00 00 55 48 89 f5 53 48 89 fb 4c 8d b5 40 05 00 00 48 83 ec 15
 RSP: 0018:ffff9dcc83fef9e8 EFLAGS: 00010202
 RAX: ffffffffc0457720 RBX: ffff934ed8d40000 RCX: 0000000000000000
 RDX: 000000000000001f RSI: ffff934ea508f800 RDI: ffff934ea508f808
 RBP: ffff934ea508f800 R08: 000000000000b240 R09: ffff934e84f4b000
 R10: ffff9dcc83fefa30 R11: ffff934e84f4b000 R12: 000000000000001f
 R13: ffff934ed8d40ac0 R14: ffff934ea508fd40 R15: ffff934e84f4b000
 FS:  00007fa73888c740(0000) GS:ffff93559f780000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000000000000b240 CR3: 0000000145a2e000 CR4: 00000000007506f0
 PKRU: 55555554
 Call Trace:
  <TASK>
  ? __die+0x20/0x70
  ? page_fault_oops+0x15a/0x460
  ? exc_page_fault+0x6e/0x180
  ? asm_exc_page_fault+0x22/0x30
  ? __pfx_bnxt_queue_mem_alloc+0x10/0x10 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7]
  ? bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7]
  netdev_rx_queue_restart+0xc5/0x240
  net_devmem_bind_dmabuf_to_queue+0xf8/0x200
  netdev_nl_bind_rx_doit+0x3a7/0x450
  genl_family_rcv_msg_doit+0xd9/0x130
  genl_rcv_msg+0x184/0x2b0
  ? __pfx_netdev_nl_bind_rx_doit+0x10/0x10
  ? __pfx_genl_rcv_msg+0x10/0x10
  netlink_rcv_skb+0x54/0x100
  genl_rcv+0x24/0x40
...

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Fixes: 2d694c27d32e ("bnxt_en: implement netdev_queue_mgmt_ops")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---

v3:
 - No changes.

v2:
 - Add Review tags from Somnath and Jakub.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 6b5fe4ee7a99..acb9500ef930 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -15447,6 +15447,9 @@ static int bnxt_queue_mem_alloc(struct net_device *dev, void *qmem, int idx)
 	struct bnxt_ring_struct *ring;
 	int rc;
 
+	if (!bp->rx_ring)
+		return -ENETDOWN;
+
 	rxr = &bp->rx_ring[idx];
 	clone = qmem;
 	memcpy(clone, rxr, sizeof(*rxr));
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 net 3/8] eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue restart logic
  2025-03-09 13:42 [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module Taehee Yoo
  2025-03-09 13:42 ` [PATCH v3 net 1/8] eth: bnxt: fix truesize for mb-xdp-pass case Taehee Yoo
  2025-03-09 13:42 ` [PATCH v3 net 2/8] eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc() Taehee Yoo
@ 2025-03-09 13:42 ` Taehee Yoo
  2025-03-09 13:42 ` [PATCH v3 net 4/8] eth: bnxt: do not update checksum in bnxt_xdp_build_skb() Taehee Yoo
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Taehee Yoo @ 2025-03-09 13:42 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest
  Cc: almasrymina, asml.silence, willemb, kaiyuanz, skhawaja, sdf,
	gospo, somnath.kotur, dw, amritha.nambiar, xuanzhuo, ap420073

When a queue is restarted, it sets MRU to 0 for stopping packet flow.
MRU variable is a member of vnic_info[], the first vnic_info is default
and the second is ntuple.
Only when ntuple is enabled(ethtool -K eth0 ntuple on), vnic_info for
ntuple is allocated in init logic.
The bp->nr_vnics indicates how many vnic_info are allocated.
However bnxt_queue_{start | stop}() accesses vnic_info[BNXT_VNIC_NTUPLE]
regardless of ntuple state.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Fixes: b9d2956e869c ("bnxt_en: stop packet flow during bnxt_queue_stop/start")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---

v3:
 - Add Review tags from Somnath.

v2:
 - No changes.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index acb9500ef930..218109ee1c23 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -15643,7 +15643,7 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
 	cpr = &rxr->bnapi->cp_ring;
 	cpr->sw_stats->rx.rx_resets++;
 
-	for (i = 0; i <= BNXT_VNIC_NTUPLE; i++) {
+	for (i = 0; i <= bp->nr_vnics; i++) {
 		vnic = &bp->vnic_info[i];
 
 		rc = bnxt_hwrm_vnic_set_rss_p5(bp, vnic, true);
@@ -15671,7 +15671,7 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
 	struct bnxt_vnic_info *vnic;
 	int i;
 
-	for (i = 0; i <= BNXT_VNIC_NTUPLE; i++) {
+	for (i = 0; i <= bp->nr_vnics; i++) {
 		vnic = &bp->vnic_info[i];
 		vnic->mru = 0;
 		bnxt_hwrm_vnic_update(bp, vnic,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 net 4/8] eth: bnxt: do not update checksum in bnxt_xdp_build_skb()
  2025-03-09 13:42 [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module Taehee Yoo
                   ` (2 preceding siblings ...)
  2025-03-09 13:42 ` [PATCH v3 net 3/8] eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue restart logic Taehee Yoo
@ 2025-03-09 13:42 ` Taehee Yoo
  2025-03-10  2:58   ` Somnath Kotur
  2025-03-09 13:42 ` [PATCH v3 net 5/8] eth: bnxt: fix kernel panic in the bnxt_get_queue_stats{rx | tx} Taehee Yoo
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Taehee Yoo @ 2025-03-09 13:42 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest
  Cc: almasrymina, asml.silence, willemb, kaiyuanz, skhawaja, sdf,
	gospo, somnath.kotur, dw, amritha.nambiar, xuanzhuo, ap420073

The bnxt_rx_pkt() updates ip_summed value at the end if checksum offload
is enabled.
When the XDP-MB program is attached and it returns XDP_PASS, the
bnxt_xdp_build_skb() is called to update skb_shared_info.
The main purpose of bnxt_xdp_build_skb() is to update skb_shared_info,
but it updates ip_summed value too if checksum offload is enabled.
This is actually duplicate work.

When the bnxt_rx_pkt() updates ip_summed value, it checks if ip_summed
is CHECKSUM_NONE or not.
It means that ip_summed should be CHECKSUM_NONE at this moment.
But ip_summed may already be updated to CHECKSUM_UNNECESSARY in the
XDP-MB-PASS path.
So the by skb_checksum_none_assert() WARNS about it.

This is duplicate work and updating ip_summed in the
bnxt_xdp_build_skb() is not needed.

Splat looks like:
WARNING: CPU: 3 PID: 5782 at ./include/linux/skbuff.h:5155 bnxt_rx_pkt+0x479b/0x7610 [bnxt_en]
Modules linked in: bnxt_re bnxt_en rdma_ucm rdma_cm iw_cm ib_cm ib_uverbs veth xt_nat xt_tcpudp xt_conntrack nft_chain_nat xt_MASQUERADE nf_]
CPU: 3 UID: 0 PID: 5782 Comm: socat Tainted: G        W          6.14.0-rc4+ #27
Tainted: [W]=WARN
Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
RIP: 0010:bnxt_rx_pkt+0x479b/0x7610 [bnxt_en]
Code: 54 24 0c 4c 89 f1 4c 89 ff c1 ea 1f ff d3 0f 1f 00 49 89 c6 48 85 c0 0f 84 4c e5 ff ff 48 89 c7 e8 ca 3d a0 c8 e9 8f f4 ff ff <0f> 0b f
RSP: 0018:ffff88881ba09928 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 00000000c7590303 RCX: 0000000000000000
RDX: 1ffff1104e7d1610 RSI: 0000000000000001 RDI: ffff8881c91300b8
RBP: ffff88881ba09b28 R08: ffff888273e8b0d0 R09: ffff888273e8b070
R10: ffff888273e8b010 R11: ffff888278b0f000 R12: ffff888273e8b080
R13: ffff8881c9130e00 R14: ffff8881505d3800 R15: ffff888273e8b000
FS:  00007f5a2e7be080(0000) GS:ffff88881ba00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fff2e708ff8 CR3: 000000013e3b0000 CR4: 00000000007506f0
PKRU: 55555554
Call Trace:
 <IRQ>
 ? __warn+0xcd/0x2f0
 ? bnxt_rx_pkt+0x479b/0x7610
 ? report_bug+0x326/0x3c0
 ? handle_bug+0x53/0xa0
 ? exc_invalid_op+0x14/0x50
 ? asm_exc_invalid_op+0x16/0x20
 ? bnxt_rx_pkt+0x479b/0x7610
 ? bnxt_rx_pkt+0x3e41/0x7610
 ? __pfx_bnxt_rx_pkt+0x10/0x10
 ? napi_complete_done+0x2cf/0x7d0
 __bnxt_poll_work+0x4e8/0x1220
 ? __pfx___bnxt_poll_work+0x10/0x10
 ? __pfx_mark_lock.part.0+0x10/0x10
 bnxt_poll_p5+0x36a/0xfa0
 ? __pfx_bnxt_poll_p5+0x10/0x10
 __napi_poll.constprop.0+0xa0/0x440
 net_rx_action+0x899/0xd00
...

Following ping.py patch adds xdp-mb-pass case. so ping.py is going
to be able to reproduce this issue.

Fixes: 1dc4c557bfed ("bnxt: adding bnxt_xdp_build_skb to build skb from multibuffer xdp_buff")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---

v3:
 - No changes.

v2:
 - Patch added.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  3 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 11 ++---------
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h |  3 +--
 3 files changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 218109ee1c23..9afb2c5072b1 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2218,7 +2218,8 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 			if (!skb)
 				goto oom_next_rx;
 		} else {
-			skb = bnxt_xdp_build_skb(bp, skb, agg_bufs, rxr->page_pool, &xdp, rxcmp1);
+			skb = bnxt_xdp_build_skb(bp, skb, agg_bufs,
+						 rxr->page_pool, &xdp);
 			if (!skb) {
 				/* we should be able to free the old skb here */
 				bnxt_xdp_buff_frags_free(rxr, &xdp);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
index e9b49cb5b735..299822cacca4 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
@@ -460,20 +460,13 @@ int bnxt_xdp(struct net_device *dev, struct netdev_bpf *xdp)
 
 struct sk_buff *
 bnxt_xdp_build_skb(struct bnxt *bp, struct sk_buff *skb, u8 num_frags,
-		   struct page_pool *pool, struct xdp_buff *xdp,
-		   struct rx_cmp_ext *rxcmp1)
+		   struct page_pool *pool, struct xdp_buff *xdp)
 {
 	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
 
 	if (!skb)
 		return NULL;
-	skb_checksum_none_assert(skb);
-	if (RX_CMP_L4_CS_OK(rxcmp1)) {
-		if (bp->dev->features & NETIF_F_RXCSUM) {
-			skb->ip_summed = CHECKSUM_UNNECESSARY;
-			skb->csum_level = RX_CMP_ENCAP(rxcmp1);
-		}
-	}
+
 	xdp_update_skb_shared_info(skb, num_frags,
 				   sinfo->xdp_frags_size,
 				   BNXT_RX_PAGE_SIZE * num_frags,
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
index 0122782400b8..220285e190fc 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
@@ -33,6 +33,5 @@ void bnxt_xdp_buff_frags_free(struct bnxt_rx_ring_info *rxr,
 			      struct xdp_buff *xdp);
 struct sk_buff *bnxt_xdp_build_skb(struct bnxt *bp, struct sk_buff *skb,
 				   u8 num_frags, struct page_pool *pool,
-				   struct xdp_buff *xdp,
-				   struct rx_cmp_ext *rxcmp1);
+				   struct xdp_buff *xdp);
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 net 5/8] eth: bnxt: fix kernel panic in the bnxt_get_queue_stats{rx | tx}
  2025-03-09 13:42 [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module Taehee Yoo
                   ` (3 preceding siblings ...)
  2025-03-09 13:42 ` [PATCH v3 net 4/8] eth: bnxt: do not update checksum in bnxt_xdp_build_skb() Taehee Yoo
@ 2025-03-09 13:42 ` Taehee Yoo
  2025-03-10  2:59   ` Somnath Kotur
  2025-03-09 13:42 ` [PATCH v3 net 6/8] eth: bnxt: fix memory leak in queue reset Taehee Yoo
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Taehee Yoo @ 2025-03-09 13:42 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest
  Cc: almasrymina, asml.silence, willemb, kaiyuanz, skhawaja, sdf,
	gospo, somnath.kotur, dw, amritha.nambiar, xuanzhuo, ap420073

When qstats-get operation is executed, callbacks of netdev_stats_ops
are called. The bnxt_get_queue_stats{rx | tx} collect per-queue stats
from sw_stats in the rings.
But {rx | tx | cp}_ring are allocated when the interface is up.
So, these rings are not allocated when the interface is down.

The qstats-get is allowed even if the interface is down. However,
the bnxt_get_queue_stats{rx | tx}() accesses cp_ring and tx_ring
without null check.
So, it needs to avoid accessing rings if the interface is down.

Reproducer:
 ip link set $interface down
 ./cli.py --spec netdev.yaml --dump qstats-get
OR
 ip link set $interface down
 python ./stats.py

Splat looks like:
 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 1680fa067 P4D 1680fa067 PUD 16be3b067 PMD 0
 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 0 UID: 0 PID: 1495 Comm: python3 Not tainted 6.14.0-rc4+ #32 5cd0f999d5a15c574ac72b3e4b907341
 Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
 RIP: 0010:bnxt_get_queue_stats_rx+0xf/0x70 [bnxt_en]
 Code: c6 87 b5 18 00 00 02 eb a2 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 01
 RSP: 0018:ffffabef43cdb7e0 EFLAGS: 00010282
 RAX: 0000000000000000 RBX: ffffffffc04c8710 RCX: 0000000000000000
 RDX: ffffabef43cdb858 RSI: 0000000000000000 RDI: ffff8d504e850000
 RBP: ffff8d506c9f9c00 R08: 0000000000000004 R09: ffff8d506bcd901c
 R10: 0000000000000015 R11: ffff8d506bcd9000 R12: 0000000000000000
 R13: ffffabef43cdb8c0 R14: ffff8d504e850000 R15: 0000000000000000
 FS:  00007f2c5462b080(0000) GS:ffff8d575f600000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000000167fd0000 CR4: 00000000007506f0
 PKRU: 55555554
 Call Trace:
  <TASK>
  ? __die+0x20/0x70
  ? page_fault_oops+0x15a/0x460
  ? sched_balance_find_src_group+0x58d/0xd10
  ? exc_page_fault+0x6e/0x180
  ? asm_exc_page_fault+0x22/0x30
  ? bnxt_get_queue_stats_rx+0xf/0x70 [bnxt_en cdd546fd48563c280cfd30e9647efa420db07bf1]
  netdev_nl_stats_by_netdev+0x2b1/0x4e0
  ? xas_load+0x9/0xb0
  ? xas_find+0x183/0x1d0
  ? xa_find+0x8b/0xe0
  netdev_nl_qstats_get_dumpit+0xbf/0x1e0
  genl_dumpit+0x31/0x90
  netlink_dump+0x1a8/0x360

Fixes: af7b3b4adda5 ("eth: bnxt: support per-queue statistics")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---

v3:
 - Patch added.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 9afb2c5072b1..bee12d9b57ab 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -15384,6 +15384,9 @@ static void bnxt_get_queue_stats_rx(struct net_device *dev, int i,
 	struct bnxt_cp_ring_info *cpr;
 	u64 *sw;
 
+	if (!bp->bnapi)
+		return;
+
 	cpr = &bp->bnapi[i]->cp_ring;
 	sw = cpr->stats.sw_stats;
 
@@ -15407,6 +15410,9 @@ static void bnxt_get_queue_stats_tx(struct net_device *dev, int i,
 	struct bnxt_napi *bnapi;
 	u64 *sw;
 
+	if (!bp->tx_ring)
+		return;
+
 	bnapi = bp->tx_ring[bp->tx_ring_map[i]].bnapi;
 	sw = bnapi->cp_ring.stats.sw_stats;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 net 6/8] eth: bnxt: fix memory leak in queue reset
  2025-03-09 13:42 [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module Taehee Yoo
                   ` (4 preceding siblings ...)
  2025-03-09 13:42 ` [PATCH v3 net 5/8] eth: bnxt: fix kernel panic in the bnxt_get_queue_stats{rx | tx} Taehee Yoo
@ 2025-03-09 13:42 ` Taehee Yoo
  2025-03-09 13:42 ` [PATCH v3 net 7/8] net: devmem: do not WARN conditionally after netdev_rx_queue_restart() Taehee Yoo
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Taehee Yoo @ 2025-03-09 13:42 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest
  Cc: almasrymina, asml.silence, willemb, kaiyuanz, skhawaja, sdf,
	gospo, somnath.kotur, dw, amritha.nambiar, xuanzhuo, ap420073

When the queue is reset, the bnxt_alloc_one_tpa_info() is called to
allocate tpa_info for the new queue.
And then the old queue's tpa_info should be removed by the
bnxt_free_one_tpa_info(), but it is not called.
So memory leak occurs.
It adds the bnxt_free_one_tpa_info() in the bnxt_queue_mem_free().

unreferenced object 0xffff888293cc0000 (size 16384):
  comm "ncdevmem", pid 2076, jiffies 4296604081
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 40 75 78 93 82 88 ff ff  ........@ux.....
    40 75 78 93 02 00 00 00 00 00 00 00 00 00 00 00  @ux.............
  backtrace (crc 5d7d4798):
    ___kmalloc_large_node+0x10d/0x1b0
    __kmalloc_large_node_noprof+0x17/0x60
    __kmalloc_noprof+0x3f6/0x520
    bnxt_alloc_one_tpa_info+0x5f/0x300 [bnxt_en]
    bnxt_queue_mem_alloc+0x8e8/0x14f0 [bnxt_en]
    netdev_rx_queue_restart+0x233/0x620
    net_devmem_bind_dmabuf_to_queue+0x2a3/0x600
    netdev_nl_bind_rx_doit+0xc00/0x10a0
    genl_family_rcv_msg_doit+0x1d4/0x2b0
    genl_rcv_msg+0x3fb/0x6c0
    netlink_rcv_skb+0x12c/0x360
    genl_rcv+0x24/0x40
    netlink_unicast+0x447/0x710
    netlink_sendmsg+0x712/0xbc0
    __sys_sendto+0x3fd/0x4d0
    __x64_sys_sendto+0xdc/0x1b0

Fixes: 2d694c27d32e ("bnxt_en: implement netdev_queue_mgmt_ops")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---

v3:
 - Patch added.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index bee12d9b57ab..55f553debd3b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -15539,6 +15539,7 @@ static void bnxt_queue_mem_free(struct net_device *dev, void *qmem)
 	struct bnxt_ring_struct *ring;
 
 	bnxt_free_one_rx_ring_skbs(bp, rxr);
+	bnxt_free_one_tpa_info(bp, rxr);
 
 	xdp_rxq_info_unreg(&rxr->xdp_rxq);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 net 7/8] net: devmem: do not WARN conditionally after netdev_rx_queue_restart()
  2025-03-09 13:42 [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module Taehee Yoo
                   ` (5 preceding siblings ...)
  2025-03-09 13:42 ` [PATCH v3 net 6/8] eth: bnxt: fix memory leak in queue reset Taehee Yoo
@ 2025-03-09 13:42 ` Taehee Yoo
  2025-03-09 21:45   ` Mina Almasry
  2025-03-09 13:42 ` [PATCH v3 net 8/8] selftests: drv-net: add xdp cases for ping.py Taehee Yoo
  2025-03-10 20:40 ` [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module patchwork-bot+netdevbpf
  8 siblings, 1 reply; 16+ messages in thread
From: Taehee Yoo @ 2025-03-09 13:42 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest
  Cc: almasrymina, asml.silence, willemb, kaiyuanz, skhawaja, sdf,
	gospo, somnath.kotur, dw, amritha.nambiar, xuanzhuo, ap420073

When devmem socket is closed, netdev_rx_queue_restart() is called to
reset queue by the net_devmem_unbind_dmabuf(). But callback may return
-ENETDOWN if the interface is down because queues are already freed
when the interface is down so queue reset is not needed.
So, it should not warn if the return value is -ENETDOWN.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---

v3:
 - No changes.

v2:
 - Patch added.

 net/core/devmem.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/core/devmem.c b/net/core/devmem.c
index 3bba3f018df0..0e5a2c672efd 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -109,6 +109,7 @@ void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding)
 	struct netdev_rx_queue *rxq;
 	unsigned long xa_idx;
 	unsigned int rxq_idx;
+	int err;
 
 	if (binding->list.next)
 		list_del(&binding->list);
@@ -120,7 +121,8 @@ void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding)
 
 		rxq_idx = get_netdev_rx_queue_index(rxq);
 
-		WARN_ON(netdev_rx_queue_restart(binding->dev, rxq_idx));
+		err = netdev_rx_queue_restart(binding->dev, rxq_idx);
+		WARN_ON(err && err != -ENETDOWN);
 	}
 
 	xa_erase(&net_devmem_dmabuf_bindings, binding->id);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 net 8/8] selftests: drv-net: add xdp cases for ping.py
  2025-03-09 13:42 [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module Taehee Yoo
                   ` (6 preceding siblings ...)
  2025-03-09 13:42 ` [PATCH v3 net 7/8] net: devmem: do not WARN conditionally after netdev_rx_queue_restart() Taehee Yoo
@ 2025-03-09 13:42 ` Taehee Yoo
  2025-03-10 20:40 ` [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module patchwork-bot+netdevbpf
  8 siblings, 0 replies; 16+ messages in thread
From: Taehee Yoo @ 2025-03-09 13:42 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest
  Cc: almasrymina, asml.silence, willemb, kaiyuanz, skhawaja, sdf,
	gospo, somnath.kotur, dw, amritha.nambiar, xuanzhuo, ap420073

ping.py has 3 cases, test_v4, test_v6 and test_tcp.
But these cases are not executed on the XDP environment.
So, it adds XDP environment, existing tests(test_v4, test_v6, and
test_tcp) are executed too on the below XDP environment.
So, it adds XDP cases.
1. xdp-generic + single-buffer
2. xdp-generic + multi-buffer
3. xdp-native + single-buffer
4. xdp-native + multi-buffer
5. xdp-offload

It also makes test_{v4 | v6 | tcp} sending large size packets. this may
help to check whether multi-buffer is working or not.

Note that the physical interface may be down and then up when xdp is
attached or detached.
This takes some period to activate traffic. So sleep(10) is
added if the test interface is the physical interface.
netdevsim and veth type interfaces skip sleep.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---

v3:
 - No changes.

v2:
 - Patch added.

 tools/testing/selftests/drivers/net/ping.py   | 200 ++++++++++++++++--
 .../testing/selftests/net/lib/xdp_dummy.bpf.c |   6 +
 2 files changed, 191 insertions(+), 15 deletions(-)

diff --git a/tools/testing/selftests/drivers/net/ping.py b/tools/testing/selftests/drivers/net/ping.py
index eb83e7b48797..93f4b411b378 100755
--- a/tools/testing/selftests/drivers/net/ping.py
+++ b/tools/testing/selftests/drivers/net/ping.py
@@ -1,49 +1,219 @@
 #!/usr/bin/env python3
 # SPDX-License-Identifier: GPL-2.0
 
+import os
+import random, string, time
 from lib.py import ksft_run, ksft_exit
-from lib.py import ksft_eq
-from lib.py import NetDrvEpEnv
+from lib.py import ksft_eq, KsftSkipEx, KsftFailEx
+from lib.py import EthtoolFamily, NetDrvEpEnv
 from lib.py import bkg, cmd, wait_port_listen, rand_port
+from lib.py import ethtool, ip
 
+remote_ifname=""
+no_sleep=False
 
-def test_v4(cfg) -> None:
+def _test_v4(cfg) -> None:
     cfg.require_v4()
 
     cmd(f"ping -c 1 -W0.5 {cfg.remote_v4}")
     cmd(f"ping -c 1 -W0.5 {cfg.v4}", host=cfg.remote)
+    cmd(f"ping -s 65000 -c 1 -W0.5 {cfg.remote_v4}")
+    cmd(f"ping -s 65000 -c 1 -W0.5 {cfg.v4}", host=cfg.remote)
 
-
-def test_v6(cfg) -> None:
+def _test_v6(cfg) -> None:
     cfg.require_v6()
 
-    cmd(f"ping -c 1 -W0.5 {cfg.remote_v6}")
-    cmd(f"ping -c 1 -W0.5 {cfg.v6}", host=cfg.remote)
-
+    cmd(f"ping -c 1 -W5 {cfg.remote_v6}")
+    cmd(f"ping -c 1 -W5 {cfg.v6}", host=cfg.remote)
+    cmd(f"ping -s 65000 -c 1 -W0.5 {cfg.remote_v6}")
+    cmd(f"ping -s 65000 -c 1 -W0.5 {cfg.v6}", host=cfg.remote)
 
-def test_tcp(cfg) -> None:
+def _test_tcp(cfg) -> None:
     cfg.require_cmd("socat", remote=True)
 
     port = rand_port()
     listen_cmd = f"socat -{cfg.addr_ipver} -t 2 -u TCP-LISTEN:{port},reuseport STDOUT"
 
+    test_string = ''.join(random.choice(string.ascii_lowercase) for _ in range(65536))
     with bkg(listen_cmd, exit_wait=True) as nc:
         wait_port_listen(port)
 
-        cmd(f"echo ping | socat -t 2 -u STDIN TCP:{cfg.baddr}:{port}",
+        cmd(f"echo {test_string} | socat -t 2 -u STDIN TCP:{cfg.baddr}:{port}",
             shell=True, host=cfg.remote)
-    ksft_eq(nc.stdout.strip(), "ping")
+    ksft_eq(nc.stdout.strip(), test_string)
 
+    test_string = ''.join(random.choice(string.ascii_lowercase) for _ in range(65536))
     with bkg(listen_cmd, host=cfg.remote, exit_wait=True) as nc:
         wait_port_listen(port, host=cfg.remote)
 
-        cmd(f"echo ping | socat -t 2 -u STDIN TCP:{cfg.remote_baddr}:{port}", shell=True)
-    ksft_eq(nc.stdout.strip(), "ping")
-
+        cmd(f"echo {test_string} | socat -t 2 -u STDIN TCP:{cfg.remote_baddr}:{port}", shell=True)
+    ksft_eq(nc.stdout.strip(), test_string)
+
+def _set_offload_checksum(cfg, netnl, on) -> None:
+    try:
+        ethtool(f" -K {cfg.ifname} rx {on} tx {on} ")
+    except:
+        return
+
+def _set_xdp_generic_sb_on(cfg) -> None:
+    test_dir = os.path.dirname(os.path.realpath(__file__))
+    prog = test_dir + "/../../net/lib/xdp_dummy.bpf.o"
+    cmd(f"ip link set dev {remote_ifname} mtu 1500", shell=True, host=cfg.remote)
+    cmd(f"ip link set dev {cfg.ifname} mtu 1500 xdpgeneric obj {prog} sec xdp", shell=True)
+
+    if no_sleep != True:
+        time.sleep(10)
+
+def _set_xdp_generic_mb_on(cfg) -> None:
+    test_dir = os.path.dirname(os.path.realpath(__file__))
+    prog = test_dir + "/../../net/lib/xdp_dummy.bpf.o"
+    cmd(f"ip link set dev {remote_ifname} mtu 9000", shell=True, host=cfg.remote)
+    ip("link set dev %s mtu 9000 xdpgeneric obj %s sec xdp.frags" % (cfg.ifname, prog))
+
+    if no_sleep != True:
+        time.sleep(10)
+
+def _set_xdp_native_sb_on(cfg) -> None:
+    test_dir = os.path.dirname(os.path.realpath(__file__))
+    prog = test_dir + "/../../net/lib/xdp_dummy.bpf.o"
+    cmd(f"ip link set dev {remote_ifname} mtu 1500", shell=True, host=cfg.remote)
+    cmd(f"ip -j link set dev {cfg.ifname} mtu 1500 xdp obj {prog} sec xdp", shell=True)
+    xdp_info = ip("-d link show %s" % (cfg.ifname), json=True)[0]
+    if xdp_info['xdp']['mode'] != 1:
+        """
+        If the interface doesn't support native-mode, it falls back to generic mode.
+        The mode value 1 is native and 2 is generic.
+        So it raises an exception if mode is not 1(native mode).
+        """
+        raise KsftSkipEx('device does not support native-XDP')
+
+    if no_sleep != True:
+        time.sleep(10)
+
+def _set_xdp_native_mb_on(cfg) -> None:
+    test_dir = os.path.dirname(os.path.realpath(__file__))
+    prog = test_dir + "/../../net/lib/xdp_dummy.bpf.o"
+    cmd(f"ip link set dev {remote_ifname} mtu 9000", shell=True, host=cfg.remote)
+    try:
+        cmd(f"ip link set dev {cfg.ifname} mtu 9000 xdp obj {prog} sec xdp.frags", shell=True)
+    except Exception as e:
+        cmd(f"ip link set dev {remote_ifname} mtu 1500", shell=True, host=cfg.remote)
+        raise KsftSkipEx('device does not support native-multi-buffer XDP')
+
+    if no_sleep != True:
+        time.sleep(10)
+
+def _set_xdp_offload_on(cfg) -> None:
+    test_dir = os.path.dirname(os.path.realpath(__file__))
+    prog = test_dir + "/../../net/lib/xdp_dummy.bpf.o"
+    cmd(f"ip link set dev {cfg.ifname} mtu 1500", shell=True)
+    try:
+        cmd(f"ip link set dev {cfg.ifname} xdpoffload obj {prog} sec xdp", shell=True)
+    except Exception as e:
+        raise KsftSkipEx('device does not support offloaded XDP')
+    cmd(f"ip link set dev {remote_ifname} mtu 1500", shell=True, host=cfg.remote)
+
+    if no_sleep != True:
+        time.sleep(10)
+
+def get_interface_info(cfg) -> None:
+    global remote_ifname
+    global no_sleep
+
+    remote_info = cmd(f"ip -4 -o addr show to {cfg.remote_v4} | awk '{{print $2}}'", shell=True, host=cfg.remote).stdout
+    remote_ifname = remote_info.rstrip('\n')
+    if remote_ifname == "":
+        raise KsftFailEx('Can not get remote interface')
+    local_info = ip("-d link show %s" % (cfg.ifname), json=True)[0]
+    if 'parentbus' in local_info and local_info['parentbus'] == "netdevsim":
+        no_sleep=True
+    if 'linkinfo' in local_info and local_info['linkinfo']['info_kind'] == "veth":
+        no_sleep=True
+
+def set_interface_init(cfg) -> None:
+    cmd(f"ip link set dev {cfg.ifname} mtu 1500", shell=True)
+    cmd(f"ip link set dev {cfg.ifname} xdp off ", shell=True)
+    cmd(f"ip link set dev {cfg.ifname} xdpgeneric off ", shell=True)
+    cmd(f"ip link set dev {cfg.ifname} xdpoffload off", shell=True)
+    cmd(f"ip link set dev {remote_ifname} mtu 1500", shell=True, host=cfg.remote)
+
+def test_default(cfg, netnl) -> None:
+    _set_offload_checksum(cfg, netnl, "off")
+    _test_v4(cfg)
+    _test_v6(cfg)
+    _test_tcp(cfg)
+    _set_offload_checksum(cfg, netnl, "on")
+    _test_v4(cfg)
+    _test_v6(cfg)
+    _test_tcp(cfg)
+
+def test_xdp_generic_sb(cfg, netnl) -> None:
+    _set_xdp_generic_sb_on(cfg)
+    _set_offload_checksum(cfg, netnl, "off")
+    _test_v4(cfg)
+    _test_v6(cfg)
+    _test_tcp(cfg)
+    _set_offload_checksum(cfg, netnl, "on")
+    _test_v4(cfg)
+    _test_v6(cfg)
+    _test_tcp(cfg)
+    ip("link set dev %s xdpgeneric off" % cfg.ifname)
+
+def test_xdp_generic_mb(cfg, netnl) -> None:
+    _set_xdp_generic_mb_on(cfg)
+    _set_offload_checksum(cfg, netnl, "off")
+    _test_v4(cfg)
+    _test_v6(cfg)
+    _test_tcp(cfg)
+    _set_offload_checksum(cfg, netnl, "on")
+    _test_v4(cfg)
+    _test_v6(cfg)
+    _test_tcp(cfg)
+    ip("link set dev %s xdpgeneric off" % cfg.ifname)
+
+def test_xdp_native_sb(cfg, netnl) -> None:
+    _set_xdp_native_sb_on(cfg)
+    _set_offload_checksum(cfg, netnl, "off")
+    _test_v4(cfg)
+    _test_v6(cfg)
+    _test_tcp(cfg)
+    _set_offload_checksum(cfg, netnl, "on")
+    _test_v4(cfg)
+    _test_v6(cfg)
+    _test_tcp(cfg)
+    ip("link set dev %s xdp off" % cfg.ifname)
+
+def test_xdp_native_mb(cfg, netnl) -> None:
+    _set_xdp_native_mb_on(cfg)
+    _set_offload_checksum(cfg, netnl, "off")
+    _test_v4(cfg)
+    _test_v6(cfg)
+    _test_tcp(cfg)
+    _set_offload_checksum(cfg, netnl, "on")
+    _test_v4(cfg)
+    _test_v6(cfg)
+    _test_tcp(cfg)
+    ip("link set dev %s xdp off" % cfg.ifname)
+
+def test_xdp_offload(cfg, netnl) -> None:
+    _set_xdp_offload_on(cfg)
+    _test_v4(cfg)
+    _test_v6(cfg)
+    _test_tcp(cfg)
+    ip("link set dev %s xdpoffload off" % cfg.ifname)
 
 def main() -> None:
     with NetDrvEpEnv(__file__) as cfg:
-        ksft_run(globs=globals(), case_pfx={"test_"}, args=(cfg, ))
+        get_interface_info(cfg)
+        set_interface_init(cfg)
+        ksft_run([test_default,
+                  test_xdp_generic_sb,
+                  test_xdp_generic_mb,
+                  test_xdp_native_sb,
+                  test_xdp_native_mb,
+                  test_xdp_offload],
+                 args=(cfg, EthtoolFamily()))
+        set_interface_init(cfg)
     ksft_exit()
 
 
diff --git a/tools/testing/selftests/net/lib/xdp_dummy.bpf.c b/tools/testing/selftests/net/lib/xdp_dummy.bpf.c
index d988b2e0cee8..e73fab3edd9f 100644
--- a/tools/testing/selftests/net/lib/xdp_dummy.bpf.c
+++ b/tools/testing/selftests/net/lib/xdp_dummy.bpf.c
@@ -10,4 +10,10 @@ int xdp_dummy_prog(struct xdp_md *ctx)
 	return XDP_PASS;
 }
 
+SEC("xdp.frags")
+int xdp_dummy_prog_frags(struct xdp_md *ctx)
+{
+	return XDP_PASS;
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net 2/8] eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc()
  2025-03-09 13:42 ` [PATCH v3 net 2/8] eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc() Taehee Yoo
@ 2025-03-09 21:44   ` Mina Almasry
  2025-03-10  2:10     ` Taehee Yoo
  0 siblings, 1 reply; 16+ messages in thread
From: Mina Almasry @ 2025-03-09 21:44 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest, asml.silence,
	willemb, kaiyuanz, skhawaja, sdf, gospo, somnath.kotur, dw,
	amritha.nambiar, xuanzhuo

On Sun, Mar 9, 2025 at 6:43 AM Taehee Yoo <ap420073@gmail.com> wrote:
>
> The bnxt_queue_mem_alloc() is called to allocate new queue memory when
> a queue is restarted.
> It internally accesses rx buffer descriptor corresponding to the index.
> The rx buffer descriptor is allocated and set when the interface is up
> and it's freed when the interface is down.
> So, if queue is restarted if interface is down, kernel panic occurs.
>
> Splat looks like:
>  BUG: unable to handle page fault for address: 000000000000b240
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
>  PGD 0 P4D 0
>  Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
>  CPU: 3 UID: 0 PID: 1563 Comm: ncdevmem2 Not tainted 6.14.0-rc2+ #9 844ddba6e7c459cafd0bf4db9a3198e
>  Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
>  RIP: 0010:bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en]
>  Code: 41 54 4d 89 c4 4d 69 c0 c0 05 00 00 55 48 89 f5 53 48 89 fb 4c 8d b5 40 05 00 00 48 83 ec 15
>  RSP: 0018:ffff9dcc83fef9e8 EFLAGS: 00010202
>  RAX: ffffffffc0457720 RBX: ffff934ed8d40000 RCX: 0000000000000000
>  RDX: 000000000000001f RSI: ffff934ea508f800 RDI: ffff934ea508f808
>  RBP: ffff934ea508f800 R08: 000000000000b240 R09: ffff934e84f4b000
>  R10: ffff9dcc83fefa30 R11: ffff934e84f4b000 R12: 000000000000001f
>  R13: ffff934ed8d40ac0 R14: ffff934ea508fd40 R15: ffff934e84f4b000
>  FS:  00007fa73888c740(0000) GS:ffff93559f780000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 000000000000b240 CR3: 0000000145a2e000 CR4: 00000000007506f0
>  PKRU: 55555554
>  Call Trace:
>   <TASK>
>   ? __die+0x20/0x70
>   ? page_fault_oops+0x15a/0x460
>   ? exc_page_fault+0x6e/0x180
>   ? asm_exc_page_fault+0x22/0x30
>   ? __pfx_bnxt_queue_mem_alloc+0x10/0x10 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7]
>   ? bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7]
>   netdev_rx_queue_restart+0xc5/0x240
>   net_devmem_bind_dmabuf_to_queue+0xf8/0x200
>   netdev_nl_bind_rx_doit+0x3a7/0x450
>   genl_family_rcv_msg_doit+0xd9/0x130
>   genl_rcv_msg+0x184/0x2b0
>   ? __pfx_netdev_nl_bind_rx_doit+0x10/0x10
>   ? __pfx_genl_rcv_msg+0x10/0x10
>   netlink_rcv_skb+0x54/0x100
>   genl_rcv+0x24/0x40
> ...
>
> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Reviewed-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Mina Almasry <almasrymina@google.com>

Although I wonder if you wanna return -ENETDOWN from other queue API
ops, if your driver doesn't handle them.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net 7/8] net: devmem: do not WARN conditionally after netdev_rx_queue_restart()
  2025-03-09 13:42 ` [PATCH v3 net 7/8] net: devmem: do not WARN conditionally after netdev_rx_queue_restart() Taehee Yoo
@ 2025-03-09 21:45   ` Mina Almasry
  0 siblings, 0 replies; 16+ messages in thread
From: Mina Almasry @ 2025-03-09 21:45 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest, asml.silence,
	willemb, kaiyuanz, skhawaja, sdf, gospo, somnath.kotur, dw,
	amritha.nambiar, xuanzhuo

On Sun, Mar 9, 2025 at 6:43 AM Taehee Yoo <ap420073@gmail.com> wrote:
>
> When devmem socket is closed, netdev_rx_queue_restart() is called to
> reset queue by the net_devmem_unbind_dmabuf(). But callback may return
> -ENETDOWN if the interface is down because queues are already freed
> when the interface is down so queue reset is not needed.
> So, it should not warn if the return value is -ENETDOWN.
>
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>

Reviewed-by: Mina Almasry <almasrymina@google.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net 2/8] eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc()
  2025-03-09 21:44   ` Mina Almasry
@ 2025-03-10  2:10     ` Taehee Yoo
  2025-03-10 20:30       ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Taehee Yoo @ 2025-03-10  2:10 UTC (permalink / raw)
  To: Mina Almasry
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest, asml.silence,
	willemb, kaiyuanz, skhawaja, sdf, gospo, somnath.kotur, dw,
	amritha.nambiar, xuanzhuo

On Mon, Mar 10, 2025 at 6:44 AM Mina Almasry <almasrymina@google.com> wrote:
>

Hi Mina,
Thanks a lot for the review!

> On Sun, Mar 9, 2025 at 6:43 AM Taehee Yoo <ap420073@gmail.com> wrote:
> >
> > The bnxt_queue_mem_alloc() is called to allocate new queue memory when
> > a queue is restarted.
> > It internally accesses rx buffer descriptor corresponding to the index.
> > The rx buffer descriptor is allocated and set when the interface is up
> > and it's freed when the interface is down.
> > So, if queue is restarted if interface is down, kernel panic occurs.
> >
> > Splat looks like:
> >  BUG: unable to handle page fault for address: 000000000000b240
> >  #PF: supervisor read access in kernel mode
> >  #PF: error_code(0x0000) - not-present page
> >  PGD 0 P4D 0
> >  Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
> >  CPU: 3 UID: 0 PID: 1563 Comm: ncdevmem2 Not tainted 6.14.0-rc2+ #9 844ddba6e7c459cafd0bf4db9a3198e
> >  Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
> >  RIP: 0010:bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en]
> >  Code: 41 54 4d 89 c4 4d 69 c0 c0 05 00 00 55 48 89 f5 53 48 89 fb 4c 8d b5 40 05 00 00 48 83 ec 15
> >  RSP: 0018:ffff9dcc83fef9e8 EFLAGS: 00010202
> >  RAX: ffffffffc0457720 RBX: ffff934ed8d40000 RCX: 0000000000000000
> >  RDX: 000000000000001f RSI: ffff934ea508f800 RDI: ffff934ea508f808
> >  RBP: ffff934ea508f800 R08: 000000000000b240 R09: ffff934e84f4b000
> >  R10: ffff9dcc83fefa30 R11: ffff934e84f4b000 R12: 000000000000001f
> >  R13: ffff934ed8d40ac0 R14: ffff934ea508fd40 R15: ffff934e84f4b000
> >  FS:  00007fa73888c740(0000) GS:ffff93559f780000(0000) knlGS:0000000000000000
> >  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >  CR2: 000000000000b240 CR3: 0000000145a2e000 CR4: 00000000007506f0
> >  PKRU: 55555554
> >  Call Trace:
> >   <TASK>
> >   ? __die+0x20/0x70
> >   ? page_fault_oops+0x15a/0x460
> >   ? exc_page_fault+0x6e/0x180
> >   ? asm_exc_page_fault+0x22/0x30
> >   ? __pfx_bnxt_queue_mem_alloc+0x10/0x10 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7]
> >   ? bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7]
> >   netdev_rx_queue_restart+0xc5/0x240
> >   net_devmem_bind_dmabuf_to_queue+0xf8/0x200
> >   netdev_nl_bind_rx_doit+0x3a7/0x450
> >   genl_family_rcv_msg_doit+0xd9/0x130
> >   genl_rcv_msg+0x184/0x2b0
> >   ? __pfx_netdev_nl_bind_rx_doit+0x10/0x10
> >   ? __pfx_genl_rcv_msg+0x10/0x10
> >   netlink_rcv_skb+0x54/0x100
> >   genl_rcv+0x24/0x40
> > ...
> >
> > Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
> > Reviewed-by: Jakub Kicinski <kuba@kernel.org>
>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
>
> Although I wonder if you wanna return -ENETDOWN from other queue API
> ops, if your driver doesn't handle them.

Okay, I will add -ENETDOWN return to the other queue API in the next
version.

Thanks a lot!
Taehee Yoo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net 4/8] eth: bnxt: do not update checksum in bnxt_xdp_build_skb()
  2025-03-09 13:42 ` [PATCH v3 net 4/8] eth: bnxt: do not update checksum in bnxt_xdp_build_skb() Taehee Yoo
@ 2025-03-10  2:58   ` Somnath Kotur
  0 siblings, 0 replies; 16+ messages in thread
From: Somnath Kotur @ 2025-03-10  2:58 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest, almasrymina,
	asml.silence, willemb, kaiyuanz, skhawaja, sdf, gospo, dw,
	amritha.nambiar, xuanzhuo

[-- Attachment #1: Type: text/plain, Size: 6322 bytes --]

On Sun, Mar 9, 2025 at 7:13 PM Taehee Yoo <ap420073@gmail.com> wrote:
>
> The bnxt_rx_pkt() updates ip_summed value at the end if checksum offload
> is enabled.
> When the XDP-MB program is attached and it returns XDP_PASS, the
> bnxt_xdp_build_skb() is called to update skb_shared_info.
> The main purpose of bnxt_xdp_build_skb() is to update skb_shared_info,
> but it updates ip_summed value too if checksum offload is enabled.
> This is actually duplicate work.
>
> When the bnxt_rx_pkt() updates ip_summed value, it checks if ip_summed
> is CHECKSUM_NONE or not.
> It means that ip_summed should be CHECKSUM_NONE at this moment.
> But ip_summed may already be updated to CHECKSUM_UNNECESSARY in the
> XDP-MB-PASS path.
> So the by skb_checksum_none_assert() WARNS about it.
>
> This is duplicate work and updating ip_summed in the
> bnxt_xdp_build_skb() is not needed.
>
> Splat looks like:
> WARNING: CPU: 3 PID: 5782 at ./include/linux/skbuff.h:5155 bnxt_rx_pkt+0x479b/0x7610 [bnxt_en]
> Modules linked in: bnxt_re bnxt_en rdma_ucm rdma_cm iw_cm ib_cm ib_uverbs veth xt_nat xt_tcpudp xt_conntrack nft_chain_nat xt_MASQUERADE nf_]
> CPU: 3 UID: 0 PID: 5782 Comm: socat Tainted: G        W          6.14.0-rc4+ #27
> Tainted: [W]=WARN
> Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
> RIP: 0010:bnxt_rx_pkt+0x479b/0x7610 [bnxt_en]
> Code: 54 24 0c 4c 89 f1 4c 89 ff c1 ea 1f ff d3 0f 1f 00 49 89 c6 48 85 c0 0f 84 4c e5 ff ff 48 89 c7 e8 ca 3d a0 c8 e9 8f f4 ff ff <0f> 0b f
> RSP: 0018:ffff88881ba09928 EFLAGS: 00010202
> RAX: 0000000000000000 RBX: 00000000c7590303 RCX: 0000000000000000
> RDX: 1ffff1104e7d1610 RSI: 0000000000000001 RDI: ffff8881c91300b8
> RBP: ffff88881ba09b28 R08: ffff888273e8b0d0 R09: ffff888273e8b070
> R10: ffff888273e8b010 R11: ffff888278b0f000 R12: ffff888273e8b080
> R13: ffff8881c9130e00 R14: ffff8881505d3800 R15: ffff888273e8b000
> FS:  00007f5a2e7be080(0000) GS:ffff88881ba00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fff2e708ff8 CR3: 000000013e3b0000 CR4: 00000000007506f0
> PKRU: 55555554
> Call Trace:
>  <IRQ>
>  ? __warn+0xcd/0x2f0
>  ? bnxt_rx_pkt+0x479b/0x7610
>  ? report_bug+0x326/0x3c0
>  ? handle_bug+0x53/0xa0
>  ? exc_invalid_op+0x14/0x50
>  ? asm_exc_invalid_op+0x16/0x20
>  ? bnxt_rx_pkt+0x479b/0x7610
>  ? bnxt_rx_pkt+0x3e41/0x7610
>  ? __pfx_bnxt_rx_pkt+0x10/0x10
>  ? napi_complete_done+0x2cf/0x7d0
>  __bnxt_poll_work+0x4e8/0x1220
>  ? __pfx___bnxt_poll_work+0x10/0x10
>  ? __pfx_mark_lock.part.0+0x10/0x10
>  bnxt_poll_p5+0x36a/0xfa0
>  ? __pfx_bnxt_poll_p5+0x10/0x10
>  __napi_poll.constprop.0+0xa0/0x440
>  net_rx_action+0x899/0xd00
> ...
>
> Following ping.py patch adds xdp-mb-pass case. so ping.py is going
> to be able to reproduce this issue.
>
> Fixes: 1dc4c557bfed ("bnxt: adding bnxt_xdp_build_skb to build skb from multibuffer xdp_buff")
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
> ---
>
> v3:
>  - No changes.
>
> v2:
>  - Patch added.
>
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  3 ++-
>  drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 11 ++---------
>  drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h |  3 +--
>  3 files changed, 5 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 218109ee1c23..9afb2c5072b1 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -2218,7 +2218,8 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
>                         if (!skb)
>                                 goto oom_next_rx;
>                 } else {
> -                       skb = bnxt_xdp_build_skb(bp, skb, agg_bufs, rxr->page_pool, &xdp, rxcmp1);
> +                       skb = bnxt_xdp_build_skb(bp, skb, agg_bufs,
> +                                                rxr->page_pool, &xdp);
>                         if (!skb) {
>                                 /* we should be able to free the old skb here */
>                                 bnxt_xdp_buff_frags_free(rxr, &xdp);
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
> index e9b49cb5b735..299822cacca4 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
> @@ -460,20 +460,13 @@ int bnxt_xdp(struct net_device *dev, struct netdev_bpf *xdp)
>
>  struct sk_buff *
>  bnxt_xdp_build_skb(struct bnxt *bp, struct sk_buff *skb, u8 num_frags,
> -                  struct page_pool *pool, struct xdp_buff *xdp,
> -                  struct rx_cmp_ext *rxcmp1)
> +                  struct page_pool *pool, struct xdp_buff *xdp)
>  {
>         struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
>
>         if (!skb)
>                 return NULL;
> -       skb_checksum_none_assert(skb);
> -       if (RX_CMP_L4_CS_OK(rxcmp1)) {
> -               if (bp->dev->features & NETIF_F_RXCSUM) {
> -                       skb->ip_summed = CHECKSUM_UNNECESSARY;
> -                       skb->csum_level = RX_CMP_ENCAP(rxcmp1);
> -               }
> -       }
> +
>         xdp_update_skb_shared_info(skb, num_frags,
>                                    sinfo->xdp_frags_size,
>                                    BNXT_RX_PAGE_SIZE * num_frags,
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
> index 0122782400b8..220285e190fc 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
> @@ -33,6 +33,5 @@ void bnxt_xdp_buff_frags_free(struct bnxt_rx_ring_info *rxr,
>                               struct xdp_buff *xdp);
>  struct sk_buff *bnxt_xdp_build_skb(struct bnxt *bp, struct sk_buff *skb,
>                                    u8 num_frags, struct page_pool *pool,
> -                                  struct xdp_buff *xdp,
> -                                  struct rx_cmp_ext *rxcmp1);
> +                                  struct xdp_buff *xdp);
>  #endif
> --
> 2.34.1
>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4199 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net 5/8] eth: bnxt: fix kernel panic in the bnxt_get_queue_stats{rx | tx}
  2025-03-09 13:42 ` [PATCH v3 net 5/8] eth: bnxt: fix kernel panic in the bnxt_get_queue_stats{rx | tx} Taehee Yoo
@ 2025-03-10  2:59   ` Somnath Kotur
  0 siblings, 0 replies; 16+ messages in thread
From: Somnath Kotur @ 2025-03-10  2:59 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest, almasrymina,
	asml.silence, willemb, kaiyuanz, skhawaja, sdf, gospo, dw,
	amritha.nambiar, xuanzhuo

[-- Attachment #1: Type: text/plain, Size: 3738 bytes --]

On Sun, Mar 9, 2025 at 7:13 PM Taehee Yoo <ap420073@gmail.com> wrote:
>
> When qstats-get operation is executed, callbacks of netdev_stats_ops
> are called. The bnxt_get_queue_stats{rx | tx} collect per-queue stats
> from sw_stats in the rings.
> But {rx | tx | cp}_ring are allocated when the interface is up.
> So, these rings are not allocated when the interface is down.
>
> The qstats-get is allowed even if the interface is down. However,
> the bnxt_get_queue_stats{rx | tx}() accesses cp_ring and tx_ring
> without null check.
> So, it needs to avoid accessing rings if the interface is down.
>
> Reproducer:
>  ip link set $interface down
>  ./cli.py --spec netdev.yaml --dump qstats-get
> OR
>  ip link set $interface down
>  python ./stats.py
>
> Splat looks like:
>  BUG: kernel NULL pointer dereference, address: 0000000000000000
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
>  PGD 1680fa067 P4D 1680fa067 PUD 16be3b067 PMD 0
>  Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
>  CPU: 0 UID: 0 PID: 1495 Comm: python3 Not tainted 6.14.0-rc4+ #32 5cd0f999d5a15c574ac72b3e4b907341
>  Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
>  RIP: 0010:bnxt_get_queue_stats_rx+0xf/0x70 [bnxt_en]
>  Code: c6 87 b5 18 00 00 02 eb a2 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 01
>  RSP: 0018:ffffabef43cdb7e0 EFLAGS: 00010282
>  RAX: 0000000000000000 RBX: ffffffffc04c8710 RCX: 0000000000000000
>  RDX: ffffabef43cdb858 RSI: 0000000000000000 RDI: ffff8d504e850000
>  RBP: ffff8d506c9f9c00 R08: 0000000000000004 R09: ffff8d506bcd901c
>  R10: 0000000000000015 R11: ffff8d506bcd9000 R12: 0000000000000000
>  R13: ffffabef43cdb8c0 R14: ffff8d504e850000 R15: 0000000000000000
>  FS:  00007f2c5462b080(0000) GS:ffff8d575f600000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 0000000000000000 CR3: 0000000167fd0000 CR4: 00000000007506f0
>  PKRU: 55555554
>  Call Trace:
>   <TASK>
>   ? __die+0x20/0x70
>   ? page_fault_oops+0x15a/0x460
>   ? sched_balance_find_src_group+0x58d/0xd10
>   ? exc_page_fault+0x6e/0x180
>   ? asm_exc_page_fault+0x22/0x30
>   ? bnxt_get_queue_stats_rx+0xf/0x70 [bnxt_en cdd546fd48563c280cfd30e9647efa420db07bf1]
>   netdev_nl_stats_by_netdev+0x2b1/0x4e0
>   ? xas_load+0x9/0xb0
>   ? xas_find+0x183/0x1d0
>   ? xa_find+0x8b/0xe0
>   netdev_nl_qstats_get_dumpit+0xbf/0x1e0
>   genl_dumpit+0x31/0x90
>   netlink_dump+0x1a8/0x360
>
> Fixes: af7b3b4adda5 ("eth: bnxt: support per-queue statistics")
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
> ---
>
> v3:
>  - Patch added.
>
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 9afb2c5072b1..bee12d9b57ab 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -15384,6 +15384,9 @@ static void bnxt_get_queue_stats_rx(struct net_device *dev, int i,
>         struct bnxt_cp_ring_info *cpr;
>         u64 *sw;
>
> +       if (!bp->bnapi)
> +               return;
> +
>         cpr = &bp->bnapi[i]->cp_ring;
>         sw = cpr->stats.sw_stats;
>
> @@ -15407,6 +15410,9 @@ static void bnxt_get_queue_stats_tx(struct net_device *dev, int i,
>         struct bnxt_napi *bnapi;
>         u64 *sw;
>
> +       if (!bp->tx_ring)
> +               return;
> +
>         bnapi = bp->tx_ring[bp->tx_ring_map[i]].bnapi;
>         sw = bnapi->cp_ring.stats.sw_stats;
>
> --
> 2.34.1
>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4199 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net 2/8] eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc()
  2025-03-10  2:10     ` Taehee Yoo
@ 2025-03-10 20:30       ` Jakub Kicinski
  0 siblings, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2025-03-10 20:30 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: Mina Almasry, davem, pabeni, edumazet, andrew+netdev,
	michael.chan, pavan.chebbi, horms, shuah, netdev, linux-kselftest,
	asml.silence, willemb, kaiyuanz, skhawaja, sdf, gospo,
	somnath.kotur, dw, amritha.nambiar, xuanzhuo

On Mon, 10 Mar 2025 11:10:42 +0900 Taehee Yoo wrote:
> > Although I wonder if you wanna return -ENETDOWN from other queue API
> > ops, if your driver doesn't handle them.  
> 
> Okay, I will add -ENETDOWN return to the other queue API in the next
> version.

I prefer the current version. If queues can't be allocated while down,
how can free be called..

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module
  2025-03-09 13:42 [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module Taehee Yoo
                   ` (7 preceding siblings ...)
  2025-03-09 13:42 ` [PATCH v3 net 8/8] selftests: drv-net: add xdp cases for ping.py Taehee Yoo
@ 2025-03-10 20:40 ` patchwork-bot+netdevbpf
  8 siblings, 0 replies; 16+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-03-10 20:40 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: davem, kuba, pabeni, edumazet, andrew+netdev, michael.chan,
	pavan.chebbi, horms, shuah, netdev, linux-kselftest, almasrymina,
	asml.silence, willemb, kaiyuanz, skhawaja, sdf, gospo,
	somnath.kotur, dw, amritha.nambiar, xuanzhuo

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Sun,  9 Mar 2025 13:42:11 +0000 you wrote:
> The first fixes setting incorrect skb->truesize.
> When xdp-mb prog returns XDP_PASS, skb is allocated and initialized.
> Currently, The truesize is calculated as BNXT_RX_PAGE_SIZE *
> sinfo->nr_frags, but sinfo->nr_frags is flushed by napi_build_skb().
> So, it stores sinfo before calling napi_build_skb() and then use it
> for calculate truesize.
> 
> [...]

Here is the summary with links:
  - [v3,net,1/8] eth: bnxt: fix truesize for mb-xdp-pass case
    https://git.kernel.org/netdev/net/c/9f7b2aa5034e
  - [v3,net,2/8] eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc()
    https://git.kernel.org/netdev/net/c/ca2456e07395
  - [v3,net,3/8] eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue restart logic
    https://git.kernel.org/netdev/net/c/661958552eda
  - [v3,net,4/8] eth: bnxt: do not update checksum in bnxt_xdp_build_skb()
    https://git.kernel.org/netdev/net/c/c03e7d05aa0e
  - [v3,net,5/8] eth: bnxt: fix kernel panic in the bnxt_get_queue_stats{rx | tx}
    https://git.kernel.org/netdev/net/c/f09af5fdfbd9
  - [v3,net,6/8] eth: bnxt: fix memory leak in queue reset
    https://git.kernel.org/netdev/net/c/87dd2850835d
  - [v3,net,7/8] net: devmem: do not WARN conditionally after netdev_rx_queue_restart()
    https://git.kernel.org/netdev/net/c/a70f891e0fa0
  - [v3,net,8/8] selftests: drv-net: add xdp cases for ping.py
    https://git.kernel.org/netdev/net/c/75cc19c8ff89

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-03-10 20:40 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-09 13:42 [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module Taehee Yoo
2025-03-09 13:42 ` [PATCH v3 net 1/8] eth: bnxt: fix truesize for mb-xdp-pass case Taehee Yoo
2025-03-09 13:42 ` [PATCH v3 net 2/8] eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc() Taehee Yoo
2025-03-09 21:44   ` Mina Almasry
2025-03-10  2:10     ` Taehee Yoo
2025-03-10 20:30       ` Jakub Kicinski
2025-03-09 13:42 ` [PATCH v3 net 3/8] eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue restart logic Taehee Yoo
2025-03-09 13:42 ` [PATCH v3 net 4/8] eth: bnxt: do not update checksum in bnxt_xdp_build_skb() Taehee Yoo
2025-03-10  2:58   ` Somnath Kotur
2025-03-09 13:42 ` [PATCH v3 net 5/8] eth: bnxt: fix kernel panic in the bnxt_get_queue_stats{rx | tx} Taehee Yoo
2025-03-10  2:59   ` Somnath Kotur
2025-03-09 13:42 ` [PATCH v3 net 6/8] eth: bnxt: fix memory leak in queue reset Taehee Yoo
2025-03-09 13:42 ` [PATCH v3 net 7/8] net: devmem: do not WARN conditionally after netdev_rx_queue_restart() Taehee Yoo
2025-03-09 21:45   ` Mina Almasry
2025-03-09 13:42 ` [PATCH v3 net 8/8] selftests: drv-net: add xdp cases for ping.py Taehee Yoo
2025-03-10 20:40 ` [PATCH v3 net 0/8] eth: bnxt: fix several bugs in the bnxt module patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).