[PATCH net-next v2 0/2] net: ethernet: mtk_eth

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance
@ 2024-07-29 18:29 Elad Yifee
  2024-07-29 18:29 ` [PATCH net-next v2 1/2] net: ethernet: mtk_eth_soc: use prefetch methods Elad Yifee
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Elad Yifee @ 2024-07-29 18:29 UTC (permalink / raw)
  To: Felix Fietkau, Sean Wang, Mark Lee, Lorenzo Bianconi,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Matthias Brugger, AngeloGioacchino Del Regno, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek
  Cc: Elad Yifee, Daniel Golle, Joe Damato

This small series includes two short and simple patches to improve RX performance
on this driver.

iperf3 result without these patches:
	[ ID] Interval           Transfer     Bandwidth
	[  4]   0.00-1.00   sec   563 MBytes  4.72 Gbits/sec
	[  4]   1.00-2.00   sec   563 MBytes  4.73 Gbits/sec
	[  4]   2.00-3.00   sec   552 MBytes  4.63 Gbits/sec
	[  4]   3.00-4.00   sec   561 MBytes  4.70 Gbits/sec
	[  4]   4.00-5.00   sec   562 MBytes  4.71 Gbits/sec
	[  4]   5.00-6.00   sec   565 MBytes  4.74 Gbits/sec
	[  4]   6.00-7.00   sec   563 MBytes  4.72 Gbits/sec
	[  4]   7.00-8.00   sec   565 MBytes  4.74 Gbits/sec
	[  4]   8.00-9.00   sec   562 MBytes  4.71 Gbits/sec
	[  4]   9.00-10.00  sec   558 MBytes  4.68 Gbits/sec
	- - - - - - - - - - - - - - - - - - - - - - - - -
	[ ID] Interval           Transfer     Bandwidth
	[  4]   0.00-10.00  sec  5.48 GBytes  4.71 Gbits/sec                  sender
	[  4]   0.00-10.00  sec  5.48 GBytes  4.71 Gbits/sec                  receiver

iperf3 result with "use prefetch methods" patch:
	[ ID] Interval           Transfer     Bandwidth
	[  4]   0.00-1.00   sec   598 MBytes  5.02 Gbits/sec
	[  4]   1.00-2.00   sec   588 MBytes  4.94 Gbits/sec
	[  4]   2.00-3.00   sec   592 MBytes  4.97 Gbits/sec
	[  4]   3.00-4.00   sec   594 MBytes  4.98 Gbits/sec
	[  4]   4.00-5.00   sec   590 MBytes  4.95 Gbits/sec
	[  4]   5.00-6.00   sec   594 MBytes  4.98 Gbits/sec
	[  4]   6.00-7.00   sec   594 MBytes  4.98 Gbits/sec
	[  4]   7.00-8.00   sec   593 MBytes  4.98 Gbits/sec
	[  4]   8.00-9.00   sec   593 MBytes  4.98 Gbits/sec
	[  4]   9.00-10.00  sec   594 MBytes  4.98 Gbits/sec
	- - - - - - - - - - - - - - - - - - - - - - - - -
	[ ID] Interval           Transfer     Bandwidth
	[  4]   0.00-10.00  sec  5.79 GBytes  4.98 Gbits/sec                  sender
	[  4]   0.00-10.00  sec  5.79 GBytes  4.98 Gbits/sec                  receiver

iperf3 result with "use PP exclusively for XDP programs" patch:
	[ ID] Interval           Transfer     Bandwidth
	[  4]   0.00-1.00   sec   635 MBytes  5.33 Gbits/sec
	[  4]   1.00-2.00   sec   636 MBytes  5.33 Gbits/sec
	[  4]   2.00-3.00   sec   637 MBytes  5.34 Gbits/sec
	[  4]   3.00-4.00   sec   636 MBytes  5.34 Gbits/sec
	[  4]   4.00-5.00   sec   637 MBytes  5.34 Gbits/sec
	[  4]   5.00-6.00   sec   637 MBytes  5.35 Gbits/sec
	[  4]   6.00-7.00   sec   637 MBytes  5.34 Gbits/sec
	[  4]   7.00-8.00   sec   636 MBytes  5.33 Gbits/sec
	[  4]   8.00-9.00   sec   634 MBytes  5.32 Gbits/sec
	[  4]   9.00-10.00  sec   637 MBytes  5.34 Gbits/sec
	- - - - - - - - - - - - - - - - - - - - - - - - -
	[ ID] Interval           Transfer     Bandwidth
	[  4]   0.00-10.00  sec  6.21 GBytes  5.34 Gbits/sec                  sender
	[  4]   0.00-10.00  sec  6.21 GBytes  5.34 Gbits/sec                  receiver

iperf3 result with both patches:
	[ ID] Interval           Transfer     Bandwidth
	[  4]   0.00-1.00   sec   652 MBytes  5.47 Gbits/sec
	[  4]   1.00-2.00   sec   653 MBytes  5.47 Gbits/sec
	[  4]   2.00-3.00   sec   654 MBytes  5.48 Gbits/sec
	[  4]   3.00-4.00   sec   654 MBytes  5.49 Gbits/sec
	[  4]   4.00-5.00   sec   653 MBytes  5.48 Gbits/sec
	[  4]   5.00-6.00   sec   653 MBytes  5.48 Gbits/sec
	[  4]   6.00-7.00   sec   653 MBytes  5.48 Gbits/sec
	[  4]   7.00-8.00   sec   653 MBytes  5.48 Gbits/sec
	[  4]   8.00-9.00   sec   653 MBytes  5.48 Gbits/sec
	[  4]   9.00-10.00  sec   654 MBytes  5.48 Gbits/sec
	- - - - - - - - - - - - - - - - - - - - - - - - -
	[ ID] Interval           Transfer     Bandwidth
	[  4]   0.00-10.00  sec  6.38 GBytes  5.48 Gbits/sec                  sender
	[  4]   0.00-10.00  sec  6.38 GBytes  5.48 Gbits/sec                  receiver

About 16% more packets/sec without XDP program loaded,
and about 5% more packets/sec when using PP.
Tested on Banana Pi BPI-R4 (MT7988A)

---
Technically, this is version 2 of the “use prefetch methods” patch.
Initially, I submitted it as a single patch for review (RFC),
but later I decided to include a second patch, resulting in this series
Changes in v2:
	- Add "use PP exclusively for XDP programs" patch and create this series
---
Elad Yifee (2):
  net: ethernet: mtk_eth_soc: use prefetch methods
  net: ethernet: mtk_eth_soc: use PP exclusively for XDP programs

 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

-- 
2.45.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next v2 1/2] net: ethernet: mtk_eth_soc: use prefetch methods
  2024-07-29 18:29 [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance Elad Yifee
@ 2024-07-29 18:29 ` Elad Yifee
  2024-07-30  8:59   ` Joe Damato
  2025-01-06 14:28   ` Shengyu Qu
  2024-07-29 18:29 ` [PATCH net-next v2 2/2] net: ethernet: mtk_eth_soc: use PP exclusively for XDP programs Elad Yifee
  2024-07-29 19:10 ` [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance Lorenzo Bianconi
  2 siblings, 2 replies; 16+ messages in thread
From: Elad Yifee @ 2024-07-29 18:29 UTC (permalink / raw)
  To: Felix Fietkau, Sean Wang, Mark Lee, Lorenzo Bianconi,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Matthias Brugger, AngeloGioacchino Del Regno, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek
  Cc: Elad Yifee, Daniel Golle, Joe Damato

Utilize kernel prefetch methods for faster cache line access.
This change boosts driver performance,
allowing the CPU to handle about 5% more packets/sec.

Signed-off-by: Elad Yifee <eladwf@gmail.com>
---
Changes in v2:
	- use net_prefetchw as suggested by Joe Damato
	- add (NET_SKB_PAD + eth->ip_align) offset to prefetched data
	- use eth->ip_align instead of NET_IP_ALIGN as it could be 0,
	depending on the platform 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 16ca427cf4c3..4d0052dbe3f4 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1963,6 +1963,7 @@ static u32 mtk_xdp_run(struct mtk_eth *eth, struct mtk_rx_ring *ring,
 	if (!prog)
 		goto out;
 
+	net_prefetchw(xdp->data_hard_start);
 	act = bpf_prog_run_xdp(prog, xdp);
 	switch (act) {
 	case XDP_PASS:
@@ -2038,6 +2039,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
 
 		idx = NEXT_DESP_IDX(ring->calc_idx, ring->dma_size);
 		rxd = ring->dma + idx * eth->soc->rx.desc_size;
+		prefetch(rxd);
 		data = ring->data[idx];
 
 		if (!mtk_rx_get_desc(eth, &trxd, rxd))
@@ -2105,6 +2107,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
 			if (ret != XDP_PASS)
 				goto skip_rx;
 
+			net_prefetch(xdp.data_meta);
 			skb = build_skb(data, PAGE_SIZE);
 			if (unlikely(!skb)) {
 				page_pool_put_full_page(ring->page_pool,
@@ -2113,6 +2116,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
 				goto skip_rx;
 			}
 
+			net_prefetchw(skb->data);
 			skb_reserve(skb, xdp.data - xdp.data_hard_start);
 			skb_put(skb, xdp.data_end - xdp.data);
 			skb_mark_for_recycle(skb);
@@ -2143,6 +2147,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
 			dma_unmap_single(eth->dma_dev, ((u64)trxd.rxd1 | addr64),
 					 ring->buf_size, DMA_FROM_DEVICE);
 
+			net_prefetch(data + NET_SKB_PAD + eth->ip_align);
 			skb = build_skb(data, ring->frag_size);
 			if (unlikely(!skb)) {
 				netdev->stats.rx_dropped++;
@@ -2150,7 +2155,8 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
 				goto skip_rx;
 			}
 
-			skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN);
+			net_prefetchw(skb->data);
+			skb_reserve(skb, NET_SKB_PAD + eth->ip_align);
 			skb_put(skb, pktlen);
 		}
 
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next v2 2/2] net: ethernet: mtk_eth_soc: use PP exclusively for XDP programs
  2024-07-29 18:29 [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance Elad Yifee
  2024-07-29 18:29 ` [PATCH net-next v2 1/2] net: ethernet: mtk_eth_soc: use prefetch methods Elad Yifee
@ 2024-07-29 18:29 ` Elad Yifee
  2024-07-29 19:10 ` [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance Lorenzo Bianconi
  2 siblings, 0 replies; 16+ messages in thread
From: Elad Yifee @ 2024-07-29 18:29 UTC (permalink / raw)
  To: Felix Fietkau, Sean Wang, Mark Lee, Lorenzo Bianconi,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Matthias Brugger, AngeloGioacchino Del Regno, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek
  Cc: Elad Yifee, Daniel Golle, Joe Damato

PP allocations and XDP code path traversal are unnecessary
when no XDP program is loaded.
Prevent that by simply not creating the pool.
This change boosts driver performance for this use case,
allowing the CPU to handle about 13% more packets/sec.

Signed-off-by: Elad Yifee <eladwf@gmail.com>
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 4d0052dbe3f4..2d1a48287c73 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -2644,7 +2644,7 @@ static int mtk_rx_alloc(struct mtk_eth *eth, int ring_no, int rx_flag)
 	if (!ring->data)
 		return -ENOMEM;
 
-	if (mtk_page_pool_enabled(eth)) {
+	if (mtk_page_pool_enabled(eth) && rcu_access_pointer(eth->prog)) {
 		struct page_pool *pp;
 
 		pp = mtk_create_page_pool(eth, &ring->xdp_q, ring_no,
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance
  2024-07-29 18:29 [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance Elad Yifee
  2024-07-29 18:29 ` [PATCH net-next v2 1/2] net: ethernet: mtk_eth_soc: use prefetch methods Elad Yifee
  2024-07-29 18:29 ` [PATCH net-next v2 2/2] net: ethernet: mtk_eth_soc: use PP exclusively for XDP programs Elad Yifee
@ 2024-07-29 19:10 ` Lorenzo Bianconi
  2024-07-30  5:29   ` Elad Yifee
  2 siblings, 1 reply; 16+ messages in thread
From: Lorenzo Bianconi @ 2024-07-29 19:10 UTC (permalink / raw)
  To: Elad Yifee
  Cc: Felix Fietkau, Sean Wang, Mark Lee, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Matthias Brugger,
	AngeloGioacchino Del Regno, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek, Daniel Golle, Joe Damato

[-- Attachment #1: Type: text/plain, Size: 5012 bytes --]

> This small series includes two short and simple patches to improve RX performance
> on this driver.

Hi Elad,

What is the chip revision you are running?
If you are using a device that does not support HW-LRO (e.g. MT7986 or
MT7988), I guess we can try to use page_pool_dev_alloc_frag() APIs and
request a 2048B buffer. Doing so, we can use use a single page for two
rx buffers improving recycling with page_pool. What do you think?

Regards,
Lorenzo

> 
> iperf3 result without these patches:
> 	[ ID] Interval           Transfer     Bandwidth
> 	[  4]   0.00-1.00   sec   563 MBytes  4.72 Gbits/sec
> 	[  4]   1.00-2.00   sec   563 MBytes  4.73 Gbits/sec
> 	[  4]   2.00-3.00   sec   552 MBytes  4.63 Gbits/sec
> 	[  4]   3.00-4.00   sec   561 MBytes  4.70 Gbits/sec
> 	[  4]   4.00-5.00   sec   562 MBytes  4.71 Gbits/sec
> 	[  4]   5.00-6.00   sec   565 MBytes  4.74 Gbits/sec
> 	[  4]   6.00-7.00   sec   563 MBytes  4.72 Gbits/sec
> 	[  4]   7.00-8.00   sec   565 MBytes  4.74 Gbits/sec
> 	[  4]   8.00-9.00   sec   562 MBytes  4.71 Gbits/sec
> 	[  4]   9.00-10.00  sec   558 MBytes  4.68 Gbits/sec
> 	- - - - - - - - - - - - - - - - - - - - - - - - -
> 	[ ID] Interval           Transfer     Bandwidth
> 	[  4]   0.00-10.00  sec  5.48 GBytes  4.71 Gbits/sec                  sender
> 	[  4]   0.00-10.00  sec  5.48 GBytes  4.71 Gbits/sec                  receiver
> 
> iperf3 result with "use prefetch methods" patch:
> 	[ ID] Interval           Transfer     Bandwidth
> 	[  4]   0.00-1.00   sec   598 MBytes  5.02 Gbits/sec
> 	[  4]   1.00-2.00   sec   588 MBytes  4.94 Gbits/sec
> 	[  4]   2.00-3.00   sec   592 MBytes  4.97 Gbits/sec
> 	[  4]   3.00-4.00   sec   594 MBytes  4.98 Gbits/sec
> 	[  4]   4.00-5.00   sec   590 MBytes  4.95 Gbits/sec
> 	[  4]   5.00-6.00   sec   594 MBytes  4.98 Gbits/sec
> 	[  4]   6.00-7.00   sec   594 MBytes  4.98 Gbits/sec
> 	[  4]   7.00-8.00   sec   593 MBytes  4.98 Gbits/sec
> 	[  4]   8.00-9.00   sec   593 MBytes  4.98 Gbits/sec
> 	[  4]   9.00-10.00  sec   594 MBytes  4.98 Gbits/sec
> 	- - - - - - - - - - - - - - - - - - - - - - - - -
> 	[ ID] Interval           Transfer     Bandwidth
> 	[  4]   0.00-10.00  sec  5.79 GBytes  4.98 Gbits/sec                  sender
> 	[  4]   0.00-10.00  sec  5.79 GBytes  4.98 Gbits/sec                  receiver
> 
> iperf3 result with "use PP exclusively for XDP programs" patch:
> 	[ ID] Interval           Transfer     Bandwidth
> 	[  4]   0.00-1.00   sec   635 MBytes  5.33 Gbits/sec
> 	[  4]   1.00-2.00   sec   636 MBytes  5.33 Gbits/sec
> 	[  4]   2.00-3.00   sec   637 MBytes  5.34 Gbits/sec
> 	[  4]   3.00-4.00   sec   636 MBytes  5.34 Gbits/sec
> 	[  4]   4.00-5.00   sec   637 MBytes  5.34 Gbits/sec
> 	[  4]   5.00-6.00   sec   637 MBytes  5.35 Gbits/sec
> 	[  4]   6.00-7.00   sec   637 MBytes  5.34 Gbits/sec
> 	[  4]   7.00-8.00   sec   636 MBytes  5.33 Gbits/sec
> 	[  4]   8.00-9.00   sec   634 MBytes  5.32 Gbits/sec
> 	[  4]   9.00-10.00  sec   637 MBytes  5.34 Gbits/sec
> 	- - - - - - - - - - - - - - - - - - - - - - - - -
> 	[ ID] Interval           Transfer     Bandwidth
> 	[  4]   0.00-10.00  sec  6.21 GBytes  5.34 Gbits/sec                  sender
> 	[  4]   0.00-10.00  sec  6.21 GBytes  5.34 Gbits/sec                  receiver
> 
> iperf3 result with both patches:
> 	[ ID] Interval           Transfer     Bandwidth
> 	[  4]   0.00-1.00   sec   652 MBytes  5.47 Gbits/sec
> 	[  4]   1.00-2.00   sec   653 MBytes  5.47 Gbits/sec
> 	[  4]   2.00-3.00   sec   654 MBytes  5.48 Gbits/sec
> 	[  4]   3.00-4.00   sec   654 MBytes  5.49 Gbits/sec
> 	[  4]   4.00-5.00   sec   653 MBytes  5.48 Gbits/sec
> 	[  4]   5.00-6.00   sec   653 MBytes  5.48 Gbits/sec
> 	[  4]   6.00-7.00   sec   653 MBytes  5.48 Gbits/sec
> 	[  4]   7.00-8.00   sec   653 MBytes  5.48 Gbits/sec
> 	[  4]   8.00-9.00   sec   653 MBytes  5.48 Gbits/sec
> 	[  4]   9.00-10.00  sec   654 MBytes  5.48 Gbits/sec
> 	- - - - - - - - - - - - - - - - - - - - - - - - -
> 	[ ID] Interval           Transfer     Bandwidth
> 	[  4]   0.00-10.00  sec  6.38 GBytes  5.48 Gbits/sec                  sender
> 	[  4]   0.00-10.00  sec  6.38 GBytes  5.48 Gbits/sec                  receiver
> 
> About 16% more packets/sec without XDP program loaded,
> and about 5% more packets/sec when using PP.
> Tested on Banana Pi BPI-R4 (MT7988A)
> 
> ---
> Technically, this is version 2 of the “use prefetch methods” patch.
> Initially, I submitted it as a single patch for review (RFC),
> but later I decided to include a second patch, resulting in this series
> Changes in v2:
> 	- Add "use PP exclusively for XDP programs" patch and create this series
> ---
> Elad Yifee (2):
>   net: ethernet: mtk_eth_soc: use prefetch methods
>   net: ethernet: mtk_eth_soc: use PP exclusively for XDP programs
> 
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> -- 
> 2.45.2
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance
  2024-07-29 19:10 ` [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance Lorenzo Bianconi
@ 2024-07-30  5:29   ` Elad Yifee
  2024-08-01  1:37     ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Elad Yifee @ 2024-07-30  5:29 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Felix Fietkau, Sean Wang, Mark Lee, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Matthias Brugger,
	AngeloGioacchino Del Regno, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek, Daniel Golle, Joe Damato

On Mon, Jul 29, 2024 at 10:10 PM Lorenzo Bianconi <lorenzo@kernel.org> wrote:
>
> > This small series includes two short and simple patches to improve RX performance
> > on this driver.
>
> Hi Elad,
>
> What is the chip revision you are running?
> If you are using a device that does not support HW-LRO (e.g. MT7986 or
> MT7988), I guess we can try to use page_pool_dev_alloc_frag() APIs and
> request a 2048B buffer. Doing so, we can use use a single page for two
> rx buffers improving recycling with page_pool. What do you think?
>
> Regards,
> Lorenzo
>
Hey Lorenzo,
It's Rev0. why, do you have any info on the revisions?
Since it's probably the reason for the performance hit,
allocating full pages every time, I think your suggestion would improve the
performance and probably match it with the napi_alloc_frag path.
I'll give it a try when I have time.
You also mentioned HW-LRO, which makes me think we also need the second patch
if we want to allow HW-LRO to co-exists with XDP on NETSYS2/3 devices.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 1/2] net: ethernet: mtk_eth_soc: use prefetch methods
  2024-07-29 18:29 ` [PATCH net-next v2 1/2] net: ethernet: mtk_eth_soc: use prefetch methods Elad Yifee
@ 2024-07-30  8:59   ` Joe Damato
  2024-07-30 18:35     ` Elad Yifee
  2025-01-06 14:28   ` Shengyu Qu
  1 sibling, 1 reply; 16+ messages in thread
From: Joe Damato @ 2024-07-30  8:59 UTC (permalink / raw)
  To: Elad Yifee
  Cc: Felix Fietkau, Sean Wang, Mark Lee, Lorenzo Bianconi,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Matthias Brugger, AngeloGioacchino Del Regno, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek, Daniel Golle

On Mon, Jul 29, 2024 at 09:29:54PM +0300, Elad Yifee wrote:
> Utilize kernel prefetch methods for faster cache line access.
> This change boosts driver performance,
> allowing the CPU to handle about 5% more packets/sec.
> 
> Signed-off-by: Elad Yifee <eladwf@gmail.com>
> ---
> Changes in v2:
> 	- use net_prefetchw as suggested by Joe Damato
> 	- add (NET_SKB_PAD + eth->ip_align) offset to prefetched data
> 	- use eth->ip_align instead of NET_IP_ALIGN as it could be 0,
> 	depending on the platform 
> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index 16ca427cf4c3..4d0052dbe3f4 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c

[...]

> @@ -2143,6 +2147,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
>  			dma_unmap_single(eth->dma_dev, ((u64)trxd.rxd1 | addr64),
>  					 ring->buf_size, DMA_FROM_DEVICE);
>  
> +			net_prefetch(data + NET_SKB_PAD + eth->ip_align);
>  			skb = build_skb(data, ring->frag_size);
>  			if (unlikely(!skb)) {
>  				netdev->stats.rx_dropped++;
> @@ -2150,7 +2155,8 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
>  				goto skip_rx;
>  			}
>  
> -			skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN);
> +			net_prefetchw(skb->data);
> +			skb_reserve(skb, NET_SKB_PAD + eth->ip_align);

Based on the code in mtk_probe, I am guessing that only
MTK_SOC_MT7628 can DMA to unaligned addresses, because for
everything else eth->ip_align would be 0.

Is that right?

I am asking because the documentation in
Documentation/core-api/unaligned-memory-access.rst refers to the
case you mention, NET_IP_ALIGN = 0, suggesting that this is
intentional for performance reasons on powerpc:

  One notable exception here is powerpc which defines NET_IP_ALIGN to
  0 because DMA to unaligned addresses can be very expensive and dwarf
  the cost of unaligned loads.

It goes on to explain that some devices cannot DMA to unaligned
addresses and I assume that for your driver that is everything which
is not MTK_SOC_MT7628 ?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 1/2] net: ethernet: mtk_eth_soc: use prefetch methods
  2024-07-30  8:59   ` Joe Damato
@ 2024-07-30 18:35     ` Elad Yifee
  2024-08-01  7:09       ` Stefan Roese
  0 siblings, 1 reply; 16+ messages in thread
From: Elad Yifee @ 2024-07-30 18:35 UTC (permalink / raw)
  To: Joe Damato, Elad Yifee, Felix Fietkau, Sean Wang, Mark Lee,
	Lorenzo Bianconi, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Matthias Brugger, AngeloGioacchino Del Regno, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek, Daniel Golle
  Cc: Stefan Roese

On Tue, Jul 30, 2024 at 11:59 AM Joe Damato <jdamato@fastly.com> wrote:
>
> Based on the code in mtk_probe, I am guessing that only
> MTK_SOC_MT7628 can DMA to unaligned addresses, because for
> everything else eth->ip_align would be 0.
>
> Is that right?
>
> I am asking because the documentation in
> Documentation/core-api/unaligned-memory-access.rst refers to the
> case you mention, NET_IP_ALIGN = 0, suggesting that this is
> intentional for performance reasons on powerpc:
>
>   One notable exception here is powerpc which defines NET_IP_ALIGN to
>   0 because DMA to unaligned addresses can be very expensive and dwarf
>   the cost of unaligned loads.
>
> It goes on to explain that some devices cannot DMA to unaligned
> addresses and I assume that for your driver that is everything which
> is not MTK_SOC_MT7628 ?

I have no explanation for this partial use of 'eth->ip_align', it
could be a mistake
or maybe I'm missing something.
Perhaps Stefan Roese, who wrote this part, has an explanation.
(adding Stefan to CC)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance
  2024-07-30  5:29   ` Elad Yifee
@ 2024-08-01  1:37     ` Jakub Kicinski
  2024-08-01  3:53       ` Elad Yifee
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2024-08-01  1:37 UTC (permalink / raw)
  To: Elad Yifee
  Cc: Lorenzo Bianconi, Felix Fietkau, Sean Wang, Mark Lee,
	David S. Miller, Eric Dumazet, Paolo Abeni, Matthias Brugger,
	AngeloGioacchino Del Regno, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek, Daniel Golle, Joe Damato

On Tue, 30 Jul 2024 08:29:58 +0300 Elad Yifee wrote:
> Since it's probably the reason for the performance hit,
> allocating full pages every time, I think your suggestion would improve the
> performance and probably match it with the napi_alloc_frag path.
> I'll give it a try when I have time.

This is a better direction than disabling PP.
Feel free to repost patch 1 separately.
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance
  2024-08-01  1:37     ` Jakub Kicinski
@ 2024-08-01  3:53       ` Elad Yifee
  2024-08-01  7:30         ` Lorenzo Bianconi
  0 siblings, 1 reply; 16+ messages in thread
From: Elad Yifee @ 2024-08-01  3:53 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Lorenzo Bianconi, Felix Fietkau, Sean Wang, Mark Lee,
	David S. Miller, Eric Dumazet, Paolo Abeni, Matthias Brugger,
	AngeloGioacchino Del Regno, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek, Daniel Golle, Joe Damato

On Thu, Aug 1, 2024 at 4:37 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 30 Jul 2024 08:29:58 +0300 Elad Yifee wrote:
> > Since it's probably the reason for the performance hit,
> > allocating full pages every time, I think your suggestion would improve the
> > performance and probably match it with the napi_alloc_frag path.
> > I'll give it a try when I have time.
>
> This is a better direction than disabling PP.
> Feel free to repost patch 1 separately.
> --
> pw-bot: cr
In this driver, the existence of PP is the condition to execute all
XDP-related operations which aren't necessary
on this hot path, so we anyway wouldn't want that. on XDP program
setup the rings are reallocated and the PP
would be created.
Other than that, for HWLRO we need contiguous pages of different order
than the PP, so the creation of PP
basically prevents the use of HWLRO.
So we solve this LRO problem and get a performance boost with this
simple change.

Lorenzo's suggestion would probably improve the performance of the XDP
path and we should try that nonetheless.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 1/2] net: ethernet: mtk_eth_soc: use prefetch methods
  2024-07-30 18:35     ` Elad Yifee
@ 2024-08-01  7:09       ` Stefan Roese
  2024-08-01 13:14         ` Joe Damato
  0 siblings, 1 reply; 16+ messages in thread
From: Stefan Roese @ 2024-08-01  7:09 UTC (permalink / raw)
  To: Elad Yifee, Joe Damato, Felix Fietkau, Sean Wang, Mark Lee,
	Lorenzo Bianconi, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Matthias Brugger, AngeloGioacchino Del Regno, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek, Daniel Golle

On 7/30/24 20:35, Elad Yifee wrote:
> On Tue, Jul 30, 2024 at 11:59 AM Joe Damato <jdamato@fastly.com> wrote:
>>
>> Based on the code in mtk_probe, I am guessing that only
>> MTK_SOC_MT7628 can DMA to unaligned addresses, because for
>> everything else eth->ip_align would be 0.
>>
>> Is that right?
>>
>> I am asking because the documentation in
>> Documentation/core-api/unaligned-memory-access.rst refers to the
>> case you mention, NET_IP_ALIGN = 0, suggesting that this is
>> intentional for performance reasons on powerpc:
>>
>>    One notable exception here is powerpc which defines NET_IP_ALIGN to
>>    0 because DMA to unaligned addresses can be very expensive and dwarf
>>    the cost of unaligned loads.
>>
>> It goes on to explain that some devices cannot DMA to unaligned
>> addresses and I assume that for your driver that is everything which
>> is not MTK_SOC_MT7628 ?
> 
> I have no explanation for this partial use of 'eth->ip_align', it
> could be a mistake
> or maybe I'm missing something.
> Perhaps Stefan Roese, who wrote this part, has an explanation.
> (adding Stefan to CC)

Sorry, I can't answer this w/o digging deeper into this driver and
SoC again. And I didn't use it for a few years now. It might be a
mistake.

Thanks,
Stefan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance
  2024-08-01  3:53       ` Elad Yifee
@ 2024-08-01  7:30         ` Lorenzo Bianconi
  2024-08-01  8:01           ` Elad Yifee
  0 siblings, 1 reply; 16+ messages in thread
From: Lorenzo Bianconi @ 2024-08-01  7:30 UTC (permalink / raw)
  To: Elad Yifee
  Cc: Jakub Kicinski, Felix Fietkau, Sean Wang, Mark Lee,
	David S. Miller, Eric Dumazet, Paolo Abeni, Matthias Brugger,
	AngeloGioacchino Del Regno, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek, Daniel Golle, Joe Damato

[-- Attachment #1: Type: text/plain, Size: 1509 bytes --]

> On Thu, Aug 1, 2024 at 4:37 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Tue, 30 Jul 2024 08:29:58 +0300 Elad Yifee wrote:
> > > Since it's probably the reason for the performance hit,
> > > allocating full pages every time, I think your suggestion would improve the
> > > performance and probably match it with the napi_alloc_frag path.
> > > I'll give it a try when I have time.
> >
> > This is a better direction than disabling PP.
> > Feel free to repost patch 1 separately.
> > --
> > pw-bot: cr
> In this driver, the existence of PP is the condition to execute all
> XDP-related operations which aren't necessary
> on this hot path, so we anyway wouldn't want that. on XDP program
> setup the rings are reallocated and the PP
> would be created.

nope, I added page_pool support even for non-XDP mode for hw that does
not support HW-LRO. I guess mtk folks can correct me if I am wrong but
IIRC there were some hw limirations on mt7986/mt7988 for HW-LRO, so I am
not sure if it can be supported.

> Other than that, for HWLRO we need contiguous pages of different order
> than the PP, so the creation of PP
> basically prevents the use of HWLRO.
> So we solve this LRO problem and get a performance boost with this
> simple change.
> 
> Lorenzo's suggestion would probably improve the performance of the XDP
> path and we should try that nonetheless.

nope, I mean to improve peformances even for non-XDP case with page_pool frag
APIs.

Regards,
Lorenzo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance
  2024-08-01  7:30         ` Lorenzo Bianconi
@ 2024-08-01  8:01           ` Elad Yifee
  2024-08-01  8:15             ` Lorenzo Bianconi
  0 siblings, 1 reply; 16+ messages in thread
From: Elad Yifee @ 2024-08-01  8:01 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Jakub Kicinski, Felix Fietkau, Sean Wang, Mark Lee,
	David S. Miller, Eric Dumazet, Paolo Abeni, Matthias Brugger,
	AngeloGioacchino Del Regno, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek, Daniel Golle, Joe Damato

On Thu, Aug 1, 2024 at 10:30 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:
>
> nope, I added page_pool support even for non-XDP mode for hw that does
> not support HW-LRO. I guess mtk folks can correct me if I am wrong but
> IIRC there were some hw limirations on mt7986/mt7988 for HW-LRO, so I am
> not sure if it can be supported.
I know, but if we want to add support for HWLRO alongside XDP on NETSYS2/3,
we need to prevent the PP use (for HWLRO allocations) and enable it
only when there's
an XDP program.
I've been told HWLRO works on the MTK SDK version.

> > Other than that, for HWLRO we need contiguous pages of different order
> > than the PP, so the creation of PP
> > basically prevents the use of HWLRO.
> > So we solve this LRO problem and get a performance boost with this
> > simple change.
> >
> > Lorenzo's suggestion would probably improve the performance of the XDP
> > path and we should try that nonetheless.
>
> nope, I mean to improve peformances even for non-XDP case with page_pool frag
> APIs.
>
> Regards,
> Lorenzo
Yes of course it would improve it for non-XDP case if we still use PP
for non-XDP,
but my point is we shouldn't, mainly because of HWLRO, but also the
extra unnecessary code.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance
  2024-08-01  8:01           ` Elad Yifee
@ 2024-08-01  8:15             ` Lorenzo Bianconi
  0 siblings, 0 replies; 16+ messages in thread
From: Lorenzo Bianconi @ 2024-08-01  8:15 UTC (permalink / raw)
  To: Elad Yifee
  Cc: Jakub Kicinski, Felix Fietkau, Sean Wang, Mark Lee,
	David S. Miller, Eric Dumazet, Paolo Abeni, Matthias Brugger,
	AngeloGioacchino Del Regno, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek, Daniel Golle, Joe Damato

[-- Attachment #1: Type: text/plain, Size: 1608 bytes --]

> On Thu, Aug 1, 2024 at 10:30 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> >
> > nope, I added page_pool support even for non-XDP mode for hw that does
> > not support HW-LRO. I guess mtk folks can correct me if I am wrong but
> > IIRC there were some hw limirations on mt7986/mt7988 for HW-LRO, so I am
> > not sure if it can be supported.
> I know, but if we want to add support for HWLRO alongside XDP on NETSYS2/3,
> we need to prevent the PP use (for HWLRO allocations) and enable it
> only when there's
> an XDP program.
> I've been told HWLRO works on the MTK SDK version.

ack, but in this case, please provide even the HW-LRO support in the same
series. Moreover, I am not sure if it is performant enough or not, we could
increase the page_pool order.
Moreover I guess we should be sure the HW-LRO works on all NETSYS2/3 hws
revisions.

Regards,
Lorenzo

> 
> > > Other than that, for HWLRO we need contiguous pages of different order
> > > than the PP, so the creation of PP
> > > basically prevents the use of HWLRO.
> > > So we solve this LRO problem and get a performance boost with this
> > > simple change.
> > >
> > > Lorenzo's suggestion would probably improve the performance of the XDP
> > > path and we should try that nonetheless.
> >
> > nope, I mean to improve peformances even for non-XDP case with page_pool frag
> > APIs.
> >
> > Regards,
> > Lorenzo
> Yes of course it would improve it for non-XDP case if we still use PP
> for non-XDP,
> but my point is we shouldn't, mainly because of HWLRO, but also the
> extra unnecessary code.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 1/2] net: ethernet: mtk_eth_soc: use prefetch methods
  2024-08-01  7:09       ` Stefan Roese
@ 2024-08-01 13:14         ` Joe Damato
  0 siblings, 0 replies; 16+ messages in thread
From: Joe Damato @ 2024-08-01 13:14 UTC (permalink / raw)
  To: Stefan Roese
  Cc: Elad Yifee, Felix Fietkau, Sean Wang, Mark Lee, Lorenzo Bianconi,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Matthias Brugger, AngeloGioacchino Del Regno, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek, Daniel Golle

On Thu, Aug 01, 2024 at 09:09:27AM +0200, Stefan Roese wrote:
> On 7/30/24 20:35, Elad Yifee wrote:
> > On Tue, Jul 30, 2024 at 11:59 AM Joe Damato <jdamato@fastly.com> wrote:
> > > 
> > > Based on the code in mtk_probe, I am guessing that only
> > > MTK_SOC_MT7628 can DMA to unaligned addresses, because for
> > > everything else eth->ip_align would be 0.
> > > 
> > > Is that right?
> > > 
> > > I am asking because the documentation in
> > > Documentation/core-api/unaligned-memory-access.rst refers to the
> > > case you mention, NET_IP_ALIGN = 0, suggesting that this is
> > > intentional for performance reasons on powerpc:
> > > 
> > >    One notable exception here is powerpc which defines NET_IP_ALIGN to
> > >    0 because DMA to unaligned addresses can be very expensive and dwarf
> > >    the cost of unaligned loads.
> > > 
> > > It goes on to explain that some devices cannot DMA to unaligned
> > > addresses and I assume that for your driver that is everything which
> > > is not MTK_SOC_MT7628 ?
> > 
> > I have no explanation for this partial use of 'eth->ip_align', it
> > could be a mistake
> > or maybe I'm missing something.
> > Perhaps Stefan Roese, who wrote this part, has an explanation.
> > (adding Stefan to CC)
> 
> Sorry, I can't answer this w/o digging deeper into this driver and
> SoC again. And I didn't use it for a few years now. It might be a
> mistake.

I asked about it because it was added in v2 of the patch, see the
changelog from the patch:

  - use eth->ip_align instead of NET_IP_ALIGN as it could be 0,
  depending on the platform 

It seemed like from the changelog some one decided adding that made
sense and I was just confirming the reasoning above.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 1/2] net: ethernet: mtk_eth_soc: use prefetch methods
  2024-07-29 18:29 ` [PATCH net-next v2 1/2] net: ethernet: mtk_eth_soc: use prefetch methods Elad Yifee
  2024-07-30  8:59   ` Joe Damato
@ 2025-01-06 14:28   ` Shengyu Qu
  2025-01-21 23:50     ` Andrew Lunn
  1 sibling, 1 reply; 16+ messages in thread
From: Shengyu Qu @ 2025-01-06 14:28 UTC (permalink / raw)
  To: Elad Yifee, Felix Fietkau, Sean Wang, Mark Lee, Lorenzo Bianconi,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Matthias Brugger, AngeloGioacchino Del Regno, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek
  Cc: wiagn233, Daniel Golle, Joe Damato, sr


[-- Attachment #1.1.1: Type: text/plain, Size: 2851 bytes --]

Hello,

Sorry to bother, but what happened to this patch? Is it given up or
something?

Best regards,
Shengyu

在 2024/7/30 2:29, Elad Yifee 写道:
> Utilize kernel prefetch methods for faster cache line access.
> This change boosts driver performance,
> allowing the CPU to handle about 5% more packets/sec.
> 
> Signed-off-by: Elad Yifee <eladwf@gmail.com>
> ---
> Changes in v2:
> 	- use net_prefetchw as suggested by Joe Damato
> 	- add (NET_SKB_PAD + eth->ip_align) offset to prefetched data
> 	- use eth->ip_align instead of NET_IP_ALIGN as it could be 0,
> 	depending on the platform
> ---
>   drivers/net/ethernet/mediatek/mtk_eth_soc.c | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index 16ca427cf4c3..4d0052dbe3f4 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -1963,6 +1963,7 @@ static u32 mtk_xdp_run(struct mtk_eth *eth, struct mtk_rx_ring *ring,
>   	if (!prog)
>   		goto out;
>   
> +	net_prefetchw(xdp->data_hard_start);
>   	act = bpf_prog_run_xdp(prog, xdp);
>   	switch (act) {
>   	case XDP_PASS:
> @@ -2038,6 +2039,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
>   
>   		idx = NEXT_DESP_IDX(ring->calc_idx, ring->dma_size);
>   		rxd = ring->dma + idx * eth->soc->rx.desc_size;
> +		prefetch(rxd);
>   		data = ring->data[idx];
>   
>   		if (!mtk_rx_get_desc(eth, &trxd, rxd))
> @@ -2105,6 +2107,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
>   			if (ret != XDP_PASS)
>   				goto skip_rx;
>   
> +			net_prefetch(xdp.data_meta);
>   			skb = build_skb(data, PAGE_SIZE);
>   			if (unlikely(!skb)) {
>   				page_pool_put_full_page(ring->page_pool,
> @@ -2113,6 +2116,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
>   				goto skip_rx;
>   			}
>   
> +			net_prefetchw(skb->data);
>   			skb_reserve(skb, xdp.data - xdp.data_hard_start);
>   			skb_put(skb, xdp.data_end - xdp.data);
>   			skb_mark_for_recycle(skb);
> @@ -2143,6 +2147,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
>   			dma_unmap_single(eth->dma_dev, ((u64)trxd.rxd1 | addr64),
>   					 ring->buf_size, DMA_FROM_DEVICE);
>   
> +			net_prefetch(data + NET_SKB_PAD + eth->ip_align);
>   			skb = build_skb(data, ring->frag_size);
>   			if (unlikely(!skb)) {
>   				netdev->stats.rx_dropped++;
> @@ -2150,7 +2155,8 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
>   				goto skip_rx;
>   			}
>   
> -			skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN);
> +			net_prefetchw(skb->data);
> +			skb_reserve(skb, NET_SKB_PAD + eth->ip_align);
>   			skb_put(skb, pktlen);
>   		}
>   


[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 6977 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 1/2] net: ethernet: mtk_eth_soc: use prefetch methods
  2025-01-06 14:28   ` Shengyu Qu
@ 2025-01-21 23:50     ` Andrew Lunn
  0 siblings, 0 replies; 16+ messages in thread
From: Andrew Lunn @ 2025-01-21 23:50 UTC (permalink / raw)
  To: Shengyu Qu
  Cc: Elad Yifee, Felix Fietkau, Sean Wang, Mark Lee, Lorenzo Bianconi,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Matthias Brugger, AngeloGioacchino Del Regno, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek, Daniel Golle,
	Joe Damato, sr

On Mon, Jan 06, 2025 at 10:28:46PM +0800, Shengyu Qu wrote:
> Hello,
> 
> Sorry to bother, but what happened to this patch? Is it given up or
> something?

There appear to be open questions about it. Those questions need
answering. Also, the net-next is closed at the moment for the merge
window. So a new version will need to be submitted once net-next
reopens in two weeks time.


    Andrew

---
pw-bot: cr

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-01-21 23:50 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-29 18:29 [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance Elad Yifee
2024-07-29 18:29 ` [PATCH net-next v2 1/2] net: ethernet: mtk_eth_soc: use prefetch methods Elad Yifee
2024-07-30  8:59   ` Joe Damato
2024-07-30 18:35     ` Elad Yifee
2024-08-01  7:09       ` Stefan Roese
2024-08-01 13:14         ` Joe Damato
2025-01-06 14:28   ` Shengyu Qu
2025-01-21 23:50     ` Andrew Lunn
2024-07-29 18:29 ` [PATCH net-next v2 2/2] net: ethernet: mtk_eth_soc: use PP exclusively for XDP programs Elad Yifee
2024-07-29 19:10 ` [PATCH net-next v2 0/2] net: ethernet: mtk_eth_soc: improve RX performance Lorenzo Bianconi
2024-07-30  5:29   ` Elad Yifee
2024-08-01  1:37     ` Jakub Kicinski
2024-08-01  3:53       ` Elad Yifee
2024-08-01  7:30         ` Lorenzo Bianconi
2024-08-01  8:01           ` Elad Yifee
2024-08-01  8:15             ` Lorenzo Bianconi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).