* [Intel-wired-lan] [PATCH v3 iwl-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic
@ 2025-01-20 15:50 ` Maciej Fijalkowski
0 siblings, 0 replies; 15+ messages in thread
From: Maciej Fijalkowski @ 2025-01-20 15:50 UTC (permalink / raw)
To: intel-wired-lan
Cc: Maciej Fijalkowski, netdev, xudu, anthony.l.nguyen,
przemyslaw.kitszel, jacob.e.keller, jmaxwell, magnus.karlsson
v2->v3:
s/intel/iwl in patch subjects
v1->v2:
* pass ntc to ice_put_rx_mbuf() (pointed out by Petr Oros) in patch 1
* add review tags from Przemek Kitszel (thanks!)
* make sure patches compile and work ;)
Hello in 2025,
this patchset fixes a pretty nasty issue that was reported by RedHat
folks which occured after ~30 minutes (this value varied, just trying
here to state that it was not observed immediately but rather after a
considerable longer amount of time) when ice driver was tortured with
jumbo frames via mix of iperf traffic executed simultaneously with
wrk/nginx on client/server sides (HTTP and TCP workloads basically).
The reported splats were spanning across all the bad things that can
happen to the state of page - refcount underflow, use-after-free, etc.
One of these looked as follows:
[ 2084.019891] BUG: Bad page state in process swapper/34 pfn:97fcd0
[ 2084.025990] page:00000000a60ee772 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x97fcd0
[ 2084.035462] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[ 2084.041990] raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000
[ 2084.049730] raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
[ 2084.057468] page dumped because: nonzero _refcount
[ 2084.062260] Modules linked in: bonding tls sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mgag200 irqd
[ 2084.137829] CPU: 34 PID: 0 Comm: swapper/34 Kdump: loaded Not tainted 5.14.0-427.37.1.el9_4.x86_64 #1
[ 2084.147039] Hardware name: Dell Inc. PowerEdge R750/0216NK, BIOS 1.13.2 12/19/2023
[ 2084.154604] Call Trace:
[ 2084.157058] <IRQ>
[ 2084.159080] dump_stack_lvl+0x34/0x48
[ 2084.162752] bad_page.cold+0x63/0x94
[ 2084.166333] check_new_pages+0xb3/0xe0
[ 2084.170083] rmqueue_bulk+0x2d2/0x9e0
[ 2084.173749] ? ktime_get+0x35/0xa0
[ 2084.177159] rmqueue_pcplist+0x13b/0x210
[ 2084.181081] rmqueue+0x7d3/0xd40
[ 2084.184316] ? xas_load+0x9/0xa0
[ 2084.187547] ? xas_find+0x183/0x1d0
[ 2084.191041] ? xa_find_after+0xd0/0x130
[ 2084.194879] ? intel_iommu_iotlb_sync_map+0x89/0xe0
[ 2084.199759] get_page_from_freelist+0x11f/0x530
[ 2084.204291] __alloc_pages+0xf2/0x250
[ 2084.207958] ice_alloc_rx_bufs+0xcc/0x1c0 [ice]
[ 2084.212543] ice_clean_rx_irq+0x631/0xa20 [ice]
[ 2084.217111] ice_napi_poll+0xdf/0x2a0 [ice]
[ 2084.221330] __napi_poll+0x27/0x170
[ 2084.224824] net_rx_action+0x233/0x2f0
[ 2084.228575] __do_softirq+0xc7/0x2ac
[ 2084.232155] __irq_exit_rcu+0xa1/0xc0
[ 2084.235821] common_interrupt+0x80/0xa0
[ 2084.239662] </IRQ>
[ 2084.241768] <TASK>
The fix is mostly about reverting what was done in commit 1dc1a7e7f410
("ice: Centrallize Rx buffer recycling") followed by proper timing on
page_count() storage and then removing the ice_rx_buf::act related logic
(which was mostly introduced for purposes from cited commit).
Special thanks to Xu Du for providing reproducer and Jacob Keller for
initial extensive analysis.
Thanks,
Maciej
Maciej Fijalkowski (3):
ice: put Rx buffers after being done with current frame
ice: gather page_count()'s of each frag right before XDP prog call
ice: stop storing XDP verdict within ice_rx_buf
drivers/net/ethernet/intel/ice/ice_txrx.c | 128 +++++++++++-------
drivers/net/ethernet/intel/ice/ice_txrx.h | 1 -
drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 43 ------
3 files changed, 82 insertions(+), 90 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 15+ messages in thread* [PATCH v3 iwl-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic @ 2025-01-20 15:50 ` Maciej Fijalkowski 0 siblings, 0 replies; 15+ messages in thread From: Maciej Fijalkowski @ 2025-01-20 15:50 UTC (permalink / raw) To: intel-wired-lan Cc: netdev, anthony.l.nguyen, magnus.karlsson, jacob.e.keller, xudu, mschmidt, jmaxwell, poros, przemyslaw.kitszel, Maciej Fijalkowski v2->v3: s/intel/iwl in patch subjects v1->v2: * pass ntc to ice_put_rx_mbuf() (pointed out by Petr Oros) in patch 1 * add review tags from Przemek Kitszel (thanks!) * make sure patches compile and work ;) Hello in 2025, this patchset fixes a pretty nasty issue that was reported by RedHat folks which occured after ~30 minutes (this value varied, just trying here to state that it was not observed immediately but rather after a considerable longer amount of time) when ice driver was tortured with jumbo frames via mix of iperf traffic executed simultaneously with wrk/nginx on client/server sides (HTTP and TCP workloads basically). The reported splats were spanning across all the bad things that can happen to the state of page - refcount underflow, use-after-free, etc. One of these looked as follows: [ 2084.019891] BUG: Bad page state in process swapper/34 pfn:97fcd0 [ 2084.025990] page:00000000a60ee772 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x97fcd0 [ 2084.035462] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) [ 2084.041990] raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000 [ 2084.049730] raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000 [ 2084.057468] page dumped because: nonzero _refcount [ 2084.062260] Modules linked in: bonding tls sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mgag200 irqd [ 2084.137829] CPU: 34 PID: 0 Comm: swapper/34 Kdump: loaded Not tainted 5.14.0-427.37.1.el9_4.x86_64 #1 [ 2084.147039] Hardware name: Dell Inc. PowerEdge R750/0216NK, BIOS 1.13.2 12/19/2023 [ 2084.154604] Call Trace: [ 2084.157058] <IRQ> [ 2084.159080] dump_stack_lvl+0x34/0x48 [ 2084.162752] bad_page.cold+0x63/0x94 [ 2084.166333] check_new_pages+0xb3/0xe0 [ 2084.170083] rmqueue_bulk+0x2d2/0x9e0 [ 2084.173749] ? ktime_get+0x35/0xa0 [ 2084.177159] rmqueue_pcplist+0x13b/0x210 [ 2084.181081] rmqueue+0x7d3/0xd40 [ 2084.184316] ? xas_load+0x9/0xa0 [ 2084.187547] ? xas_find+0x183/0x1d0 [ 2084.191041] ? xa_find_after+0xd0/0x130 [ 2084.194879] ? intel_iommu_iotlb_sync_map+0x89/0xe0 [ 2084.199759] get_page_from_freelist+0x11f/0x530 [ 2084.204291] __alloc_pages+0xf2/0x250 [ 2084.207958] ice_alloc_rx_bufs+0xcc/0x1c0 [ice] [ 2084.212543] ice_clean_rx_irq+0x631/0xa20 [ice] [ 2084.217111] ice_napi_poll+0xdf/0x2a0 [ice] [ 2084.221330] __napi_poll+0x27/0x170 [ 2084.224824] net_rx_action+0x233/0x2f0 [ 2084.228575] __do_softirq+0xc7/0x2ac [ 2084.232155] __irq_exit_rcu+0xa1/0xc0 [ 2084.235821] common_interrupt+0x80/0xa0 [ 2084.239662] </IRQ> [ 2084.241768] <TASK> The fix is mostly about reverting what was done in commit 1dc1a7e7f410 ("ice: Centrallize Rx buffer recycling") followed by proper timing on page_count() storage and then removing the ice_rx_buf::act related logic (which was mostly introduced for purposes from cited commit). Special thanks to Xu Du for providing reproducer and Jacob Keller for initial extensive analysis. Thanks, Maciej Maciej Fijalkowski (3): ice: put Rx buffers after being done with current frame ice: gather page_count()'s of each frag right before XDP prog call ice: stop storing XDP verdict within ice_rx_buf drivers/net/ethernet/intel/ice/ice_txrx.c | 128 +++++++++++------- drivers/net/ethernet/intel/ice/ice_txrx.h | 1 - drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 43 ------ 3 files changed, 82 insertions(+), 90 deletions(-) -- 2.43.0 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [Intel-wired-lan] [PATCH v3 iwl-net 1/3] ice: put Rx buffers after being done with current frame 2025-01-20 15:50 ` Maciej Fijalkowski @ 2025-01-20 15:50 ` Maciej Fijalkowski -1 siblings, 0 replies; 15+ messages in thread From: Maciej Fijalkowski @ 2025-01-20 15:50 UTC (permalink / raw) To: intel-wired-lan Cc: Maciej Fijalkowski, netdev, xudu, anthony.l.nguyen, przemyslaw.kitszel, jacob.e.keller, jmaxwell, magnus.karlsson Introduce a new helper ice_put_rx_mbuf() that will go through gathered frags from current frame and will call ice_put_rx_buf() on them. Current logic that was supposed to simplify and optimize the driver where we go through a batch of all buffers processed in current NAPI instance turned out to be broken for jumbo frames and very heavy load that was coming from both multi-thread iperf and nginx/wrk pair between server and client. The delay introduced by approach that we are dropping is simply too big and we need to take the decision regarding page recycling/releasing as quick as we can. While at it, address an error path of ice_add_xdp_frag() - we were missing buffer putting from day 1 there. As a nice side effect we get rid of annoying and repetetive three-liner: xdp->data = NULL; rx_ring->first_desc = ntc; rx_ring->nr_frags = 0; by embedding it within introduced routine. Fixes: 1dc1a7e7f410 ("ice: Centrallize Rx buffer recycling") Reported-and-tested-by: Xu Du <xudu@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Co-developed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> --- drivers/net/ethernet/intel/ice/ice_txrx.c | 68 +++++++++++++---------- 1 file changed, 39 insertions(+), 29 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 5d2d7736fd5f..f2134ad57ead 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -1103,6 +1103,38 @@ ice_put_rx_buf(struct ice_rx_ring *rx_ring, struct ice_rx_buf *rx_buf) rx_buf->page = NULL; } +static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, + u32 *xdp_xmit, u32 ntc) +{ + u32 nr_frags = rx_ring->nr_frags + 1; + u32 idx = rx_ring->first_desc; + u32 cnt = rx_ring->count; + struct ice_rx_buf *buf; + int i; + + for (i = 0; i < nr_frags; i++) { + buf = &rx_ring->rx_buf[idx]; + + if (buf->act & (ICE_XDP_TX | ICE_XDP_REDIR)) { + ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); + *xdp_xmit |= buf->act; + } else if (buf->act & ICE_XDP_CONSUMED) { + buf->pagecnt_bias++; + } else if (buf->act == ICE_XDP_PASS) { + ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); + } + + ice_put_rx_buf(rx_ring, buf); + + if (++idx == cnt) + idx = 0; + } + + xdp->data = NULL; + rx_ring->first_desc = ntc; + rx_ring->nr_frags = 0; +} + /** * ice_clean_rx_irq - Clean completed descriptors from Rx ring - bounce buf * @rx_ring: Rx descriptor ring to transact packets on @@ -1120,7 +1152,6 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) unsigned int total_rx_bytes = 0, total_rx_pkts = 0; unsigned int offset = rx_ring->rx_offset; struct xdp_buff *xdp = &rx_ring->xdp; - u32 cached_ntc = rx_ring->first_desc; struct ice_tx_ring *xdp_ring = NULL; struct bpf_prog *xdp_prog = NULL; u32 ntc = rx_ring->next_to_clean; @@ -1128,7 +1159,6 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) u32 xdp_xmit = 0; u32 cached_ntu; bool failure; - u32 first; xdp_prog = READ_ONCE(rx_ring->xdp_prog); if (xdp_prog) { @@ -1190,6 +1220,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) xdp_prepare_buff(xdp, hard_start, offset, size, !!offset); xdp_buff_clear_frags_flag(xdp); } else if (ice_add_xdp_frag(rx_ring, xdp, rx_buf, size)) { + ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc); break; } if (++ntc == cnt) @@ -1205,9 +1236,8 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) total_rx_bytes += xdp_get_buff_len(xdp); total_rx_pkts++; - xdp->data = NULL; - rx_ring->first_desc = ntc; - rx_ring->nr_frags = 0; + ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc); + continue; construct_skb: if (likely(ice_ring_uses_build_skb(rx_ring))) @@ -1221,14 +1251,11 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) if (unlikely(xdp_buff_has_frags(xdp))) ice_set_rx_bufs_act(xdp, rx_ring, ICE_XDP_CONSUMED); - xdp->data = NULL; - rx_ring->first_desc = ntc; - rx_ring->nr_frags = 0; - break; } - xdp->data = NULL; - rx_ring->first_desc = ntc; - rx_ring->nr_frags = 0; + ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc); + + if (!skb) + break; stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_RXE_S); if (unlikely(ice_test_staterr(rx_desc->wb.status_error0, @@ -1257,23 +1284,6 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) total_rx_pkts++; } - first = rx_ring->first_desc; - while (cached_ntc != first) { - struct ice_rx_buf *buf = &rx_ring->rx_buf[cached_ntc]; - - if (buf->act & (ICE_XDP_TX | ICE_XDP_REDIR)) { - ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); - xdp_xmit |= buf->act; - } else if (buf->act & ICE_XDP_CONSUMED) { - buf->pagecnt_bias++; - } else if (buf->act == ICE_XDP_PASS) { - ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); - } - - ice_put_rx_buf(rx_ring, buf); - if (++cached_ntc >= cnt) - cached_ntc = 0; - } rx_ring->next_to_clean = ntc; /* return up to cleaned_count buffers to hardware */ failure = ice_alloc_rx_bufs(rx_ring, ICE_RX_DESC_UNUSED(rx_ring)); -- 2.43.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 iwl-net 1/3] ice: put Rx buffers after being done with current frame @ 2025-01-20 15:50 ` Maciej Fijalkowski 0 siblings, 0 replies; 15+ messages in thread From: Maciej Fijalkowski @ 2025-01-20 15:50 UTC (permalink / raw) To: intel-wired-lan Cc: netdev, anthony.l.nguyen, magnus.karlsson, jacob.e.keller, xudu, mschmidt, jmaxwell, poros, przemyslaw.kitszel, Maciej Fijalkowski Introduce a new helper ice_put_rx_mbuf() that will go through gathered frags from current frame and will call ice_put_rx_buf() on them. Current logic that was supposed to simplify and optimize the driver where we go through a batch of all buffers processed in current NAPI instance turned out to be broken for jumbo frames and very heavy load that was coming from both multi-thread iperf and nginx/wrk pair between server and client. The delay introduced by approach that we are dropping is simply too big and we need to take the decision regarding page recycling/releasing as quick as we can. While at it, address an error path of ice_add_xdp_frag() - we were missing buffer putting from day 1 there. As a nice side effect we get rid of annoying and repetetive three-liner: xdp->data = NULL; rx_ring->first_desc = ntc; rx_ring->nr_frags = 0; by embedding it within introduced routine. Fixes: 1dc1a7e7f410 ("ice: Centrallize Rx buffer recycling") Reported-and-tested-by: Xu Du <xudu@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Co-developed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> --- drivers/net/ethernet/intel/ice/ice_txrx.c | 68 +++++++++++++---------- 1 file changed, 39 insertions(+), 29 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 5d2d7736fd5f..f2134ad57ead 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -1103,6 +1103,38 @@ ice_put_rx_buf(struct ice_rx_ring *rx_ring, struct ice_rx_buf *rx_buf) rx_buf->page = NULL; } +static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, + u32 *xdp_xmit, u32 ntc) +{ + u32 nr_frags = rx_ring->nr_frags + 1; + u32 idx = rx_ring->first_desc; + u32 cnt = rx_ring->count; + struct ice_rx_buf *buf; + int i; + + for (i = 0; i < nr_frags; i++) { + buf = &rx_ring->rx_buf[idx]; + + if (buf->act & (ICE_XDP_TX | ICE_XDP_REDIR)) { + ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); + *xdp_xmit |= buf->act; + } else if (buf->act & ICE_XDP_CONSUMED) { + buf->pagecnt_bias++; + } else if (buf->act == ICE_XDP_PASS) { + ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); + } + + ice_put_rx_buf(rx_ring, buf); + + if (++idx == cnt) + idx = 0; + } + + xdp->data = NULL; + rx_ring->first_desc = ntc; + rx_ring->nr_frags = 0; +} + /** * ice_clean_rx_irq - Clean completed descriptors from Rx ring - bounce buf * @rx_ring: Rx descriptor ring to transact packets on @@ -1120,7 +1152,6 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) unsigned int total_rx_bytes = 0, total_rx_pkts = 0; unsigned int offset = rx_ring->rx_offset; struct xdp_buff *xdp = &rx_ring->xdp; - u32 cached_ntc = rx_ring->first_desc; struct ice_tx_ring *xdp_ring = NULL; struct bpf_prog *xdp_prog = NULL; u32 ntc = rx_ring->next_to_clean; @@ -1128,7 +1159,6 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) u32 xdp_xmit = 0; u32 cached_ntu; bool failure; - u32 first; xdp_prog = READ_ONCE(rx_ring->xdp_prog); if (xdp_prog) { @@ -1190,6 +1220,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) xdp_prepare_buff(xdp, hard_start, offset, size, !!offset); xdp_buff_clear_frags_flag(xdp); } else if (ice_add_xdp_frag(rx_ring, xdp, rx_buf, size)) { + ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc); break; } if (++ntc == cnt) @@ -1205,9 +1236,8 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) total_rx_bytes += xdp_get_buff_len(xdp); total_rx_pkts++; - xdp->data = NULL; - rx_ring->first_desc = ntc; - rx_ring->nr_frags = 0; + ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc); + continue; construct_skb: if (likely(ice_ring_uses_build_skb(rx_ring))) @@ -1221,14 +1251,11 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) if (unlikely(xdp_buff_has_frags(xdp))) ice_set_rx_bufs_act(xdp, rx_ring, ICE_XDP_CONSUMED); - xdp->data = NULL; - rx_ring->first_desc = ntc; - rx_ring->nr_frags = 0; - break; } - xdp->data = NULL; - rx_ring->first_desc = ntc; - rx_ring->nr_frags = 0; + ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc); + + if (!skb) + break; stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_RXE_S); if (unlikely(ice_test_staterr(rx_desc->wb.status_error0, @@ -1257,23 +1284,6 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) total_rx_pkts++; } - first = rx_ring->first_desc; - while (cached_ntc != first) { - struct ice_rx_buf *buf = &rx_ring->rx_buf[cached_ntc]; - - if (buf->act & (ICE_XDP_TX | ICE_XDP_REDIR)) { - ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); - xdp_xmit |= buf->act; - } else if (buf->act & ICE_XDP_CONSUMED) { - buf->pagecnt_bias++; - } else if (buf->act == ICE_XDP_PASS) { - ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); - } - - ice_put_rx_buf(rx_ring, buf); - if (++cached_ntc >= cnt) - cached_ntc = 0; - } rx_ring->next_to_clean = ntc; /* return up to cleaned_count buffers to hardware */ failure = ice_alloc_rx_bufs(rx_ring, ICE_RX_DESC_UNUSED(rx_ring)); -- 2.43.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Intel-wired-lan] [PATCH v3 iwl-net 1/3] ice: put Rx buffers after being done with current frame 2025-01-20 15:50 ` Maciej Fijalkowski @ 2025-01-20 16:38 ` Simon Horman -1 siblings, 0 replies; 15+ messages in thread From: Simon Horman @ 2025-01-20 16:38 UTC (permalink / raw) To: Maciej Fijalkowski Cc: netdev, xudu, anthony.l.nguyen, przemyslaw.kitszel, jacob.e.keller, intel-wired-lan, jmaxwell, magnus.karlsson On Mon, Jan 20, 2025 at 04:50:14PM +0100, Maciej Fijalkowski wrote: > Introduce a new helper ice_put_rx_mbuf() that will go through gathered > frags from current frame and will call ice_put_rx_buf() on them. Current > logic that was supposed to simplify and optimize the driver where we go > through a batch of all buffers processed in current NAPI instance turned > out to be broken for jumbo frames and very heavy load that was coming > from both multi-thread iperf and nginx/wrk pair between server and > client. The delay introduced by approach that we are dropping is simply > too big and we need to take the decision regarding page > recycling/releasing as quick as we can. > > While at it, address an error path of ice_add_xdp_frag() - we were > missing buffer putting from day 1 there. > > As a nice side effect we get rid of annoying and repetetive three-liner: nit: repetitive ... ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 iwl-net 1/3] ice: put Rx buffers after being done with current frame @ 2025-01-20 16:38 ` Simon Horman 0 siblings, 0 replies; 15+ messages in thread From: Simon Horman @ 2025-01-20 16:38 UTC (permalink / raw) To: Maciej Fijalkowski Cc: intel-wired-lan, netdev, anthony.l.nguyen, magnus.karlsson, jacob.e.keller, xudu, mschmidt, jmaxwell, poros, przemyslaw.kitszel On Mon, Jan 20, 2025 at 04:50:14PM +0100, Maciej Fijalkowski wrote: > Introduce a new helper ice_put_rx_mbuf() that will go through gathered > frags from current frame and will call ice_put_rx_buf() on them. Current > logic that was supposed to simplify and optimize the driver where we go > through a batch of all buffers processed in current NAPI instance turned > out to be broken for jumbo frames and very heavy load that was coming > from both multi-thread iperf and nginx/wrk pair between server and > client. The delay introduced by approach that we are dropping is simply > too big and we need to take the decision regarding page > recycling/releasing as quick as we can. > > While at it, address an error path of ice_add_xdp_frag() - we were > missing buffer putting from day 1 there. > > As a nice side effect we get rid of annoying and repetetive three-liner: nit: repetitive ... ^ permalink raw reply [flat|nested] 15+ messages in thread
* [Intel-wired-lan] [PATCH v3 iwl-net 2/3] ice: gather page_count()'s of each frag right before XDP prog call 2025-01-20 15:50 ` Maciej Fijalkowski @ 2025-01-20 15:50 ` Maciej Fijalkowski -1 siblings, 0 replies; 15+ messages in thread From: Maciej Fijalkowski @ 2025-01-20 15:50 UTC (permalink / raw) To: intel-wired-lan Cc: Maciej Fijalkowski, netdev, xudu, anthony.l.nguyen, przemyslaw.kitszel, jacob.e.keller, jmaxwell, magnus.karlsson If we store the pgcnt on few fragments while being in the middle of gathering the whole frame and we stumbled upon DD bit not being set, we terminate the NAPI Rx processing loop and come back later on. Then on next NAPI execution we work on previously stored pgcnt. Imagine that second half of page was used actively by networking stack and by the time we came back, stack is not busy with this page anymore and decremented the refcnt. The page reuse algorithm in this case should be good to reuse the page but given the old refcnt it will not do so and attempt to release the page via page_frag_cache_drain() with pagecnt_bias used as an arg. This in turn will result in negative refcnt on struct page, which was initially observed by Xu Du. Therefore, move the page count storage from ice_get_rx_buf() to a place where we are sure that whole frame has been collected, but before calling XDP program as it internally can also change the page count of fragments belonging to xdp_buff. Fixes: ac0753391195 ("ice: Store page count inside ice_rx_buf") Reported-and-tested-by: Xu Du <xudu@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Co-developed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> --- drivers/net/ethernet/intel/ice/ice_txrx.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index f2134ad57ead..9aa53ad2d8f2 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -924,7 +924,6 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size, struct ice_rx_buf *rx_buf; rx_buf = &rx_ring->rx_buf[ntc]; - rx_buf->pgcnt = page_count(rx_buf->page); prefetchw(rx_buf->page); if (!size) @@ -940,6 +939,22 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size, return rx_buf; } +static void ice_get_pgcnts(struct ice_rx_ring *rx_ring) +{ + u32 nr_frags = rx_ring->nr_frags + 1; + u32 idx = rx_ring->first_desc; + struct ice_rx_buf *rx_buf; + u32 cnt = rx_ring->count; + + for (int i = 0; i < nr_frags; i++) { + rx_buf = &rx_ring->rx_buf[idx]; + rx_buf->pgcnt = page_count(rx_buf->page); + + if (++idx == cnt) + idx = 0; + } +} + /** * ice_build_skb - Build skb around an existing buffer * @rx_ring: Rx descriptor ring to transact packets on @@ -1230,6 +1245,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) if (ice_is_non_eop(rx_ring, rx_desc)) continue; + ice_get_pgcnts(rx_ring); ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc); if (rx_buf->act == ICE_XDP_PASS) goto construct_skb; -- 2.43.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 iwl-net 2/3] ice: gather page_count()'s of each frag right before XDP prog call @ 2025-01-20 15:50 ` Maciej Fijalkowski 0 siblings, 0 replies; 15+ messages in thread From: Maciej Fijalkowski @ 2025-01-20 15:50 UTC (permalink / raw) To: intel-wired-lan Cc: netdev, anthony.l.nguyen, magnus.karlsson, jacob.e.keller, xudu, mschmidt, jmaxwell, poros, przemyslaw.kitszel, Maciej Fijalkowski If we store the pgcnt on few fragments while being in the middle of gathering the whole frame and we stumbled upon DD bit not being set, we terminate the NAPI Rx processing loop and come back later on. Then on next NAPI execution we work on previously stored pgcnt. Imagine that second half of page was used actively by networking stack and by the time we came back, stack is not busy with this page anymore and decremented the refcnt. The page reuse algorithm in this case should be good to reuse the page but given the old refcnt it will not do so and attempt to release the page via page_frag_cache_drain() with pagecnt_bias used as an arg. This in turn will result in negative refcnt on struct page, which was initially observed by Xu Du. Therefore, move the page count storage from ice_get_rx_buf() to a place where we are sure that whole frame has been collected, but before calling XDP program as it internally can also change the page count of fragments belonging to xdp_buff. Fixes: ac0753391195 ("ice: Store page count inside ice_rx_buf") Reported-and-tested-by: Xu Du <xudu@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Co-developed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> --- drivers/net/ethernet/intel/ice/ice_txrx.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index f2134ad57ead..9aa53ad2d8f2 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -924,7 +924,6 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size, struct ice_rx_buf *rx_buf; rx_buf = &rx_ring->rx_buf[ntc]; - rx_buf->pgcnt = page_count(rx_buf->page); prefetchw(rx_buf->page); if (!size) @@ -940,6 +939,22 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size, return rx_buf; } +static void ice_get_pgcnts(struct ice_rx_ring *rx_ring) +{ + u32 nr_frags = rx_ring->nr_frags + 1; + u32 idx = rx_ring->first_desc; + struct ice_rx_buf *rx_buf; + u32 cnt = rx_ring->count; + + for (int i = 0; i < nr_frags; i++) { + rx_buf = &rx_ring->rx_buf[idx]; + rx_buf->pgcnt = page_count(rx_buf->page); + + if (++idx == cnt) + idx = 0; + } +} + /** * ice_build_skb - Build skb around an existing buffer * @rx_ring: Rx descriptor ring to transact packets on @@ -1230,6 +1245,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) if (ice_is_non_eop(rx_ring, rx_desc)) continue; + ice_get_pgcnts(rx_ring); ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc); if (rx_buf->act == ICE_XDP_PASS) goto construct_skb; -- 2.43.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Intel-wired-lan] [PATCH v3 iwl-net 3/3] ice: stop storing XDP verdict within ice_rx_buf 2025-01-20 15:50 ` Maciej Fijalkowski @ 2025-01-20 15:50 ` Maciej Fijalkowski -1 siblings, 0 replies; 15+ messages in thread From: Maciej Fijalkowski @ 2025-01-20 15:50 UTC (permalink / raw) To: intel-wired-lan Cc: Maciej Fijalkowski, netdev, xudu, anthony.l.nguyen, przemyslaw.kitszel, jacob.e.keller, jmaxwell, magnus.karlsson Idea behind having ice_rx_buf::act was to simplify and speed up the Rx data path by walking through buffers that were representing cleaned HW Rx descriptors. Since it caused us a major headache recently and we rolled back to old approach that 'puts' Rx buffers right after running XDP prog/creating skb, this is useless now and should be removed. Get rid of ice_rx_buf::act and related logic. We still need to take care of a corner case where XDP program releases a particular fragment. Make ice_run_xdp() to return its result and use it within ice_put_rx_mbuf(). Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> --- drivers/net/ethernet/intel/ice/ice_txrx.c | 60 +++++++++++-------- drivers/net/ethernet/intel/ice/ice_txrx.h | 1 - drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 43 ------------- 3 files changed, 35 insertions(+), 69 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 9aa53ad2d8f2..77d75664c14d 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -532,10 +532,10 @@ int ice_setup_rx_ring(struct ice_rx_ring *rx_ring) * * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR} */ -static void +static u32 ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring, - struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc) + union ice_32b_rx_flex_desc *eop_desc) { unsigned int ret = ICE_XDP_PASS; u32 act; @@ -574,7 +574,7 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, ret = ICE_XDP_CONSUMED; } exit: - ice_set_rx_bufs_act(xdp, rx_ring, ret); + return ret; } /** @@ -860,10 +860,8 @@ ice_add_xdp_frag(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, xdp_buff_set_frags_flag(xdp); } - if (unlikely(sinfo->nr_frags == MAX_SKB_FRAGS)) { - ice_set_rx_bufs_act(xdp, rx_ring, ICE_XDP_CONSUMED); + if (unlikely(sinfo->nr_frags == MAX_SKB_FRAGS)) return -ENOMEM; - } __skb_fill_page_desc_noacc(sinfo, sinfo->nr_frags++, rx_buf->page, rx_buf->page_offset, size); @@ -1066,12 +1064,12 @@ ice_construct_skb(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp) rx_buf->page_offset + headlen, size, xdp->frame_sz); } else { - /* buffer is unused, change the act that should be taken later - * on; data was copied onto skb's linear part so there's no + /* buffer is unused, restore biased page count in Rx buffer; + * data was copied onto skb's linear part so there's no * need for adjusting page offset and we can reuse this buffer * as-is */ - rx_buf->act = ICE_SKB_CONSUMED; + rx_buf->pagecnt_bias++; } if (unlikely(xdp_buff_has_frags(xdp))) { @@ -1119,23 +1117,27 @@ ice_put_rx_buf(struct ice_rx_ring *rx_ring, struct ice_rx_buf *rx_buf) } static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, - u32 *xdp_xmit, u32 ntc) + u32 *xdp_xmit, u32 ntc, u32 verdict) { u32 nr_frags = rx_ring->nr_frags + 1; u32 idx = rx_ring->first_desc; u32 cnt = rx_ring->count; + u32 post_xdp_frags = 1; struct ice_rx_buf *buf; int i; - for (i = 0; i < nr_frags; i++) { + if (unlikely(xdp_buff_has_frags(xdp))) + post_xdp_frags += xdp_get_shared_info_from_buff(xdp)->nr_frags; + + for (i = 0; i < post_xdp_frags; i++) { buf = &rx_ring->rx_buf[idx]; - if (buf->act & (ICE_XDP_TX | ICE_XDP_REDIR)) { + if (verdict & (ICE_XDP_TX | ICE_XDP_REDIR)) { ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); - *xdp_xmit |= buf->act; - } else if (buf->act & ICE_XDP_CONSUMED) { + *xdp_xmit |= verdict; + } else if (verdict & ICE_XDP_CONSUMED) { buf->pagecnt_bias++; - } else if (buf->act == ICE_XDP_PASS) { + } else if (verdict == ICE_XDP_PASS) { ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); } @@ -1144,6 +1146,17 @@ static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, if (++idx == cnt) idx = 0; } + /* handle buffers that represented frags released by XDP prog; + * for these we keep pagecnt_bias as-is; refcount from struct page + * has been decremented within XDP prog and we do not have to increase + * the biased refcnt + */ + for (; i < nr_frags; i++) { + buf = &rx_ring->rx_buf[idx]; + ice_put_rx_buf(rx_ring, buf); + if (++idx == cnt) + idx = 0; + } xdp->data = NULL; rx_ring->first_desc = ntc; @@ -1170,9 +1183,9 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) struct ice_tx_ring *xdp_ring = NULL; struct bpf_prog *xdp_prog = NULL; u32 ntc = rx_ring->next_to_clean; + u32 cached_ntu, xdp_verdict; u32 cnt = rx_ring->count; u32 xdp_xmit = 0; - u32 cached_ntu; bool failure; xdp_prog = READ_ONCE(rx_ring->xdp_prog); @@ -1235,7 +1248,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) xdp_prepare_buff(xdp, hard_start, offset, size, !!offset); xdp_buff_clear_frags_flag(xdp); } else if (ice_add_xdp_frag(rx_ring, xdp, rx_buf, size)) { - ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc); + ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc, ICE_XDP_CONSUMED); break; } if (++ntc == cnt) @@ -1246,13 +1259,13 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) continue; ice_get_pgcnts(rx_ring); - ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc); - if (rx_buf->act == ICE_XDP_PASS) + xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_desc); + if (xdp_verdict == ICE_XDP_PASS) goto construct_skb; total_rx_bytes += xdp_get_buff_len(xdp); total_rx_pkts++; - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc); + ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict); continue; construct_skb: @@ -1263,12 +1276,9 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) /* exit if we failed to retrieve a buffer */ if (!skb) { rx_ring->ring_stats->rx_stats.alloc_page_failed++; - rx_buf->act = ICE_XDP_CONSUMED; - if (unlikely(xdp_buff_has_frags(xdp))) - ice_set_rx_bufs_act(xdp, rx_ring, - ICE_XDP_CONSUMED); + xdp_verdict = ICE_XDP_CONSUMED; } - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc); + ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict); if (!skb) break; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h index cb347c852ba9..806bce701df3 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h @@ -201,7 +201,6 @@ struct ice_rx_buf { struct page *page; unsigned int page_offset; unsigned int pgcnt; - unsigned int act; unsigned int pagecnt_bias; }; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h index 79f960c6680d..6cf32b404127 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h @@ -5,49 +5,6 @@ #define _ICE_TXRX_LIB_H_ #include "ice.h" -/** - * ice_set_rx_bufs_act - propagate Rx buffer action to frags - * @xdp: XDP buffer representing frame (linear and frags part) - * @rx_ring: Rx ring struct - * act: action to store onto Rx buffers related to XDP buffer parts - * - * Set action that should be taken before putting Rx buffer from first frag - * to the last. - */ -static inline void -ice_set_rx_bufs_act(struct xdp_buff *xdp, const struct ice_rx_ring *rx_ring, - const unsigned int act) -{ - u32 sinfo_frags = xdp_get_shared_info_from_buff(xdp)->nr_frags; - u32 nr_frags = rx_ring->nr_frags + 1; - u32 idx = rx_ring->first_desc; - u32 cnt = rx_ring->count; - struct ice_rx_buf *buf; - - for (int i = 0; i < nr_frags; i++) { - buf = &rx_ring->rx_buf[idx]; - buf->act = act; - - if (++idx == cnt) - idx = 0; - } - - /* adjust pagecnt_bias on frags freed by XDP prog */ - if (sinfo_frags < rx_ring->nr_frags && act == ICE_XDP_CONSUMED) { - u32 delta = rx_ring->nr_frags - sinfo_frags; - - while (delta) { - if (idx == 0) - idx = cnt - 1; - else - idx--; - buf = &rx_ring->rx_buf[idx]; - buf->pagecnt_bias--; - delta--; - } - } -} - /** * ice_test_staterr - tests bits in Rx descriptor status and error fields * @status_err_n: Rx descriptor status_error0 or status_error1 bits -- 2.43.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 iwl-net 3/3] ice: stop storing XDP verdict within ice_rx_buf @ 2025-01-20 15:50 ` Maciej Fijalkowski 0 siblings, 0 replies; 15+ messages in thread From: Maciej Fijalkowski @ 2025-01-20 15:50 UTC (permalink / raw) To: intel-wired-lan Cc: netdev, anthony.l.nguyen, magnus.karlsson, jacob.e.keller, xudu, mschmidt, jmaxwell, poros, przemyslaw.kitszel, Maciej Fijalkowski Idea behind having ice_rx_buf::act was to simplify and speed up the Rx data path by walking through buffers that were representing cleaned HW Rx descriptors. Since it caused us a major headache recently and we rolled back to old approach that 'puts' Rx buffers right after running XDP prog/creating skb, this is useless now and should be removed. Get rid of ice_rx_buf::act and related logic. We still need to take care of a corner case where XDP program releases a particular fragment. Make ice_run_xdp() to return its result and use it within ice_put_rx_mbuf(). Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> --- drivers/net/ethernet/intel/ice/ice_txrx.c | 60 +++++++++++-------- drivers/net/ethernet/intel/ice/ice_txrx.h | 1 - drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 43 ------------- 3 files changed, 35 insertions(+), 69 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 9aa53ad2d8f2..77d75664c14d 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -532,10 +532,10 @@ int ice_setup_rx_ring(struct ice_rx_ring *rx_ring) * * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR} */ -static void +static u32 ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring, - struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc) + union ice_32b_rx_flex_desc *eop_desc) { unsigned int ret = ICE_XDP_PASS; u32 act; @@ -574,7 +574,7 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, ret = ICE_XDP_CONSUMED; } exit: - ice_set_rx_bufs_act(xdp, rx_ring, ret); + return ret; } /** @@ -860,10 +860,8 @@ ice_add_xdp_frag(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, xdp_buff_set_frags_flag(xdp); } - if (unlikely(sinfo->nr_frags == MAX_SKB_FRAGS)) { - ice_set_rx_bufs_act(xdp, rx_ring, ICE_XDP_CONSUMED); + if (unlikely(sinfo->nr_frags == MAX_SKB_FRAGS)) return -ENOMEM; - } __skb_fill_page_desc_noacc(sinfo, sinfo->nr_frags++, rx_buf->page, rx_buf->page_offset, size); @@ -1066,12 +1064,12 @@ ice_construct_skb(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp) rx_buf->page_offset + headlen, size, xdp->frame_sz); } else { - /* buffer is unused, change the act that should be taken later - * on; data was copied onto skb's linear part so there's no + /* buffer is unused, restore biased page count in Rx buffer; + * data was copied onto skb's linear part so there's no * need for adjusting page offset and we can reuse this buffer * as-is */ - rx_buf->act = ICE_SKB_CONSUMED; + rx_buf->pagecnt_bias++; } if (unlikely(xdp_buff_has_frags(xdp))) { @@ -1119,23 +1117,27 @@ ice_put_rx_buf(struct ice_rx_ring *rx_ring, struct ice_rx_buf *rx_buf) } static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, - u32 *xdp_xmit, u32 ntc) + u32 *xdp_xmit, u32 ntc, u32 verdict) { u32 nr_frags = rx_ring->nr_frags + 1; u32 idx = rx_ring->first_desc; u32 cnt = rx_ring->count; + u32 post_xdp_frags = 1; struct ice_rx_buf *buf; int i; - for (i = 0; i < nr_frags; i++) { + if (unlikely(xdp_buff_has_frags(xdp))) + post_xdp_frags += xdp_get_shared_info_from_buff(xdp)->nr_frags; + + for (i = 0; i < post_xdp_frags; i++) { buf = &rx_ring->rx_buf[idx]; - if (buf->act & (ICE_XDP_TX | ICE_XDP_REDIR)) { + if (verdict & (ICE_XDP_TX | ICE_XDP_REDIR)) { ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); - *xdp_xmit |= buf->act; - } else if (buf->act & ICE_XDP_CONSUMED) { + *xdp_xmit |= verdict; + } else if (verdict & ICE_XDP_CONSUMED) { buf->pagecnt_bias++; - } else if (buf->act == ICE_XDP_PASS) { + } else if (verdict == ICE_XDP_PASS) { ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); } @@ -1144,6 +1146,17 @@ static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, if (++idx == cnt) idx = 0; } + /* handle buffers that represented frags released by XDP prog; + * for these we keep pagecnt_bias as-is; refcount from struct page + * has been decremented within XDP prog and we do not have to increase + * the biased refcnt + */ + for (; i < nr_frags; i++) { + buf = &rx_ring->rx_buf[idx]; + ice_put_rx_buf(rx_ring, buf); + if (++idx == cnt) + idx = 0; + } xdp->data = NULL; rx_ring->first_desc = ntc; @@ -1170,9 +1183,9 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) struct ice_tx_ring *xdp_ring = NULL; struct bpf_prog *xdp_prog = NULL; u32 ntc = rx_ring->next_to_clean; + u32 cached_ntu, xdp_verdict; u32 cnt = rx_ring->count; u32 xdp_xmit = 0; - u32 cached_ntu; bool failure; xdp_prog = READ_ONCE(rx_ring->xdp_prog); @@ -1235,7 +1248,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) xdp_prepare_buff(xdp, hard_start, offset, size, !!offset); xdp_buff_clear_frags_flag(xdp); } else if (ice_add_xdp_frag(rx_ring, xdp, rx_buf, size)) { - ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc); + ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc, ICE_XDP_CONSUMED); break; } if (++ntc == cnt) @@ -1246,13 +1259,13 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) continue; ice_get_pgcnts(rx_ring); - ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc); - if (rx_buf->act == ICE_XDP_PASS) + xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_desc); + if (xdp_verdict == ICE_XDP_PASS) goto construct_skb; total_rx_bytes += xdp_get_buff_len(xdp); total_rx_pkts++; - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc); + ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict); continue; construct_skb: @@ -1263,12 +1276,9 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) /* exit if we failed to retrieve a buffer */ if (!skb) { rx_ring->ring_stats->rx_stats.alloc_page_failed++; - rx_buf->act = ICE_XDP_CONSUMED; - if (unlikely(xdp_buff_has_frags(xdp))) - ice_set_rx_bufs_act(xdp, rx_ring, - ICE_XDP_CONSUMED); + xdp_verdict = ICE_XDP_CONSUMED; } - ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc); + ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict); if (!skb) break; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h index cb347c852ba9..806bce701df3 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h @@ -201,7 +201,6 @@ struct ice_rx_buf { struct page *page; unsigned int page_offset; unsigned int pgcnt; - unsigned int act; unsigned int pagecnt_bias; }; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h index 79f960c6680d..6cf32b404127 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h @@ -5,49 +5,6 @@ #define _ICE_TXRX_LIB_H_ #include "ice.h" -/** - * ice_set_rx_bufs_act - propagate Rx buffer action to frags - * @xdp: XDP buffer representing frame (linear and frags part) - * @rx_ring: Rx ring struct - * act: action to store onto Rx buffers related to XDP buffer parts - * - * Set action that should be taken before putting Rx buffer from first frag - * to the last. - */ -static inline void -ice_set_rx_bufs_act(struct xdp_buff *xdp, const struct ice_rx_ring *rx_ring, - const unsigned int act) -{ - u32 sinfo_frags = xdp_get_shared_info_from_buff(xdp)->nr_frags; - u32 nr_frags = rx_ring->nr_frags + 1; - u32 idx = rx_ring->first_desc; - u32 cnt = rx_ring->count; - struct ice_rx_buf *buf; - - for (int i = 0; i < nr_frags; i++) { - buf = &rx_ring->rx_buf[idx]; - buf->act = act; - - if (++idx == cnt) - idx = 0; - } - - /* adjust pagecnt_bias on frags freed by XDP prog */ - if (sinfo_frags < rx_ring->nr_frags && act == ICE_XDP_CONSUMED) { - u32 delta = rx_ring->nr_frags - sinfo_frags; - - while (delta) { - if (idx == 0) - idx = cnt - 1; - else - idx--; - buf = &rx_ring->rx_buf[idx]; - buf->pagecnt_bias--; - delta--; - } - } -} - /** * ice_test_staterr - tests bits in Rx descriptor status and error fields * @status_err_n: Rx descriptor status_error0 or status_error1 bits -- 2.43.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Intel-wired-lan] [PATCH v3 iwl-net 3/3] ice: stop storing XDP verdict within ice_rx_buf 2025-01-20 15:50 ` Maciej Fijalkowski @ 2025-01-20 16:37 ` Simon Horman -1 siblings, 0 replies; 15+ messages in thread From: Simon Horman @ 2025-01-20 16:37 UTC (permalink / raw) To: Maciej Fijalkowski Cc: netdev, xudu, anthony.l.nguyen, przemyslaw.kitszel, jacob.e.keller, intel-wired-lan, jmaxwell, magnus.karlsson On Mon, Jan 20, 2025 at 04:50:16PM +0100, Maciej Fijalkowski wrote: > Idea behind having ice_rx_buf::act was to simplify and speed up the Rx > data path by walking through buffers that were representing cleaned HW > Rx descriptors. Since it caused us a major headache recently and we > rolled back to old approach that 'puts' Rx buffers right after running > XDP prog/creating skb, this is useless now and should be removed. > > Get rid of ice_rx_buf::act and related logic. We still need to take care > of a corner case where XDP program releases a particular fragment. > > Make ice_run_xdp() to return its result and use it within > ice_put_rx_mbuf(). > > Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > --- > drivers/net/ethernet/intel/ice/ice_txrx.c | 60 +++++++++++-------- > drivers/net/ethernet/intel/ice/ice_txrx.h | 1 - > drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 43 ------------- > 3 files changed, 35 insertions(+), 69 deletions(-) > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c > index 9aa53ad2d8f2..77d75664c14d 100644 > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c > @@ -532,10 +532,10 @@ int ice_setup_rx_ring(struct ice_rx_ring *rx_ring) > * > * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR} > */ > -static void > +static u32 > ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, > struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring, > - struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc) > + union ice_32b_rx_flex_desc *eop_desc) > { > unsigned int ret = ICE_XDP_PASS; > u32 act; nit: The Kernel doc for ice_run_xdp should also be updated to no longer document the rx_buf parameter. ... ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 iwl-net 3/3] ice: stop storing XDP verdict within ice_rx_buf @ 2025-01-20 16:37 ` Simon Horman 0 siblings, 0 replies; 15+ messages in thread From: Simon Horman @ 2025-01-20 16:37 UTC (permalink / raw) To: Maciej Fijalkowski Cc: intel-wired-lan, netdev, anthony.l.nguyen, magnus.karlsson, jacob.e.keller, xudu, mschmidt, jmaxwell, poros, przemyslaw.kitszel On Mon, Jan 20, 2025 at 04:50:16PM +0100, Maciej Fijalkowski wrote: > Idea behind having ice_rx_buf::act was to simplify and speed up the Rx > data path by walking through buffers that were representing cleaned HW > Rx descriptors. Since it caused us a major headache recently and we > rolled back to old approach that 'puts' Rx buffers right after running > XDP prog/creating skb, this is useless now and should be removed. > > Get rid of ice_rx_buf::act and related logic. We still need to take care > of a corner case where XDP program releases a particular fragment. > > Make ice_run_xdp() to return its result and use it within > ice_put_rx_mbuf(). > > Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > --- > drivers/net/ethernet/intel/ice/ice_txrx.c | 60 +++++++++++-------- > drivers/net/ethernet/intel/ice/ice_txrx.h | 1 - > drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 43 ------------- > 3 files changed, 35 insertions(+), 69 deletions(-) > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c > index 9aa53ad2d8f2..77d75664c14d 100644 > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c > @@ -532,10 +532,10 @@ int ice_setup_rx_ring(struct ice_rx_ring *rx_ring) > * > * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR} > */ > -static void > +static u32 > ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, > struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring, > - struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc) > + union ice_32b_rx_flex_desc *eop_desc) > { > unsigned int ret = ICE_XDP_PASS; > u32 act; nit: The Kernel doc for ice_run_xdp should also be updated to no longer document the rx_buf parameter. ... ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-wired-lan] [PATCH v3 iwl-net 3/3] ice: stop storing XDP verdict within ice_rx_buf 2025-01-20 16:37 ` Simon Horman @ 2025-01-22 12:50 ` Maciej Fijalkowski -1 siblings, 0 replies; 15+ messages in thread From: Maciej Fijalkowski @ 2025-01-22 12:50 UTC (permalink / raw) To: Simon Horman Cc: netdev, xudu, anthony.l.nguyen, przemyslaw.kitszel, jacob.e.keller, intel-wired-lan, jmaxwell, magnus.karlsson On Mon, Jan 20, 2025 at 04:37:55PM +0000, Simon Horman wrote: > On Mon, Jan 20, 2025 at 04:50:16PM +0100, Maciej Fijalkowski wrote: > > Idea behind having ice_rx_buf::act was to simplify and speed up the Rx > > data path by walking through buffers that were representing cleaned HW > > Rx descriptors. Since it caused us a major headache recently and we > > rolled back to old approach that 'puts' Rx buffers right after running > > XDP prog/creating skb, this is useless now and should be removed. > > > > Get rid of ice_rx_buf::act and related logic. We still need to take care > > of a corner case where XDP program releases a particular fragment. > > > > Make ice_run_xdp() to return its result and use it within > > ice_put_rx_mbuf(). > > > > Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") > > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > --- > > drivers/net/ethernet/intel/ice/ice_txrx.c | 60 +++++++++++-------- > > drivers/net/ethernet/intel/ice/ice_txrx.h | 1 - > > drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 43 ------------- > > 3 files changed, 35 insertions(+), 69 deletions(-) > > > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c > > index 9aa53ad2d8f2..77d75664c14d 100644 > > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c > > @@ -532,10 +532,10 @@ int ice_setup_rx_ring(struct ice_rx_ring *rx_ring) > > * > > * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR} > > */ > > -static void > > +static u32 > > ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, > > struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring, > > - struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc) > > + union ice_32b_rx_flex_desc *eop_desc) > > { > > unsigned int ret = ICE_XDP_PASS; > > u32 act; > > nit: The Kernel doc for ice_run_xdp should also be updated to no > longer document the rx_buf parameter. Heh - but after making it to return the verdict again the return description is valid:D I have been missing the kdoc descriptions for introduced functions in this patchset so let me add them as well. Thanks for review! > > ... ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 iwl-net 3/3] ice: stop storing XDP verdict within ice_rx_buf @ 2025-01-22 12:50 ` Maciej Fijalkowski 0 siblings, 0 replies; 15+ messages in thread From: Maciej Fijalkowski @ 2025-01-22 12:50 UTC (permalink / raw) To: Simon Horman Cc: intel-wired-lan, netdev, anthony.l.nguyen, magnus.karlsson, jacob.e.keller, xudu, mschmidt, jmaxwell, poros, przemyslaw.kitszel On Mon, Jan 20, 2025 at 04:37:55PM +0000, Simon Horman wrote: > On Mon, Jan 20, 2025 at 04:50:16PM +0100, Maciej Fijalkowski wrote: > > Idea behind having ice_rx_buf::act was to simplify and speed up the Rx > > data path by walking through buffers that were representing cleaned HW > > Rx descriptors. Since it caused us a major headache recently and we > > rolled back to old approach that 'puts' Rx buffers right after running > > XDP prog/creating skb, this is useless now and should be removed. > > > > Get rid of ice_rx_buf::act and related logic. We still need to take care > > of a corner case where XDP program releases a particular fragment. > > > > Make ice_run_xdp() to return its result and use it within > > ice_put_rx_mbuf(). > > > > Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") > > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > --- > > drivers/net/ethernet/intel/ice/ice_txrx.c | 60 +++++++++++-------- > > drivers/net/ethernet/intel/ice/ice_txrx.h | 1 - > > drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 43 ------------- > > 3 files changed, 35 insertions(+), 69 deletions(-) > > > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c > > index 9aa53ad2d8f2..77d75664c14d 100644 > > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c > > @@ -532,10 +532,10 @@ int ice_setup_rx_ring(struct ice_rx_ring *rx_ring) > > * > > * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR} > > */ > > -static void > > +static u32 > > ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, > > struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring, > > - struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc) > > + union ice_32b_rx_flex_desc *eop_desc) > > { > > unsigned int ret = ICE_XDP_PASS; > > u32 act; > > nit: The Kernel doc for ice_run_xdp should also be updated to no > longer document the rx_buf parameter. Heh - but after making it to return the verdict again the return description is valid:D I have been missing the kdoc descriptions for introduced functions in this patchset so let me add them as well. Thanks for review! > > ... ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-wired-lan] [PATCH v3 iwl-net 3/3] ice: stop storing XDP verdict within ice_rx_buf 2025-01-20 15:50 ` Maciej Fijalkowski (?) (?) @ 2025-01-20 21:23 ` kernel test robot -1 siblings, 0 replies; 15+ messages in thread From: kernel test robot @ 2025-01-20 21:23 UTC (permalink / raw) To: Maciej Fijalkowski, intel-wired-lan Cc: oe-kbuild-all, Maciej Fijalkowski, netdev, xudu, anthony.l.nguyen, przemyslaw.kitszel, jacob.e.keller, jmaxwell, magnus.karlsson Hi Maciej, kernel test robot noticed the following build warnings: [auto build test WARNING on tnguy-net-queue/dev-queue] url: https://github.com/intel-lab-lkp/linux/commits/Maciej-Fijalkowski/ice-put-Rx-buffers-after-being-done-with-current-frame/20250120-235320 base: https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue.git dev-queue patch link: https://lore.kernel.org/r/20250120155016.556735-4-maciej.fijalkowski%40intel.com patch subject: [Intel-wired-lan] [PATCH v3 iwl-net 3/3] ice: stop storing XDP verdict within ice_rx_buf config: arc-randconfig-001-20250121 (https://download.01.org/0day-ci/archive/20250121/202501210750.KInYtrPt-lkp@intel.com/config) compiler: arceb-elf-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250121/202501210750.KInYtrPt-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202501210750.KInYtrPt-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/net/ethernet/intel/ice/ice_txrx.c:539: warning: Excess function parameter 'rx_buf' description in 'ice_run_xdp' vim +539 drivers/net/ethernet/intel/ice/ice_txrx.c cdedef59deb020 Anirudh Venkataramanan 2018-03-20 523 efc2214b6047b6 Maciej Fijalkowski 2019-11-04 524 /** efc2214b6047b6 Maciej Fijalkowski 2019-11-04 525 * ice_run_xdp - Executes an XDP program on initialized xdp_buff efc2214b6047b6 Maciej Fijalkowski 2019-11-04 526 * @rx_ring: Rx ring efc2214b6047b6 Maciej Fijalkowski 2019-11-04 527 * @xdp: xdp_buff used as input to the XDP program efc2214b6047b6 Maciej Fijalkowski 2019-11-04 528 * @xdp_prog: XDP program to run eb087cd828648d Maciej Fijalkowski 2021-08-19 529 * @xdp_ring: ring to be used for XDP_TX action 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 530 * @rx_buf: Rx buffer to store the XDP action d951c14ad237b0 Larysa Zaremba 2023-12-05 531 * @eop_desc: Last descriptor in packet to read metadata from efc2214b6047b6 Maciej Fijalkowski 2019-11-04 532 * efc2214b6047b6 Maciej Fijalkowski 2019-11-04 533 * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR} efc2214b6047b6 Maciej Fijalkowski 2019-11-04 534 */ 55a1a17189d7a5 Maciej Fijalkowski 2025-01-20 535 static u32 e72bba21355dbb Maciej Fijalkowski 2021-08-19 536 ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 537 struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring, 55a1a17189d7a5 Maciej Fijalkowski 2025-01-20 538 union ice_32b_rx_flex_desc *eop_desc) efc2214b6047b6 Maciej Fijalkowski 2019-11-04 @539 { 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 540 unsigned int ret = ICE_XDP_PASS; efc2214b6047b6 Maciej Fijalkowski 2019-11-04 541 u32 act; efc2214b6047b6 Maciej Fijalkowski 2019-11-04 542 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 543 if (!xdp_prog) 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 544 goto exit; 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 545 d951c14ad237b0 Larysa Zaremba 2023-12-05 546 ice_xdp_meta_set_desc(xdp, eop_desc); d951c14ad237b0 Larysa Zaremba 2023-12-05 547 efc2214b6047b6 Maciej Fijalkowski 2019-11-04 548 act = bpf_prog_run_xdp(xdp_prog, xdp); efc2214b6047b6 Maciej Fijalkowski 2019-11-04 549 switch (act) { efc2214b6047b6 Maciej Fijalkowski 2019-11-04 550 case XDP_PASS: 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 551 break; efc2214b6047b6 Maciej Fijalkowski 2019-11-04 552 case XDP_TX: 22bf877e528f68 Maciej Fijalkowski 2021-08-19 553 if (static_branch_unlikely(&ice_xdp_locking_key)) 22bf877e528f68 Maciej Fijalkowski 2021-08-19 554 spin_lock(&xdp_ring->tx_lock); 055d0920685e53 Alexander Lobakin 2023-02-10 555 ret = __ice_xmit_xdp_ring(xdp, xdp_ring, false); 22bf877e528f68 Maciej Fijalkowski 2021-08-19 556 if (static_branch_unlikely(&ice_xdp_locking_key)) 22bf877e528f68 Maciej Fijalkowski 2021-08-19 557 spin_unlock(&xdp_ring->tx_lock); 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 558 if (ret == ICE_XDP_CONSUMED) 89d65df024c599 Magnus Karlsson 2021-05-10 559 goto out_failure; 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 560 break; efc2214b6047b6 Maciej Fijalkowski 2019-11-04 561 case XDP_REDIRECT: 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 562 if (xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog)) 89d65df024c599 Magnus Karlsson 2021-05-10 563 goto out_failure; 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 564 ret = ICE_XDP_REDIR; 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 565 break; efc2214b6047b6 Maciej Fijalkowski 2019-11-04 566 default: c8064e5b4adac5 Paolo Abeni 2021-11-30 567 bpf_warn_invalid_xdp_action(rx_ring->netdev, xdp_prog, act); 4e83fc934e3a04 Bruce Allan 2020-01-22 568 fallthrough; efc2214b6047b6 Maciej Fijalkowski 2019-11-04 569 case XDP_ABORTED: 89d65df024c599 Magnus Karlsson 2021-05-10 570 out_failure: efc2214b6047b6 Maciej Fijalkowski 2019-11-04 571 trace_xdp_exception(rx_ring->netdev, xdp_prog, act); 4e83fc934e3a04 Bruce Allan 2020-01-22 572 fallthrough; efc2214b6047b6 Maciej Fijalkowski 2019-11-04 573 case XDP_DROP: 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 574 ret = ICE_XDP_CONSUMED; efc2214b6047b6 Maciej Fijalkowski 2019-11-04 575 } 1dc1a7e7f4108b Maciej Fijalkowski 2023-01-31 576 exit: 55a1a17189d7a5 Maciej Fijalkowski 2025-01-20 577 return ret; efc2214b6047b6 Maciej Fijalkowski 2019-11-04 578 } efc2214b6047b6 Maciej Fijalkowski 2019-11-04 579 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-01-22 12:50 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-01-20 15:50 [Intel-wired-lan] [PATCH v3 iwl-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic Maciej Fijalkowski 2025-01-20 15:50 ` Maciej Fijalkowski 2025-01-20 15:50 ` [Intel-wired-lan] [PATCH v3 iwl-net 1/3] ice: put Rx buffers after being done with current frame Maciej Fijalkowski 2025-01-20 15:50 ` Maciej Fijalkowski 2025-01-20 16:38 ` [Intel-wired-lan] " Simon Horman 2025-01-20 16:38 ` Simon Horman 2025-01-20 15:50 ` [Intel-wired-lan] [PATCH v3 iwl-net 2/3] ice: gather page_count()'s of each frag right before XDP prog call Maciej Fijalkowski 2025-01-20 15:50 ` Maciej Fijalkowski 2025-01-20 15:50 ` [Intel-wired-lan] [PATCH v3 iwl-net 3/3] ice: stop storing XDP verdict within ice_rx_buf Maciej Fijalkowski 2025-01-20 15:50 ` Maciej Fijalkowski 2025-01-20 16:37 ` [Intel-wired-lan] " Simon Horman 2025-01-20 16:37 ` Simon Horman 2025-01-22 12:50 ` [Intel-wired-lan] " Maciej Fijalkowski 2025-01-22 12:50 ` Maciej Fijalkowski 2025-01-20 21:23 ` [Intel-wired-lan] " kernel test robot
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.