* [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic
@ 2025-01-17 15:18 Maciej Fijalkowski
2025-01-17 15:18 ` [PATCH v2 intel-net 1/3] ice: put Rx buffers after being done with current frame Maciej Fijalkowski
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Maciej Fijalkowski @ 2025-01-17 15:18 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, jacob.e.keller, xudu,
mschmidt, jmaxwell, poros, przemyslaw.kitszel, Maciej Fijalkowski
v1->v2:
* pass ntc to ice_put_rx_mbuf() (pointed out by Petr Oros) in patch 1
* add review tags from Przemek Kitszel (thanks!)
* make sure patches compile and work ;)
Hello in 2025,
this patchset fixes a pretty nasty issue that was reported by RedHat
folks which occured after ~30 minutes (this value varied, just trying
here to state that it was not observed immediately but rather after a
considerable longer amount of time) when ice driver was tortured with
jumbo frames via mix of iperf traffic executed simultaneously with
wrk/nginx on client/server sides (HTTP and TCP workloads basically).
The reported splats were spanning across all the bad things that can
happen to the state of page - refcount underflow, use-after-free, etc.
One of these looked as follows:
[ 2084.019891] BUG: Bad page state in process swapper/34 pfn:97fcd0
[ 2084.025990] page:00000000a60ee772 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x97fcd0
[ 2084.035462] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[ 2084.041990] raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000
[ 2084.049730] raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
[ 2084.057468] page dumped because: nonzero _refcount
[ 2084.062260] Modules linked in: bonding tls sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mgag200 irqd
[ 2084.137829] CPU: 34 PID: 0 Comm: swapper/34 Kdump: loaded Not tainted 5.14.0-427.37.1.el9_4.x86_64 #1
[ 2084.147039] Hardware name: Dell Inc. PowerEdge R750/0216NK, BIOS 1.13.2 12/19/2023
[ 2084.154604] Call Trace:
[ 2084.157058] <IRQ>
[ 2084.159080] dump_stack_lvl+0x34/0x48
[ 2084.162752] bad_page.cold+0x63/0x94
[ 2084.166333] check_new_pages+0xb3/0xe0
[ 2084.170083] rmqueue_bulk+0x2d2/0x9e0
[ 2084.173749] ? ktime_get+0x35/0xa0
[ 2084.177159] rmqueue_pcplist+0x13b/0x210
[ 2084.181081] rmqueue+0x7d3/0xd40
[ 2084.184316] ? xas_load+0x9/0xa0
[ 2084.187547] ? xas_find+0x183/0x1d0
[ 2084.191041] ? xa_find_after+0xd0/0x130
[ 2084.194879] ? intel_iommu_iotlb_sync_map+0x89/0xe0
[ 2084.199759] get_page_from_freelist+0x11f/0x530
[ 2084.204291] __alloc_pages+0xf2/0x250
[ 2084.207958] ice_alloc_rx_bufs+0xcc/0x1c0 [ice]
[ 2084.212543] ice_clean_rx_irq+0x631/0xa20 [ice]
[ 2084.217111] ice_napi_poll+0xdf/0x2a0 [ice]
[ 2084.221330] __napi_poll+0x27/0x170
[ 2084.224824] net_rx_action+0x233/0x2f0
[ 2084.228575] __do_softirq+0xc7/0x2ac
[ 2084.232155] __irq_exit_rcu+0xa1/0xc0
[ 2084.235821] common_interrupt+0x80/0xa0
[ 2084.239662] </IRQ>
[ 2084.241768] <TASK>
The fix is mostly about reverting what was done in commit 1dc1a7e7f410
("ice: Centrallize Rx buffer recycling") followed by proper timing on
page_count() storage and then removing the ice_rx_buf::act related logic
(which was mostly introduced for purposes from cited commit).
Special thanks to Xu Du for providing reproducer and Jacob Keller for
initial extensive analysis.
Thanks,
Maciej
Maciej Fijalkowski (3):
ice: put Rx buffers after being done with current frame
ice: gather page_count()'s of each frag right before XDP prog call
ice: stop storing XDP verdict within ice_rx_buf
drivers/net/ethernet/intel/ice/ice_txrx.c | 128 +++++++++++-------
drivers/net/ethernet/intel/ice/ice_txrx.h | 1 -
drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 43 ------
3 files changed, 82 insertions(+), 90 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 intel-net 1/3] ice: put Rx buffers after being done with current frame
2025-01-17 15:18 [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic Maciej Fijalkowski
@ 2025-01-17 15:18 ` Maciej Fijalkowski
2025-01-17 15:18 ` [PATCH v2 intel-net 2/3] ice: gather page_count()'s of each frag right before XDP prog call Maciej Fijalkowski
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: Maciej Fijalkowski @ 2025-01-17 15:18 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, jacob.e.keller, xudu,
mschmidt, jmaxwell, poros, przemyslaw.kitszel, Maciej Fijalkowski
Introduce a new helper ice_put_rx_mbuf() that will go through gathered
frags from current frame and will call ice_put_rx_buf() on them. Current
logic that was supposed to simplify and optimize the driver where we go
through a batch of all buffers processed in current NAPI instance turned
out to be broken for jumbo frames and very heavy load that was coming
from both multi-thread iperf and nginx/wrk pair between server and
client. The delay introduced by approach that we are dropping is simply
too big and we need to take the decision regarding page
recycling/releasing as quick as we can.
While at it, address an error path of ice_add_xdp_frag() - we were
missing buffer putting from day 1 there.
As a nice side effect we get rid of annoying and repetetive three-liner:
xdp->data = NULL;
rx_ring->first_desc = ntc;
rx_ring->nr_frags = 0;
by embedding it within introduced routine.
Fixes: 1dc1a7e7f410 ("ice: Centrallize Rx buffer recycling")
Reported-and-tested-by: Xu Du <xudu@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/ice/ice_txrx.c | 68 +++++++++++++----------
1 file changed, 39 insertions(+), 29 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 5d2d7736fd5f..f2134ad57ead 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -1103,6 +1103,38 @@ ice_put_rx_buf(struct ice_rx_ring *rx_ring, struct ice_rx_buf *rx_buf)
rx_buf->page = NULL;
}
+static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
+ u32 *xdp_xmit, u32 ntc)
+{
+ u32 nr_frags = rx_ring->nr_frags + 1;
+ u32 idx = rx_ring->first_desc;
+ u32 cnt = rx_ring->count;
+ struct ice_rx_buf *buf;
+ int i;
+
+ for (i = 0; i < nr_frags; i++) {
+ buf = &rx_ring->rx_buf[idx];
+
+ if (buf->act & (ICE_XDP_TX | ICE_XDP_REDIR)) {
+ ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);
+ *xdp_xmit |= buf->act;
+ } else if (buf->act & ICE_XDP_CONSUMED) {
+ buf->pagecnt_bias++;
+ } else if (buf->act == ICE_XDP_PASS) {
+ ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);
+ }
+
+ ice_put_rx_buf(rx_ring, buf);
+
+ if (++idx == cnt)
+ idx = 0;
+ }
+
+ xdp->data = NULL;
+ rx_ring->first_desc = ntc;
+ rx_ring->nr_frags = 0;
+}
+
/**
* ice_clean_rx_irq - Clean completed descriptors from Rx ring - bounce buf
* @rx_ring: Rx descriptor ring to transact packets on
@@ -1120,7 +1152,6 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
unsigned int total_rx_bytes = 0, total_rx_pkts = 0;
unsigned int offset = rx_ring->rx_offset;
struct xdp_buff *xdp = &rx_ring->xdp;
- u32 cached_ntc = rx_ring->first_desc;
struct ice_tx_ring *xdp_ring = NULL;
struct bpf_prog *xdp_prog = NULL;
u32 ntc = rx_ring->next_to_clean;
@@ -1128,7 +1159,6 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
u32 xdp_xmit = 0;
u32 cached_ntu;
bool failure;
- u32 first;
xdp_prog = READ_ONCE(rx_ring->xdp_prog);
if (xdp_prog) {
@@ -1190,6 +1220,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
xdp_prepare_buff(xdp, hard_start, offset, size, !!offset);
xdp_buff_clear_frags_flag(xdp);
} else if (ice_add_xdp_frag(rx_ring, xdp, rx_buf, size)) {
+ ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc);
break;
}
if (++ntc == cnt)
@@ -1205,9 +1236,8 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
total_rx_bytes += xdp_get_buff_len(xdp);
total_rx_pkts++;
- xdp->data = NULL;
- rx_ring->first_desc = ntc;
- rx_ring->nr_frags = 0;
+ ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc);
+
continue;
construct_skb:
if (likely(ice_ring_uses_build_skb(rx_ring)))
@@ -1221,14 +1251,11 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
if (unlikely(xdp_buff_has_frags(xdp)))
ice_set_rx_bufs_act(xdp, rx_ring,
ICE_XDP_CONSUMED);
- xdp->data = NULL;
- rx_ring->first_desc = ntc;
- rx_ring->nr_frags = 0;
- break;
}
- xdp->data = NULL;
- rx_ring->first_desc = ntc;
- rx_ring->nr_frags = 0;
+ ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc);
+
+ if (!skb)
+ break;
stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_RXE_S);
if (unlikely(ice_test_staterr(rx_desc->wb.status_error0,
@@ -1257,23 +1284,6 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
total_rx_pkts++;
}
- first = rx_ring->first_desc;
- while (cached_ntc != first) {
- struct ice_rx_buf *buf = &rx_ring->rx_buf[cached_ntc];
-
- if (buf->act & (ICE_XDP_TX | ICE_XDP_REDIR)) {
- ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);
- xdp_xmit |= buf->act;
- } else if (buf->act & ICE_XDP_CONSUMED) {
- buf->pagecnt_bias++;
- } else if (buf->act == ICE_XDP_PASS) {
- ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);
- }
-
- ice_put_rx_buf(rx_ring, buf);
- if (++cached_ntc >= cnt)
- cached_ntc = 0;
- }
rx_ring->next_to_clean = ntc;
/* return up to cleaned_count buffers to hardware */
failure = ice_alloc_rx_bufs(rx_ring, ICE_RX_DESC_UNUSED(rx_ring));
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 intel-net 2/3] ice: gather page_count()'s of each frag right before XDP prog call
2025-01-17 15:18 [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic Maciej Fijalkowski
2025-01-17 15:18 ` [PATCH v2 intel-net 1/3] ice: put Rx buffers after being done with current frame Maciej Fijalkowski
@ 2025-01-17 15:18 ` Maciej Fijalkowski
2025-01-17 15:19 ` [PATCH v2 intel-net 3/3] ice: stop storing XDP verdict within ice_rx_buf Maciej Fijalkowski
2025-01-17 21:22 ` [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic Jakub Kicinski
3 siblings, 0 replies; 6+ messages in thread
From: Maciej Fijalkowski @ 2025-01-17 15:18 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, jacob.e.keller, xudu,
mschmidt, jmaxwell, poros, przemyslaw.kitszel, Maciej Fijalkowski
If we store the pgcnt on few fragments while being in the middle of
gathering the whole frame and we stumbled upon DD bit not being set, we
terminate the NAPI Rx processing loop and come back later on. Then on
next NAPI execution we work on previously stored pgcnt.
Imagine that second half of page was used actively by networking stack
and by the time we came back, stack is not busy with this page anymore
and decremented the refcnt. The page reuse algorithm in this case should
be good to reuse the page but given the old refcnt it will not do so and
attempt to release the page via page_frag_cache_drain() with
pagecnt_bias used as an arg. This in turn will result in negative refcnt
on struct page, which was initially observed by Xu Du.
Therefore, move the page count storage from ice_get_rx_buf() to a place
where we are sure that whole frame has been collected, but before
calling XDP program as it internally can also change the page count of
fragments belonging to xdp_buff.
Fixes: ac0753391195 ("ice: Store page count inside ice_rx_buf")
Reported-and-tested-by: Xu Du <xudu@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/ice/ice_txrx.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index f2134ad57ead..9aa53ad2d8f2 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -924,7 +924,6 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size,
struct ice_rx_buf *rx_buf;
rx_buf = &rx_ring->rx_buf[ntc];
- rx_buf->pgcnt = page_count(rx_buf->page);
prefetchw(rx_buf->page);
if (!size)
@@ -940,6 +939,22 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size,
return rx_buf;
}
+static void ice_get_pgcnts(struct ice_rx_ring *rx_ring)
+{
+ u32 nr_frags = rx_ring->nr_frags + 1;
+ u32 idx = rx_ring->first_desc;
+ struct ice_rx_buf *rx_buf;
+ u32 cnt = rx_ring->count;
+
+ for (int i = 0; i < nr_frags; i++) {
+ rx_buf = &rx_ring->rx_buf[idx];
+ rx_buf->pgcnt = page_count(rx_buf->page);
+
+ if (++idx == cnt)
+ idx = 0;
+ }
+}
+
/**
* ice_build_skb - Build skb around an existing buffer
* @rx_ring: Rx descriptor ring to transact packets on
@@ -1230,6 +1245,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
if (ice_is_non_eop(rx_ring, rx_desc))
continue;
+ ice_get_pgcnts(rx_ring);
ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc);
if (rx_buf->act == ICE_XDP_PASS)
goto construct_skb;
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 intel-net 3/3] ice: stop storing XDP verdict within ice_rx_buf
2025-01-17 15:18 [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic Maciej Fijalkowski
2025-01-17 15:18 ` [PATCH v2 intel-net 1/3] ice: put Rx buffers after being done with current frame Maciej Fijalkowski
2025-01-17 15:18 ` [PATCH v2 intel-net 2/3] ice: gather page_count()'s of each frag right before XDP prog call Maciej Fijalkowski
@ 2025-01-17 15:19 ` Maciej Fijalkowski
2025-01-17 21:22 ` [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic Jakub Kicinski
3 siblings, 0 replies; 6+ messages in thread
From: Maciej Fijalkowski @ 2025-01-17 15:19 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, jacob.e.keller, xudu,
mschmidt, jmaxwell, poros, przemyslaw.kitszel, Maciej Fijalkowski
Idea behind having ice_rx_buf::act was to simplify and speed up the Rx
data path by walking through buffers that were representing cleaned HW
Rx descriptors. Since it caused us a major headache recently and we
rolled back to old approach that 'puts' Rx buffers right after running
XDP prog/creating skb, this is useless now and should be removed.
Get rid of ice_rx_buf::act and related logic. We still need to take care
of a corner case where XDP program releases a particular fragment.
Make ice_run_xdp() to return its result and use it within
ice_put_rx_mbuf().
Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/ice/ice_txrx.c | 60 +++++++++++--------
drivers/net/ethernet/intel/ice/ice_txrx.h | 1 -
drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 43 -------------
3 files changed, 35 insertions(+), 69 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 9aa53ad2d8f2..77d75664c14d 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -532,10 +532,10 @@ int ice_setup_rx_ring(struct ice_rx_ring *rx_ring)
*
* Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
*/
-static void
+static u32
ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
- struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc)
+ union ice_32b_rx_flex_desc *eop_desc)
{
unsigned int ret = ICE_XDP_PASS;
u32 act;
@@ -574,7 +574,7 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
ret = ICE_XDP_CONSUMED;
}
exit:
- ice_set_rx_bufs_act(xdp, rx_ring, ret);
+ return ret;
}
/**
@@ -860,10 +860,8 @@ ice_add_xdp_frag(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
xdp_buff_set_frags_flag(xdp);
}
- if (unlikely(sinfo->nr_frags == MAX_SKB_FRAGS)) {
- ice_set_rx_bufs_act(xdp, rx_ring, ICE_XDP_CONSUMED);
+ if (unlikely(sinfo->nr_frags == MAX_SKB_FRAGS))
return -ENOMEM;
- }
__skb_fill_page_desc_noacc(sinfo, sinfo->nr_frags++, rx_buf->page,
rx_buf->page_offset, size);
@@ -1066,12 +1064,12 @@ ice_construct_skb(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp)
rx_buf->page_offset + headlen, size,
xdp->frame_sz);
} else {
- /* buffer is unused, change the act that should be taken later
- * on; data was copied onto skb's linear part so there's no
+ /* buffer is unused, restore biased page count in Rx buffer;
+ * data was copied onto skb's linear part so there's no
* need for adjusting page offset and we can reuse this buffer
* as-is
*/
- rx_buf->act = ICE_SKB_CONSUMED;
+ rx_buf->pagecnt_bias++;
}
if (unlikely(xdp_buff_has_frags(xdp))) {
@@ -1119,23 +1117,27 @@ ice_put_rx_buf(struct ice_rx_ring *rx_ring, struct ice_rx_buf *rx_buf)
}
static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
- u32 *xdp_xmit, u32 ntc)
+ u32 *xdp_xmit, u32 ntc, u32 verdict)
{
u32 nr_frags = rx_ring->nr_frags + 1;
u32 idx = rx_ring->first_desc;
u32 cnt = rx_ring->count;
+ u32 post_xdp_frags = 1;
struct ice_rx_buf *buf;
int i;
- for (i = 0; i < nr_frags; i++) {
+ if (unlikely(xdp_buff_has_frags(xdp)))
+ post_xdp_frags += xdp_get_shared_info_from_buff(xdp)->nr_frags;
+
+ for (i = 0; i < post_xdp_frags; i++) {
buf = &rx_ring->rx_buf[idx];
- if (buf->act & (ICE_XDP_TX | ICE_XDP_REDIR)) {
+ if (verdict & (ICE_XDP_TX | ICE_XDP_REDIR)) {
ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);
- *xdp_xmit |= buf->act;
- } else if (buf->act & ICE_XDP_CONSUMED) {
+ *xdp_xmit |= verdict;
+ } else if (verdict & ICE_XDP_CONSUMED) {
buf->pagecnt_bias++;
- } else if (buf->act == ICE_XDP_PASS) {
+ } else if (verdict == ICE_XDP_PASS) {
ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);
}
@@ -1144,6 +1146,17 @@ static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
if (++idx == cnt)
idx = 0;
}
+ /* handle buffers that represented frags released by XDP prog;
+ * for these we keep pagecnt_bias as-is; refcount from struct page
+ * has been decremented within XDP prog and we do not have to increase
+ * the biased refcnt
+ */
+ for (; i < nr_frags; i++) {
+ buf = &rx_ring->rx_buf[idx];
+ ice_put_rx_buf(rx_ring, buf);
+ if (++idx == cnt)
+ idx = 0;
+ }
xdp->data = NULL;
rx_ring->first_desc = ntc;
@@ -1170,9 +1183,9 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
struct ice_tx_ring *xdp_ring = NULL;
struct bpf_prog *xdp_prog = NULL;
u32 ntc = rx_ring->next_to_clean;
+ u32 cached_ntu, xdp_verdict;
u32 cnt = rx_ring->count;
u32 xdp_xmit = 0;
- u32 cached_ntu;
bool failure;
xdp_prog = READ_ONCE(rx_ring->xdp_prog);
@@ -1235,7 +1248,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
xdp_prepare_buff(xdp, hard_start, offset, size, !!offset);
xdp_buff_clear_frags_flag(xdp);
} else if (ice_add_xdp_frag(rx_ring, xdp, rx_buf, size)) {
- ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc);
+ ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc, ICE_XDP_CONSUMED);
break;
}
if (++ntc == cnt)
@@ -1246,13 +1259,13 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
continue;
ice_get_pgcnts(rx_ring);
- ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc);
- if (rx_buf->act == ICE_XDP_PASS)
+ xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_desc);
+ if (xdp_verdict == ICE_XDP_PASS)
goto construct_skb;
total_rx_bytes += xdp_get_buff_len(xdp);
total_rx_pkts++;
- ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc);
+ ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict);
continue;
construct_skb:
@@ -1263,12 +1276,9 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
/* exit if we failed to retrieve a buffer */
if (!skb) {
rx_ring->ring_stats->rx_stats.alloc_page_failed++;
- rx_buf->act = ICE_XDP_CONSUMED;
- if (unlikely(xdp_buff_has_frags(xdp)))
- ice_set_rx_bufs_act(xdp, rx_ring,
- ICE_XDP_CONSUMED);
+ xdp_verdict = ICE_XDP_CONSUMED;
}
- ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc);
+ ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict);
if (!skb)
break;
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
index cb347c852ba9..806bce701df3 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
@@ -201,7 +201,6 @@ struct ice_rx_buf {
struct page *page;
unsigned int page_offset;
unsigned int pgcnt;
- unsigned int act;
unsigned int pagecnt_bias;
};
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
index 79f960c6680d..6cf32b404127 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
@@ -5,49 +5,6 @@
#define _ICE_TXRX_LIB_H_
#include "ice.h"
-/**
- * ice_set_rx_bufs_act - propagate Rx buffer action to frags
- * @xdp: XDP buffer representing frame (linear and frags part)
- * @rx_ring: Rx ring struct
- * act: action to store onto Rx buffers related to XDP buffer parts
- *
- * Set action that should be taken before putting Rx buffer from first frag
- * to the last.
- */
-static inline void
-ice_set_rx_bufs_act(struct xdp_buff *xdp, const struct ice_rx_ring *rx_ring,
- const unsigned int act)
-{
- u32 sinfo_frags = xdp_get_shared_info_from_buff(xdp)->nr_frags;
- u32 nr_frags = rx_ring->nr_frags + 1;
- u32 idx = rx_ring->first_desc;
- u32 cnt = rx_ring->count;
- struct ice_rx_buf *buf;
-
- for (int i = 0; i < nr_frags; i++) {
- buf = &rx_ring->rx_buf[idx];
- buf->act = act;
-
- if (++idx == cnt)
- idx = 0;
- }
-
- /* adjust pagecnt_bias on frags freed by XDP prog */
- if (sinfo_frags < rx_ring->nr_frags && act == ICE_XDP_CONSUMED) {
- u32 delta = rx_ring->nr_frags - sinfo_frags;
-
- while (delta) {
- if (idx == 0)
- idx = cnt - 1;
- else
- idx--;
- buf = &rx_ring->rx_buf[idx];
- buf->pagecnt_bias--;
- delta--;
- }
- }
-}
-
/**
* ice_test_staterr - tests bits in Rx descriptor status and error fields
* @status_err_n: Rx descriptor status_error0 or status_error1 bits
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic
2025-01-17 15:18 [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic Maciej Fijalkowski
` (2 preceding siblings ...)
2025-01-17 15:19 ` [PATCH v2 intel-net 3/3] ice: stop storing XDP verdict within ice_rx_buf Maciej Fijalkowski
@ 2025-01-17 21:22 ` Jakub Kicinski
2025-01-20 14:48 ` Maciej Fijalkowski
3 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2025-01-17 21:22 UTC (permalink / raw)
To: Maciej Fijalkowski
Cc: intel-wired-lan, netdev, anthony.l.nguyen, magnus.karlsson,
jacob.e.keller, xudu, mschmidt, jmaxwell, poros,
przemyslaw.kitszel
On Fri, 17 Jan 2025 16:18:57 +0100 Maciej Fijalkowski wrote:
> Subject: [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic
nit: could you use iwl-net and iwl-next as the tree names?
That's what we match on in NIPA to categorize Intel patches.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic
2025-01-17 21:22 ` [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic Jakub Kicinski
@ 2025-01-20 14:48 ` Maciej Fijalkowski
0 siblings, 0 replies; 6+ messages in thread
From: Maciej Fijalkowski @ 2025-01-20 14:48 UTC (permalink / raw)
To: Jakub Kicinski
Cc: intel-wired-lan, netdev, anthony.l.nguyen, magnus.karlsson,
jacob.e.keller, xudu, mschmidt, jmaxwell, poros,
przemyslaw.kitszel
On Fri, Jan 17, 2025 at 01:22:08PM -0800, Jakub Kicinski wrote:
> On Fri, 17 Jan 2025 16:18:57 +0100 Maciej Fijalkowski wrote:
> > Subject: [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic
>
> nit: could you use iwl-net and iwl-next as the tree names?
> That's what we match on in NIPA to categorize Intel patches.
Not sure how I came up with my very own tree target. Will send a v3 then.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-01-20 14:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-17 15:18 [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic Maciej Fijalkowski
2025-01-17 15:18 ` [PATCH v2 intel-net 1/3] ice: put Rx buffers after being done with current frame Maciej Fijalkowski
2025-01-17 15:18 ` [PATCH v2 intel-net 2/3] ice: gather page_count()'s of each frag right before XDP prog call Maciej Fijalkowski
2025-01-17 15:19 ` [PATCH v2 intel-net 3/3] ice: stop storing XDP verdict within ice_rx_buf Maciej Fijalkowski
2025-01-17 21:22 ` [PATCH v2 intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic Jakub Kicinski
2025-01-20 14:48 ` Maciej Fijalkowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).