From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org, anthony.l.nguyen@intel.com,
magnus.karlsson@intel.com, jacob.e.keller@intel.com,
xudu@redhat.com, mschmidt@redhat.com, jmaxwell@redhat.com,
poros@redhat.com, przemyslaw.kitszel@intel.com,
Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Subject: [PATCH intel-net 2/3] ice: gather page_count()'s of each frag right before XDP prog call
Date: Thu, 16 Jan 2025 16:39:07 +0100 [thread overview]
Message-ID: <20250116153908.515848-3-maciej.fijalkowski@intel.com> (raw)
In-Reply-To: <20250116153908.515848-1-maciej.fijalkowski@intel.com>
Intel networking drivers have a page recycling scheme that is nicely
explained by Bjorn Topel in commit 75aab4e10ae6 ("i40e: avoid premature
Rx buffer reuse"). Please refer to it for better understanding what is
described below.
If we store the page count on a bunch of fragments while being in the
middle of gathering the whole frame and we stumbled upon DD bit not
being set, we terminate the NAPI Rx processing loop and come back later
on. Then, on next NAPI execution, the page recycling algorithm will work
on previously stored refcount of underlying page.
Imagine that second half of page was used actively by networking stack
and by the time we came back, stack is not busy with this page anymore
and decremented the refcount. The page reuse logic in this case should
be good to reuse the page but given the old refcount it will not do so
and attempt to release the page via page_frag_cache_drain() with
pagecnt_bias used as an arg. This in turn will result in negative
refcount on struct page, which was initially observed by Xu Du.
Therefore to fix this, move the page count storage from ice_get_rx_buf()
to a place where we are sure that whole frame has been collected and
processing of this frame will happen under current NAPI instance, but
before calling XDP program as it internally can also change the page
count of fragments belonging to xdp_buff.
Fixes: ac0753391195 ("ice: Store page count inside ice_rx_buf")
Reported-and-tested-by: Xu Du <xudu@redhat.com>
Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/ice/ice_txrx.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 501df1bc881d..7cd07e757d3c 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -924,7 +924,6 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size,
struct ice_rx_buf *rx_buf;
rx_buf = &rx_ring->rx_buf[ntc];
- rx_buf->pgcnt = page_count(rx_buf->page);
prefetchw(rx_buf->page);
if (!size)
@@ -940,6 +939,22 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size,
return rx_buf;
}
+static void ice_get_pgcnts(struct ice_rx_ring *rx_ring)
+{
+ u32 nr_frags = rx_ring->nr_frags + 1;
+ u32 idx = rx_ring->first_desc;
+ struct ice_rx_buf *rx_buf;
+ u32 cnt = rx_ring->count;
+
+ for (int i = 0; i < nr_frags; i++) {
+ rx_buf = &rx_ring->rx_buf[idx];
+ rx_buf->pgcnt = page_count(rx_buf->page);
+
+ if (++idx == cnt)
+ idx = 0;
+ }
+}
+
/**
* ice_build_skb - Build skb around an existing buffer
* @rx_ring: Rx descriptor ring to transact packets on
@@ -1229,6 +1244,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
if (ice_is_non_eop(rx_ring, rx_desc))
continue;
+ ice_get_pgcnts(rx_ring);
ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc);
if (rx_buf->act == ICE_XDP_PASS)
goto construct_skb;
--
2.43.0
next prev parent reply other threads:[~2025-01-16 15:39 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-16 15:39 [PATCH intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic Maciej Fijalkowski
2025-01-16 15:39 ` [PATCH intel-net 1/3] ice: put Rx buffers after being done with current frame Maciej Fijalkowski
2025-01-16 17:10 ` Petr Oros
2025-01-16 17:13 ` Maciej Fijalkowski
2025-01-16 15:39 ` Maciej Fijalkowski [this message]
2025-01-16 15:39 ` [PATCH intel-net 3/3] ice: stop storing XDP verdict within ice_rx_buf Maciej Fijalkowski
2025-01-16 16:19 ` [PATCH intel-net 0/3] ice: fix Rx data path for heavy 9k MTU traffic Przemek Kitszel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250116153908.515848-3-maciej.fijalkowski@intel.com \
--to=maciej.fijalkowski@intel.com \
--cc=anthony.l.nguyen@intel.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jacob.e.keller@intel.com \
--cc=jmaxwell@redhat.com \
--cc=magnus.karlsson@intel.com \
--cc=mschmidt@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=poros@redhat.com \
--cc=przemyslaw.kitszel@intel.com \
--cc=xudu@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox