[RFC PATCH net-next v2 2/2] net: veth: Improving page pool pages recycling

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Liang Chen <liangchen.linux@gmail.com>
To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, linyunsheng@huawei.com
Cc: hawk@kernel.org, ilias.apalodimas@linaro.org,
	daniel@iogearbox.net, ast@kernel.org, netdev@vger.kernel.org,
	liangchen.linux@gmail.com
Subject: [RFC PATCH net-next v2 2/2] net: veth: Improving page pool pages recycling
Date: Tue,  1 Aug 2023 14:19:32 +0800	[thread overview]
Message-ID: <20230801061932.10335-2-liangchen.linux@gmail.com> (raw)
In-Reply-To: <20230801061932.10335-1-liangchen.linux@gmail.com>

Page pool is supported for veth. But for XDP_TX and XDP_REDIRECT cases,
the pages are not effectively recycled. "ethtool -S" statistics for the
page pool are as follows:

NIC statistics:
rx_pp_alloc_fast: 18041186
rx_pp_alloc_slow: 286369
rx_pp_recycle_ring: 0
rx_pp_recycle_released_ref: 18327555

This failure to recycle page pool pages is a result of the code snippet
below, which converts page pool pages into regular pages and releases
the skb data structure:

veth_xdp_get(xdp);
consume_skb(skb);

The reason behind is some skbs received from the veth peer are not page
pool pages, and remain so after conversion to xdp frame. In order to not
confusing __xdp_return with mixed regular pages and page pool pages, they
are all converted to regular pages. So registering xdp memory model as
MEM_TYPE_PAGE_SHARED is sufficient.

If we replace the above code with kfree_skb_partial, directly releasing
the skb data structure, we can retain the original page pool page behavior.
However, directly changing the xdp memory model to MEM_TYPE_PAGE_POOL is
not a solution as explained above. Therefore, we introduced an additionally
MEM_TYPE_PAGE_POOL model for each rq.

In addition, to avoid mixing up pages from page pools with different
xdp_mem_id, page pool pages directly coming from the peers are still
converted into regular pages. This is not common, as most of the time,
they will be reallocated in veth_convert_skb_to_xdp_buff.

The following tests were conducted using pktgen to generate traffic and
evaluate the performance improvement after page pool pages get successfully
recycled in scenarios involving XDP_TX, XDP_REDIRECT, and AF_XDP.

Test environment setup:
ns1                 ns2
veth0   <-peer->    veth1
veth2   <-peer->    veth3

Test Results:
pktgen -> veth1 -> veth0(XDP_TX) -> veth1(XDP_DROP)
    without PP recycle: 1,780,392
    with PP recycle: 1,984,680
    improvement: ~10%

pktgen -> veth1 -> veth0(XDP_TX) -> veth1(XDP_PASS)
    without PP recycle: 1,433,491
    with PP recycle: 1,511,680
    improvement: 5~6%

pktgen -> veth1 -> veth0(XDP_REDIRECT) -> veth2 -> veth3(XDP_DROP)
    without PP recycle: 1,527,708
    with PP recycle: 1,672,101
    improvement: ~10%

pktgen -> veth1 -> veth0(XDP_REDIRECT) -> veth2 -> veth3(XDP_PASS)
    without PP recycle: 1,325,804
    with PP recycle: 1,392,704
    improvement: ~5.5%

pktgen -> veth1 -> veth0(AF_XDP) -> user space(DROP)
    without PP recycle: 1,607,609
    with PP recycle: 1,736,957
    improvement: ~8%

Additionally, the performance improvement were measured when converting to
xdp_buff doesn't require buffer copy and original skb uses regular pages,
i.e. page pool recycle not involved. This still gives around 2% improvement
attributed to the changes from consume_skb to kfree_skb_partial.

Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
---
 Changes from v1:
- pp pages from the peers are still converted into regular pages.
---
 drivers/net/veth.c | 48 ++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 42 insertions(+), 6 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 509e901da41d..ea1b344e5db4 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -62,6 +62,7 @@ struct veth_rq {
 	struct net_device	*dev;
 	struct bpf_prog __rcu	*xdp_prog;
 	struct xdp_mem_info	xdp_mem;
+	struct xdp_mem_info	xdp_mem_pp;
 	struct veth_rq_stats	stats;
 	bool			rx_notify_masked;
 	struct ptr_ring		xdp_ring;
@@ -836,6 +837,7 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
 	struct bpf_prog *xdp_prog;
 	struct veth_xdp_buff vxbuf;
 	struct xdp_buff *xdp = &vxbuf.xdp;
+	struct sk_buff *skb_orig;
 	u32 act, metalen;
 	int off;
 
@@ -848,6 +850,7 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
 		goto out;
 	}
 
+	skb_orig = skb;
 	__skb_push(skb, skb->data - skb_mac_header(skb));
 	if (veth_convert_skb_to_xdp_buff(rq, xdp, &skb))
 		goto drop;
@@ -862,9 +865,18 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
 	case XDP_PASS:
 		break;
 	case XDP_TX:
-		veth_xdp_get(xdp);
-		consume_skb(skb);
-		xdp->rxq->mem = rq->xdp_mem;
+		if (skb != skb_orig) {
+			xdp->rxq->mem = rq->xdp_mem_pp;
+			kfree_skb_partial(skb, true);
+		} else if (!skb->pp_recycle) {
+			xdp->rxq->mem = rq->xdp_mem;
+			kfree_skb_partial(skb, true);
+		} else {
+			veth_xdp_get(xdp);
+			consume_skb(skb);
+			xdp->rxq->mem = rq->xdp_mem;
+		}
+
 		if (unlikely(veth_xdp_tx(rq, xdp, bq) < 0)) {
 			trace_xdp_exception(rq->dev, xdp_prog, act);
 			stats->rx_drops++;
@@ -874,9 +886,18 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
 		rcu_read_unlock();
 		goto xdp_xmit;
 	case XDP_REDIRECT:
-		veth_xdp_get(xdp);
-		consume_skb(skb);
-		xdp->rxq->mem = rq->xdp_mem;
+		if (skb != skb_orig) {
+			xdp->rxq->mem = rq->xdp_mem_pp;
+			kfree_skb_partial(skb, true);
+		} else if (!skb->pp_recycle) {
+			xdp->rxq->mem = rq->xdp_mem;
+			kfree_skb_partial(skb, true);
+		} else {
+			veth_xdp_get(xdp);
+			consume_skb(skb);
+			xdp->rxq->mem = rq->xdp_mem;
+		}
+
 		if (xdp_do_redirect(rq->dev, xdp, xdp_prog)) {
 			stats->rx_drops++;
 			goto err_xdp;
@@ -1061,6 +1082,14 @@ static int __veth_napi_enable_range(struct net_device *dev, int start, int end)
 			goto err_page_pool;
 	}
 
+	for (i = start; i < end; i++) {
+		err = xdp_reg_mem_model(&priv->rq[i].xdp_mem_pp,
+					MEM_TYPE_PAGE_POOL,
+					priv->rq[i].page_pool);
+		if (err)
+			goto err_reg_mem;
+	}
+
 	for (i = start; i < end; i++) {
 		struct veth_rq *rq = &priv->rq[i];
 
@@ -1082,6 +1111,10 @@ static int __veth_napi_enable_range(struct net_device *dev, int start, int end)
 	for (i--; i >= start; i--)
 		ptr_ring_cleanup(&priv->rq[i].xdp_ring, veth_ptr_free);
 	i = end;
+err_reg_mem:
+	for (i--; i >= start; i--)
+		xdp_unreg_mem_model(&priv->rq[i].xdp_mem_pp);
+	i = end;
 err_page_pool:
 	for (i--; i >= start; i--) {
 		page_pool_destroy(priv->rq[i].page_pool);
@@ -1117,6 +1150,9 @@ static void veth_napi_del_range(struct net_device *dev, int start, int end)
 		ptr_ring_cleanup(&rq->xdp_ring, veth_ptr_free);
 	}
 
+	for (i = start; i < end; i++)
+		xdp_unreg_mem_model(&priv->rq[i].xdp_mem_pp);
+
 	for (i = start; i < end; i++) {
 		page_pool_destroy(priv->rq[i].page_pool);
 		priv->rq[i].page_pool = NULL;
-- 
2.40.1

next prev parent reply	other threads:[~2023-08-01  6:20 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-01  6:19 [RFC PATCH net-next v2 1/2] net: veth: Page pool creation error handling for existing pools only Liang Chen
2023-08-01  6:19 ` Liang Chen [this message]
2023-08-02 12:32   ` [RFC PATCH net-next v2 2/2] net: veth: Improving page pool pages recycling Yunsheng Lin
2023-08-07 12:20     ` Liang Chen
2023-08-08 11:16       ` Yunsheng Lin
2023-08-09 10:01         ` Liang Chen
2023-08-09 12:35           ` Yunsheng Lin
2023-08-12  1:52             ` Liang Chen
2023-08-11 18:16   ` Jesper Dangaard Brouer
2023-08-12  2:02     ` Liang Chen
2023-08-01 12:18 ` [RFC PATCH net-next v2 1/2] net: veth: Page pool creation error handling for existing pools only Simon Horman
2023-08-11 11:56   ` Liang Chen
2023-08-02  8:56 ` Jesper Dangaard Brouer
2023-08-11 12:02   ` Liang Chen
2023-08-11 16:35     ` Jesper Dangaard Brouer
2023-08-12  2:16       ` Liang Chen

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:509e901da41 dfblob:ea1b344e5db )
 OR (
bs:"[RFC PATCH net-next v2 2/2] net: veth: Improving page pool pages recycling" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230801061932.10335-2-liangchen.linux@gmail.com \
    --to=liangchen.linux@gmail.com \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=kuba@kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).