Re: [RFC net-next] net: veth: reduce page_pool memory footprint using half page per-buffer

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
To: Yunsheng Lin <linyunsheng@huawei.com>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>,
	Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
	netdev@vger.kernel.org, bpf@vger.kernel.org, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	ast@kernel.org, daniel@iogearbox.net, hawk@kernel.org,
	john.fastabend@gmail.com
Subject: Re: [RFC net-next] net: veth: reduce page_pool memory footprint using half page per-buffer
Date: Wed, 17 May 2023 16:17:59 +0200	[thread overview]
Message-ID: <ZGTiF+B46FA3TOj6@lore-desk> (raw)
In-Reply-To: <d6348bf0-0da8-c0ae-ce78-7f4620837f66@huawei.com>

[-- Attachment #1: Type: text/plain, Size: 5810 bytes --]

> On 2023/5/17 6:52, Lorenzo Bianconi wrote:
> >> On Mon, May 15, 2023 at 01:24:20PM +0200, Lorenzo Bianconi wrote:
> >>>> On 2023/5/12 21:08, Lorenzo Bianconi wrote:
> >>>>> In order to reduce page_pool memory footprint, rely on
> >>>>> page_pool_dev_alloc_frag routine and reduce buffer size
> >>>>> (VETH_PAGE_POOL_FRAG_SIZE) to PAGE_SIZE / 2 in order to consume one page
> >>>>
> >>>> Is there any performance improvement beside the memory saving? As it
> >>>> should reduce TLB miss, I wonder if the TLB miss reducing can even
> >>>> out the cost of the extra frag reference count handling for the
> >>>> frag support?
> >>>
> >>> reducing the requested headroom to 192 (from 256) we have a nice improvement in
> >>> the 1500B frame case while it is mostly the same in the case of paged skb
> >>> (e.g. MTU 8000B).
> >>
> >> Can you define 'nice improvement' ? ;)
> >> Show us numbers or improvement in %.
> > 
> > I am testing this RFC patch in the scenario reported below:
> > 
> > iperf tcp tx --> veth0 --> veth1 (xdp_pass) --> iperf tcp rx
> > 
> > - 6.4.0-rc1 net-next:
> >   MTU 1500B: ~ 7.07 Gbps
> >   MTU 8000B: ~ 14.7 Gbps
> > 
> > - 6.4.0-rc1 net-next + page_pool frag support in veth:
> >   MTU 1500B: ~ 8.57 Gbps
> >   MTU 8000B: ~ 14.5 Gbps
> > 
> 
> Thanks for sharing the data.
> Maybe using the new frag interface introduced in [1] bring
> back the performance for the MTU 8000B case.
> 
> 1. https://patchwork.kernel.org/project/netdevbpf/cover/20230516124801.2465-1-linyunsheng@huawei.com/
> 
> 
> I drafted a patch for veth to use the new frag interface, maybe that
> will show how veth can make use of it. Would you give it a try to see
> if there is any performance improvment for MTU 8000B case? Thanks.
> 
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -737,8 +737,8 @@ static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq,
>             skb_shinfo(skb)->nr_frags ||
>             skb_headroom(skb) < XDP_PACKET_HEADROOM) {
>                 u32 size, len, max_head_size, off;
> +               struct page_pool_frag *pp_frag;
>                 struct sk_buff *nskb;
> -               struct page *page;
>                 int i, head_off;
> 
>                 /* We need a private copy of the skb and data buffers since
> @@ -752,14 +752,20 @@ static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq,
>                 if (skb->len > PAGE_SIZE * MAX_SKB_FRAGS + max_head_size)
>                         goto drop;
> 
> +               size = min_t(u32, skb->len, max_head_size);
> +               size += VETH_XDP_HEADROOM;
> +               size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> +
>                 /* Allocate skb head */
> -               page = page_pool_dev_alloc_pages(rq->page_pool);
> -               if (!page)
> +               pp_frag = page_pool_dev_alloc_frag(rq->page_pool, size);
> +               if (!pp_frag)
>                         goto drop;
> 
> -               nskb = napi_build_skb(page_address(page), PAGE_SIZE);
> +               nskb = napi_build_skb(page_address(pp_frag->page) + pp_frag->offset,
> +                                     pp_frag->truesize);
>                 if (!nskb) {
> -                       page_pool_put_full_page(rq->page_pool, page, true);
> +                       page_pool_put_full_page(rq->page_pool, pp_frag->page,
> +                                               true);
>                         goto drop;
>                 }
> 
> @@ -782,16 +788,18 @@ static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq,
>                 len = skb->len - off;
> 
>                 for (i = 0; i < MAX_SKB_FRAGS && off < skb->len; i++) {
> -                       page = page_pool_dev_alloc_pages(rq->page_pool);
> -                       if (!page) {
> +                       size = min_t(u32, len, PAGE_SIZE);
> +
> +                       pp_frag = page_pool_dev_alloc_frag(rq->page_pool, size);
> +                       if (!pp_frag) {
>                                 consume_skb(nskb);
>                                 goto drop;
>                         }
> 
> -                       size = min_t(u32, len, PAGE_SIZE);
> -                       skb_add_rx_frag(nskb, i, page, 0, size, PAGE_SIZE);
> -                       if (skb_copy_bits(skb, off, page_address(page),
> -                                         size)) {
> +                       skb_add_rx_frag(nskb, i, pp_frag->page, pp_frag->offset,
> +                                       size, pp_frag->truesize);
> +                       if (skb_copy_bits(skb, off, page_address(pp_frag->page) +
> +                                         pp_frag->offset, size)) {
>                                 consume_skb(nskb);
>                                 goto drop;
>                         }
> @@ -1047,6 +1055,8 @@ static int veth_create_page_pool(struct veth_rq *rq)
>                 return err;
>         }

IIUC the code here we are using a variable length for linear part (at most one page)
while we will always use a full page (exept for the last fragment) for the paged
area, correct? I have not tested it yet but I do not think we will get a significant
improvement since if we set MTU to 8000B in my tests we get mostly the same throughput
(14.5 Gbps vs 14.7 Gbps) if we use page_pool fragment or page_pool full page.
Am I missing something?
What we are discussing with Jesper is try to allocate a order 3 page from the pool and
rely page_pool fragment, similar to page_frag_cache is doing. I will look into it if
there are no strong 'red flags'.

Regards,
Lorenzo

> 
> +       page_pool_set_max_frag_size(rq->page_pool, PAGE_SIZE / 2);
> +
>         return 0;
>  }
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

next prev parent reply	other threads:[~2023-05-17 14:18 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-12 13:08 [RFC net-next] net: veth: reduce page_pool memory footprint using half page per-buffer Lorenzo Bianconi
2023-05-12 13:43 ` Alexander Lobakin
2023-05-12 14:14   ` Lorenzo Bianconi
2023-05-15 16:36     ` Alexander Lobakin
2023-05-15 11:10 ` Yunsheng Lin
2023-05-15 11:24   ` Lorenzo Bianconi
2023-05-15 13:11     ` Maciej Fijalkowski
2023-05-16 22:52       ` Lorenzo Bianconi
2023-05-17  9:41         ` Yunsheng Lin
2023-05-17 14:17           ` Lorenzo Bianconi [this message]
2023-05-18  1:16             ` Yunsheng Lin
2023-05-17 14:58         ` Jakub Kicinski
2023-05-16 12:55     ` Yunsheng Lin
2023-05-16 16:11 ` Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZGTiF+B46FA3TOj6@lore-desk \
    --to=lorenzo.bianconi@redhat.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=lorenzo@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.