Re: [EXT] Re: [PATCH 1/1] net: fec: add initial XDP support

public inbox for imx@lists.linux.dev
 help / color / mirror / Atom feed

From: Jesper Dangaard Brouer <jbrouer@redhat.com>
To: Shenwei Wang <shenwei.wang@nxp.com>,
	Jesper Dangaard Brouer <jbrouer@redhat.com>,
	Andrew Lunn <andrew@lunn.ch>
Cc: brouer@redhat.com, "Joakim Zhang" <qiangqing.zhang@nxp.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"imx@lists.linux.dev" <imx@lists.linux.dev>,
	"Magnus Karlsson" <magnus.karlsson@gmail.com>,
	"Björn Töpel" <bjorn@kernel.org>,
	"Ilias Apalodimas" <ilias.apalodimas@linaro.org>
Subject: Re: [EXT] Re: [PATCH 1/1] net: fec: add initial XDP support
Date: Tue, 4 Oct 2022 13:21:49 +0200	[thread overview]
Message-ID: <4f7cf74d-95ca-f93f-7328-e0386348a06e@redhat.com> (raw)
In-Reply-To: <PAXPR04MB9185743919EC6DDA54FAC3B7895B9@PAXPR04MB9185.eurprd04.prod.outlook.com>


On 03/10/2022 14.49, Shenwei Wang wrote:
> Hi Jesper,
> 
>>>> On mvneta driver/platform we saw huge speedup replacing:
>>>>
>>>>      page_pool_release_page(rxq->page_pool, page); with
>>>>      skb_mark_for_recycle(skb);
>>>>
> 
> After replacing the page_pool_release_page with the
> skb_mark_for_recycle, I found something confused me a little in the
> testing result. >
> I tested with the sample app of "xdpsock" under two modes: 
>  1. Native (xdpsock -i eth0). 
>  2. Skb-mode (xdpsock -S -i eth0).
Great that you are also testing AF_XDP, but do you have a particular
use-case that needs AF_XDP on this board?

What packet size are used in below results?

> The following are the testing result:
 >
>       With page_pool_release_page (pps)  With skb_mark_for_recycle (pps)
> 
>   SKB-Mode                          90K                            200K
>   Native                           190K                            190K
> 

The default AF_XDP test with xdpsock is rxdrop IIRC.

Can you test the normal XDP code path and do a XDP_DROP test via the
samples tool 'xdp_rxq_info' and cmdline:

   sudo ./xdp_rxq_info --dev eth42 --act XDP_DROP --read

And then same with --skb-mode

> The skb_mark_for_recycle solution boosted the performance of SKB-Mode
> to 200K+ PPS. That is even higher than the performance of Native
> solution.  Is this result reasonable? Do you have any clue why the
> SKB-Mode performance can go higher than that of Native one?
I might be able to explain this (Cc. AF_XDP maintainers to keep me honest).

When you say "native" *AF_XDP* that isn't Zero-Copy AF_XDP.

Sure, XDP runs in native driver mode and redirects the raw frames into 
the AF_XDP socket, but as this isn't zero-copy AF_XDP. Thus, the packets 
needs to be copied into the AF_XDP buffers.

As soon as the frame or SKB (for generic XDP) have been copied it is 
released/freed by AF_XDP/xsk code (either via xdp_return_buff() or 
consume_skb()). Thus, it looks like it really pays off to recycle the 
frame via page_pool, also for the SKB consume_skb() case.

I am still a little surprised that to can be faster than native AF_XDP, 
as the SKB-mode ("XDP-generic") needs to call through lot more software 
layers and convert the SKB to look like an xdp_buff.

--Jesper



>> -----Original Message-----
>> From: Jesper Dangaard Brouer <jbrouer@redhat.com>
>> Sent: Thursday, September 29, 2022 1:55 PM
>> To: Shenwei Wang <shenwei.wang@nxp.com>; Jesper Dangaard Brouer
>> <jbrouer@redhat.com>; Andrew Lunn <andrew@lunn.ch>
>> Cc: brouer@redhat.com; Joakim Zhang <qiangqing.zhang@nxp.com>; David S.
>> Miller <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub
>> Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; Alexei
>> Starovoitov <ast@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>;
>> Jesper Dangaard Brouer <hawk@kernel.org>; John Fastabend
>> <john.fastabend@gmail.com>; netdev@vger.kernel.org; linux-
>> kernel@vger.kernel.org; imx@lists.linux.dev
>> Subject: Re: [EXT] Re: [PATCH 1/1] net: fec: add initial XDP support
>>
>> Caution: EXT Email
>>
>> On 29/09/2022 17.52, Shenwei Wang wrote:
>>>
>>>> From: Jesper Dangaard Brouer <jbrouer@redhat.com>
>>>>
>>>> On 29/09/2022 15.26, Shenwei Wang wrote:
>>>>>
>>>>>> From: Andrew Lunn <andrew@lunn.ch>
>>>>>> Sent: Thursday, September 29, 2022 8:23 AM
>>>> [...]
>>>>>>
>>>>>>> I actually did some compare testing regarding the page pool for
>>>>>>> normal traffic.  So far I don't see significant improvement in the
>>>>>>> current implementation. The performance for large packets improves
>>>>>>> a little, and the performance for small packets get a little worse.
>>>>>>
>>>>>> What hardware was this for? imx51? imx6? imx7 Vybrid? These all use the
>> FEC.
>>>>>
>>>>> I tested on imx8qxp platform. It is ARM64.
>>>>
>>>> On mvneta driver/platform we saw huge speedup replacing:
>>>>
>>>>      page_pool_release_page(rxq->page_pool, page); with
>>>>      skb_mark_for_recycle(skb);
>>>>
>>>> As I mentioned: Today page_pool have SKB recycle support (you might
>>>> have looked at drivers that didn't utilize this yet), thus you don't
>>>> need to release the page (page_pool_release_page) here.  Instead you
>>>> could simply mark the SKB for recycling, unless driver does some page refcnt
>> tricks I didn't notice.
>>>>
>>>> On the mvneta driver/platform the DMA unmap (in
>>>> page_pool_release_page) was very expensive. This imx8qxp platform
>>>> might have faster DMA unmap in case is it cache-coherent.
>>>>
>>>> I would be very interested in knowing if skb_mark_for_recycle() helps
>>>> on this platform, for normal network stack performance.
>>>>
>>>
>>> Did a quick compare testing for the following 3 scenarios:
>>
>> Thanks for doing this! :-)
>>
>>> 1. original implementation
>>>
>>> shenwei@5810:~$ iperf -c 10.81.16.245 -w 2m -i 1
>>> ------------------------------------------------------------
>>> Client connecting to 10.81.16.245, TCP port 5001 TCP window size:  416
>>> KByte (WARNING: requested 1.91 MByte)
>>> ------------------------------------------------------------
>>> [  1] local 10.81.17.20 port 49154 connected with 10.81.16.245 port 5001
>>> [ ID] Interval       Transfer     Bandwidth
>>> [  1] 0.0000-1.0000 sec   104 MBytes   868 Mbits/sec
>>> [  1] 1.0000-2.0000 sec   105 MBytes   878 Mbits/sec
>>> [  1] 2.0000-3.0000 sec   105 MBytes   881 Mbits/sec
>>> [  1] 3.0000-4.0000 sec   105 MBytes   879 Mbits/sec
>>> [  1] 4.0000-5.0000 sec   105 MBytes   878 Mbits/sec
>>> [  1] 5.0000-6.0000 sec   105 MBytes   878 Mbits/sec
>>> [  1] 6.0000-7.0000 sec   104 MBytes   875 Mbits/sec
>>> [  1] 7.0000-8.0000 sec   104 MBytes   875 Mbits/sec
>>> [  1] 8.0000-9.0000 sec   104 MBytes   873 Mbits/sec
>>> [  1] 9.0000-10.0000 sec   104 MBytes   875 Mbits/sec
>>> [  1] 0.0000-10.0073 sec  1.02 GBytes   875 Mbits/sec
>>>
>>> 2. Page pool with page_pool_release_page
>>>
>>> shenwei@5810:~$ iperf -c 10.81.16.245 -w 2m -i 1
>>> ------------------------------------------------------------
>>> Client connecting to 10.81.16.245, TCP port 5001 TCP window size:  416
>>> KByte (WARNING: requested 1.91 MByte)
>>> ------------------------------------------------------------
>>> [  1] local 10.81.17.20 port 35924 connected with 10.81.16.245 port 5001
>>> [ ID] Interval       Transfer     Bandwidth
>>> [  1] 0.0000-1.0000 sec   101 MBytes   849 Mbits/sec
>>> [  1] 1.0000-2.0000 sec   102 MBytes   860 Mbits/sec
>>> [  1] 2.0000-3.0000 sec   102 MBytes   860 Mbits/sec
>>> [  1] 3.0000-4.0000 sec   102 MBytes   859 Mbits/sec
>>> [  1] 4.0000-5.0000 sec   103 MBytes   863 Mbits/sec
>>> [  1] 5.0000-6.0000 sec   103 MBytes   864 Mbits/sec
>>> [  1] 6.0000-7.0000 sec   103 MBytes   863 Mbits/sec
>>> [  1] 7.0000-8.0000 sec   103 MBytes   865 Mbits/sec
>>> [  1] 8.0000-9.0000 sec   103 MBytes   862 Mbits/sec
>>> [  1] 9.0000-10.0000 sec   102 MBytes   856 Mbits/sec
>>> [  1] 0.0000-10.0246 sec  1.00 GBytes   858 Mbits/sec
>>>
>>>
>>> 3. page pool with skb_mark_for_recycle
>>>
>>> shenwei@5810:~$ iperf -c 10.81.16.245 -w 2m -i 1
>>> ------------------------------------------------------------
>>> Client connecting to 10.81.16.245, TCP port 5001 TCP window size:  416
>>> KByte (WARNING: requested 1.91 MByte)
>>> ------------------------------------------------------------
>>> [  1] local 10.81.17.20 port 42724 connected with 10.81.16.245 port 5001
>>> [ ID] Interval       Transfer     Bandwidth
>>> [  1] 0.0000-1.0000 sec   111 MBytes   931 Mbits/sec
>>> [  1] 1.0000-2.0000 sec   112 MBytes   935 Mbits/sec
>>> [  1] 2.0000-3.0000 sec   111 MBytes   934 Mbits/sec
>>> [  1] 3.0000-4.0000 sec   111 MBytes   934 Mbits/sec
>>> [  1] 4.0000-5.0000 sec   111 MBytes   934 Mbits/sec
>>> [  1] 5.0000-6.0000 sec   112 MBytes   935 Mbits/sec
>>> [  1] 6.0000-7.0000 sec   111 MBytes   934 Mbits/sec
>>> [  1] 7.0000-8.0000 sec   111 MBytes   933 Mbits/sec
>>> [  1] 8.0000-9.0000 sec   112 MBytes   935 Mbits/sec
>>> [  1] 9.0000-10.0000 sec   111 MBytes   933 Mbits/sec
>>> [  1] 0.0000-10.0069 sec  1.09 GBytes   934 Mbits/sec
>>
>> This is a very significant performance improvement (page pool with
>> skb_mark_for_recycle).  This is very close to the max goodput for a 1Gbit/s link.
>>
>>
>>> For small packet size (64 bytes), all three cases have almost the same result:
>>>
>>
>> To me this indicate, that the DMA map/unmap operations on this platform are
>> indeed more expensive on larger packets.  Given this is what page_pool does,
>> keeping the DMA mapping intact when recycling.
>>
>> Driver still need DMA-sync, although I notice you set page_pool feature flag
>> PP_FLAG_DMA_SYNC_DEV, this is good as page_pool will try to reduce sync size
>> where possible. E.g. in this SKB case will reduce the DMA-sync to the
>> max_len=FEC_ENET_RX_FRSIZE which should also help on performance.
>>
>>
>>> shenwei@5810:~$ iperf -c 10.81.16.245 -w 2m -i 1 -l 64
>>> ------------------------------------------------------------
>>> Client connecting to 10.81.16.245, TCP port 5001 TCP window size:  416
>>> KByte (WARNING: requested 1.91 MByte)
>>> ------------------------------------------------------------
>>> [  1] local 10.81.17.20 port 58204 connected with 10.81.16.245 port 5001
>>> [ ID] Interval       Transfer     Bandwidth
>>> [  1] 0.0000-1.0000 sec  36.9 MBytes   309 Mbits/sec
>>> [  1] 1.0000-2.0000 sec  36.6 MBytes   307 Mbits/sec
>>> [  1] 2.0000-3.0000 sec  36.6 MBytes   307 Mbits/sec
>>> [  1] 3.0000-4.0000 sec  36.5 MBytes   307 Mbits/sec
>>> [  1] 4.0000-5.0000 sec  37.1 MBytes   311 Mbits/sec
>>> [  1] 5.0000-6.0000 sec  37.2 MBytes   312 Mbits/sec
>>> [  1] 6.0000-7.0000 sec  37.1 MBytes   311 Mbits/sec
>>> [  1] 7.0000-8.0000 sec  37.1 MBytes   311 Mbits/sec
>>> [  1] 8.0000-9.0000 sec  37.1 MBytes   312 Mbits/sec
>>> [  1] 9.0000-10.0000 sec  37.2 MBytes   312 Mbits/sec
>>> [  1] 0.0000-10.0097 sec   369 MBytes   310 Mbits/sec
>>>
>>> Regards,
>>> Shenwei
>>>
>>>
>>>>>> By small packets, do you mean those under the copybreak limit?
>>>>>>
>>>>>> Please provide some benchmark numbers with your next patchset.
>>>>>
>>>>> Yes, the packet size is 64 bytes and it is under the copybreak limit.
>>>>> As the impact is not significant, I would prefer to remove the
>>>>> copybreak  logic.
>>>>
>>>> +1 to removing this logic if possible, due to maintenance cost.
>>>>
>>>> --Jesper
>>>
>

next prev parent reply	other threads:[~2022-10-04 11:21 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-28 15:25 [PATCH 1/1] net: fec: add initial XDP support Shenwei Wang
2022-09-28 21:18 ` kernel test robot
2022-09-29  1:33 ` Andrew Lunn
2022-09-29 12:40   ` [EXT] " Shenwei Wang
2022-09-29 13:22     ` Andrew Lunn
2022-09-29 13:26       ` Shenwei Wang
2022-09-29 15:19         ` Andrew Lunn
2022-09-29 15:28         ` Jesper Dangaard Brouer
2022-09-29 15:39           ` Andrew Lunn
2022-09-29 15:52           ` Shenwei Wang
2022-09-29 18:55             ` Jesper Dangaard Brouer
2022-10-03 12:49               ` Shenwei Wang
2022-10-04 11:21                 ` Jesper Dangaard Brouer [this message]
2022-10-04 13:12                   ` Shenwei Wang
2022-10-04 13:34                     ` Shenwei Wang
2022-10-05 12:40                       ` Shenwei Wang
2022-10-06  8:37                         ` Jesper Dangaard Brouer
2022-10-07  8:08                           ` Ilias Apalodimas
2022-10-07 19:18                             ` Shenwei Wang
2022-09-29  1:50 ` Andrew Lunn
2022-09-29 12:46   ` [EXT] " Shenwei Wang
2022-09-29 13:24     ` Andrew Lunn
2022-09-29 13:35       ` Shenwei Wang
2022-09-29  2:43 ` kernel test robot
2022-09-29 10:16 ` Jesper Dangaard Brouer
2022-09-29 13:11   ` [EXT] " Shenwei Wang
2022-09-29 15:44     ` Jesper Dangaard Brouer
2022-10-03  5:41 ` kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2022-10-25 20:11 Shenwei Wang
2022-10-25 22:08 ` Andrew Lunn
2022-10-27  1:50   ` [EXT] " Shenwei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4f7cf74d-95ca-f93f-7328-e0386348a06e@redhat.com \
    --to=jbrouer@redhat.com \
    --cc=andrew@lunn.ch \
    --cc=ast@kernel.org \
    --cc=bjorn@kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=imx@lists.linux.dev \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=magnus.karlsson@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=qiangqing.zhang@nxp.com \
    --cc=shenwei.wang@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox