All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: "Björn Töpel" <bjorn.topel@gmail.com>
Cc: magnus.karlsson@intel.com, alexander.h.duyck@intel.com,
	alexander.duyck@gmail.com, john.fastabend@gmail.com, ast@fb.com,
	willemdebruijn.kernel@gmail.com, daniel@iogearbox.net,
	netdev@vger.kernel.org, "Björn Töpel" <bjorn.topel@intel.com>,
	michael.lundkvist@ericsson.com, jesse.brandeburg@intel.com,
	anjali.singhai@intel.com, jeffrey.b.shaw@intel.com,
	ferruh.yigit@intel.com, qi.z.zhang@intel.com, brouer@redhat.com,
	"Saeed Mahameed" <saeedm@mellanox.com>
Subject: Re: [RFC PATCH 00/24] Introducing AF_XDP support
Date: Thu, 1 Feb 2018 17:42:40 +0100	[thread overview]
Message-ID: <20180201174240.6368bc66@redhat.com> (raw)
In-Reply-To: <20180131135356.19134-1-bjorn.topel@gmail.com>



On Wed, 31 Jan 2018 14:53:32 +0100 Björn Töpel <bjorn.topel@gmail.com> wrote:

> * In this RFC, do not use an XDP_REDIRECT action other than
>   bpf_xdpsk_redirect for XDP_DRV_ZC. This is because a zero-copy
>   allocated buffer will then be sent to a cpu id / queue_pair through
>   ndo_xdp_xmit that does not know this has been ZC allocated. It will
>   then do a page_free on it and you will get a crash. How to extend
>   ndo_xdp_xmit with some free/completion function that could be called
>   instead of page_free?  Hopefully, the same solution can be used here
>   as in the first problem item in this section.

I'm prototype-coding extending ndo_xdp_xmit with a free/completion
function call, that look at the xdp_rxq_info to determine what
allocator type the RX-NIC used (info per RXq), and invoke the
appropriate callback.

I dusted off my old page_pool implementation (modifying it to run
outside page-allocator).  Implemented XDP_REDIRECT for mlx5, and
extended xdp_rxq_info, and stored needed info in ixgbe for DMA TX
completion.  Disabled the mlx5 page cache, and instead use the
page_pool.

It worked surprisingly well... test is: pktgen on mlx5 100Gbit/s NIC,
and XDP_REDIRECT with xdp_redirect_map sample, out 10G ixgbe NIC.

Performance is surprisingly good... Testing DMA-TX completion on
ixgbe, that calls "xdp_return_frame", which is mapped to
page_pool_put_page(pool, page); Here DMA-TX-completion runs on CPU#3
and mlx5 RX runs on CPU#0.  (Internally page_pool uses ptr_ring, which
is what gives the good cross CPU performance).

Show adapter(s) (ixgbe2 mlx5p2) statistics (ONLY that changed!)
Ethtool(ixgbe2  ) stat:    810562253 (    810,562,253) <= tx_bytes /sec
Ethtool(ixgbe2  ) stat:    864600261 (    864,600,261) <= tx_bytes_nic /sec
Ethtool(ixgbe2  ) stat:     13509371 (     13,509,371) <= tx_packets /sec
Ethtool(ixgbe2  ) stat:     13509380 (     13,509,380) <= tx_pkts_nic /sec
Ethtool(mlx5p2  ) stat:     36827369 (     36,827,369) <= rx_64_bytes_phy /sec
Ethtool(mlx5p2  ) stat:   2356953271 (  2,356,953,271) <= rx_bytes_phy /sec
Ethtool(mlx5p2  ) stat:     23313782 (     23,313,782) <= rx_discards_phy /sec
Ethtool(mlx5p2  ) stat:         3019 (          3,019) <= rx_out_of_buffer /sec
Ethtool(mlx5p2  ) stat:     36827395 (     36,827,395) <= rx_packets_phy /sec
Ethtool(mlx5p2  ) stat:   2356924099 (  2,356,924,099) <= rx_prio0_bytes /sec
Ethtool(mlx5p2  ) stat:     13513560 (     13,513,560) <= rx_prio0_packets /sec
Ethtool(mlx5p2  ) stat:    810820253 (    810,820,253) <= rx_vport_unicast_bytes /sec
Ethtool(mlx5p2  ) stat:     13513672 (     13,513,672) <= rx_vport_unicast_packets /sec

If I only disabled the mlx5 page cache (no page_pool), then single flow
performance was 6Mpps, and if I started two flows the collective
performance drop to 4Mpps, because we hit the page allocator lock
(further negative scaling occurs).

If I keep the mlx5 cache, I see between 7-11Mpps... which varies
depending on ixgbe TX-ring size and DMA-completion interrupt levels.


For AF_XDP, we just register another free/completion callback function.
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  parent reply	other threads:[~2018-02-01 16:42 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-31 13:53 [RFC PATCH 00/24] Introducing AF_XDP support Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 01/24] xsk: AF_XDP sockets buildable skeleton Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 02/24] xsk: add user memory registration sockopt Björn Töpel
2018-02-07 16:00   ` Willem de Bruijn
2018-02-07 21:39     ` Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 03/24] xsk: added XDP_{R,T}X_RING sockopt and supporting structures Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 04/24] xsk: add bind support and introduce Rx functionality Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 05/24] bpf: added bpf_xdpsk_redirect Björn Töpel
2018-02-05 13:42   ` Jesper Dangaard Brouer
2018-02-07 21:11     ` Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 06/24] net: wire up xsk support in the XDP_REDIRECT path Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 07/24] xsk: introduce Tx functionality Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 08/24] i40e: add support for XDP_REDIRECT Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 09/24] samples/bpf: added xdpsock program Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 10/24] netdevice: added XDP_{UN,}REGISTER_XSK command to ndo_bpf Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 11/24] netdevice: added ndo for transmitting a packet from an XDP socket Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 12/24] xsk: add iterator functions to xsk_ring Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 13/24] i40e: introduce external allocator support Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 14/24] i40e: implemented page recycling buff_pool Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 15/24] i40e: start using " Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 16/24] i40e: separated buff_pool interface from i40e implementaion Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 17/24] xsk: introduce xsk_buff_pool Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 18/24] xdp: added buff_pool support to struct xdp_buff Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 19/24] xsk: add support for zero copy Rx Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 20/24] xsk: add support for zero copy Tx Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 21/24] i40e: implement xsk sub-commands in ndo_bpf for zero copy Rx Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 22/24] i40e: introduced a clean_tx callback function Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 23/24] i40e: introduced Tx completion callbacks Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 24/24] i40e: Tx support for zero copy allocator Björn Töpel
2018-02-01 16:42 ` Jesper Dangaard Brouer [this message]
2018-02-02 10:31 ` [RFC PATCH 00/24] Introducing AF_XDP support Jesper Dangaard Brouer
2018-02-05 15:05 ` Björn Töpel
2018-02-07 15:54   ` Willem de Bruijn
2018-02-07 21:28     ` Björn Töpel
2018-02-08 23:16       ` Willem de Bruijn
2018-02-07 17:59 ` Tom Herbert
2018-02-07 21:38   ` Björn Töpel
2018-03-26 16:06 ` William Tu
2018-03-26 16:38   ` Jesper Dangaard Brouer
2018-03-26 21:58     ` William Tu
2018-03-27  6:09       ` Björn Töpel
2018-03-27  9:37       ` Jesper Dangaard Brouer
2018-03-28  0:06         ` William Tu
2018-03-28  8:01           ` Jesper Dangaard Brouer
2018-03-28 15:05             ` William Tu
2018-03-26 22:54     ` Tushar Dave
2018-03-26 23:03       ` Alexander Duyck
2018-03-26 23:20         ` Tushar Dave
2018-03-28  0:49           ` William Tu
2018-03-27  6:30         ` Björn Töpel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180201174240.6368bc66@redhat.com \
    --to=brouer@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=alexander.h.duyck@intel.com \
    --cc=anjali.singhai@intel.com \
    --cc=ast@fb.com \
    --cc=bjorn.topel@gmail.com \
    --cc=bjorn.topel@intel.com \
    --cc=daniel@iogearbox.net \
    --cc=ferruh.yigit@intel.com \
    --cc=jeffrey.b.shaw@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=john.fastabend@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=michael.lundkvist@ericsson.com \
    --cc=netdev@vger.kernel.org \
    --cc=qi.z.zhang@intel.com \
    --cc=saeedm@mellanox.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.