netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Simon Horman <horms@kernel.org>
To: "Asbjørn Sloth Tønnesen" <ast@fiberby.net>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Jiri Pirko <jiri@resnulli.us>,
	Daniel Borkmann <daniel@iogearbox.net>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Vlad Buslov <vladbu@nvidia.com>,
	Marcelo Ricardo Leitner <mleitner@redhat.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	llu@fiberby.dk
Subject: Re: [PATCH net-next v4 3/3] net: sched: make skip_sw actually skip software
Date: Wed, 27 Mar 2024 13:52:21 +0000	[thread overview]
Message-ID: <20240327135221.GG403975@kernel.org> (raw)
In-Reply-To: <20240325204740.1393349-4-ast@fiberby.net>

On Mon, Mar 25, 2024 at 08:47:36PM +0000, Asbjørn Sloth Tønnesen wrote:
> TC filters come in 3 variants:
> - no flag (try to process in hardware, but fallback to software))
> - skip_hw (do not process filter by hardware)
> - skip_sw (do not process filter by software)
> 
> However skip_sw is implemented so that the skip_sw
> flag can first be checked, after it has been matched.
> 
> IMHO it's common when using skip_sw, to use it on all rules.
> 
> So if all filters in a block is skip_sw filters, then
> we can bail early, we can thus avoid having to match
> the filters, just to check for the skip_sw flag.
> 
> This patch adds a bypass, for when only TC skip_sw rules
> are used. The bypass is guarded by a static key, to avoid
> harming other workloads.
> 
> There are 3 ways that a packet from a skip_sw ruleset, can
> end up in the kernel path. Although the send packets to a
> non-existent chain way is only improved a few percents, then
> I believe it's worth optimizing the trap and fall-though
> use-cases.
> 
>  +----------------------------+--------+--------+--------+
>  | Test description           | Pre-   | Post-  | Rel.   |
>  |                            | kpps   | kpps   | chg.   |
>  +----------------------------+--------+--------+--------+
>  | basic forwarding + notrack | 3589.3 | 3587.9 |  1.00x |
>  | switch to eswitch mode     | 3081.8 | 3094.7 |  1.00x |
>  | add ingress qdisc          | 3042.9 | 3063.6 |  1.01x |
>  | tc forward in hw / skip_sw |37024.7 |37028.4 |  1.00x |
>  | tc forward in sw / skip_hw | 3245.0 | 3245.3 |  1.00x |
>  +----------------------------+--------+--------+--------+
>  | tests with only skip_sw rules below:                  |
>  +----------------------------+--------+--------+--------+
>  | 1 non-matching rule        | 2694.7 | 3058.7 |  1.14x |
>  | 1 n-m rule, match trap     | 2611.2 | 3323.1 |  1.27x |
>  | 1 n-m rule, goto non-chain | 2886.8 | 2945.9 |  1.02x |
>  | 5 non-matching rules       | 1958.2 | 3061.3 |  1.56x |
>  | 5 n-m rules, match trap    | 1911.9 | 3327.0 |  1.74x |
>  | 5 n-m rules, goto non-chain| 2883.1 | 2947.5 |  1.02x |
>  | 10 non-matching rules      | 1466.3 | 3062.8 |  2.09x |
>  | 10 n-m rules, match trap   | 1444.3 | 3317.9 |  2.30x |
>  | 10 n-m rules,goto non-chain| 2883.1 | 2939.5 |  1.02x |
>  | 25 non-matching rules      |  838.5 | 3058.9 |  3.65x |
>  | 25 n-m rules, match trap   |  824.5 | 3323.0 |  4.03x |
>  | 25 n-m rules,goto non-chain| 2875.8 | 2944.7 |  1.02x |
>  | 50 non-matching rules      |  488.1 | 3054.7 |  6.26x |
>  | 50 n-m rules, match trap   |  484.9 | 3318.5 |  6.84x |
>  | 50 n-m rules,goto non-chain| 2884.1 | 2939.7 |  1.02x |
>  +----------------------------+--------+--------+--------+
> 
> perf top (25 n-m skip_sw rules - pre patch):
>   20.39%  [kernel]  [k] __skb_flow_dissect
>   16.43%  [kernel]  [k] rhashtable_jhash2
>   10.58%  [kernel]  [k] fl_classify
>   10.23%  [kernel]  [k] fl_mask_lookup
>    4.79%  [kernel]  [k] memset_orig
>    2.58%  [kernel]  [k] tcf_classify
>    1.47%  [kernel]  [k] __x86_indirect_thunk_rax
>    1.42%  [kernel]  [k] __dev_queue_xmit
>    1.36%  [kernel]  [k] nft_do_chain
>    1.21%  [kernel]  [k] __rcu_read_lock
> 
> perf top (25 n-m skip_sw rules - post patch):
>    5.12%  [kernel]  [k] __dev_queue_xmit
>    4.77%  [kernel]  [k] nft_do_chain
>    3.65%  [kernel]  [k] dev_gro_receive
>    3.41%  [kernel]  [k] check_preemption_disabled
>    3.14%  [kernel]  [k] mlx5e_skb_from_cqe_mpwrq_nonlinear
>    2.88%  [kernel]  [k] __netif_receive_skb_core.constprop.0
>    2.49%  [kernel]  [k] mlx5e_xmit
>    2.15%  [kernel]  [k] ip_forward
>    1.95%  [kernel]  [k] mlx5e_tc_restore_tunnel
>    1.92%  [kernel]  [k] vlan_gro_receive
> 
> Test setup:
>  DUT: Intel Xeon D-1518 (2.20GHz) w/ Nvidia/Mellanox ConnectX-6 Dx 2x100G
>  Data rate measured on switch (Extreme X690), and DUT connected as
>  a router on a stick, with pktgen and pktsink as VLANs.
>  Pktgen-dpdk was in range 36.6-37.7 Mpps 64B packets across all tests.
>  Full test data at https://files.fiberby.net/ast/2024/tc_skip_sw/v2_tests/
> 
> Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>

Reviewed-by: Simon Horman <horms@kernel.org>


  reply	other threads:[~2024-03-27 13:52 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-25 20:47 [PATCH net-next v4 0/3] make skip_sw actually skip software Asbjørn Sloth Tønnesen
2024-03-25 20:47 ` [PATCH net-next v4 1/3] net: sched: cls_api: add skip_sw counter Asbjørn Sloth Tønnesen
2024-03-27 13:51   ` Simon Horman
2024-03-28  0:46   ` Marcelo Ricardo Leitner
2024-03-25 20:47 ` [PATCH net-next v4 2/3] net: sched: cls_api: add filter counter Asbjørn Sloth Tønnesen
2024-03-27 13:51   ` Simon Horman
2024-03-28  0:46   ` Marcelo Ricardo Leitner
2024-03-25 20:47 ` [PATCH net-next v4 3/3] net: sched: make skip_sw actually skip software Asbjørn Sloth Tønnesen
2024-03-27 13:52   ` Simon Horman [this message]
2024-03-28  0:46   ` Marcelo Ricardo Leitner
2024-03-29  9:50 ` [PATCH net-next v4 0/3] " patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240327135221.GG403975@kernel.org \
    --to=horms@kernel.org \
    --cc=ast@fiberby.net \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llu@fiberby.dk \
    --cc=mleitner@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=vladbu@nvidia.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).