From: Simon Horman <horms@kernel.org>
To: "Asbjørn Sloth Tønnesen" <ast@fiberby.net>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>,
Cong Wang <xiyou.wangcong@gmail.com>,
Jiri Pirko <jiri@resnulli.us>,
Daniel Borkmann <daniel@iogearbox.net>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Vlad Buslov <vladbu@nvidia.com>,
Marcelo Ricardo Leitner <mleitner@redhat.com>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
llu@fiberby.dk
Subject: Re: [PATCH net-next v4 3/3] net: sched: make skip_sw actually skip software
Date: Wed, 27 Mar 2024 13:52:21 +0000 [thread overview]
Message-ID: <20240327135221.GG403975@kernel.org> (raw)
In-Reply-To: <20240325204740.1393349-4-ast@fiberby.net>
On Mon, Mar 25, 2024 at 08:47:36PM +0000, Asbjørn Sloth Tønnesen wrote:
> TC filters come in 3 variants:
> - no flag (try to process in hardware, but fallback to software))
> - skip_hw (do not process filter by hardware)
> - skip_sw (do not process filter by software)
>
> However skip_sw is implemented so that the skip_sw
> flag can first be checked, after it has been matched.
>
> IMHO it's common when using skip_sw, to use it on all rules.
>
> So if all filters in a block is skip_sw filters, then
> we can bail early, we can thus avoid having to match
> the filters, just to check for the skip_sw flag.
>
> This patch adds a bypass, for when only TC skip_sw rules
> are used. The bypass is guarded by a static key, to avoid
> harming other workloads.
>
> There are 3 ways that a packet from a skip_sw ruleset, can
> end up in the kernel path. Although the send packets to a
> non-existent chain way is only improved a few percents, then
> I believe it's worth optimizing the trap and fall-though
> use-cases.
>
> +----------------------------+--------+--------+--------+
> | Test description | Pre- | Post- | Rel. |
> | | kpps | kpps | chg. |
> +----------------------------+--------+--------+--------+
> | basic forwarding + notrack | 3589.3 | 3587.9 | 1.00x |
> | switch to eswitch mode | 3081.8 | 3094.7 | 1.00x |
> | add ingress qdisc | 3042.9 | 3063.6 | 1.01x |
> | tc forward in hw / skip_sw |37024.7 |37028.4 | 1.00x |
> | tc forward in sw / skip_hw | 3245.0 | 3245.3 | 1.00x |
> +----------------------------+--------+--------+--------+
> | tests with only skip_sw rules below: |
> +----------------------------+--------+--------+--------+
> | 1 non-matching rule | 2694.7 | 3058.7 | 1.14x |
> | 1 n-m rule, match trap | 2611.2 | 3323.1 | 1.27x |
> | 1 n-m rule, goto non-chain | 2886.8 | 2945.9 | 1.02x |
> | 5 non-matching rules | 1958.2 | 3061.3 | 1.56x |
> | 5 n-m rules, match trap | 1911.9 | 3327.0 | 1.74x |
> | 5 n-m rules, goto non-chain| 2883.1 | 2947.5 | 1.02x |
> | 10 non-matching rules | 1466.3 | 3062.8 | 2.09x |
> | 10 n-m rules, match trap | 1444.3 | 3317.9 | 2.30x |
> | 10 n-m rules,goto non-chain| 2883.1 | 2939.5 | 1.02x |
> | 25 non-matching rules | 838.5 | 3058.9 | 3.65x |
> | 25 n-m rules, match trap | 824.5 | 3323.0 | 4.03x |
> | 25 n-m rules,goto non-chain| 2875.8 | 2944.7 | 1.02x |
> | 50 non-matching rules | 488.1 | 3054.7 | 6.26x |
> | 50 n-m rules, match trap | 484.9 | 3318.5 | 6.84x |
> | 50 n-m rules,goto non-chain| 2884.1 | 2939.7 | 1.02x |
> +----------------------------+--------+--------+--------+
>
> perf top (25 n-m skip_sw rules - pre patch):
> 20.39% [kernel] [k] __skb_flow_dissect
> 16.43% [kernel] [k] rhashtable_jhash2
> 10.58% [kernel] [k] fl_classify
> 10.23% [kernel] [k] fl_mask_lookup
> 4.79% [kernel] [k] memset_orig
> 2.58% [kernel] [k] tcf_classify
> 1.47% [kernel] [k] __x86_indirect_thunk_rax
> 1.42% [kernel] [k] __dev_queue_xmit
> 1.36% [kernel] [k] nft_do_chain
> 1.21% [kernel] [k] __rcu_read_lock
>
> perf top (25 n-m skip_sw rules - post patch):
> 5.12% [kernel] [k] __dev_queue_xmit
> 4.77% [kernel] [k] nft_do_chain
> 3.65% [kernel] [k] dev_gro_receive
> 3.41% [kernel] [k] check_preemption_disabled
> 3.14% [kernel] [k] mlx5e_skb_from_cqe_mpwrq_nonlinear
> 2.88% [kernel] [k] __netif_receive_skb_core.constprop.0
> 2.49% [kernel] [k] mlx5e_xmit
> 2.15% [kernel] [k] ip_forward
> 1.95% [kernel] [k] mlx5e_tc_restore_tunnel
> 1.92% [kernel] [k] vlan_gro_receive
>
> Test setup:
> DUT: Intel Xeon D-1518 (2.20GHz) w/ Nvidia/Mellanox ConnectX-6 Dx 2x100G
> Data rate measured on switch (Extreme X690), and DUT connected as
> a router on a stick, with pktgen and pktsink as VLANs.
> Pktgen-dpdk was in range 36.6-37.7 Mpps 64B packets across all tests.
> Full test data at https://files.fiberby.net/ast/2024/tc_skip_sw/v2_tests/
>
> Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Simon Horman <horms@kernel.org>
next prev parent reply other threads:[~2024-03-27 13:52 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-25 20:47 [PATCH net-next v4 0/3] make skip_sw actually skip software Asbjørn Sloth Tønnesen
2024-03-25 20:47 ` [PATCH net-next v4 1/3] net: sched: cls_api: add skip_sw counter Asbjørn Sloth Tønnesen
2024-03-27 13:51 ` Simon Horman
2024-03-28 0:46 ` Marcelo Ricardo Leitner
2024-03-25 20:47 ` [PATCH net-next v4 2/3] net: sched: cls_api: add filter counter Asbjørn Sloth Tønnesen
2024-03-27 13:51 ` Simon Horman
2024-03-28 0:46 ` Marcelo Ricardo Leitner
2024-03-25 20:47 ` [PATCH net-next v4 3/3] net: sched: make skip_sw actually skip software Asbjørn Sloth Tønnesen
2024-03-27 13:52 ` Simon Horman [this message]
2024-03-28 0:46 ` Marcelo Ricardo Leitner
2024-03-29 9:50 ` [PATCH net-next v4 0/3] " patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240327135221.GG403975@kernel.org \
--to=horms@kernel.org \
--cc=ast@fiberby.net \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=llu@fiberby.dk \
--cc=mleitner@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=vladbu@nvidia.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.