public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Pirko <jiri@resnulli.us>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Asbjørn Sloth Tønnesen" <ast@fiberby.net>,
	"Cong Wang" <xiyou.wangcong@gmail.com>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	llu@fiberby.dk, "Vlad Buslov" <vladbu@nvidia.com>,
	"Marcelo Ricardo Leitner" <mleitner@redhat.com>
Subject: Re: [PATCH net-next 3/3] net: sched: make skip_sw actually skip software
Date: Fri, 16 Feb 2024 13:57:39 +0100	[thread overview]
Message-ID: <Zc9bw8eHa5z_xh6Y@nanopsycho> (raw)
In-Reply-To: <CAM0EoMmyGwA9Q=RibR+Fc41_dPZyhBRWiBEejSbPsS9NhaUFVQ@mail.gmail.com>

Thu, Feb 15, 2024 at 06:49:05PM CET, jhs@mojatatu.com wrote:
>On Thu, Feb 15, 2024 at 11:06 AM Asbjørn Sloth Tønnesen <ast@fiberby.net> wrote:
>>
>> TC filters come in 3 variants:
>> - no flag (no opinion, process wherever possible)
>> - skip_hw (do not process filter by hardware)
>> - skip_sw (do not process filter by software)
>>
>> However skip_sw is implemented so that the skip_sw
>> flag can first be checked, after it has been matched.
>>
>> IMHO it's common when using skip_sw, to use it on all rules.
>>
>> So if all filters in a block is skip_sw filters, then
>> we can bail early, we can thus avoid having to match
>> the filters, just to check for the skip_sw flag.
>>
>>  +----------------------------+--------+--------+--------+
>>  | Test description           | Pre    | Post   | Rel.   |
>>  |                            | kpps   | kpps   | chg.   |
>>  +----------------------------+--------+--------+--------+
>>  | basic forwarding + notrack | 1264.9 | 1277.7 |  1.01x |
>>  | switch to eswitch mode     | 1067.1 | 1071.0 |  1.00x |
>>  | add ingress qdisc          | 1056.0 | 1059.1 |  1.00x |
>>  +----------------------------+--------+--------+--------+
>>  | 1 non-matching rule        |  927.9 | 1057.1 |  1.14x |
>>  | 10 non-matching rules      |  495.8 | 1055.6 |  2.13x |
>>  | 25 non-matching rules      |  280.6 | 1053.5 |  3.75x |
>>  | 50 non-matching rules      |  162.0 | 1055.7 |  6.52x |
>>  | 100 non-matching rules     |   87.7 | 1019.0 | 11.62x |
>>  +----------------------------+--------+--------+--------+
>>
>> perf top (100 n-m skip_sw rules - pre patch):
>>   25.57%  [kernel]  [k] __skb_flow_dissect
>>   20.77%  [kernel]  [k] rhashtable_jhash2
>>   14.26%  [kernel]  [k] fl_classify
>>   13.28%  [kernel]  [k] fl_mask_lookup
>>    6.38%  [kernel]  [k] memset_orig
>>    3.22%  [kernel]  [k] tcf_classify
>>
>> perf top (100 n-m skip_sw rules - post patch):
>>    4.28%  [kernel]  [k] __dev_queue_xmit
>>    3.80%  [kernel]  [k] check_preemption_disabled
>>    3.68%  [kernel]  [k] nft_do_chain
>>    3.08%  [kernel]  [k] __netif_receive_skb_core.constprop.0
>>    2.59%  [kernel]  [k] mlx5e_xmit
>>    2.48%  [kernel]  [k] mlx5e_skb_from_cqe_mpwrq_nonlinear
>>
>
>The concept makes sense - but i am wondering when you have a mix of
>skip_sw and skip_hw if it makes more sense to just avoid looking up
>skip_sw at all in the s/w datapath? Potentially by separating the
>hashes for skip_sw/hw. I know it's a deeper surgery - but would be

Yeah, there could be 2 hashes: skip_sw/rest
rest is the only one that needs to be looked-up in kernel datapath.
skip_sw is just for control path.

But is it worth the efford? I mean, since now, nobody seemed to care. If
this patchset solves the problem for this usecase, I think it is enough.

In that case, I'm fine with this patch:

Reviewed-by: Jiri Pirko <jiri@nvidia.com>



>more general purpose....unless i am missing something
>
>> Test setup:
>>  DUT: Intel Xeon D-1518 (2.20GHz) w/ Nvidia/Mellanox ConnectX-6 Dx 2x100G
>>  Data rate measured on switch (Extreme X690), and DUT connected as
>>  a router on a stick, with pktgen and pktsink as VLANs.
>>  Pktgen was in range 12.79 - 12.95 Mpps across all tests.
>>
>
>Hrm. Those are "tiny" numbers (25G @64B is about 3x that). What are
>the packet sizes?
>Perhaps the traffic generator is a limitation here?
>Also feels like you are doing exact matches? A sample flower rule
>would have helped.
>
>cheers,
>jamal
>> Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
>> ---
>>  include/net/pkt_cls.h | 5 +++++
>>  net/core/dev.c        | 3 +++
>>  2 files changed, 8 insertions(+)
>>
>> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
>> index a4ee43f493bb..a065da4df7ff 100644
>> --- a/include/net/pkt_cls.h
>> +++ b/include/net/pkt_cls.h
>> @@ -74,6 +74,11 @@ static inline bool tcf_block_non_null_shared(struct tcf_block *block)
>>         return block && block->index;
>>  }
>>
>> +static inline bool tcf_block_has_skip_sw_only(struct tcf_block *block)
>> +{
>> +       return block && atomic_read(&block->filtercnt) == atomic_read(&block->skipswcnt);
>> +}
>> +
>>  static inline struct Qdisc *tcf_block_q(struct tcf_block *block)
>>  {
>>         WARN_ON(tcf_block_shared(block));
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index d8dd293a7a27..7cd014e5066e 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -3910,6 +3910,9 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
>>         if (!miniq)
>>                 return ret;
>>
>> +       if (tcf_block_has_skip_sw_only(miniq->block))
>> +               return ret;
>> +
>>         tc_skb_cb(skb)->mru = 0;
>>         tc_skb_cb(skb)->post_ct = false;
>>         tcf_set_drop_reason(skb, *drop_reason);
>> --
>> 2.43.0
>>

  reply	other threads:[~2024-02-16 12:57 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-15 16:04 [PATCH net-next 0/3] make skip_sw actually skip software Asbjørn Sloth Tønnesen
2024-02-15 16:04 ` [PATCH net-next 1/3] net: sched: cls_api: add skip_sw counter Asbjørn Sloth Tønnesen
2024-02-15 17:39   ` Jamal Hadi Salim
2024-02-15 23:34     ` Asbjørn Sloth Tønnesen
2024-02-16  8:35       ` Vlad Buslov
2024-02-16 12:52   ` Jiri Pirko
2024-02-15 16:04 ` [PATCH net-next 2/3] net: sched: cls_api: add filter counter Asbjørn Sloth Tønnesen
2024-02-15 17:25   ` Jiri Pirko
2024-02-15 23:19     ` Asbjørn Sloth Tønnesen
2024-02-15 16:04 ` [PATCH net-next 3/3] net: sched: make skip_sw actually skip software Asbjørn Sloth Tønnesen
2024-02-15 17:49   ` Jamal Hadi Salim
2024-02-16 12:57     ` Jiri Pirko [this message]
2024-02-16 15:07       ` Jamal Hadi Salim
2024-02-16 13:38     ` Asbjørn Sloth Tønnesen
2024-02-16  8:47   ` Vlad Buslov
2024-02-16 14:01     ` Asbjørn Sloth Tønnesen
2024-02-15 18:00 ` [PATCH net-next 0/3] " Marcelo Ricardo Leitner
2024-02-16  8:44   ` Vlad Buslov
2024-02-16 12:17   ` Asbjørn Sloth Tønnesen
2024-02-16 14:46     ` Marcelo Ricardo Leitner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zc9bw8eHa5z_xh6Y@nanopsycho \
    --to=jiri@resnulli.us \
    --cc=ast@fiberby.net \
    --cc=daniel@iogearbox.net \
    --cc=jhs@mojatatu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llu@fiberby.dk \
    --cc=mleitner@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=vladbu@nvidia.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox