From: Tariq Toukan <tariqt@nvidia.com>
To: "David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Eric Dumazet <edumazet@google.com>,
"Andrew Lunn" <andrew+netdev@lunn.ch>
Cc: Saeed Mahameed <saeedm@nvidia.com>,
Leon Romanovsky <leon@kernel.org>,
Tariq Toukan <tariqt@nvidia.com>, <netdev@vger.kernel.org>,
<linux-rdma@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
Moshe Shemesh <moshe@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
Vlad Dogaru <vdogaru@nvidia.com>,
Yevgeny Kliteynik <kliteyn@nvidia.com>,
Gal Pressman <gal@nvidia.com>
Subject: [PATCH net-next 00/10] net/mlx5: HWS, Complex Matchers and rehash mechanism fixes
Date: Sun, 11 May 2025 22:38:00 +0300 [thread overview]
Message-ID: <1746992290-568936-1-git-send-email-tariqt@nvidia.com> (raw)
Hi,
This series by Yevgeny adds Hardware Steering support for Complex
Matchers/Rules (patches 1-5), and rehash mechanism fixes (patches 6-10).
See detailed descriptions by Yevgeny below [1][2].
Regards,
Tariq
[1]
Motivation:
----------
A matcher can match a certain set of match parameters. However,
the number and size of match params for a single matcher are
limited — all the parameters must fit within a single definer.
A common example of this limitation is IPv6 address matching, where
matching both source and destination IPs requires more bits than a
single definer can support.
SW Steering addresses this limitation by chaining multiple Steering
Table Entries (STEs) within the same matcher, where each STE matches
on a subset of the parameters.
In HW Steering, such chaining is not possible — the matcher's STEs
are managed in a hash table, and a single definer is used to calculate
the hash index for STEs.
Overview:
--------
To address this limitation in HW Steering, we introduce
*Complex Matchers*, which consist of two chained matchers. This allows
matching on twice as many parameters. Complex Matchers are filled with
*Complex Rules* — rules that are split into two parts and inserted into
their respective matchers.
The first half of the Complex Matcher is a regular matcher and points
to the second half, which is an *Isolated Matcher*. An Isolated Matcher
has its own isolated table and is accessible only by traffic coming
from the first half of the Complex Matcher.
This splitting of matchers/rules into multiple parts is transparent to
users. It is hidden behind the BWC HWS API. It becomes visible only
when dumping steering debug information, where the Complex Matcher
appears as two separate matchers: one in the user-created table and
another in its isolated table.
Implementation Details:
----------------------
All user actions are performed on the second part of the rules only.
The first part handles matching and applies two actions: modify header
(set metadata, see details below) and go-to-table (directing traffic
to the isolated table containing the isolated matcher).
Rule updates (updating rule actions) are applied to the second part
of the rule since user-provided actions are not executed in the first
matcher.
We use REG_C_6 metadata register to set and match on unique per-rule
tag (see details below).
Splitting rules into two parts introduces new challenges:
1. Invalid Combinations
Consider two rules with different matching values:
- Rule 1: A+B
- Rule 2: C+D
Let's split the rules into two parts as follows:
|-----Complex Matcher-------|
| |
| 1st matcher 2nd matcher |
| |---| |---| |
| | A | | B | |
| |---| -----> |---| |
| | C | | D | |
| |---| |---| |
| |
|---------------------------|
Splitting these rules results in invalid combinations: A+D and C+B:
any packet that matched on A will be forwarded to the 2nd matcher,
where it will try to match on B (which is legal, and it is what the
user asked for), but it will also try to match on D (which is not
what the user asked for). To resolve this, we assign unique tags
to each rule on the first matcher and match on these tags on the
second matcher:
|----------| |---------|
| A | | B, TagA |
| action: | | |
| set TagA | | |
|----------| --> |---------|
| C | | D, TagB |
| action: | | |
| set TagB | | |
|----------| |---------|
2. Duplicated Entries:
Consider two rules with overlapping values:
- Rule 1: A+B
- Rule 2: A+D
Let's split the rules into two parts as follows:
|---| |---|
| A | | B |
|---| --> |---|
| | | D |
|---| |---|
This leads to the duplicated entries on the first matcher, which HWS
doesn't allow: subsequent delete of either of the rules will delete
the only entry in the first matcher, leaving the remaining rule
broken. To address this, we use a reference count for entries in the
first matcher and delete STEs only when their refcount reaches zero.
Both challenges are resolved by having a per-matcher data structure
(implemented with rhashtable) that manages refcounts for the first part
of the rules and holds unique tags (managed via IDA) for these rules to
set and to match on the second matcher.
Limitations:
-----------
We utilize metadata register REG_C_6 in this implementation, so its
usage anywhere along the flow that might include the need for Complex
Matcher is prohibited.
The number and size of match parameters remain limited — now
constrained by what can be represented by two definers instead of one.
This architectural limitation arises from the structure of Complex
Matchers. If future requirements demand more parameters, Complex
Matchers can be extended beyond two matchers.
Additionally, there is an implementation limit of 32 match parameters
per matcher (disregarding parameter size). This limit can be lifted
if needed.
Patches:
-------
- Patches 1-3: small additions/refactoring in preparation for
Complex Matcher: exposed mlx5hws_table_ft_set_next_ft() in header,
added definer function to convert field name enum to string,
expose the polling function mlx5hws_bwc_queue_poll() in a header.
- Patch 4: in preparation for Complex Matcher, this patch adds
support for Isolated Matcher.
- Patch 5: the main patch - Complex Matchers implementation.
[2]
Patch 6: fixing the usecase where rule insertion was failing,
but rehash couldn't be initiated if the number of rules in
the table is below the rehash threshold.
Patch 7: fixing the usecase where many rules in parallel
would require rehash, due to the way the counting of rules
was done.
Patch 8: fixing the case where rules were requiring action
template extension in parallel, leading to unneeded extensions
with the same templates.
Patch 9: refactor and simplify the rehash loop.
Patch 10: dump error completion details, which helps a lot
in trying to understand what went wrong, especially during
rehash.
Yevgeny Kliteynik (10):
net/mlx5: HWS, expose function mlx5hws_table_ft_set_next_ft in header
net/mlx5: HWS, add definer function to get field name str
net/mlx5: HWS, expose polling function in header file
net/mlx5: HWS, introduce isolated matchers
net/mlx5: HWS, support complex matchers
net/mlx5: HWS, force rehash when rule insertion failed
net/mlx5: HWS, fix counting of rules in the matcher
net/mlx5: HWS, fix redundant extension of action templates
net/mlx5: HWS, rework rehash loop
net/mlx5: HWS, dump bad completion details
.../mellanox/mlx5/core/steering/hws/bwc.c | 330 ++--
.../mellanox/mlx5/core/steering/hws/bwc.h | 11 +
.../mlx5/core/steering/hws/bwc_complex.c | 1348 ++++++++++++++++-
.../mlx5/core/steering/hws/bwc_complex.h | 21 +
.../mellanox/mlx5/core/steering/hws/definer.c | 212 +++
.../mellanox/mlx5/core/steering/hws/definer.h | 2 +
.../mellanox/mlx5/core/steering/hws/matcher.c | 278 +++-
.../mellanox/mlx5/core/steering/hws/matcher.h | 9 +
.../mellanox/mlx5/core/steering/hws/mlx5hws.h | 2 +
.../mellanox/mlx5/core/steering/hws/send.c | 122 +-
.../mellanox/mlx5/core/steering/hws/send.h | 1 +
.../mellanox/mlx5/core/steering/hws/table.c | 16 +-
.../mellanox/mlx5/core/steering/hws/table.h | 5 +
13 files changed, 2178 insertions(+), 179 deletions(-)
base-commit: 0b28182c73a3d013bcabbb890dc1070a8388f55a
--
2.31.1
next reply other threads:[~2025-05-11 19:39 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-11 19:38 Tariq Toukan [this message]
2025-05-11 19:38 ` [PATCH net-next 01/10] net/mlx5: HWS, expose function mlx5hws_table_ft_set_next_ft in header Tariq Toukan
2025-05-11 19:38 ` [PATCH net-next 02/10] net/mlx5: HWS, add definer function to get field name str Tariq Toukan
2025-05-11 19:38 ` [PATCH net-next 03/10] net/mlx5: HWS, expose polling function in header file Tariq Toukan
2025-05-11 19:38 ` [PATCH net-next 04/10] net/mlx5: HWS, introduce isolated matchers Tariq Toukan
2025-05-11 19:38 ` [PATCH net-next 05/10] net/mlx5: HWS, support complex matchers Tariq Toukan
2025-05-11 19:38 ` [PATCH net-next 06/10] net/mlx5: HWS, force rehash when rule insertion failed Tariq Toukan
2025-05-11 19:38 ` [PATCH net-next 07/10] net/mlx5: HWS, fix counting of rules in the matcher Tariq Toukan
2025-05-11 19:38 ` [PATCH net-next 08/10] net/mlx5: HWS, fix redundant extension of action templates Tariq Toukan
2025-05-11 19:38 ` [PATCH net-next 09/10] net/mlx5: HWS, rework rehash loop Tariq Toukan
2025-05-11 19:38 ` [PATCH net-next 10/10] net/mlx5: HWS, dump bad completion details Tariq Toukan
2025-05-13 22:50 ` [PATCH net-next 00/10] net/mlx5: HWS, Complex Matchers and rehash mechanism fixes patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1746992290-568936-1-git-send-email-tariqt@nvidia.com \
--to=tariqt@nvidia.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=kliteyn@nvidia.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=mbloch@nvidia.com \
--cc=moshe@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=saeedm@nvidia.com \
--cc=vdogaru@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox