From: Simon Horman <horms@kernel.org>
To: Petr Machata <petrm@nvidia.com>
Cc: "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
netdev@vger.kernel.org, Amit Cohen <amcohen@nvidia.com>,
Ido Schimmel <idosch@nvidia.com>, Jiri Pirko <jiri@resnulli.us>,
Alexander Zubkov <green@qrator.net>,
mlxsw@nvidia.com
Subject: Re: [PATCH net 5/6] mlxsw: spectrum_acl_erp: Fix object nesting warning
Date: Sat, 8 Jun 2024 10:01:50 +0100 [thread overview]
Message-ID: <20240608090150.GR27689@kernel.org> (raw)
In-Reply-To: <c0c27909a09b9a47e03beb643b83784f75c7952c.1717684365.git.petrm@nvidia.com>
On Thu, Jun 06, 2024 at 04:49:42PM +0200, Petr Machata wrote:
> From: Ido Schimmel <idosch@nvidia.com>
>
> ACLs in Spectrum-2 and newer ASICs can reside in the algorithmic TCAM
> (A-TCAM) or in the ordinary circuit TCAM (C-TCAM). The former can
> contain more ACLs (i.e., tc filters), but the number of masks in each
> region (i.e., tc chain) is limited.
>
> In order to mitigate the effects of the above limitation, the device
> allows filters to share a single mask if their masks only differ in up
> to 8 consecutive bits. For example, dst_ip/25 can be represented using
> dst_ip/24 with a delta of 1 bit. The C-TCAM does not have a limit on the
> number of masks being used (and therefore does not support mask
> aggregation), but can contain a limited number of filters.
>
> The driver uses the "objagg" library to perform the mask aggregation by
> passing it objects that consist of the filter's mask and whether the
> filter is to be inserted into the A-TCAM or the C-TCAM since filters in
> different TCAMs cannot share a mask.
>
> The set of created objects is dependent on the insertion order of the
> filters and is not necessarily optimal. Therefore, the driver will
> periodically ask the library to compute a more optimal set ("hints") by
> looking at all the existing objects.
>
> When the library asks the driver whether two objects can be aggregated
> the driver only compares the provided masks and ignores the A-TCAM /
> C-TCAM indication. This is the right thing to do since the goal is to
> move as many filters as possible to the A-TCAM. The driver also forbids
> two identical masks from being aggregated since this can only happen if
> one was intentionally put in the C-TCAM to avoid a conflict in the
> A-TCAM.
>
> The above can result in the following set of hints:
>
> H1: {mask X, A-TCAM} -> H2: {mask Y, A-TCAM} // X is Y + delta
> H3: {mask Y, C-TCAM} -> H4: {mask Z, A-TCAM} // Y is Z + delta
>
> After getting the hints from the library the driver will start migrating
> filters from one region to another while consulting the computed hints
> and instructing the device to perform a lookup in both regions during
> the transition.
>
> Assuming a filter with mask X is being migrated into the A-TCAM in the
> new region, the hints lookup will return H1. Since H2 is the parent of
> H1, the library will try to find the object associated with it and
> create it if necessary in which case another hints lookup (recursive)
> will be performed. This hints lookup for {mask Y, A-TCAM} will either
> return H2 or H3 since the driver passes the library an object comparison
> function that ignores the A-TCAM / C-TCAM indication.
>
> This can eventually lead to nested objects which are not supported by
> the library [1].
>
> Fix by removing the object comparison function from both the driver and
> the library as the driver was the only user. That way the lookup will
> only return exact matches.
>
> I do not have a reliable reproducer that can reproduce the issue in a
> timely manner, but before the fix the issue would reproduce in several
> minutes and with the fix it does not reproduce in over an hour.
>
> Note that the current usefulness of the hints is limited because they
> include the C-TCAM indication and represent aggregation that cannot
> actually happen. This will be addressed in net-next.
>
> [1]
> WARNING: CPU: 0 PID: 153 at lib/objagg.c:170 objagg_obj_parent_assign+0xb5/0xd0
> Modules linked in:
> CPU: 0 PID: 153 Comm: kworker/0:18 Not tainted 6.9.0-rc6-custom-g70fbc2c1c38b #42
> Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018
> Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
> RIP: 0010:objagg_obj_parent_assign+0xb5/0xd0
> [...]
> Call Trace:
> <TASK>
> __objagg_obj_get+0x2bb/0x580
> objagg_obj_get+0xe/0x80
> mlxsw_sp_acl_erp_mask_get+0xb5/0xf0
> mlxsw_sp_acl_atcam_entry_add+0xe8/0x3c0
> mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
> mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270
> mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510
> process_one_work+0x151/0x370
>
> Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> Reviewed-by: Amit Cohen <amcohen@nvidia.com>
> Tested-by: Alexander Zubkov <green@qrator.net>
> Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
next prev parent reply other threads:[~2024-06-08 9:01 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-06 14:49 [PATCH net 0/6] mlxsw: ACL fixes Petr Machata
2024-06-06 14:49 ` [PATCH net 1/6] lib: objagg: Fix spelling Petr Machata
2024-06-08 9:00 ` Simon Horman
2024-06-06 14:49 ` [PATCH net 2/6] lib: test_objagg: " Petr Machata
2024-06-08 9:00 ` Simon Horman
2024-06-06 14:49 ` [PATCH net 3/6] mlxsw: spectrum_acl_atcam: Fix wrong comment Petr Machata
2024-06-08 9:01 ` Simon Horman
2024-06-06 14:49 ` [PATCH net 4/6] lib: objagg: Fix general protection fault Petr Machata
2024-06-08 9:01 ` Simon Horman
2024-06-06 14:49 ` [PATCH net 5/6] mlxsw: spectrum_acl_erp: Fix object nesting warning Petr Machata
2024-06-08 9:01 ` Simon Horman [this message]
2024-06-06 14:49 ` [PATCH net 6/6] mlxsw: spectrum_acl: Fix ACL scale regression and firmware errors Petr Machata
2024-06-08 9:01 ` Simon Horman
2024-06-10 10:20 ` [PATCH net 0/6] mlxsw: ACL fixes patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240608090150.GR27689@kernel.org \
--to=horms@kernel.org \
--cc=amcohen@nvidia.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=green@qrator.net \
--cc=idosch@nvidia.com \
--cc=jiri@resnulli.us \
--cc=kuba@kernel.org \
--cc=mlxsw@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=petrm@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).