From: Simon Horman <horms@kernel.org>
To: Petr Machata <petrm@nvidia.com>
Cc: "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
netdev@vger.kernel.org, Amit Cohen <amcohen@nvidia.com>,
Ido Schimmel <idosch@nvidia.com>, Jiri Pirko <jiri@resnulli.us>,
Alexander Zubkov <green@qrator.net>,
mlxsw@nvidia.com
Subject: Re: [PATCH net 5/6] mlxsw: spectrum_acl_erp: Fix object nesting warning
Date: Sat, 8 Jun 2024 10:01:50 +0100 [thread overview]
Message-ID: <20240608090150.GR27689@kernel.org> (raw)
In-Reply-To: <c0c27909a09b9a47e03beb643b83784f75c7952c.1717684365.git.petrm@nvidia.com>
On Thu, Jun 06, 2024 at 04:49:42PM +0200, Petr Machata wrote:
> From: Ido Schimmel <idosch@nvidia.com>
>
> ACLs in Spectrum-2 and newer ASICs can reside in the algorithmic TCAM
> (A-TCAM) or in the ordinary circuit TCAM (C-TCAM). The former can
> contain more ACLs (i.e., tc filters), but the number of masks in each
> region (i.e., tc chain) is limited.
>
> In order to mitigate the effects of the above limitation, the device
> allows filters to share a single mask if their masks only differ in up
> to 8 consecutive bits. For example, dst_ip/25 can be represented using
> dst_ip/24 with a delta of 1 bit. The C-TCAM does not have a limit on the
> number of masks being used (and therefore does not support mask
> aggregation), but can contain a limited number of filters.
>
> The driver uses the "objagg" library to perform the mask aggregation by
> passing it objects that consist of the filter's mask and whether the
> filter is to be inserted into the A-TCAM or the C-TCAM since filters in
> different TCAMs cannot share a mask.
>
> The set of created objects is dependent on the insertion order of the
> filters and is not necessarily optimal. Therefore, the driver will
> periodically ask the library to compute a more optimal set ("hints") by
> looking at all the existing objects.
>
> When the library asks the driver whether two objects can be aggregated
> the driver only compares the provided masks and ignores the A-TCAM /
> C-TCAM indication. This is the right thing to do since the goal is to
> move as many filters as possible to the A-TCAM. The driver also forbids
> two identical masks from being aggregated since this can only happen if
> one was intentionally put in the C-TCAM to avoid a conflict in the
> A-TCAM.
>
> The above can result in the following set of hints:
>
> H1: {mask X, A-TCAM} -> H2: {mask Y, A-TCAM} // X is Y + delta
> H3: {mask Y, C-TCAM} -> H4: {mask Z, A-TCAM} // Y is Z + delta
>
> After getting the hints from the library the driver will start migrating
> filters from one region to another while consulting the computed hints
> and instructing the device to perform a lookup in both regions during
> the transition.
>
> Assuming a filter with mask X is being migrated into the A-TCAM in the
> new region, the hints lookup will return H1. Since H2 is the parent of
> H1, the library will try to find the object associated with it and
> create it if necessary in which case another hints lookup (recursive)
> will be performed. This hints lookup for {mask Y, A-TCAM} will either
> return H2 or H3 since the driver passes the library an object comparison
> function that ignores the A-TCAM / C-TCAM indication.
>
> This can eventually lead to nested objects which are not supported by
> the library [1].
>
> Fix by removing the object comparison function from both the driver and
> the library as the driver was the only user. That way the lookup will
> only return exact matches.
>
> I do not have a reliable reproducer that can reproduce the issue in a
> timely manner, but before the fix the issue would reproduce in several
> minutes and with the fix it does not reproduce in over an hour.
>
> Note that the current usefulness of the hints is limited because they
> include the C-TCAM indication and represent aggregation that cannot
> actually happen. This will be addressed in net-next.
>
> [1]
> WARNING: CPU: 0 PID: 153 at lib/objagg.c:170 objagg_obj_parent_assign+0xb5/0xd0
> Modules linked in:
> CPU: 0 PID: 153 Comm: kworker/0:18 Not tainted 6.9.0-rc6-custom-g70fbc2c1c38b #42
> Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018
> Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
> RIP: 0010:objagg_obj_parent_assign+0xb5/0xd0
> [...]
> Call Trace:
> <TASK>
> __objagg_obj_get+0x2bb/0x580
> objagg_obj_get+0xe/0x80
> mlxsw_sp_acl_erp_mask_get+0xb5/0xf0
> mlxsw_sp_acl_atcam_entry_add+0xe8/0x3c0
> mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
> mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270
> mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510
> process_one_work+0x151/0x370
>
> Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> Reviewed-by: Amit Cohen <amcohen@nvidia.com>
> Tested-by: Alexander Zubkov <green@qrator.net>
> Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
next prev parent reply other threads:[~2024-06-08 9:01 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-06 14:49 [PATCH net 0/6] mlxsw: ACL fixes Petr Machata
2024-06-06 14:49 ` [PATCH net 1/6] lib: objagg: Fix spelling Petr Machata
2024-06-08 9:00 ` Simon Horman
2024-06-06 14:49 ` [PATCH net 2/6] lib: test_objagg: " Petr Machata
2024-06-08 9:00 ` Simon Horman
2024-06-06 14:49 ` [PATCH net 3/6] mlxsw: spectrum_acl_atcam: Fix wrong comment Petr Machata
2024-06-08 9:01 ` Simon Horman
2024-06-06 14:49 ` [PATCH net 4/6] lib: objagg: Fix general protection fault Petr Machata
2024-06-08 9:01 ` Simon Horman
2024-06-06 14:49 ` [PATCH net 5/6] mlxsw: spectrum_acl_erp: Fix object nesting warning Petr Machata
2024-06-08 9:01 ` Simon Horman [this message]
2024-06-06 14:49 ` [PATCH net 6/6] mlxsw: spectrum_acl: Fix ACL scale regression and firmware errors Petr Machata
2024-06-08 9:01 ` Simon Horman
2024-06-10 10:20 ` [PATCH net 0/6] mlxsw: ACL fixes patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240608090150.GR27689@kernel.org \
--to=horms@kernel.org \
--cc=amcohen@nvidia.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=green@qrator.net \
--cc=idosch@nvidia.com \
--cc=jiri@resnulli.us \
--cc=kuba@kernel.org \
--cc=mlxsw@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=petrm@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.