netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Simon Horman <horms@kernel.org>
To: Petr Machata <petrm@nvidia.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	netdev@vger.kernel.org, Amit Cohen <amcohen@nvidia.com>,
	Ido Schimmel <idosch@nvidia.com>, Jiri Pirko <jiri@resnulli.us>,
	Alexander Zubkov <green@qrator.net>,
	mlxsw@nvidia.com
Subject: Re: [PATCH net 5/6] mlxsw: spectrum_acl_erp: Fix object nesting warning
Date: Sat, 8 Jun 2024 10:01:50 +0100	[thread overview]
Message-ID: <20240608090150.GR27689@kernel.org> (raw)
In-Reply-To: <c0c27909a09b9a47e03beb643b83784f75c7952c.1717684365.git.petrm@nvidia.com>

On Thu, Jun 06, 2024 at 04:49:42PM +0200, Petr Machata wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> ACLs in Spectrum-2 and newer ASICs can reside in the algorithmic TCAM
> (A-TCAM) or in the ordinary circuit TCAM (C-TCAM). The former can
> contain more ACLs (i.e., tc filters), but the number of masks in each
> region (i.e., tc chain) is limited.
> 
> In order to mitigate the effects of the above limitation, the device
> allows filters to share a single mask if their masks only differ in up
> to 8 consecutive bits. For example, dst_ip/25 can be represented using
> dst_ip/24 with a delta of 1 bit. The C-TCAM does not have a limit on the
> number of masks being used (and therefore does not support mask
> aggregation), but can contain a limited number of filters.
> 
> The driver uses the "objagg" library to perform the mask aggregation by
> passing it objects that consist of the filter's mask and whether the
> filter is to be inserted into the A-TCAM or the C-TCAM since filters in
> different TCAMs cannot share a mask.
> 
> The set of created objects is dependent on the insertion order of the
> filters and is not necessarily optimal. Therefore, the driver will
> periodically ask the library to compute a more optimal set ("hints") by
> looking at all the existing objects.
> 
> When the library asks the driver whether two objects can be aggregated
> the driver only compares the provided masks and ignores the A-TCAM /
> C-TCAM indication. This is the right thing to do since the goal is to
> move as many filters as possible to the A-TCAM. The driver also forbids
> two identical masks from being aggregated since this can only happen if
> one was intentionally put in the C-TCAM to avoid a conflict in the
> A-TCAM.
> 
> The above can result in the following set of hints:
> 
> H1: {mask X, A-TCAM} -> H2: {mask Y, A-TCAM} // X is Y + delta
> H3: {mask Y, C-TCAM} -> H4: {mask Z, A-TCAM} // Y is Z + delta
> 
> After getting the hints from the library the driver will start migrating
> filters from one region to another while consulting the computed hints
> and instructing the device to perform a lookup in both regions during
> the transition.
> 
> Assuming a filter with mask X is being migrated into the A-TCAM in the
> new region, the hints lookup will return H1. Since H2 is the parent of
> H1, the library will try to find the object associated with it and
> create it if necessary in which case another hints lookup (recursive)
> will be performed. This hints lookup for {mask Y, A-TCAM} will either
> return H2 or H3 since the driver passes the library an object comparison
> function that ignores the A-TCAM / C-TCAM indication.
> 
> This can eventually lead to nested objects which are not supported by
> the library [1].
> 
> Fix by removing the object comparison function from both the driver and
> the library as the driver was the only user. That way the lookup will
> only return exact matches.
> 
> I do not have a reliable reproducer that can reproduce the issue in a
> timely manner, but before the fix the issue would reproduce in several
> minutes and with the fix it does not reproduce in over an hour.
> 
> Note that the current usefulness of the hints is limited because they
> include the C-TCAM indication and represent aggregation that cannot
> actually happen. This will be addressed in net-next.
> 
> [1]
> WARNING: CPU: 0 PID: 153 at lib/objagg.c:170 objagg_obj_parent_assign+0xb5/0xd0
> Modules linked in:
> CPU: 0 PID: 153 Comm: kworker/0:18 Not tainted 6.9.0-rc6-custom-g70fbc2c1c38b #42
> Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018
> Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
> RIP: 0010:objagg_obj_parent_assign+0xb5/0xd0
> [...]
> Call Trace:
>  <TASK>
>  __objagg_obj_get+0x2bb/0x580
>  objagg_obj_get+0xe/0x80
>  mlxsw_sp_acl_erp_mask_get+0xb5/0xf0
>  mlxsw_sp_acl_atcam_entry_add+0xe8/0x3c0
>  mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
>  mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270
>  mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510
>  process_one_work+0x151/0x370
> 
> Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> Reviewed-by: Amit Cohen <amcohen@nvidia.com>
> Tested-by: Alexander Zubkov <green@qrator.net>
> Signed-off-by: Petr Machata <petrm@nvidia.com>

Reviewed-by: Simon Horman <horms@kernel.org>


  reply	other threads:[~2024-06-08  9:01 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-06 14:49 [PATCH net 0/6] mlxsw: ACL fixes Petr Machata
2024-06-06 14:49 ` [PATCH net 1/6] lib: objagg: Fix spelling Petr Machata
2024-06-08  9:00   ` Simon Horman
2024-06-06 14:49 ` [PATCH net 2/6] lib: test_objagg: " Petr Machata
2024-06-08  9:00   ` Simon Horman
2024-06-06 14:49 ` [PATCH net 3/6] mlxsw: spectrum_acl_atcam: Fix wrong comment Petr Machata
2024-06-08  9:01   ` Simon Horman
2024-06-06 14:49 ` [PATCH net 4/6] lib: objagg: Fix general protection fault Petr Machata
2024-06-08  9:01   ` Simon Horman
2024-06-06 14:49 ` [PATCH net 5/6] mlxsw: spectrum_acl_erp: Fix object nesting warning Petr Machata
2024-06-08  9:01   ` Simon Horman [this message]
2024-06-06 14:49 ` [PATCH net 6/6] mlxsw: spectrum_acl: Fix ACL scale regression and firmware errors Petr Machata
2024-06-08  9:01   ` Simon Horman
2024-06-10 10:20 ` [PATCH net 0/6] mlxsw: ACL fixes patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240608090150.GR27689@kernel.org \
    --to=horms@kernel.org \
    --cc=amcohen@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=green@qrator.net \
    --cc=idosch@nvidia.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=mlxsw@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=petrm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).