netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 00/10] net/mlx5: hw counters refactor and misc
@ 2024-08-15  5:46 Tariq Toukan
  2024-08-15  5:46 ` [PATCH net-next 01/10] net/mlx5: hw counters: Make fc_stats & fc_pool private Tariq Toukan
                   ` (9 more replies)
  0 siblings, 10 replies; 18+ messages in thread
From: Tariq Toukan @ 2024-08-15  5:46 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, Saeed Mahameed, Gal Pressman, Leon Romanovsky,
	Tariq Toukan

This patchset contains multiple enhancements from the team to the mlx5
core and Eth drivers.

In the first 6 patches, Cosmin refactors hw counters and solves perf
scaling issue.  Find description below [1].

Followed by two patches by Shay for the core driver.

Patch 9 by Dragos adds an RX SW counter to cover no-split events in
header/data split mode.

Patch 10 by Rahul matches the cleanup order of the RQ to be reversed
to the allocation order.


Series generated against:
commit a9c60712d71f ("Merge branch 'uapi-net-sched-cxgb4-fix-wflex-array-member-not-at-end-warning'")

Regards,
Tariq

[1]
HW counters are central to mlx5 driver operations. They are hardware
objects created and used alongside most steering operations, and queried
from a variety of places. Most counters are queried in bulk from a
periodic task in fs_counters.c.

Counter performance is important and as such, a variety of improvements
have been done over the years. Currently, counters are allocated from
pools, which are bulk allocated to amortize the cost of firmware
commands. Counters are managed through an IDR, a doubly linked list and
two atomic single linked lists. Adding/removing counters is a complex
dance between user contexts requesting it and the mlx5_fc_stats_work
task which does most of the work.

Under high load (e.g. from connection tracking flow insertion/deletion),
the counter code becomes a bottleneck, as seen on flame graphs. Whenever
a counter is deleted, it gets added to a list and the wq task is
scheduled to run immediately to actually delete it. This is done via
mod_delayed_work which uses an internal spinlock. In some tests, waiting
for this spinlock took up to 66% of all samples.

This series refactors the counter code to use a more straight-forward
approach, avoiding the mod_delayed_work problem and making the code
easier to understand. For that:

- patch #1 moves counters data structs to a more appropriate place.
- patch #2 simplifies the bulk query allocation scheme by using vmalloc.
- patch #3 replaces the IDR+3 lists with an xarray. This is the main
  patch of the series, solving the spinlock congestion issue.
- patch #4 removes an unnecessary cacheline alignment causing a lot of
  memory to be wasted.
- patches #5 and #6 are small cleanups enabled by the refactoring.

Cosmin Ratiu (6):
  net/mlx5: hw counters: Make fc_stats & fc_pool private
  net/mlx5: hw counters: Use kvmalloc for bulk query buffer
  net/mlx5: hw counters: Replace IDR+lists with xarray
  net/mlx5: hw counters: Drop unneeded cacheline alignment
  net/mlx5: hw counters: Don't maintain a counter count
  net/mlx5: hw counters: Remove mlx5_fc_create_ex

Dragos Tatulea (1):
  net/mlx5e: SHAMPO, Add no-split ethtool counters for header/data split

Rahul Rameshbabu (1):
  net/mlx5e: Match cleanup order in mlx5e_free_rq in reverse of
    mlx5e_alloc_rq

Shay Drory (2):
  net/mlx5: Allow users to configure affinity for SFs
  net/mlx5: Add NOT_READY command return status

 .../ethernet/mellanox/mlx5/counters.rst       |  16 +
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c |   7 +-
 .../ethernet/mellanox/mlx5/core/en/tc_ct.c    |   2 +-
 .../net/ethernet/mellanox/mlx5/core/en_main.c |  25 +-
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   |   3 +
 .../ethernet/mellanox/mlx5/core/en_stats.c    |   6 +
 .../ethernet/mellanox/mlx5/core/en_stats.h    |   4 +
 drivers/net/ethernet/mellanox/mlx5/core/eq.c  |   2 +-
 .../ethernet/mellanox/mlx5/core/fs_counters.c | 387 +++++++-----------
 include/linux/mlx5/device.h                   |   1 +
 include/linux/mlx5/driver.h                   |  33 +-
 include/linux/mlx5/fs.h                       |   3 -
 12 files changed, 197 insertions(+), 292 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-08-30  2:20 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-15  5:46 [PATCH net-next 00/10] net/mlx5: hw counters refactor and misc Tariq Toukan
2024-08-15  5:46 ` [PATCH net-next 01/10] net/mlx5: hw counters: Make fc_stats & fc_pool private Tariq Toukan
2024-08-15  5:46 ` [PATCH net-next 02/10] net/mlx5: hw counters: Use kvmalloc for bulk query buffer Tariq Toukan
2024-08-15  5:46 ` [PATCH net-next 03/10] net/mlx5: hw counters: Replace IDR+lists with xarray Tariq Toukan
2024-08-15 13:44   ` Simon Horman
2024-08-27 11:14     ` Cosmin Ratiu
2024-08-27 15:01       ` Simon Horman
2024-08-27 15:20         ` Simon Horman
2024-08-29 23:20           ` Saeed Mahameed
2024-08-30  2:20             ` Jakub Kicinski
2024-08-27 15:27     ` Dan Carpenter
2024-08-15  5:46 ` [PATCH net-next 04/10] net/mlx5: hw counters: Drop unneeded cacheline alignment Tariq Toukan
2024-08-15  5:46 ` [PATCH net-next 05/10] net/mlx5: hw counters: Don't maintain a counter count Tariq Toukan
2024-08-15  5:46 ` [PATCH net-next 06/10] net/mlx5: hw counters: Remove mlx5_fc_create_ex Tariq Toukan
2024-08-15  5:46 ` [PATCH net-next 07/10] net/mlx5: Allow users to configure affinity for SFs Tariq Toukan
2024-08-15  5:46 ` [PATCH net-next 08/10] net/mlx5: Add NOT_READY command return status Tariq Toukan
2024-08-15  5:46 ` [PATCH net-next 09/10] net/mlx5e: SHAMPO, Add no-split ethtool counters for header/data split Tariq Toukan
2024-08-15  5:46 ` [PATCH net-next 10/10] net/mlx5e: Match cleanup order in mlx5e_free_rq in reverse of mlx5e_alloc_rq Tariq Toukan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).