* [pull request][net-next V2 00/15] mlx5 updates 2023-10-10
@ 2023-10-12 19:27 Saeed Mahameed
2023-10-12 19:27 ` [net-next V2 01/15] net/mlx5: Parallelize vhca event handling Saeed Mahameed
` (15 more replies)
0 siblings, 16 replies; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan
From: Saeed Mahameed <saeedm@nvidia.com>
v1->v2:
- patch #2, remove leftover mutex_init() to fix the build break
This provides misc updates to mlx5 driver.
For more information please see tag log below.
Please pull and let me know if there is any problem.
Thanks,
Saeed.
The following changes since commit 2f0968a030f2a5dd4897a0151c8395bf5babe5b0:
net: gso_test: fix build with gcc-12 and earlier (2023-10-12 15:34:08 +0200)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2023-10-10
for you to fetch changes up to 8c18eaea290b1f4d9fc784703aaec3b3cdb7eab9:
net/mlx5e: Allow IPsec soft/hard limits in bytes (2023-10-12 12:19:17 -0700)
----------------------------------------------------------------
mlx5-updates-2023-10-10
1) Adham Faris, Increase max supported channels number to 256
2) Leon Romanovsky, Allow IPsec soft/hard limits in bytes
3) Shay Drory, Replace global mlx5_intf_lock with
HCA devcom component lock
4) Wei Zhang, Optimize SF creation flow
During SF creation, HCA state gets changed from INVALID to
IN_USE step by step. Accordingly, FW sends vhca event to
driver to inform about this state change asynchronously.
Each vhca event is critical because all related SW/FW
operations are triggered by it.
Currently there is only a single mlx5 general event handler
which not only handles vhca event but many other events.
This incurs huge bottleneck because all events are forced
to be handled in serial manner.
Moreover, all SFs share same table_lock which inevitably
impacts each other when they are created in parallel.
This series will solve this issue by:
1. A dedicated vhca event handler is introduced to eliminate
the mutual impact with other mlx5 events.
2. Max FW threads work queues are employed in the vhca event
handler to fully utilize FW capability.
3. Redesign SF active work logic to completely remove
table_lock.
With above optimization, SF creation time is reduced by 25%,
i.e. from 80s to 60s when creating 100 SFs.
Patches summary:
Patch 1 - implement dedicated vhca event handler with max FW
cmd threads of work queues.
Patch 2 - remove table_lock by redesigning SF active work
logic.
----------------------------------------------------------------
Adham Faris (5):
net/mlx5e: Refactor rx_res_init() and rx_res_free() APIs
net/mlx5e: Refactor mlx5e_rss_set_rxfh() and mlx5e_rss_get_rxfh()
net/mlx5e: Refactor mlx5e_rss_init() and mlx5e_rss_free() API's
net/mlx5e: Preparations for supporting larger number of channels
net/mlx5e: Increase max supported channels number to 256
Jinjie Ruan (1):
net/mlx5: Use PTR_ERR_OR_ZERO() to simplify code
Leon Romanovsky (1):
net/mlx5e: Allow IPsec soft/hard limits in bytes
Lukas Bulwahn (1):
net/mlx5: fix config name in Kconfig parameter documentation
Shay Drory (3):
net/mlx5: Avoid false positive lockdep warning by adding lock_class_key
net/mlx5: Refactor LAG peer device lookout bus logic to mlx5 devcom
net/mlx5: Replace global mlx5_intf_lock with HCA devcom component lock
Wei Zhang (2):
net/mlx5: Parallelize vhca event handling
net/mlx5: Redesign SF active work to remove table_lock
Yu Liao (1):
net/mlx5e: Use PTR_ERR_OR_ZERO() to simplify code
Yue Haibing (1):
net/mlx5: Remove unused declaration
.../ethernet/mellanox/mlx5/kconfig.rst | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/dev.c | 105 ++-------------
drivers/net/ethernet/mellanox/mlx5/core/en.h | 5 +-
drivers/net/ethernet/mellanox/mlx5/core/en/fs.h | 1 -
drivers/net/ethernet/mellanox/mlx5/core/en/rqt.c | 32 +++--
drivers/net/ethernet/mellanox/mlx5/core/en/rqt.h | 9 +-
drivers/net/ethernet/mellanox/mlx5/core/en/rss.c | 144 ++++++++++++++++-----
drivers/net/ethernet/mellanox/mlx5/core/en/rss.h | 20 ++-
.../net/ethernet/mellanox/mlx5/core/en/rx_res.c | 105 ++++++++-------
.../net/ethernet/mellanox/mlx5/core/en/rx_res.h | 12 +-
.../ethernet/mellanox/mlx5/core/en_accel/ipsec.c | 23 ++--
.../mellanox/mlx5/core/en_accel/ipsec_fs.c | 24 ++--
.../mellanox/mlx5/core/en_accel/ipsec_rxtx.h | 1 -
.../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/en_fs.c | 8 +-
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 32 ++---
drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 27 ++--
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 16 ++-
drivers/net/ethernet/mellanox/mlx5/core/events.c | 5 -
.../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c | 24 ++--
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 47 +++++--
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h | 1 +
.../net/ethernet/mellanox/mlx5/core/lag/mpesw.c | 9 +-
.../net/ethernet/mellanox/mlx5/core/lag/port_sel.c | 10 +-
.../net/ethernet/mellanox/mlx5/core/lib/devcom.c | 25 ++++
.../net/ethernet/mellanox/mlx5/core/lib/devcom.h | 5 +
drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h | 1 -
drivers/net/ethernet/mellanox/mlx5/core/main.c | 25 ++++
.../net/ethernet/mellanox/mlx5/core/mlx5_core.h | 14 +-
.../net/ethernet/mellanox/mlx5/core/sf/dev/dev.c | 86 +++++++-----
.../ethernet/mellanox/mlx5/core/sf/vhca_event.c | 69 +++++++++-
.../ethernet/mellanox/mlx5/core/sf/vhca_event.h | 3 +
.../mellanox/mlx5/core/steering/dr_types.h | 4 -
include/linux/mlx5/driver.h | 16 +--
34 files changed, 544 insertions(+), 368 deletions(-)
^ permalink raw reply [flat|nested] 32+ messages in thread
* [net-next V2 01/15] net/mlx5: Parallelize vhca event handling
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 21:13 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 02/15] net/mlx5: Redesign SF active work to remove table_lock Saeed Mahameed
` (14 subsequent siblings)
15 siblings, 1 reply; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Wei Zhang, Moshe Shemesh,
Shay Drory
From: Wei Zhang <weizhang@nvidia.com>
At present, mlx5 driver have a general purpose
event handler which not only handles vhca event
but also many other events. This incurs a huge
bottleneck because the event handler is
implemented by single threaded workqueue and all
events are forced to be handled in serial manner
even though application tries to create multiple
SFs simultaneously.
Introduce a dedicated vhca event handler which
manages SFs parallel creation.
Signed-off-by: Wei Zhang <weizhang@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/events.c | 5 --
.../ethernet/mellanox/mlx5/core/mlx5_core.h | 3 +-
.../mellanox/mlx5/core/sf/vhca_event.c | 57 ++++++++++++++++++-
include/linux/mlx5/driver.h | 1 +
4 files changed, 57 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/events.c b/drivers/net/ethernet/mellanox/mlx5/core/events.c
index 3ec892d51f57..d91ea53eb394 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/events.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/events.c
@@ -441,8 +441,3 @@ int mlx5_blocking_notifier_call_chain(struct mlx5_core_dev *dev, unsigned int ev
return blocking_notifier_call_chain(&events->sw_nh, event, data);
}
-
-void mlx5_events_work_enqueue(struct mlx5_core_dev *dev, struct work_struct *work)
-{
- queue_work(dev->priv.events->wq, work);
-}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 124352459c23..94f809f52f27 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -143,6 +143,8 @@ enum mlx5_semaphore_space_address {
#define MLX5_DEFAULT_PROF 2
#define MLX5_SF_PROF 3
+#define MLX5_NUM_FW_CMD_THREADS 8
+#define MLX5_DEV_MAX_WQS MLX5_NUM_FW_CMD_THREADS
static inline int mlx5_flexible_inlen(struct mlx5_core_dev *dev, size_t fixed,
size_t item_size, size_t num_items,
@@ -331,7 +333,6 @@ int mlx5_vport_set_other_func_cap(struct mlx5_core_dev *dev, const void *hca_cap
#define mlx5_vport_get_other_func_general_cap(dev, vport, out) \
mlx5_vport_get_other_func_cap(dev, vport, out, MLX5_CAP_GENERAL)
-void mlx5_events_work_enqueue(struct mlx5_core_dev *dev, struct work_struct *work);
static inline u32 mlx5_sriov_get_vf_total_msix(struct pci_dev *pdev)
{
struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
index d908fba968f0..c6fd729de8b2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
@@ -21,6 +21,15 @@ struct mlx5_vhca_event_work {
struct mlx5_vhca_state_event event;
};
+struct mlx5_vhca_event_handler {
+ struct workqueue_struct *wq;
+};
+
+struct mlx5_vhca_events {
+ struct mlx5_core_dev *dev;
+ struct mlx5_vhca_event_handler handler[MLX5_DEV_MAX_WQS];
+};
+
int mlx5_cmd_query_vhca_state(struct mlx5_core_dev *dev, u16 function_id, u32 *out, u32 outlen)
{
u32 in[MLX5_ST_SZ_DW(query_vhca_state_in)] = {};
@@ -99,6 +108,12 @@ static void mlx5_vhca_state_work_handler(struct work_struct *_work)
kfree(work);
}
+static void
+mlx5_vhca_events_work_enqueue(struct mlx5_core_dev *dev, int idx, struct work_struct *work)
+{
+ queue_work(dev->priv.vhca_events->handler[idx].wq, work);
+}
+
static int
mlx5_vhca_state_change_notifier(struct notifier_block *nb, unsigned long type, void *data)
{
@@ -106,6 +121,7 @@ mlx5_vhca_state_change_notifier(struct notifier_block *nb, unsigned long type, v
mlx5_nb_cof(nb, struct mlx5_vhca_state_notifier, nb);
struct mlx5_vhca_event_work *work;
struct mlx5_eqe *eqe = data;
+ int wq_idx;
work = kzalloc(sizeof(*work), GFP_ATOMIC);
if (!work)
@@ -113,7 +129,8 @@ mlx5_vhca_state_change_notifier(struct notifier_block *nb, unsigned long type, v
INIT_WORK(&work->work, &mlx5_vhca_state_work_handler);
work->notifier = notifier;
work->event.function_id = be16_to_cpu(eqe->data.vhca_state.function_id);
- mlx5_events_work_enqueue(notifier->dev, &work->work);
+ wq_idx = work->event.function_id % MLX5_DEV_MAX_WQS;
+ mlx5_vhca_events_work_enqueue(notifier->dev, wq_idx, &work->work);
return NOTIFY_OK;
}
@@ -132,28 +149,62 @@ void mlx5_vhca_state_cap_handle(struct mlx5_core_dev *dev, void *set_hca_cap)
int mlx5_vhca_event_init(struct mlx5_core_dev *dev)
{
struct mlx5_vhca_state_notifier *notifier;
+ char wq_name[MLX5_CMD_WQ_MAX_NAME];
+ struct mlx5_vhca_events *events;
+ int err, i;
if (!mlx5_vhca_event_supported(dev))
return 0;
- notifier = kzalloc(sizeof(*notifier), GFP_KERNEL);
- if (!notifier)
+ events = kzalloc(sizeof(*events), GFP_KERNEL);
+ if (!events)
return -ENOMEM;
+ events->dev = dev;
+ dev->priv.vhca_events = events;
+ for (i = 0; i < MLX5_DEV_MAX_WQS; i++) {
+ snprintf(wq_name, MLX5_CMD_WQ_MAX_NAME, "mlx5_vhca_event%d", i);
+ events->handler[i].wq = create_singlethread_workqueue(wq_name);
+ if (!events->handler[i].wq) {
+ err = -ENOMEM;
+ goto err_create_wq;
+ }
+ }
+
+ notifier = kzalloc(sizeof(*notifier), GFP_KERNEL);
+ if (!notifier) {
+ err = -ENOMEM;
+ goto err_notifier;
+ }
+
dev->priv.vhca_state_notifier = notifier;
notifier->dev = dev;
BLOCKING_INIT_NOTIFIER_HEAD(¬ifier->n_head);
MLX5_NB_INIT(¬ifier->nb, mlx5_vhca_state_change_notifier, VHCA_STATE_CHANGE);
return 0;
+
+err_notifier:
+err_create_wq:
+ for (--i; i >= 0; i--)
+ destroy_workqueue(events->handler[i].wq);
+ kfree(events);
+ return err;
}
void mlx5_vhca_event_cleanup(struct mlx5_core_dev *dev)
{
+ struct mlx5_vhca_events *vhca_events;
+ int i;
+
if (!mlx5_vhca_event_supported(dev))
return;
kfree(dev->priv.vhca_state_notifier);
dev->priv.vhca_state_notifier = NULL;
+ vhca_events = dev->priv.vhca_events;
+ for (i = 0; i < MLX5_DEV_MAX_WQS; i++)
+ destroy_workqueue(vhca_events->handler[i].wq);
+ kvfree(vhca_events);
}
void mlx5_vhca_event_start(struct mlx5_core_dev *dev)
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 92434814c855..50025fe90026 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -615,6 +615,7 @@ struct mlx5_priv {
int adev_idx;
int sw_vhca_id;
struct mlx5_events *events;
+ struct mlx5_vhca_events *vhca_events;
struct mlx5_flow_steering *steering;
struct mlx5_mpfs *mpfs;
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 02/15] net/mlx5: Redesign SF active work to remove table_lock
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
2023-10-12 19:27 ` [net-next V2 01/15] net/mlx5: Parallelize vhca event handling Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 21:18 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 03/15] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key Saeed Mahameed
` (13 subsequent siblings)
15 siblings, 1 reply; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Wei Zhang, Shay Drory
From: Wei Zhang <weizhang@nvidia.com>
active_work is a work that iterates over all
possible SF devices which their SF port
representors are located on different function,
and in case SF is in active state, probes it.
Currently, the active_work in active_wq is
synced with mlx5_vhca_events_work via table_lock
and this lock causing a bottleneck in performance.
To remove table_lock, redesign active_wq logic
so that it now pushes active_work per SF to
mlx5_vhca_events_workqueues. Since the latter
workqueues are ordered, active_work and
mlx5_vhca_events_work with same index will be
pushed into same workqueue, thus it completely
eliminates the need for a lock.
Signed-off-by: Wei Zhang <weizhang@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../ethernet/mellanox/mlx5/core/sf/dev/dev.c | 86 ++++++++++++-------
.../mellanox/mlx5/core/sf/vhca_event.c | 16 +++-
.../mellanox/mlx5/core/sf/vhca_event.h | 3 +
3 files changed, 74 insertions(+), 31 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
index 0f9b280514b8..c93492b67788 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
@@ -17,13 +17,19 @@ struct mlx5_sf_dev_table {
phys_addr_t base_address;
u64 sf_bar_length;
struct notifier_block nb;
- struct mutex table_lock; /* Serializes sf life cycle and vhca state change handler */
struct workqueue_struct *active_wq;
struct work_struct work;
u8 stop_active_wq:1;
struct mlx5_core_dev *dev;
};
+struct mlx5_sf_dev_active_work_ctx {
+ struct work_struct work;
+ struct mlx5_vhca_state_event event;
+ struct mlx5_sf_dev_table *table;
+ int sf_index;
+};
+
static bool mlx5_sf_dev_supported(const struct mlx5_core_dev *dev)
{
return MLX5_CAP_GEN(dev, sf) && mlx5_vhca_event_supported(dev);
@@ -165,7 +171,6 @@ mlx5_sf_dev_state_change_handler(struct notifier_block *nb, unsigned long event_
return 0;
sf_index = event->function_id - base_id;
- mutex_lock(&table->table_lock);
sf_dev = xa_load(&table->devices, sf_index);
switch (event->new_vhca_state) {
case MLX5_VHCA_STATE_INVALID:
@@ -189,7 +194,6 @@ mlx5_sf_dev_state_change_handler(struct notifier_block *nb, unsigned long event_
default:
break;
}
- mutex_unlock(&table->table_lock);
return 0;
}
@@ -214,15 +218,44 @@ static int mlx5_sf_dev_vhca_arm_all(struct mlx5_sf_dev_table *table)
return 0;
}
-static void mlx5_sf_dev_add_active_work(struct work_struct *work)
+static void mlx5_sf_dev_add_active_work(struct work_struct *_work)
{
- struct mlx5_sf_dev_table *table = container_of(work, struct mlx5_sf_dev_table, work);
+ struct mlx5_sf_dev_active_work_ctx *work_ctx;
+
+ work_ctx = container_of(_work, struct mlx5_sf_dev_active_work_ctx, work);
+ if (work_ctx->table->stop_active_wq)
+ goto out;
+ /* Don't probe device which is already probe */
+ if (!xa_load(&work_ctx->table->devices, work_ctx->sf_index))
+ mlx5_sf_dev_add(work_ctx->table->dev, work_ctx->sf_index,
+ work_ctx->event.function_id, work_ctx->event.sw_function_id);
+ /* There is a race where SF got inactive after the query
+ * above. e.g.: the query returns that the state of the
+ * SF is active, and after that the eswitch manager set it to
+ * inactive.
+ * This case cannot be managed in SW, since the probing of the
+ * SF is on one system, and the inactivation is on a different
+ * system.
+ * If the inactive is done after the SF perform init_hca(),
+ * the SF will fully probe and then removed. If it was
+ * done before init_hca(), the SF probe will fail.
+ */
+out:
+ kfree(work_ctx);
+}
+
+/* In case SFs are generated externally, probe active SFs */
+static void mlx5_sf_dev_queue_active_works(struct work_struct *_work)
+{
+ struct mlx5_sf_dev_table *table = container_of(_work, struct mlx5_sf_dev_table, work);
u32 out[MLX5_ST_SZ_DW(query_vhca_state_out)] = {};
+ struct mlx5_sf_dev_active_work_ctx *work_ctx;
struct mlx5_core_dev *dev = table->dev;
u16 max_functions;
u16 function_id;
u16 sw_func_id;
int err = 0;
+ int wq_idx;
u8 state;
int i;
@@ -242,27 +275,22 @@ static void mlx5_sf_dev_add_active_work(struct work_struct *work)
continue;
sw_func_id = MLX5_GET(query_vhca_state_out, out, vhca_state_context.sw_function_id);
- mutex_lock(&table->table_lock);
- /* Don't probe device which is already probe */
- if (!xa_load(&table->devices, i))
- mlx5_sf_dev_add(dev, i, function_id, sw_func_id);
- /* There is a race where SF got inactive after the query
- * above. e.g.: the query returns that the state of the
- * SF is active, and after that the eswitch manager set it to
- * inactive.
- * This case cannot be managed in SW, since the probing of the
- * SF is on one system, and the inactivation is on a different
- * system.
- * If the inactive is done after the SF perform init_hca(),
- * the SF will fully probe and then removed. If it was
- * done before init_hca(), the SF probe will fail.
- */
- mutex_unlock(&table->table_lock);
+ work_ctx = kzalloc(sizeof(*work_ctx), GFP_KERNEL);
+ if (!work_ctx)
+ return;
+
+ INIT_WORK(&work_ctx->work, &mlx5_sf_dev_add_active_work);
+ work_ctx->event.function_id = function_id;
+ work_ctx->event.sw_function_id = sw_func_id;
+ work_ctx->table = table;
+ work_ctx->sf_index = i;
+ wq_idx = work_ctx->event.function_id % MLX5_DEV_MAX_WQS;
+ mlx5_vhca_events_work_enqueue(dev, wq_idx, &work_ctx->work);
}
}
/* In case SFs are generated externally, probe active SFs */
-static int mlx5_sf_dev_queue_active_work(struct mlx5_sf_dev_table *table)
+static int mlx5_sf_dev_create_active_works(struct mlx5_sf_dev_table *table)
{
if (MLX5_CAP_GEN(table->dev, eswitch_manager))
return 0; /* the table is local */
@@ -273,12 +301,12 @@ static int mlx5_sf_dev_queue_active_work(struct mlx5_sf_dev_table *table)
table->active_wq = create_singlethread_workqueue("mlx5_active_sf");
if (!table->active_wq)
return -ENOMEM;
- INIT_WORK(&table->work, &mlx5_sf_dev_add_active_work);
+ INIT_WORK(&table->work, &mlx5_sf_dev_queue_active_works);
queue_work(table->active_wq, &table->work);
return 0;
}
-static void mlx5_sf_dev_destroy_active_work(struct mlx5_sf_dev_table *table)
+static void mlx5_sf_dev_destroy_active_works(struct mlx5_sf_dev_table *table)
{
if (table->active_wq) {
table->stop_active_wq = true;
@@ -305,14 +333,13 @@ void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev)
table->sf_bar_length = 1 << (MLX5_CAP_GEN(dev, log_min_sf_size) + 12);
table->base_address = pci_resource_start(dev->pdev, 2);
xa_init(&table->devices);
- mutex_init(&table->table_lock);
dev->priv.sf_dev_table = table;
err = mlx5_vhca_event_notifier_register(dev, &table->nb);
if (err)
goto vhca_err;
- err = mlx5_sf_dev_queue_active_work(table);
+ err = mlx5_sf_dev_create_active_works(table);
if (err)
goto add_active_err;
@@ -322,9 +349,10 @@ void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev)
return;
arm_err:
- mlx5_sf_dev_destroy_active_work(table);
+ mlx5_sf_dev_destroy_active_works(table);
add_active_err:
mlx5_vhca_event_notifier_unregister(dev, &table->nb);
+ mlx5_vhca_event_work_queues_flush(dev);
vhca_err:
kfree(table);
dev->priv.sf_dev_table = NULL;
@@ -350,9 +378,9 @@ void mlx5_sf_dev_table_destroy(struct mlx5_core_dev *dev)
if (!table)
return;
- mlx5_sf_dev_destroy_active_work(table);
+ mlx5_sf_dev_destroy_active_works(table);
mlx5_vhca_event_notifier_unregister(dev, &table->nb);
- mutex_destroy(&table->table_lock);
+ mlx5_vhca_event_work_queues_flush(dev);
/* Now that event handler is not running, it is safe to destroy
* the sf device without race.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
index c6fd729de8b2..cda01ba441ae 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
@@ -108,8 +108,7 @@ static void mlx5_vhca_state_work_handler(struct work_struct *_work)
kfree(work);
}
-static void
-mlx5_vhca_events_work_enqueue(struct mlx5_core_dev *dev, int idx, struct work_struct *work)
+void mlx5_vhca_events_work_enqueue(struct mlx5_core_dev *dev, int idx, struct work_struct *work)
{
queue_work(dev->priv.vhca_events->handler[idx].wq, work);
}
@@ -191,6 +190,19 @@ int mlx5_vhca_event_init(struct mlx5_core_dev *dev)
return err;
}
+void mlx5_vhca_event_work_queues_flush(struct mlx5_core_dev *dev)
+{
+ struct mlx5_vhca_events *vhca_events;
+ int i;
+
+ if (!mlx5_vhca_event_supported(dev))
+ return;
+
+ vhca_events = dev->priv.vhca_events;
+ for (i = 0; i < MLX5_DEV_MAX_WQS; i++)
+ flush_workqueue(vhca_events->handler[i].wq);
+}
+
void mlx5_vhca_event_cleanup(struct mlx5_core_dev *dev)
{
struct mlx5_vhca_events *vhca_events;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h
index 013cdfe90616..1725ba64f8af 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h
@@ -28,6 +28,9 @@ int mlx5_modify_vhca_sw_id(struct mlx5_core_dev *dev, u16 function_id, u32 sw_fn
int mlx5_vhca_event_arm(struct mlx5_core_dev *dev, u16 function_id);
int mlx5_cmd_query_vhca_state(struct mlx5_core_dev *dev, u16 function_id,
u32 *out, u32 outlen);
+void mlx5_vhca_events_work_enqueue(struct mlx5_core_dev *dev, int idx, struct work_struct *work);
+void mlx5_vhca_event_work_queues_flush(struct mlx5_core_dev *dev);
+
#else
static inline void mlx5_vhca_state_cap_handle(struct mlx5_core_dev *dev, void *set_hca_cap)
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 03/15] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
2023-10-12 19:27 ` [net-next V2 01/15] net/mlx5: Parallelize vhca event handling Saeed Mahameed
2023-10-12 19:27 ` [net-next V2 02/15] net/mlx5: Redesign SF active work to remove table_lock Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 21:23 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 04/15] net/mlx5: Refactor LAG peer device lookout bus logic to mlx5 devcom Saeed Mahameed
` (12 subsequent siblings)
15 siblings, 1 reply; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch
From: Shay Drory <shayd@nvidia.com>
Downstream patch will add devcom component which will be locked in
many places. This can lead to a false positive "possible circular
locking dependency" warning by lockdep, on flows which lock more than
one mlx5 devcom component, such as probing ETH aux device.
Hence, add a lock_class_key per mlx5 device.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
index 00e67910e3ee..89ac3209277e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
@@ -31,6 +31,7 @@ struct mlx5_devcom_comp {
struct kref ref;
bool ready;
struct rw_semaphore sem;
+ struct lock_class_key lock_key;
};
struct mlx5_devcom_comp_dev {
@@ -119,6 +120,8 @@ mlx5_devcom_comp_alloc(u64 id, u64 key, mlx5_devcom_event_handler_t handler)
comp->key = key;
comp->handler = handler;
init_rwsem(&comp->sem);
+ lockdep_register_key(&comp->lock_key);
+ lockdep_set_class(&comp->sem, &comp->lock_key);
kref_init(&comp->ref);
INIT_LIST_HEAD(&comp->comp_dev_list_head);
@@ -133,6 +136,7 @@ mlx5_devcom_comp_release(struct kref *ref)
mutex_lock(&comp_list_lock);
list_del(&comp->comp_list);
mutex_unlock(&comp_list_lock);
+ lockdep_unregister_key(&comp->lock_key);
kfree(comp);
}
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 04/15] net/mlx5: Refactor LAG peer device lookout bus logic to mlx5 devcom
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
` (2 preceding siblings ...)
2023-10-12 19:27 ` [net-next V2 03/15] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 21:26 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 05/15] net/mlx5: Replace global mlx5_intf_lock with HCA devcom component lock Saeed Mahameed
` (11 subsequent siblings)
15 siblings, 1 reply; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch
From: Shay Drory <shayd@nvidia.com>
LAG peer device lookout bus logic required the usage of global lock,
mlx5_intf_mutex.
As part of the effort to remove this global lock, refactor LAG peer
device lookout to use mlx5 devcom layer.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/dev.c | 68 -------------------
.../net/ethernet/mellanox/mlx5/core/lag/lag.c | 12 ++--
.../ethernet/mellanox/mlx5/core/lib/devcom.c | 14 ++++
.../ethernet/mellanox/mlx5/core/lib/devcom.h | 4 ++
.../net/ethernet/mellanox/mlx5/core/main.c | 25 +++++++
.../ethernet/mellanox/mlx5/core/mlx5_core.h | 1 -
include/linux/mlx5/driver.h | 1 +
7 files changed, 52 insertions(+), 73 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
index 1fc03480c2ff..6e3a8c22881f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
@@ -566,74 +566,6 @@ bool mlx5_same_hw_devs(struct mlx5_core_dev *dev, struct mlx5_core_dev *peer_dev
return (fsystem_guid && psystem_guid && fsystem_guid == psystem_guid);
}
-static u32 mlx5_gen_pci_id(const struct mlx5_core_dev *dev)
-{
- return (u32)((pci_domain_nr(dev->pdev->bus) << 16) |
- (dev->pdev->bus->number << 8) |
- PCI_SLOT(dev->pdev->devfn));
-}
-
-static int _next_phys_dev(struct mlx5_core_dev *mdev,
- const struct mlx5_core_dev *curr)
-{
- if (!mlx5_core_is_pf(mdev))
- return 0;
-
- if (mdev == curr)
- return 0;
-
- if (!mlx5_same_hw_devs(mdev, (struct mlx5_core_dev *)curr) &&
- mlx5_gen_pci_id(mdev) != mlx5_gen_pci_id(curr))
- return 0;
-
- return 1;
-}
-
-static void *pci_get_other_drvdata(struct device *this, struct device *other)
-{
- if (this->driver != other->driver)
- return NULL;
-
- return pci_get_drvdata(to_pci_dev(other));
-}
-
-static int next_phys_dev_lag(struct device *dev, const void *data)
-{
- struct mlx5_core_dev *mdev, *this = (struct mlx5_core_dev *)data;
-
- mdev = pci_get_other_drvdata(this->device, dev);
- if (!mdev)
- return 0;
-
- if (!mlx5_lag_is_supported(mdev))
- return 0;
-
- return _next_phys_dev(mdev, data);
-}
-
-static struct mlx5_core_dev *mlx5_get_next_dev(struct mlx5_core_dev *dev,
- int (*match)(struct device *dev, const void *data))
-{
- struct device *next;
-
- if (!mlx5_core_is_pf(dev))
- return NULL;
-
- next = bus_find_device(&pci_bus_type, NULL, dev, match);
- if (!next)
- return NULL;
-
- put_device(next);
- return pci_get_drvdata(to_pci_dev(next));
-}
-
-/* Must be called with intf_mutex held */
-struct mlx5_core_dev *mlx5_get_next_phys_dev_lag(struct mlx5_core_dev *dev)
-{
- lockdep_assert_held(&mlx5_intf_mutex);
- return mlx5_get_next_dev(dev, &next_phys_dev_lag);
-}
-
void mlx5_dev_list_lock(void)
{
mutex_lock(&mlx5_intf_mutex);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index af3fac090b82..f0b57f97739f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -1212,13 +1212,14 @@ static void mlx5_ldev_remove_mdev(struct mlx5_lag *ldev,
dev->priv.lag = NULL;
}
-/* Must be called with intf_mutex held */
+/* Must be called with HCA devcom component lock held */
static int __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev)
{
+ struct mlx5_devcom_comp_dev *pos = NULL;
struct mlx5_lag *ldev = NULL;
struct mlx5_core_dev *tmp_dev;
- tmp_dev = mlx5_get_next_phys_dev_lag(dev);
+ tmp_dev = mlx5_devcom_get_next_peer_data(dev->priv.hca_devcom_comp, &pos);
if (tmp_dev)
ldev = mlx5_lag_dev(tmp_dev);
@@ -1275,10 +1276,13 @@ void mlx5_lag_add_mdev(struct mlx5_core_dev *dev)
if (!mlx5_lag_is_supported(dev))
return;
+ if (IS_ERR_OR_NULL(dev->priv.hca_devcom_comp))
+ return;
+
recheck:
- mlx5_dev_list_lock();
+ mlx5_devcom_comp_lock(dev->priv.hca_devcom_comp);
err = __mlx5_lag_dev_add_mdev(dev);
- mlx5_dev_list_unlock();
+ mlx5_devcom_comp_unlock(dev->priv.hca_devcom_comp);
if (err) {
msleep(100);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
index 89ac3209277e..f4d5c300ddd6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
@@ -387,3 +387,17 @@ void *mlx5_devcom_get_next_peer_data_rcu(struct mlx5_devcom_comp_dev *devcom,
*pos = tmp;
return data;
}
+
+void mlx5_devcom_comp_lock(struct mlx5_devcom_comp_dev *devcom)
+{
+ if (IS_ERR_OR_NULL(devcom))
+ return;
+ down_write(&devcom->comp->sem);
+}
+
+void mlx5_devcom_comp_unlock(struct mlx5_devcom_comp_dev *devcom)
+{
+ if (IS_ERR_OR_NULL(devcom))
+ return;
+ up_write(&devcom->comp->sem);
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
index 8389ac0af708..ed249bafd07c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
@@ -8,6 +8,7 @@
enum mlx5_devcom_component {
MLX5_DEVCOM_ESW_OFFLOADS,
+ MLX5_DEVCOM_HCA_PORTS,
MLX5_DEVCOM_NUM_COMPONENTS,
};
@@ -51,4 +52,7 @@ void *mlx5_devcom_get_next_peer_data_rcu(struct mlx5_devcom_comp_dev *devcom,
data; \
data = mlx5_devcom_get_next_peer_data_rcu(devcom, &pos))
+void mlx5_devcom_comp_lock(struct mlx5_devcom_comp_dev *devcom);
+void mlx5_devcom_comp_unlock(struct mlx5_devcom_comp_dev *devcom);
+
#endif /* __LIB_MLX5_DEVCOM_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index d17c9c31b165..6bb3c4f45292 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -73,6 +73,7 @@
#include "sf/sf.h"
#include "mlx5_irq.h"
#include "hwmon.h"
+#include "lag/lag.h"
MODULE_AUTHOR("Eli Cohen <eli@mellanox.com>");
MODULE_DESCRIPTION("Mellanox 5th generation network adapters (ConnectX series) core driver");
@@ -946,6 +947,27 @@ static void mlx5_pci_close(struct mlx5_core_dev *dev)
mlx5_pci_disable_device(dev);
}
+static void mlx5_register_hca_devcom_comp(struct mlx5_core_dev *dev)
+{
+ /* This component is use to sync adding core_dev to lag_dev and to sync
+ * changes of mlx5_adev_devices between LAG layer and other layers.
+ */
+ if (!mlx5_lag_is_supported(dev))
+ return;
+
+ dev->priv.hca_devcom_comp =
+ mlx5_devcom_register_component(dev->priv.devc, MLX5_DEVCOM_HCA_PORTS,
+ mlx5_query_nic_system_image_guid(dev),
+ NULL, dev);
+ if (IS_ERR_OR_NULL(dev->priv.hca_devcom_comp))
+ mlx5_core_err(dev, "Failed to register devcom HCA component\n");
+}
+
+static void mlx5_unregister_hca_devcom_comp(struct mlx5_core_dev *dev)
+{
+ mlx5_devcom_unregister_component(dev->priv.hca_devcom_comp);
+}
+
static int mlx5_init_once(struct mlx5_core_dev *dev)
{
int err;
@@ -954,6 +976,7 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
if (IS_ERR(dev->priv.devc))
mlx5_core_warn(dev, "failed to register devcom device %ld\n",
PTR_ERR(dev->priv.devc));
+ mlx5_register_hca_devcom_comp(dev);
err = mlx5_query_board_id(dev);
if (err) {
@@ -1088,6 +1111,7 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
err_irq_cleanup:
mlx5_irq_table_cleanup(dev);
err_devcom:
+ mlx5_unregister_hca_devcom_comp(dev);
mlx5_devcom_unregister_device(dev->priv.devc);
return err;
@@ -1117,6 +1141,7 @@ static void mlx5_cleanup_once(struct mlx5_core_dev *dev)
mlx5_events_cleanup(dev);
mlx5_eq_table_cleanup(dev);
mlx5_irq_table_cleanup(dev);
+ mlx5_unregister_hca_devcom_comp(dev);
mlx5_devcom_unregister_device(dev->priv.devc);
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 94f809f52f27..bbc96aa6cbcc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -250,7 +250,6 @@ int mlx5_register_device(struct mlx5_core_dev *dev);
void mlx5_unregister_device(struct mlx5_core_dev *dev);
void mlx5_dev_set_lightweight(struct mlx5_core_dev *dev);
bool mlx5_dev_is_lightweight(struct mlx5_core_dev *dev);
-struct mlx5_core_dev *mlx5_get_next_phys_dev_lag(struct mlx5_core_dev *dev);
void mlx5_dev_list_lock(void);
void mlx5_dev_list_unlock(void);
int mlx5_dev_list_trylock(void);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 50025fe90026..37c2101e4f63 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -624,6 +624,7 @@ struct mlx5_priv {
struct mlx5_lag *lag;
u32 flags;
struct mlx5_devcom_dev *devc;
+ struct mlx5_devcom_comp_dev *hca_devcom_comp;
struct mlx5_fw_reset *fw_reset;
struct mlx5_core_roce roce;
struct mlx5_fc_stats fc_stats;
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 05/15] net/mlx5: Replace global mlx5_intf_lock with HCA devcom component lock
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
` (3 preceding siblings ...)
2023-10-12 19:27 ` [net-next V2 04/15] net/mlx5: Refactor LAG peer device lookout bus logic to mlx5 devcom Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 21:28 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 06/15] net/mlx5: Remove unused declaration Saeed Mahameed
` (10 subsequent siblings)
15 siblings, 1 reply; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch
From: Shay Drory <shayd@nvidia.com>
mlx5_intf_lock is used to sync between LAG changes and its slaves
mlx5 core dev aux devices changes, which means every time mlx5 core
dev add/remove aux devices, mlx5 is taking this global lock, even if
LAG functionality isn't supported over the core dev.
This cause a bottleneck when probing VFs/SFs in parallel.
Hence, replace mlx5_intf_lock with HCA devcom component lock, or no
lock if LAG functionality isn't supported.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/dev.c | 37 +++++--------------
.../net/ethernet/mellanox/mlx5/core/lag/lag.c | 35 +++++++++++++++---
.../net/ethernet/mellanox/mlx5/core/lag/lag.h | 1 +
.../ethernet/mellanox/mlx5/core/lag/mpesw.c | 9 ++++-
.../ethernet/mellanox/mlx5/core/lib/devcom.c | 7 ++++
.../ethernet/mellanox/mlx5/core/lib/devcom.h | 1 +
.../ethernet/mellanox/mlx5/core/mlx5_core.h | 8 ++--
7 files changed, 59 insertions(+), 39 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
index 6e3a8c22881f..cf0477f53dc4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
@@ -38,8 +38,6 @@
#include "devlink.h"
#include "lag/lag.h"
-/* intf dev list mutex */
-static DEFINE_MUTEX(mlx5_intf_mutex);
static DEFINE_IDA(mlx5_adev_ida);
static bool is_eth_rep_supported(struct mlx5_core_dev *dev)
@@ -337,9 +335,9 @@ static void del_adev(struct auxiliary_device *adev)
void mlx5_dev_set_lightweight(struct mlx5_core_dev *dev)
{
- mutex_lock(&mlx5_intf_mutex);
+ mlx5_devcom_comp_lock(dev->priv.hca_devcom_comp);
dev->priv.flags |= MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV;
- mutex_unlock(&mlx5_intf_mutex);
+ mlx5_devcom_comp_unlock(dev->priv.hca_devcom_comp);
}
bool mlx5_dev_is_lightweight(struct mlx5_core_dev *dev)
@@ -355,7 +353,7 @@ int mlx5_attach_device(struct mlx5_core_dev *dev)
int ret = 0, i;
devl_assert_locked(priv_to_devlink(dev));
- mutex_lock(&mlx5_intf_mutex);
+ mlx5_devcom_comp_lock(dev->priv.hca_devcom_comp);
priv->flags &= ~MLX5_PRIV_FLAGS_DETACH;
for (i = 0; i < ARRAY_SIZE(mlx5_adev_devices); i++) {
if (!priv->adev[i]) {
@@ -400,7 +398,7 @@ int mlx5_attach_device(struct mlx5_core_dev *dev)
break;
}
}
- mutex_unlock(&mlx5_intf_mutex);
+ mlx5_devcom_comp_unlock(dev->priv.hca_devcom_comp);
return ret;
}
@@ -413,7 +411,7 @@ void mlx5_detach_device(struct mlx5_core_dev *dev, bool suspend)
int i;
devl_assert_locked(priv_to_devlink(dev));
- mutex_lock(&mlx5_intf_mutex);
+ mlx5_devcom_comp_lock(dev->priv.hca_devcom_comp);
for (i = ARRAY_SIZE(mlx5_adev_devices) - 1; i >= 0; i--) {
if (!priv->adev[i])
continue;
@@ -443,7 +441,7 @@ void mlx5_detach_device(struct mlx5_core_dev *dev, bool suspend)
priv->adev[i] = NULL;
}
priv->flags |= MLX5_PRIV_FLAGS_DETACH;
- mutex_unlock(&mlx5_intf_mutex);
+ mlx5_devcom_comp_unlock(dev->priv.hca_devcom_comp);
}
int mlx5_register_device(struct mlx5_core_dev *dev)
@@ -451,10 +449,10 @@ int mlx5_register_device(struct mlx5_core_dev *dev)
int ret;
devl_assert_locked(priv_to_devlink(dev));
- mutex_lock(&mlx5_intf_mutex);
+ mlx5_devcom_comp_lock(dev->priv.hca_devcom_comp);
dev->priv.flags &= ~MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV;
ret = mlx5_rescan_drivers_locked(dev);
- mutex_unlock(&mlx5_intf_mutex);
+ mlx5_devcom_comp_unlock(dev->priv.hca_devcom_comp);
if (ret)
mlx5_unregister_device(dev);
@@ -464,10 +462,10 @@ int mlx5_register_device(struct mlx5_core_dev *dev)
void mlx5_unregister_device(struct mlx5_core_dev *dev)
{
devl_assert_locked(priv_to_devlink(dev));
- mutex_lock(&mlx5_intf_mutex);
+ mlx5_devcom_comp_lock(dev->priv.hca_devcom_comp);
dev->priv.flags = MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV;
mlx5_rescan_drivers_locked(dev);
- mutex_unlock(&mlx5_intf_mutex);
+ mlx5_devcom_comp_unlock(dev->priv.hca_devcom_comp);
}
static int add_drivers(struct mlx5_core_dev *dev)
@@ -545,7 +543,6 @@ int mlx5_rescan_drivers_locked(struct mlx5_core_dev *dev)
{
struct mlx5_priv *priv = &dev->priv;
- lockdep_assert_held(&mlx5_intf_mutex);
if (priv->flags & MLX5_PRIV_FLAGS_DETACH)
return 0;
@@ -565,17 +562,3 @@ bool mlx5_same_hw_devs(struct mlx5_core_dev *dev, struct mlx5_core_dev *peer_dev
return (fsystem_guid && psystem_guid && fsystem_guid == psystem_guid);
}
-
-void mlx5_dev_list_lock(void)
-{
- mutex_lock(&mlx5_intf_mutex);
-}
-void mlx5_dev_list_unlock(void)
-{
- mutex_unlock(&mlx5_intf_mutex);
-}
-
-int mlx5_dev_list_trylock(void)
-{
- return mutex_trylock(&mlx5_intf_mutex);
-}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index f0b57f97739f..d14459e5c04f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -943,6 +943,26 @@ static void mlx5_do_bond(struct mlx5_lag *ldev)
}
}
+/* The last mdev to unregister will destroy the workqueue before removing the
+ * devcom component, and as all the mdevs use the same devcom component we are
+ * guaranteed that the devcom is valid while the calling work is running.
+ */
+struct mlx5_devcom_comp_dev *mlx5_lag_get_devcom_comp(struct mlx5_lag *ldev)
+{
+ struct mlx5_devcom_comp_dev *devcom = NULL;
+ int i;
+
+ mutex_lock(&ldev->lock);
+ for (i = 0; i < ldev->ports; i++) {
+ if (ldev->pf[i].dev) {
+ devcom = ldev->pf[i].dev->priv.hca_devcom_comp;
+ break;
+ }
+ }
+ mutex_unlock(&ldev->lock);
+ return devcom;
+}
+
static void mlx5_queue_bond_work(struct mlx5_lag *ldev, unsigned long delay)
{
queue_delayed_work(ldev->wq, &ldev->bond_work, delay);
@@ -953,9 +973,14 @@ static void mlx5_do_bond_work(struct work_struct *work)
struct delayed_work *delayed_work = to_delayed_work(work);
struct mlx5_lag *ldev = container_of(delayed_work, struct mlx5_lag,
bond_work);
+ struct mlx5_devcom_comp_dev *devcom;
int status;
- status = mlx5_dev_list_trylock();
+ devcom = mlx5_lag_get_devcom_comp(ldev);
+ if (!devcom)
+ return;
+
+ status = mlx5_devcom_comp_trylock(devcom);
if (!status) {
mlx5_queue_bond_work(ldev, HZ);
return;
@@ -964,14 +989,14 @@ static void mlx5_do_bond_work(struct work_struct *work)
mutex_lock(&ldev->lock);
if (ldev->mode_changes_in_progress) {
mutex_unlock(&ldev->lock);
- mlx5_dev_list_unlock();
+ mlx5_devcom_comp_unlock(devcom);
mlx5_queue_bond_work(ldev, HZ);
return;
}
mlx5_do_bond(ldev);
mutex_unlock(&ldev->lock);
- mlx5_dev_list_unlock();
+ mlx5_devcom_comp_unlock(devcom);
}
static int mlx5_handle_changeupper_event(struct mlx5_lag *ldev,
@@ -1435,7 +1460,7 @@ void mlx5_lag_disable_change(struct mlx5_core_dev *dev)
if (!ldev)
return;
- mlx5_dev_list_lock();
+ mlx5_devcom_comp_lock(dev->priv.hca_devcom_comp);
mutex_lock(&ldev->lock);
ldev->mode_changes_in_progress++;
@@ -1443,7 +1468,7 @@ void mlx5_lag_disable_change(struct mlx5_core_dev *dev)
mlx5_disable_lag(ldev);
mutex_unlock(&ldev->lock);
- mlx5_dev_list_unlock();
+ mlx5_devcom_comp_unlock(dev->priv.hca_devcom_comp);
}
void mlx5_lag_enable_change(struct mlx5_core_dev *dev)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
index 481e92f39fe6..50fcb1eee574 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
@@ -112,6 +112,7 @@ void mlx5_disable_lag(struct mlx5_lag *ldev);
void mlx5_lag_remove_devices(struct mlx5_lag *ldev);
int mlx5_deactivate_lag(struct mlx5_lag *ldev);
void mlx5_lag_add_devices(struct mlx5_lag *ldev);
+struct mlx5_devcom_comp_dev *mlx5_lag_get_devcom_comp(struct mlx5_lag *ldev);
static inline bool mlx5_lag_is_supported(struct mlx5_core_dev *dev)
{
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c
index 0857eebf4f07..82889f30506e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/mpesw.c
@@ -129,9 +129,14 @@ static void disable_mpesw(struct mlx5_lag *ldev)
static void mlx5_mpesw_work(struct work_struct *work)
{
struct mlx5_mpesw_work_st *mpesww = container_of(work, struct mlx5_mpesw_work_st, work);
+ struct mlx5_devcom_comp_dev *devcom;
struct mlx5_lag *ldev = mpesww->lag;
- mlx5_dev_list_lock();
+ devcom = mlx5_lag_get_devcom_comp(ldev);
+ if (!devcom)
+ return;
+
+ mlx5_devcom_comp_lock(devcom);
mutex_lock(&ldev->lock);
if (ldev->mode_changes_in_progress) {
mpesww->result = -EAGAIN;
@@ -144,7 +149,7 @@ static void mlx5_mpesw_work(struct work_struct *work)
disable_mpesw(ldev);
unlock:
mutex_unlock(&ldev->lock);
- mlx5_dev_list_unlock();
+ mlx5_devcom_comp_unlock(devcom);
complete(&mpesww->comp);
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
index f4d5c300ddd6..e8e50563e956 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
@@ -401,3 +401,10 @@ void mlx5_devcom_comp_unlock(struct mlx5_devcom_comp_dev *devcom)
return;
up_write(&devcom->comp->sem);
}
+
+int mlx5_devcom_comp_trylock(struct mlx5_devcom_comp_dev *devcom)
+{
+ if (IS_ERR_OR_NULL(devcom))
+ return 0;
+ return down_write_trylock(&devcom->comp->sem);
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
index ed249bafd07c..1549edd19aa8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
@@ -54,5 +54,6 @@ void *mlx5_devcom_get_next_peer_data_rcu(struct mlx5_devcom_comp_dev *devcom,
void mlx5_devcom_comp_lock(struct mlx5_devcom_comp_dev *devcom);
void mlx5_devcom_comp_unlock(struct mlx5_devcom_comp_dev *devcom);
+int mlx5_devcom_comp_trylock(struct mlx5_devcom_comp_dev *devcom);
#endif /* __LIB_MLX5_DEVCOM_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index bbc96aa6cbcc..0b60b02518c1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -41,6 +41,7 @@
#include <linux/mlx5/cq.h>
#include <linux/mlx5/fs.h>
#include <linux/mlx5/driver.h>
+#include "lib/devcom.h"
extern uint mlx5_core_debug_mask;
@@ -250,9 +251,6 @@ int mlx5_register_device(struct mlx5_core_dev *dev);
void mlx5_unregister_device(struct mlx5_core_dev *dev);
void mlx5_dev_set_lightweight(struct mlx5_core_dev *dev);
bool mlx5_dev_is_lightweight(struct mlx5_core_dev *dev);
-void mlx5_dev_list_lock(void);
-void mlx5_dev_list_unlock(void);
-int mlx5_dev_list_trylock(void);
void mlx5_fw_reporters_create(struct mlx5_core_dev *dev);
int mlx5_query_mtpps(struct mlx5_core_dev *dev, u32 *mtpps, u32 mtpps_size);
@@ -291,9 +289,9 @@ static inline int mlx5_rescan_drivers(struct mlx5_core_dev *dev)
{
int ret;
- mlx5_dev_list_lock();
+ mlx5_devcom_comp_lock(dev->priv.hca_devcom_comp);
ret = mlx5_rescan_drivers_locked(dev);
- mlx5_dev_list_unlock();
+ mlx5_devcom_comp_unlock(dev->priv.hca_devcom_comp);
return ret;
}
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 06/15] net/mlx5: Remove unused declaration
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
` (4 preceding siblings ...)
2023-10-12 19:27 ` [net-next V2 05/15] net/mlx5: Replace global mlx5_intf_lock with HCA devcom component lock Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 21:29 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 07/15] net/mlx5: fix config name in Kconfig parameter documentation Saeed Mahameed
` (9 subsequent siblings)
15 siblings, 1 reply; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Yue Haibing,
Leon Romanovsky, Simon Horman
From: Yue Haibing <yuehaibing@huawei.com>
Commit 2ac9cfe78223 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path")
declared mlx5e_ipsec_inverse_table_init() but never implemented it.
Commit f52f2faee581 ("net/mlx5e: Introduce flow steering API")
declared mlx5e_fs_set_tc() but never implemented it.
Commit f2f3df550139 ("net/mlx5: EQ, Privatize eq_table and friends")
declared mlx5_eq_comp_cpumask() but never implemented it.
Commit cac1eb2cf2e3 ("net/mlx5: Lag, properly lock eswitch if needed")
removed mlx5_lag_update() but not its declaration.
Commit 35ba005d820b ("net/mlx5: DR, Set flex parser for TNL_MPLS dynamically")
removed mlx5dr_ste_build_tnl_mpls() but not its declaration.
Commit e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
declared but never implemented mlx5_alloc_cmd_mailbox_chain() and mlx5_free_cmd_mailbox_chain().
Commit 0cf53c124756 ("net/mlx5: FWPage, Use async events chain")
removed mlx5_core_req_pages_handler() but not its declaration.
Commit 938fe83c8dcb ("net/mlx5_core: New device capabilities handling")
removed mlx5_query_odp_caps() but not its declaration.
Commit f6a8a19bb11b ("RDMA/netdev: Hoist alloc_netdev_mqs out of the driver")
removed mlx5_rdma_netdev_alloc() but not its declaration.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en/fs.h | 1 -
.../mellanox/mlx5/core/en_accel/ipsec_rxtx.h | 1 -
drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h | 1 -
.../net/ethernet/mellanox/mlx5/core/mlx5_core.h | 2 --
.../mellanox/mlx5/core/steering/dr_types.h | 4 ----
include/linux/mlx5/driver.h | 14 --------------
6 files changed, 23 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h b/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h
index e5a44b0b9616..4d6225e0eec7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h
@@ -150,7 +150,6 @@ struct mlx5e_flow_steering *mlx5e_fs_init(const struct mlx5e_profile *profile,
struct dentry *dfs_root);
void mlx5e_fs_cleanup(struct mlx5e_flow_steering *fs);
struct mlx5e_vlan_table *mlx5e_fs_get_vlan(struct mlx5e_flow_steering *fs);
-void mlx5e_fs_set_tc(struct mlx5e_flow_steering *fs, struct mlx5e_tc_table *tc);
struct mlx5e_tc_table *mlx5e_fs_get_tc(struct mlx5e_flow_steering *fs);
struct mlx5e_l2_table *mlx5e_fs_get_l2(struct mlx5e_flow_steering *fs);
struct mlx5_flow_namespace *mlx5e_fs_get_ns(struct mlx5e_flow_steering *fs, bool egress);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.h
index 9ee014a8ad24..2ed99772f168 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.h
@@ -54,7 +54,6 @@ struct mlx5e_accel_tx_ipsec_state {
#ifdef CONFIG_MLX5_EN_IPSEC
-void mlx5e_ipsec_inverse_table_init(void);
void mlx5e_ipsec_set_iv_esn(struct sk_buff *skb, struct xfrm_state *x,
struct xfrm_offload *xo);
void mlx5e_ipsec_set_iv(struct sk_buff *skb, struct xfrm_state *x,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h
index 69a75459775d..4b7f7131c560 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h
@@ -85,7 +85,6 @@ void mlx5_eq_del_cq(struct mlx5_eq *eq, struct mlx5_core_cq *cq);
struct mlx5_eq_comp *mlx5_eqn2comp_eq(struct mlx5_core_dev *dev, int eqn);
struct mlx5_eq *mlx5_get_async_eq(struct mlx5_core_dev *dev);
void mlx5_cq_tasklet_cb(struct tasklet_struct *t);
-struct cpumask *mlx5_eq_comp_cpumask(struct mlx5_core_dev *dev, int ix);
u32 mlx5_eq_poll_irq_disabled(struct mlx5_eq_comp *eq);
void mlx5_cmd_eq_recover(struct mlx5_core_dev *dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 0b60b02518c1..4e083fd3cc8d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -295,8 +295,6 @@ static inline int mlx5_rescan_drivers(struct mlx5_core_dev *dev)
return ret;
}
-void mlx5_lag_update(struct mlx5_core_dev *dev);
-
enum {
MLX5_NIC_IFC_FULL = 0,
MLX5_NIC_IFC_DISABLED = 1,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
index 55dc7383477c..81eff6c410ce 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
@@ -436,10 +436,6 @@ void mlx5dr_ste_build_mpls(struct mlx5dr_ste_ctx *ste_ctx,
struct mlx5dr_ste_build *sb,
struct mlx5dr_match_param *mask,
bool inner, bool rx);
-void mlx5dr_ste_build_tnl_mpls(struct mlx5dr_ste_ctx *ste_ctx,
- struct mlx5dr_ste_build *sb,
- struct mlx5dr_match_param *mask,
- bool inner, bool rx);
void mlx5dr_ste_build_tnl_mpls_over_gre(struct mlx5dr_ste_ctx *ste_ctx,
struct mlx5dr_ste_build *sb,
struct mlx5dr_match_param *mask,
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 37c2101e4f63..217c7e3ad76c 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -1041,10 +1041,6 @@ void mlx5_trigger_health_work(struct mlx5_core_dev *dev);
int mlx5_frag_buf_alloc_node(struct mlx5_core_dev *dev, int size,
struct mlx5_frag_buf *buf, int node);
void mlx5_frag_buf_free(struct mlx5_core_dev *dev, struct mlx5_frag_buf *buf);
-struct mlx5_cmd_mailbox *mlx5_alloc_cmd_mailbox_chain(struct mlx5_core_dev *dev,
- gfp_t flags, int npages);
-void mlx5_free_cmd_mailbox_chain(struct mlx5_core_dev *dev,
- struct mlx5_cmd_mailbox *head);
int mlx5_core_create_mkey(struct mlx5_core_dev *dev, u32 *mkey, u32 *in,
int inlen);
int mlx5_core_destroy_mkey(struct mlx5_core_dev *dev, u32 mkey);
@@ -1058,8 +1054,6 @@ void mlx5_pagealloc_start(struct mlx5_core_dev *dev);
void mlx5_pagealloc_stop(struct mlx5_core_dev *dev);
void mlx5_pages_debugfs_init(struct mlx5_core_dev *dev);
void mlx5_pages_debugfs_cleanup(struct mlx5_core_dev *dev);
-void mlx5_core_req_pages_handler(struct mlx5_core_dev *dev, u16 func_id,
- s32 npages, bool ec_function);
int mlx5_satisfy_startup_pages(struct mlx5_core_dev *dev, int boot);
int mlx5_reclaim_startup_pages(struct mlx5_core_dev *dev);
void mlx5_register_debugfs(void);
@@ -1099,8 +1093,6 @@ int mlx5_core_create_psv(struct mlx5_core_dev *dev, u32 pdn,
int mlx5_core_destroy_psv(struct mlx5_core_dev *dev, int psv_num);
__be32 mlx5_core_get_terminate_scatter_list_mkey(struct mlx5_core_dev *dev);
void mlx5_core_put_rsc(struct mlx5_core_rsc_common *common);
-int mlx5_query_odp_caps(struct mlx5_core_dev *dev,
- struct mlx5_odp_caps *odp_caps);
int mlx5_init_rl_table(struct mlx5_core_dev *dev);
void mlx5_cleanup_rl_table(struct mlx5_core_dev *dev);
@@ -1202,12 +1194,6 @@ int mlx5_sriov_blocking_notifier_register(struct mlx5_core_dev *mdev,
void mlx5_sriov_blocking_notifier_unregister(struct mlx5_core_dev *mdev,
int vf_id,
struct notifier_block *nb);
-#ifdef CONFIG_MLX5_CORE_IPOIB
-struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
- struct ib_device *ibdev,
- const char *name,
- void (*setup)(struct net_device *));
-#endif /* CONFIG_MLX5_CORE_IPOIB */
int mlx5_rdma_rn_get_params(struct mlx5_core_dev *mdev,
struct ib_device *device,
struct rdma_netdev_alloc_params *params);
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 07/15] net/mlx5: fix config name in Kconfig parameter documentation
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
` (5 preceding siblings ...)
2023-10-12 19:27 ` [net-next V2 06/15] net/mlx5: Remove unused declaration Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 19:27 ` [net-next V2 08/15] net/mlx5: Use PTR_ERR_OR_ZERO() to simplify code Saeed Mahameed
` (8 subsequent siblings)
15 siblings, 0 replies; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Lukas Bulwahn
From: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Commit a12ba19269d7 ("net/mlx5: Update Kconfig parameter documentation")
adds documentation on Kconfig options for the mlx5 driver. It refers to the
config MLX5_EN_MACSEC for MACSec offloading, but the config is actually
called MLX5_MACSEC.
Fix the reference to the right config name in the documentation.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../device_drivers/ethernet/mellanox/mlx5/kconfig.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst
index 0a42c3395ffa..20d3b7e87049 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst
@@ -67,7 +67,7 @@ Enabling the driver and kconfig options
| Enables :ref:`IPSec XFRM cryptography-offload acceleration <xfrm_device>`.
-**CONFIG_MLX5_EN_MACSEC=(y/n)**
+**CONFIG_MLX5_MACSEC=(y/n)**
| Build support for MACsec cryptography-offload acceleration in the NIC.
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 08/15] net/mlx5: Use PTR_ERR_OR_ZERO() to simplify code
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
` (6 preceding siblings ...)
2023-10-12 19:27 ` [net-next V2 07/15] net/mlx5: fix config name in Kconfig parameter documentation Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 19:27 ` [net-next V2 09/15] net/mlx5e: " Saeed Mahameed
` (7 subsequent siblings)
15 siblings, 0 replies; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Jinjie Ruan, Jacob Keller
From: Jinjie Ruan <ruanjinjie@huawei.com>
Return PTR_ERR_OR_ZERO() instead of return 0 or PTR_ERR() to
simplify code.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c
index 7d9bbb494d95..101b3bb90863 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c
@@ -507,10 +507,7 @@ static int mlx5_lag_create_ttc_table(struct mlx5_lag *ldev)
mlx5_lag_set_outer_ttc_params(ldev, &ttc_params);
port_sel->outer.ttc = mlx5_create_ttc_table(dev, &ttc_params);
- if (IS_ERR(port_sel->outer.ttc))
- return PTR_ERR(port_sel->outer.ttc);
-
- return 0;
+ return PTR_ERR_OR_ZERO(port_sel->outer.ttc);
}
static int mlx5_lag_create_inner_ttc_table(struct mlx5_lag *ldev)
@@ -521,10 +518,7 @@ static int mlx5_lag_create_inner_ttc_table(struct mlx5_lag *ldev)
mlx5_lag_set_inner_ttc_params(ldev, &ttc_params);
port_sel->inner.ttc = mlx5_create_inner_ttc_table(dev, &ttc_params);
- if (IS_ERR(port_sel->inner.ttc))
- return PTR_ERR(port_sel->inner.ttc);
-
- return 0;
+ return PTR_ERR_OR_ZERO(port_sel->inner.ttc);
}
int mlx5_lag_port_sel_create(struct mlx5_lag *ldev,
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 09/15] net/mlx5e: Use PTR_ERR_OR_ZERO() to simplify code
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
` (7 preceding siblings ...)
2023-10-12 19:27 ` [net-next V2 08/15] net/mlx5: Use PTR_ERR_OR_ZERO() to simplify code Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 21:30 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 10/15] net/mlx5e: Refactor rx_res_init() and rx_res_free() APIs Saeed Mahameed
` (6 subsequent siblings)
15 siblings, 1 reply; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Yu Liao, Leon Romanovsky
From: Yu Liao <liaoyu15@huawei.com>
Use the standard error pointer macro to shorten the code and simplify.
Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_fs.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
index 934b0d5ce1b3..777d311d44ef 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
@@ -1283,9 +1283,7 @@ static int mlx5e_create_inner_ttc_table(struct mlx5e_flow_steering *fs,
mlx5e_set_inner_ttc_params(fs, rx_res, &ttc_params);
fs->inner_ttc = mlx5_create_inner_ttc_table(fs->mdev,
&ttc_params);
- if (IS_ERR(fs->inner_ttc))
- return PTR_ERR(fs->inner_ttc);
- return 0;
+ return PTR_ERR_OR_ZERO(fs->inner_ttc);
}
int mlx5e_create_ttc_table(struct mlx5e_flow_steering *fs,
@@ -1295,9 +1293,7 @@ int mlx5e_create_ttc_table(struct mlx5e_flow_steering *fs,
mlx5e_set_ttc_params(fs, rx_res, &ttc_params, true);
fs->ttc = mlx5_create_ttc_table(fs->mdev, &ttc_params);
- if (IS_ERR(fs->ttc))
- return PTR_ERR(fs->ttc);
- return 0;
+ return PTR_ERR_OR_ZERO(fs->ttc);
}
int mlx5e_create_flow_steering(struct mlx5e_flow_steering *fs,
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 10/15] net/mlx5e: Refactor rx_res_init() and rx_res_free() APIs
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
` (8 preceding siblings ...)
2023-10-12 19:27 ` [net-next V2 09/15] net/mlx5e: " Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 21:31 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 11/15] net/mlx5e: Refactor mlx5e_rss_set_rxfh() and mlx5e_rss_get_rxfh() Saeed Mahameed
` (5 subsequent siblings)
15 siblings, 1 reply; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Adham Faris
From: Adham Faris <afaris@nvidia.com>
Refactor mlx5e_rx_res_init() and mlx5e_rx_res_free() by wrapping
mlx5e_rx_res_alloc() and mlx5e_rx_res_destroy() API's respectively.
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../ethernet/mellanox/mlx5/core/en/rx_res.c | 36 +++++++++++--------
.../ethernet/mellanox/mlx5/core/en/rx_res.h | 11 +++---
.../net/ethernet/mellanox/mlx5/core/en_main.c | 24 ++++++-------
.../net/ethernet/mellanox/mlx5/core/en_rep.c | 27 ++++++--------
.../ethernet/mellanox/mlx5/core/ipoib/ipoib.c | 24 +++++--------
5 files changed, 56 insertions(+), 66 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
index 56e6b8c7501f..abbd08b6d1a9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
@@ -284,7 +284,12 @@ struct mlx5e_rss *mlx5e_rx_res_rss_get(struct mlx5e_rx_res *res, u32 rss_idx)
/* End of API rx_res_rss_* */
-struct mlx5e_rx_res *mlx5e_rx_res_alloc(void)
+static void mlx5e_rx_res_free(struct mlx5e_rx_res *res)
+{
+ kvfree(res);
+}
+
+static struct mlx5e_rx_res *mlx5e_rx_res_alloc(void)
{
return kvzalloc(sizeof(struct mlx5e_rx_res), GFP_KERNEL);
}
@@ -404,13 +409,19 @@ static void mlx5e_rx_res_ptp_destroy(struct mlx5e_rx_res *res)
mlx5e_rqt_destroy(&res->ptp.rqt);
}
-int mlx5e_rx_res_init(struct mlx5e_rx_res *res, struct mlx5_core_dev *mdev,
- enum mlx5e_rx_res_features features, unsigned int max_nch,
- u32 drop_rqn, const struct mlx5e_packet_merge_param *init_pkt_merge_param,
- unsigned int init_nch)
+struct mlx5e_rx_res *
+mlx5e_rx_res_create(struct mlx5_core_dev *mdev, enum mlx5e_rx_res_features features,
+ unsigned int max_nch, u32 drop_rqn,
+ const struct mlx5e_packet_merge_param *init_pkt_merge_param,
+ unsigned int init_nch)
{
+ struct mlx5e_rx_res *res;
int err;
+ res = mlx5e_rx_res_alloc();
+ if (!res)
+ return ERR_PTR(-ENOMEM);
+
res->mdev = mdev;
res->features = features;
res->max_nch = max_nch;
@@ -421,7 +432,7 @@ int mlx5e_rx_res_init(struct mlx5e_rx_res *res, struct mlx5_core_dev *mdev,
err = mlx5e_rx_res_rss_init_def(res, init_nch);
if (err)
- goto err_out;
+ goto err_rx_res_free;
err = mlx5e_rx_res_channels_init(res);
if (err)
@@ -431,14 +442,15 @@ int mlx5e_rx_res_init(struct mlx5e_rx_res *res, struct mlx5_core_dev *mdev,
if (err)
goto err_channels_destroy;
- return 0;
+ return res;
err_channels_destroy:
mlx5e_rx_res_channels_destroy(res);
err_rss_destroy:
__mlx5e_rx_res_rss_destroy(res, 0);
-err_out:
- return err;
+err_rx_res_free:
+ mlx5e_rx_res_free(res);
+ return ERR_PTR(err);
}
void mlx5e_rx_res_destroy(struct mlx5e_rx_res *res)
@@ -446,11 +458,7 @@ void mlx5e_rx_res_destroy(struct mlx5e_rx_res *res)
mlx5e_rx_res_ptp_destroy(res);
mlx5e_rx_res_channels_destroy(res);
mlx5e_rx_res_rss_destroy_all(res);
-}
-
-void mlx5e_rx_res_free(struct mlx5e_rx_res *res)
-{
- kvfree(res);
+ mlx5e_rx_res_free(res);
}
u32 mlx5e_rx_res_get_tirn_direct(struct mlx5e_rx_res *res, unsigned int ix)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
index 580fe8bc3cd2..4fa8366db31f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
@@ -21,13 +21,12 @@ enum mlx5e_rx_res_features {
};
/* Setup */
-struct mlx5e_rx_res *mlx5e_rx_res_alloc(void);
-int mlx5e_rx_res_init(struct mlx5e_rx_res *res, struct mlx5_core_dev *mdev,
- enum mlx5e_rx_res_features features, unsigned int max_nch,
- u32 drop_rqn, const struct mlx5e_packet_merge_param *init_pkt_merge_param,
- unsigned int init_nch);
+struct mlx5e_rx_res *
+mlx5e_rx_res_create(struct mlx5_core_dev *mdev, enum mlx5e_rx_res_features features,
+ unsigned int max_nch, u32 drop_rqn,
+ const struct mlx5e_packet_merge_param *init_pkt_merge_param,
+ unsigned int init_nch);
void mlx5e_rx_res_destroy(struct mlx5e_rx_res *res);
-void mlx5e_rx_res_free(struct mlx5e_rx_res *res);
/* TIRN getters for flow steering */
u32 mlx5e_rx_res_get_tirn_direct(struct mlx5e_rx_res *res, unsigned int ix);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index a2ae791538ed..4230990b74d2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -5325,10 +5325,6 @@ static int mlx5e_init_nic_rx(struct mlx5e_priv *priv)
enum mlx5e_rx_res_features features;
int err;
- priv->rx_res = mlx5e_rx_res_alloc();
- if (!priv->rx_res)
- return -ENOMEM;
-
mlx5e_create_q_counters(priv);
err = mlx5e_open_drop_rq(priv, &priv->drop_rq);
@@ -5340,12 +5336,16 @@ static int mlx5e_init_nic_rx(struct mlx5e_priv *priv)
features = MLX5E_RX_RES_FEATURE_PTP;
if (mlx5_tunnel_inner_ft_supported(mdev))
features |= MLX5E_RX_RES_FEATURE_INNER_FT;
- err = mlx5e_rx_res_init(priv->rx_res, priv->mdev, features,
- priv->max_nch, priv->drop_rq.rqn,
- &priv->channels.params.packet_merge,
- priv->channels.params.num_channels);
- if (err)
+
+ priv->rx_res = mlx5e_rx_res_create(priv->mdev, features, priv->max_nch, priv->drop_rq.rqn,
+ &priv->channels.params.packet_merge,
+ priv->channels.params.num_channels);
+ if (IS_ERR(priv->rx_res)) {
+ err = PTR_ERR(priv->rx_res);
+ priv->rx_res = NULL;
+ mlx5_core_err(mdev, "create rx resources failed, %d\n", err);
goto err_close_drop_rq;
+ }
err = mlx5e_create_flow_steering(priv->fs, priv->rx_res, priv->profile,
priv->netdev);
@@ -5375,12 +5375,11 @@ static int mlx5e_init_nic_rx(struct mlx5e_priv *priv)
priv->profile);
err_destroy_rx_res:
mlx5e_rx_res_destroy(priv->rx_res);
+ priv->rx_res = NULL;
err_close_drop_rq:
mlx5e_close_drop_rq(&priv->drop_rq);
err_destroy_q_counters:
mlx5e_destroy_q_counters(priv);
- mlx5e_rx_res_free(priv->rx_res);
- priv->rx_res = NULL;
return err;
}
@@ -5391,10 +5390,9 @@ static void mlx5e_cleanup_nic_rx(struct mlx5e_priv *priv)
mlx5e_destroy_flow_steering(priv->fs, !!(priv->netdev->hw_features & NETIF_F_NTUPLE),
priv->profile);
mlx5e_rx_res_destroy(priv->rx_res);
+ priv->rx_res = NULL;
mlx5e_close_drop_rq(&priv->drop_rq);
mlx5e_destroy_q_counters(priv);
- mlx5e_rx_res_free(priv->rx_res);
- priv->rx_res = NULL;
}
static void mlx5e_set_mqprio_rl(struct mlx5e_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 2fdb8895aecd..82dc27e31d9b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -998,26 +998,22 @@ static int mlx5e_init_rep_rx(struct mlx5e_priv *priv)
struct mlx5_core_dev *mdev = priv->mdev;
int err;
- priv->rx_res = mlx5e_rx_res_alloc();
- if (!priv->rx_res) {
- err = -ENOMEM;
- goto err_free_fs;
- }
-
mlx5e_fs_init_l2_addr(priv->fs, priv->netdev);
err = mlx5e_open_drop_rq(priv, &priv->drop_rq);
if (err) {
mlx5_core_err(mdev, "open drop rq failed, %d\n", err);
- goto err_rx_res_free;
+ goto err_free_fs;
}
- err = mlx5e_rx_res_init(priv->rx_res, priv->mdev, 0,
- priv->max_nch, priv->drop_rq.rqn,
- &priv->channels.params.packet_merge,
- priv->channels.params.num_channels);
- if (err)
+ priv->rx_res = mlx5e_rx_res_create(priv->mdev, 0, priv->max_nch, priv->drop_rq.rqn,
+ &priv->channels.params.packet_merge,
+ priv->channels.params.num_channels);
+ if (IS_ERR(priv->rx_res)) {
+ err = PTR_ERR(priv->rx_res);
+ mlx5_core_err(mdev, "Create rx resources failed, err=%d\n", err);
goto err_close_drop_rq;
+ }
err = mlx5e_create_rep_ttc_table(priv);
if (err)
@@ -1041,11 +1037,9 @@ static int mlx5e_init_rep_rx(struct mlx5e_priv *priv)
mlx5_destroy_ttc_table(mlx5e_fs_get_ttc(priv->fs, false));
err_destroy_rx_res:
mlx5e_rx_res_destroy(priv->rx_res);
+ priv->rx_res = ERR_PTR(-EINVAL);
err_close_drop_rq:
mlx5e_close_drop_rq(&priv->drop_rq);
-err_rx_res_free:
- mlx5e_rx_res_free(priv->rx_res);
- priv->rx_res = NULL;
err_free_fs:
mlx5e_fs_cleanup(priv->fs);
priv->fs = NULL;
@@ -1059,9 +1053,8 @@ static void mlx5e_cleanup_rep_rx(struct mlx5e_priv *priv)
mlx5e_destroy_rep_root_ft(priv);
mlx5_destroy_ttc_table(mlx5e_fs_get_ttc(priv->fs, false));
mlx5e_rx_res_destroy(priv->rx_res);
+ priv->rx_res = ERR_PTR(-EINVAL);
mlx5e_close_drop_rq(&priv->drop_rq);
- mlx5e_rx_res_free(priv->rx_res);
- priv->rx_res = NULL;
}
static void mlx5e_rep_mpesw_work(struct work_struct *work)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index baa7ef812313..2bf77a5251b4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -418,12 +418,6 @@ static int mlx5i_init_rx(struct mlx5e_priv *priv)
return -ENOMEM;
}
- priv->rx_res = mlx5e_rx_res_alloc();
- if (!priv->rx_res) {
- err = -ENOMEM;
- goto err_free_fs;
- }
-
mlx5e_create_q_counters(priv);
err = mlx5e_open_drop_rq(priv, &priv->drop_rq);
@@ -432,12 +426,13 @@ static int mlx5i_init_rx(struct mlx5e_priv *priv)
goto err_destroy_q_counters;
}
- err = mlx5e_rx_res_init(priv->rx_res, priv->mdev, 0,
- priv->max_nch, priv->drop_rq.rqn,
- &priv->channels.params.packet_merge,
- priv->channels.params.num_channels);
- if (err)
+ priv->rx_res = mlx5e_rx_res_create(priv->mdev, 0, priv->max_nch, priv->drop_rq.rqn,
+ &priv->channels.params.packet_merge,
+ priv->channels.params.num_channels);
+ if (IS_ERR(priv->rx_res)) {
+ err = PTR_ERR(priv->rx_res);
goto err_close_drop_rq;
+ }
err = mlx5i_create_flow_steering(priv);
if (err)
@@ -447,13 +442,11 @@ static int mlx5i_init_rx(struct mlx5e_priv *priv)
err_destroy_rx_res:
mlx5e_rx_res_destroy(priv->rx_res);
+ priv->rx_res = ERR_PTR(-EINVAL);
err_close_drop_rq:
mlx5e_close_drop_rq(&priv->drop_rq);
err_destroy_q_counters:
mlx5e_destroy_q_counters(priv);
- mlx5e_rx_res_free(priv->rx_res);
- priv->rx_res = NULL;
-err_free_fs:
mlx5e_fs_cleanup(priv->fs);
return err;
}
@@ -462,10 +455,9 @@ static void mlx5i_cleanup_rx(struct mlx5e_priv *priv)
{
mlx5i_destroy_flow_steering(priv);
mlx5e_rx_res_destroy(priv->rx_res);
+ priv->rx_res = ERR_PTR(-EINVAL);
mlx5e_close_drop_rq(&priv->drop_rq);
mlx5e_destroy_q_counters(priv);
- mlx5e_rx_res_free(priv->rx_res);
- priv->rx_res = NULL;
mlx5e_fs_cleanup(priv->fs);
}
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 11/15] net/mlx5e: Refactor mlx5e_rss_set_rxfh() and mlx5e_rss_get_rxfh()
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
` (9 preceding siblings ...)
2023-10-12 19:27 ` [net-next V2 10/15] net/mlx5e: Refactor rx_res_init() and rx_res_free() APIs Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 21:32 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 12/15] net/mlx5e: Refactor mlx5e_rss_init() and mlx5e_rss_free() API's Saeed Mahameed
` (4 subsequent siblings)
15 siblings, 1 reply; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Adham Faris
From: Adham Faris <afaris@nvidia.com>
Initialize indirect table array with memcpy rather than for loop.
This change has made for two reasons:
1) To be consistent with the indirect table array init in
mlx5e_rss_set_rxfh().
2) In general, prefer to use memcpy for array initializing rather than
for loop.
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
| 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
--git a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
index 7f93426b88b3..fd52541e5508 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
@@ -470,11 +470,9 @@ int mlx5e_rss_packet_merge_set_param(struct mlx5e_rss *rss,
int mlx5e_rss_get_rxfh(struct mlx5e_rss *rss, u32 *indir, u8 *key, u8 *hfunc)
{
- unsigned int i;
-
if (indir)
- for (i = 0; i < MLX5E_INDIR_RQT_SIZE; i++)
- indir[i] = rss->indir.table[i];
+ memcpy(indir, rss->indir.table,
+ MLX5E_INDIR_RQT_SIZE * sizeof(*rss->indir.table));
if (key)
memcpy(key, rss->hash.toeplitz_hash_key,
@@ -523,12 +521,10 @@ int mlx5e_rss_set_rxfh(struct mlx5e_rss *rss, const u32 *indir,
}
if (indir) {
- unsigned int i;
-
changed_indir = true;
- for (i = 0; i < MLX5E_INDIR_RQT_SIZE; i++)
- rss->indir.table[i] = indir[i];
+ memcpy(rss->indir.table, indir,
+ MLX5E_INDIR_RQT_SIZE * sizeof(*rss->indir.table));
}
if (changed_indir && rss->enabled) {
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 12/15] net/mlx5e: Refactor mlx5e_rss_init() and mlx5e_rss_free() API's
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
` (10 preceding siblings ...)
2023-10-12 19:27 ` [net-next V2 11/15] net/mlx5e: Refactor mlx5e_rss_set_rxfh() and mlx5e_rss_get_rxfh() Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 21:33 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 13/15] net/mlx5e: Preparations for supporting larger number of channels Saeed Mahameed
` (3 subsequent siblings)
15 siblings, 1 reply; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Adham Faris
From: Adham Faris <afaris@nvidia.com>
Introduce code refactoring below:
1) Introduce single API for creating and destroying rss object,
mlx5e_rss_create() and mlx5e_rss_destroy() respectively.
2) mlx5e_rss_create() constructs and initializes RSS object depends
on a function new param enum mlx5e_rss_create_type. Callers (like
rx_res.c) will no longer need to allocate RSS object via
mlx5e_rss_alloc() and initialize it immediately via
mlx5e_rss_init_no_tirs() or mlx5e_rss_init(), this will be done by
a single call to mlx5e_rss_create(). Hence, mlx5e_rss_alloc() and
mlx5e_rss_init_no_tirs() have been removed from rss.h file and became
static functions.
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
| 44 ++++++++++++-------
| 15 ++++---
.../ethernet/mellanox/mlx5/core/en/rx_res.c | 35 ++++-----------
3 files changed, 44 insertions(+), 50 deletions(-)
--git a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
index fd52541e5508..654b5c45e4a5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
@@ -81,12 +81,12 @@ struct mlx5e_rss {
refcount_t refcnt;
};
-struct mlx5e_rss *mlx5e_rss_alloc(void)
+static struct mlx5e_rss *mlx5e_rss_alloc(void)
{
return kvzalloc(sizeof(struct mlx5e_rss), GFP_KERNEL);
}
-void mlx5e_rss_free(struct mlx5e_rss *rss)
+static void mlx5e_rss_free(struct mlx5e_rss *rss)
{
kvfree(rss);
}
@@ -282,28 +282,35 @@ static int mlx5e_rss_update_tirs(struct mlx5e_rss *rss)
return retval;
}
-int mlx5e_rss_init_no_tirs(struct mlx5e_rss *rss, struct mlx5_core_dev *mdev,
- bool inner_ft_support, u32 drop_rqn)
+static int mlx5e_rss_init_no_tirs(struct mlx5e_rss *rss)
{
- rss->mdev = mdev;
- rss->inner_ft_support = inner_ft_support;
- rss->drop_rqn = drop_rqn;
-
mlx5e_rss_params_init(rss);
refcount_set(&rss->refcnt, 1);
- return mlx5e_rqt_init_direct(&rss->rqt, mdev, true, drop_rqn);
+ return mlx5e_rqt_init_direct(&rss->rqt, rss->mdev, true, rss->drop_rqn);
}
-int mlx5e_rss_init(struct mlx5e_rss *rss, struct mlx5_core_dev *mdev,
- bool inner_ft_support, u32 drop_rqn,
- const struct mlx5e_packet_merge_param *init_pkt_merge_param)
+struct mlx5e_rss *mlx5e_rss_init(struct mlx5_core_dev *mdev, bool inner_ft_support, u32 drop_rqn,
+ const struct mlx5e_packet_merge_param *init_pkt_merge_param,
+ enum mlx5e_rss_init_type type)
{
+ struct mlx5e_rss *rss;
int err;
- err = mlx5e_rss_init_no_tirs(rss, mdev, inner_ft_support, drop_rqn);
+ rss = mlx5e_rss_alloc();
+ if (!rss)
+ return ERR_PTR(-ENOMEM);
+
+ rss->mdev = mdev;
+ rss->inner_ft_support = inner_ft_support;
+ rss->drop_rqn = drop_rqn;
+
+ err = mlx5e_rss_init_no_tirs(rss);
if (err)
- goto err_out;
+ goto err_free_rss;
+
+ if (type == MLX5E_RSS_INIT_NO_TIRS)
+ goto out;
err = mlx5e_rss_create_tirs(rss, init_pkt_merge_param, false);
if (err)
@@ -315,14 +322,16 @@ int mlx5e_rss_init(struct mlx5e_rss *rss, struct mlx5_core_dev *mdev,
goto err_destroy_tirs;
}
- return 0;
+out:
+ return rss;
err_destroy_tirs:
mlx5e_rss_destroy_tirs(rss, false);
err_destroy_rqt:
mlx5e_rqt_destroy(&rss->rqt);
-err_out:
- return err;
+err_free_rss:
+ mlx5e_rss_free(rss);
+ return ERR_PTR(err);
}
int mlx5e_rss_cleanup(struct mlx5e_rss *rss)
@@ -336,6 +345,7 @@ int mlx5e_rss_cleanup(struct mlx5e_rss *rss)
mlx5e_rss_destroy_tirs(rss, true);
mlx5e_rqt_destroy(&rss->rqt);
+ mlx5e_rss_free(rss);
return 0;
}
--git a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
index c6b216416344..8b2290727641 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
@@ -8,18 +8,19 @@
#include "tir.h"
#include "fs.h"
+enum mlx5e_rss_init_type {
+ MLX5E_RSS_INIT_NO_TIRS = 0,
+ MLX5E_RSS_INIT_TIRS
+};
+
struct mlx5e_rss_params_traffic_type
mlx5e_rss_get_default_tt_config(enum mlx5_traffic_types tt);
struct mlx5e_rss;
-struct mlx5e_rss *mlx5e_rss_alloc(void);
-void mlx5e_rss_free(struct mlx5e_rss *rss);
-int mlx5e_rss_init(struct mlx5e_rss *rss, struct mlx5_core_dev *mdev,
- bool inner_ft_support, u32 drop_rqn,
- const struct mlx5e_packet_merge_param *init_pkt_merge_param);
-int mlx5e_rss_init_no_tirs(struct mlx5e_rss *rss, struct mlx5_core_dev *mdev,
- bool inner_ft_support, u32 drop_rqn);
+struct mlx5e_rss *mlx5e_rss_init(struct mlx5_core_dev *mdev, bool inner_ft_support, u32 drop_rqn,
+ const struct mlx5e_packet_merge_param *init_pkt_merge_param,
+ enum mlx5e_rss_init_type type);
int mlx5e_rss_cleanup(struct mlx5e_rss *rss);
void mlx5e_rss_refcnt_inc(struct mlx5e_rss *rss);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
index abbd08b6d1a9..81b13338bfdf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
@@ -39,36 +39,27 @@ static int mlx5e_rx_res_rss_init_def(struct mlx5e_rx_res *res,
{
bool inner_ft_support = res->features & MLX5E_RX_RES_FEATURE_INNER_FT;
struct mlx5e_rss *rss;
- int err;
if (WARN_ON(res->rss[0]))
return -EINVAL;
- rss = mlx5e_rss_alloc();
- if (!rss)
- return -ENOMEM;
-
- err = mlx5e_rss_init(rss, res->mdev, inner_ft_support, res->drop_rqn,
- &res->pkt_merge_param);
- if (err)
- goto err_rss_free;
+ rss = mlx5e_rss_init(res->mdev, inner_ft_support, res->drop_rqn,
+ &res->pkt_merge_param, MLX5E_RSS_INIT_TIRS);
+ if (IS_ERR(rss))
+ return PTR_ERR(rss);
mlx5e_rss_set_indir_uniform(rss, init_nch);
res->rss[0] = rss;
return 0;
-
-err_rss_free:
- mlx5e_rss_free(rss);
- return err;
}
int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res, u32 *rss_idx, unsigned int init_nch)
{
bool inner_ft_support = res->features & MLX5E_RX_RES_FEATURE_INNER_FT;
struct mlx5e_rss *rss;
- int err, i;
+ int i;
for (i = 1; i < MLX5E_MAX_NUM_RSS; i++)
if (!res->rss[i])
@@ -77,13 +68,10 @@ int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res, u32 *rss_idx, unsigned int i
if (i == MLX5E_MAX_NUM_RSS)
return -ENOSPC;
- rss = mlx5e_rss_alloc();
- if (!rss)
- return -ENOMEM;
-
- err = mlx5e_rss_init_no_tirs(rss, res->mdev, inner_ft_support, res->drop_rqn);
- if (err)
- goto err_rss_free;
+ rss = mlx5e_rss_init(res->mdev, inner_ft_support, res->drop_rqn,
+ &res->pkt_merge_param, MLX5E_RSS_INIT_NO_TIRS);
+ if (IS_ERR(rss))
+ return PTR_ERR(rss);
mlx5e_rss_set_indir_uniform(rss, init_nch);
if (res->rss_active)
@@ -93,10 +81,6 @@ int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res, u32 *rss_idx, unsigned int i
*rss_idx = i;
return 0;
-
-err_rss_free:
- mlx5e_rss_free(rss);
- return err;
}
static int __mlx5e_rx_res_rss_destroy(struct mlx5e_rx_res *res, u32 rss_idx)
@@ -108,7 +92,6 @@ static int __mlx5e_rx_res_rss_destroy(struct mlx5e_rx_res *res, u32 rss_idx)
if (err)
return err;
- mlx5e_rss_free(rss);
res->rss[rss_idx] = NULL;
return 0;
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 13/15] net/mlx5e: Preparations for supporting larger number of channels
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
` (11 preceding siblings ...)
2023-10-12 19:27 ` [net-next V2 12/15] net/mlx5e: Refactor mlx5e_rss_init() and mlx5e_rss_free() API's Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 21:35 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 14/15] net/mlx5e: Increase max supported channels number to 256 Saeed Mahameed
` (2 subsequent siblings)
15 siblings, 1 reply; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Adham Faris
From: Adham Faris <afaris@nvidia.com>
Data center server CPUs number keeps getting larger with time.
Currently, our driver limits the number of channels to 128.
Maximum channels number is enforced and bounded by hardcoded
defines (en.h/MLX5E_MAX_NUM_CHANNELS) even though the device and machine
(CPUs num) can allow more.
Refactor current implementation in order to handle further channels.
The maximum supported channels number will be increased in the followup
patch.
Introduce RQT size calculation/allocation scheme below:
1) Preserve current RQT size of 256 for channels number up to 128 (the
old limit).
2) For greater channels number, RQT size is calculated by multiplying
the channels number by 2 and rounding up the result to the nearest
power of 2. If the calculated RQT size exceeds the maximum supported
size by the NIC, fallback to this maximum RQT size
(1 << log_max_rqt_size).
Since RQT size is no more static, allocate and free the indirection
table SW shadow dynamically.
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 5 +-
.../net/ethernet/mellanox/mlx5/core/en/rqt.c | 32 ++++--
.../net/ethernet/mellanox/mlx5/core/en/rqt.h | 9 +-
| 108 +++++++++++++++---
| 7 +-
.../ethernet/mellanox/mlx5/core/en/rx_res.c | 42 +++++--
.../ethernet/mellanox/mlx5/core/en/rx_res.h | 1 +
.../ethernet/mellanox/mlx5/core/en_ethtool.c | 2 +-
.../net/ethernet/mellanox/mlx5/core/en_main.c | 8 +-
.../net/ethernet/mellanox/mlx5/core/en_tc.c | 16 +--
10 files changed, 178 insertions(+), 52 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 86f2690c5e01..2130cba548b5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -141,7 +141,7 @@ struct page_pool;
#define MLX5E_PARAMS_DEFAULT_MIN_RX_WQES_MPW 0x2
#define MLX5E_MIN_NUM_CHANNELS 0x1
-#define MLX5E_MAX_NUM_CHANNELS (MLX5E_INDIR_RQT_SIZE / 2)
+#define MLX5E_MAX_NUM_CHANNELS 128
#define MLX5E_TX_CQ_POLL_BUDGET 128
#define MLX5E_TX_XSK_POLL_BUDGET 64
#define MLX5E_SQ_RECOVER_MIN_INTERVAL 500 /* msecs */
@@ -193,7 +193,8 @@ static inline int mlx5e_get_max_num_channels(struct mlx5_core_dev *mdev)
{
return is_kdump_kernel() ?
MLX5E_MIN_NUM_CHANNELS :
- min_t(int, mlx5_comp_vectors_max(mdev), MLX5E_MAX_NUM_CHANNELS);
+ min3(mlx5_comp_vectors_max(mdev), (u32)MLX5E_MAX_NUM_CHANNELS,
+ (u32)(1 << MLX5_CAP_GEN(mdev, log_max_rqt_size)));
}
/* The maximum WQE size can be retrieved by max_wqe_sz_sq in
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.c
index b915fb29dd2c..7b8ff7a71003 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.c
@@ -9,7 +9,7 @@ void mlx5e_rss_params_indir_init_uniform(struct mlx5e_rss_params_indir *indir,
{
unsigned int i;
- for (i = 0; i < MLX5E_INDIR_RQT_SIZE; i++)
+ for (i = 0; i < indir->actual_table_size; i++)
indir->table[i] = i % num_channels;
}
@@ -45,9 +45,9 @@ static int mlx5e_rqt_init(struct mlx5e_rqt *rqt, struct mlx5_core_dev *mdev,
}
int mlx5e_rqt_init_direct(struct mlx5e_rqt *rqt, struct mlx5_core_dev *mdev,
- bool indir_enabled, u32 init_rqn)
+ bool indir_enabled, u32 init_rqn, u32 indir_table_size)
{
- u16 max_size = indir_enabled ? MLX5E_INDIR_RQT_SIZE : 1;
+ u16 max_size = indir_enabled ? indir_table_size : 1;
return mlx5e_rqt_init(rqt, mdev, max_size, &init_rqn, 1);
}
@@ -68,11 +68,11 @@ static int mlx5e_calc_indir_rqns(u32 *rss_rqns, u32 *rqns, unsigned int num_rqns
{
unsigned int i;
- for (i = 0; i < MLX5E_INDIR_RQT_SIZE; i++) {
+ for (i = 0; i < indir->actual_table_size; i++) {
unsigned int ix = i;
if (hfunc == ETH_RSS_HASH_XOR)
- ix = mlx5e_bits_invert(ix, ilog2(MLX5E_INDIR_RQT_SIZE));
+ ix = mlx5e_bits_invert(ix, ilog2(indir->actual_table_size));
ix = indir->table[ix];
@@ -94,7 +94,7 @@ int mlx5e_rqt_init_indir(struct mlx5e_rqt *rqt, struct mlx5_core_dev *mdev,
u32 *rss_rqns;
int err;
- rss_rqns = kvmalloc_array(MLX5E_INDIR_RQT_SIZE, sizeof(*rss_rqns), GFP_KERNEL);
+ rss_rqns = kvmalloc_array(indir->actual_table_size, sizeof(*rss_rqns), GFP_KERNEL);
if (!rss_rqns)
return -ENOMEM;
@@ -102,13 +102,25 @@ int mlx5e_rqt_init_indir(struct mlx5e_rqt *rqt, struct mlx5_core_dev *mdev,
if (err)
goto out;
- err = mlx5e_rqt_init(rqt, mdev, MLX5E_INDIR_RQT_SIZE, rss_rqns, MLX5E_INDIR_RQT_SIZE);
+ err = mlx5e_rqt_init(rqt, mdev, indir->max_table_size, rss_rqns,
+ indir->actual_table_size);
out:
kvfree(rss_rqns);
return err;
}
+#define MLX5E_UNIFORM_SPREAD_RQT_FACTOR 2
+
+u32 mlx5e_rqt_size(struct mlx5_core_dev *mdev, unsigned int num_channels)
+{
+ u32 rqt_size = max_t(u32, MLX5E_INDIR_MIN_RQT_SIZE,
+ roundup_pow_of_two(num_channels * MLX5E_UNIFORM_SPREAD_RQT_FACTOR));
+ u32 max_cap_rqt_size = 1 << MLX5_CAP_GEN(mdev, log_max_rqt_size);
+
+ return min_t(u32, rqt_size, max_cap_rqt_size);
+}
+
void mlx5e_rqt_destroy(struct mlx5e_rqt *rqt)
{
mlx5_core_destroy_rqt(rqt->mdev, rqt->rqtn);
@@ -151,10 +163,10 @@ int mlx5e_rqt_redirect_indir(struct mlx5e_rqt *rqt, u32 *rqns, unsigned int num_
u32 *rss_rqns;
int err;
- if (WARN_ON(rqt->size != MLX5E_INDIR_RQT_SIZE))
+ if (WARN_ON(rqt->size != indir->max_table_size))
return -EINVAL;
- rss_rqns = kvmalloc_array(MLX5E_INDIR_RQT_SIZE, sizeof(*rss_rqns), GFP_KERNEL);
+ rss_rqns = kvmalloc_array(indir->actual_table_size, sizeof(*rss_rqns), GFP_KERNEL);
if (!rss_rqns)
return -ENOMEM;
@@ -162,7 +174,7 @@ int mlx5e_rqt_redirect_indir(struct mlx5e_rqt *rqt, u32 *rqns, unsigned int num_
if (err)
goto out;
- err = mlx5e_rqt_redirect(rqt, rss_rqns, MLX5E_INDIR_RQT_SIZE);
+ err = mlx5e_rqt_redirect(rqt, rss_rqns, indir->actual_table_size);
out:
kvfree(rss_rqns);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.h
index 60c985a12f24..77fba3ebd18d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rqt.h
@@ -6,12 +6,14 @@
#include <linux/kernel.h>
-#define MLX5E_INDIR_RQT_SIZE (1 << 8)
+#define MLX5E_INDIR_MIN_RQT_SIZE (BIT(8))
struct mlx5_core_dev;
struct mlx5e_rss_params_indir {
- u32 table[MLX5E_INDIR_RQT_SIZE];
+ u32 *table;
+ u32 actual_table_size;
+ u32 max_table_size;
};
void mlx5e_rss_params_indir_init_uniform(struct mlx5e_rss_params_indir *indir,
@@ -24,7 +26,7 @@ struct mlx5e_rqt {
};
int mlx5e_rqt_init_direct(struct mlx5e_rqt *rqt, struct mlx5_core_dev *mdev,
- bool indir_enabled, u32 init_rqn);
+ bool indir_enabled, u32 init_rqn, u32 indir_table_size);
int mlx5e_rqt_init_indir(struct mlx5e_rqt *rqt, struct mlx5_core_dev *mdev,
u32 *rqns, unsigned int num_rqns,
u8 hfunc, struct mlx5e_rss_params_indir *indir);
@@ -35,6 +37,7 @@ static inline u32 mlx5e_rqt_get_rqtn(struct mlx5e_rqt *rqt)
return rqt->rqtn;
}
+u32 mlx5e_rqt_size(struct mlx5_core_dev *mdev, unsigned int num_channels);
int mlx5e_rqt_redirect_direct(struct mlx5e_rqt *rqt, u32 rqn);
int mlx5e_rqt_redirect_indir(struct mlx5e_rqt *rqt, u32 *rqns, unsigned int num_rqns,
u8 hfunc, struct mlx5e_rss_params_indir *indir);
--git a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
index 654b5c45e4a5..c1545a2e8d6d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.c
@@ -81,14 +81,75 @@ struct mlx5e_rss {
refcount_t refcnt;
};
-static struct mlx5e_rss *mlx5e_rss_alloc(void)
+void mlx5e_rss_params_indir_modify_actual_size(struct mlx5e_rss *rss, u32 num_channels)
{
- return kvzalloc(sizeof(struct mlx5e_rss), GFP_KERNEL);
+ rss->indir.actual_table_size = mlx5e_rqt_size(rss->mdev, num_channels);
}
-static void mlx5e_rss_free(struct mlx5e_rss *rss)
+int mlx5e_rss_params_indir_init(struct mlx5e_rss_params_indir *indir, struct mlx5_core_dev *mdev,
+ u32 actual_table_size, u32 max_table_size)
{
+ indir->table = kvmalloc_array(max_table_size, sizeof(*indir->table), GFP_KERNEL);
+ if (!indir->table)
+ return -ENOMEM;
+
+ indir->max_table_size = max_table_size;
+ indir->actual_table_size = actual_table_size;
+
+ return 0;
+}
+
+void mlx5e_rss_params_indir_cleanup(struct mlx5e_rss_params_indir *indir)
+{
+ kvfree(indir->table);
+}
+
+static int mlx5e_rss_copy(struct mlx5e_rss *to, const struct mlx5e_rss *from)
+{
+ u32 *dst_indir_table;
+
+ if (to->indir.actual_table_size != from->indir.actual_table_size ||
+ to->indir.max_table_size != from->indir.max_table_size) {
+ mlx5e_rss_warn(to->mdev,
+ "Failed to copy RSS due to size mismatch, src (actual %u, max %u) != dst (actual %u, max %u)\n",
+ from->indir.actual_table_size, from->indir.max_table_size,
+ to->indir.actual_table_size, to->indir.max_table_size);
+ return -EINVAL;
+ }
+
+ dst_indir_table = to->indir.table;
+ *to = *from;
+ to->indir.table = dst_indir_table;
+ memcpy(to->indir.table, from->indir.table,
+ from->indir.actual_table_size * sizeof(*from->indir.table));
+ return 0;
+}
+
+static struct mlx5e_rss *mlx5e_rss_init_copy(const struct mlx5e_rss *from)
+{
+ struct mlx5e_rss *rss;
+ int err;
+
+ rss = kvzalloc(sizeof(*rss), GFP_KERNEL);
+ if (!rss)
+ return ERR_PTR(-ENOMEM);
+
+ err = mlx5e_rss_params_indir_init(&rss->indir, from->mdev, from->indir.actual_table_size,
+ from->indir.max_table_size);
+ if (err)
+ goto err_free_rss;
+
+ err = mlx5e_rss_copy(rss, from);
+ if (err)
+ goto err_free_indir;
+
+ return rss;
+
+err_free_indir:
+ mlx5e_rss_params_indir_cleanup(&rss->indir);
+err_free_rss:
kvfree(rss);
+ return ERR_PTR(err);
}
static void mlx5e_rss_params_init(struct mlx5e_rss *rss)
@@ -287,27 +348,35 @@ static int mlx5e_rss_init_no_tirs(struct mlx5e_rss *rss)
mlx5e_rss_params_init(rss);
refcount_set(&rss->refcnt, 1);
- return mlx5e_rqt_init_direct(&rss->rqt, rss->mdev, true, rss->drop_rqn);
+ return mlx5e_rqt_init_direct(&rss->rqt, rss->mdev, true,
+ rss->drop_rqn, rss->indir.max_table_size);
}
struct mlx5e_rss *mlx5e_rss_init(struct mlx5_core_dev *mdev, bool inner_ft_support, u32 drop_rqn,
const struct mlx5e_packet_merge_param *init_pkt_merge_param,
- enum mlx5e_rss_init_type type)
+ enum mlx5e_rss_init_type type, unsigned int nch,
+ unsigned int max_nch)
{
struct mlx5e_rss *rss;
int err;
- rss = mlx5e_rss_alloc();
+ rss = kvzalloc(sizeof(*rss), GFP_KERNEL);
if (!rss)
return ERR_PTR(-ENOMEM);
+ err = mlx5e_rss_params_indir_init(&rss->indir, mdev,
+ mlx5e_rqt_size(mdev, nch),
+ mlx5e_rqt_size(mdev, max_nch));
+ if (err)
+ goto err_free_rss;
+
rss->mdev = mdev;
rss->inner_ft_support = inner_ft_support;
rss->drop_rqn = drop_rqn;
err = mlx5e_rss_init_no_tirs(rss);
if (err)
- goto err_free_rss;
+ goto err_free_indir;
if (type == MLX5E_RSS_INIT_NO_TIRS)
goto out;
@@ -329,8 +398,10 @@ struct mlx5e_rss *mlx5e_rss_init(struct mlx5_core_dev *mdev, bool inner_ft_suppo
mlx5e_rss_destroy_tirs(rss, false);
err_destroy_rqt:
mlx5e_rqt_destroy(&rss->rqt);
+err_free_indir:
+ mlx5e_rss_params_indir_cleanup(&rss->indir);
err_free_rss:
- mlx5e_rss_free(rss);
+ kvfree(rss);
return ERR_PTR(err);
}
@@ -345,7 +416,8 @@ int mlx5e_rss_cleanup(struct mlx5e_rss *rss)
mlx5e_rss_destroy_tirs(rss, true);
mlx5e_rqt_destroy(&rss->rqt);
- mlx5e_rss_free(rss);
+ mlx5e_rss_params_indir_cleanup(&rss->indir);
+ kvfree(rss);
return 0;
}
@@ -482,7 +554,7 @@ int mlx5e_rss_get_rxfh(struct mlx5e_rss *rss, u32 *indir, u8 *key, u8 *hfunc)
{
if (indir)
memcpy(indir, rss->indir.table,
- MLX5E_INDIR_RQT_SIZE * sizeof(*rss->indir.table));
+ rss->indir.actual_table_size * sizeof(*rss->indir.table));
if (key)
memcpy(key, rss->hash.toeplitz_hash_key,
@@ -503,11 +575,9 @@ int mlx5e_rss_set_rxfh(struct mlx5e_rss *rss, const u32 *indir,
struct mlx5e_rss *old_rss;
int err = 0;
- old_rss = mlx5e_rss_alloc();
- if (!old_rss)
- return -ENOMEM;
-
- *old_rss = *rss;
+ old_rss = mlx5e_rss_init_copy(rss);
+ if (IS_ERR(old_rss))
+ return PTR_ERR(old_rss);
if (hfunc && *hfunc != rss->hash.hfunc) {
switch (*hfunc) {
@@ -534,13 +604,13 @@ int mlx5e_rss_set_rxfh(struct mlx5e_rss *rss, const u32 *indir,
changed_indir = true;
memcpy(rss->indir.table, indir,
- MLX5E_INDIR_RQT_SIZE * sizeof(*rss->indir.table));
+ rss->indir.actual_table_size * sizeof(*rss->indir.table));
}
if (changed_indir && rss->enabled) {
err = mlx5e_rss_apply(rss, rqns, num_rqns);
if (err) {
- *rss = *old_rss;
+ mlx5e_rss_copy(rss, old_rss);
goto out;
}
}
@@ -549,7 +619,9 @@ int mlx5e_rss_set_rxfh(struct mlx5e_rss *rss, const u32 *indir,
mlx5e_rss_update_tirs(rss);
out:
- mlx5e_rss_free(old_rss);
+ mlx5e_rss_params_indir_cleanup(&old_rss->indir);
+ kvfree(old_rss);
+
return err;
}
--git a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
index 8b2290727641..d1d0bc350e92 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rss.h
@@ -18,9 +18,14 @@ mlx5e_rss_get_default_tt_config(enum mlx5_traffic_types tt);
struct mlx5e_rss;
+int mlx5e_rss_params_indir_init(struct mlx5e_rss_params_indir *indir, struct mlx5_core_dev *mdev,
+ u32 actual_table_size, u32 max_table_size);
+void mlx5e_rss_params_indir_cleanup(struct mlx5e_rss_params_indir *indir);
+void mlx5e_rss_params_indir_modify_actual_size(struct mlx5e_rss *rss, u32 num_channels);
struct mlx5e_rss *mlx5e_rss_init(struct mlx5_core_dev *mdev, bool inner_ft_support, u32 drop_rqn,
const struct mlx5e_packet_merge_param *init_pkt_merge_param,
- enum mlx5e_rss_init_type type);
+ enum mlx5e_rss_init_type type, unsigned int nch,
+ unsigned int max_nch);
int mlx5e_rss_cleanup(struct mlx5e_rss *rss);
void mlx5e_rss_refcnt_inc(struct mlx5e_rss *rss);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
index 81b13338bfdf..b23e224e3763 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.c
@@ -18,7 +18,7 @@ struct mlx5e_rx_res {
struct mlx5e_rss *rss[MLX5E_MAX_NUM_RSS];
bool rss_active;
- u32 rss_rqns[MLX5E_INDIR_RQT_SIZE];
+ u32 *rss_rqns;
unsigned int rss_nch;
struct {
@@ -34,6 +34,16 @@ struct mlx5e_rx_res {
/* API for rx_res_rss_* */
+void mlx5e_rx_res_rss_update_num_channels(struct mlx5e_rx_res *res, u32 nch)
+{
+ int i;
+
+ for (i = 0; i < MLX5E_MAX_NUM_RSS; i++) {
+ if (res->rss[i])
+ mlx5e_rss_params_indir_modify_actual_size(res->rss[i], nch);
+ }
+}
+
static int mlx5e_rx_res_rss_init_def(struct mlx5e_rx_res *res,
unsigned int init_nch)
{
@@ -44,7 +54,7 @@ static int mlx5e_rx_res_rss_init_def(struct mlx5e_rx_res *res,
return -EINVAL;
rss = mlx5e_rss_init(res->mdev, inner_ft_support, res->drop_rqn,
- &res->pkt_merge_param, MLX5E_RSS_INIT_TIRS);
+ &res->pkt_merge_param, MLX5E_RSS_INIT_TIRS, init_nch, res->max_nch);
if (IS_ERR(rss))
return PTR_ERR(rss);
@@ -69,7 +79,8 @@ int mlx5e_rx_res_rss_init(struct mlx5e_rx_res *res, u32 *rss_idx, unsigned int i
return -ENOSPC;
rss = mlx5e_rss_init(res->mdev, inner_ft_support, res->drop_rqn,
- &res->pkt_merge_param, MLX5E_RSS_INIT_NO_TIRS);
+ &res->pkt_merge_param, MLX5E_RSS_INIT_NO_TIRS, init_nch,
+ res->max_nch);
if (IS_ERR(rss))
return PTR_ERR(rss);
@@ -269,12 +280,25 @@ struct mlx5e_rss *mlx5e_rx_res_rss_get(struct mlx5e_rx_res *res, u32 rss_idx)
static void mlx5e_rx_res_free(struct mlx5e_rx_res *res)
{
+ kvfree(res->rss_rqns);
kvfree(res);
}
-static struct mlx5e_rx_res *mlx5e_rx_res_alloc(void)
+static struct mlx5e_rx_res *mlx5e_rx_res_alloc(struct mlx5_core_dev *mdev, unsigned int max_nch)
{
- return kvzalloc(sizeof(struct mlx5e_rx_res), GFP_KERNEL);
+ struct mlx5e_rx_res *rx_res;
+
+ rx_res = kvzalloc(sizeof(*rx_res), GFP_KERNEL);
+ if (!rx_res)
+ return NULL;
+
+ rx_res->rss_rqns = kvcalloc(max_nch, sizeof(*rx_res->rss_rqns), GFP_KERNEL);
+ if (!rx_res->rss_rqns) {
+ kvfree(rx_res);
+ return NULL;
+ }
+
+ return rx_res;
}
static int mlx5e_rx_res_channels_init(struct mlx5e_rx_res *res)
@@ -296,7 +320,8 @@ static int mlx5e_rx_res_channels_init(struct mlx5e_rx_res *res)
for (ix = 0; ix < res->max_nch; ix++) {
err = mlx5e_rqt_init_direct(&res->channels[ix].direct_rqt,
- res->mdev, false, res->drop_rqn);
+ res->mdev, false, res->drop_rqn,
+ mlx5e_rqt_size(res->mdev, res->max_nch));
if (err) {
mlx5_core_warn(res->mdev, "Failed to create a direct RQT: err = %d, ix = %u\n",
err, ix);
@@ -350,7 +375,8 @@ static int mlx5e_rx_res_ptp_init(struct mlx5e_rx_res *res)
if (!builder)
return -ENOMEM;
- err = mlx5e_rqt_init_direct(&res->ptp.rqt, res->mdev, false, res->drop_rqn);
+ err = mlx5e_rqt_init_direct(&res->ptp.rqt, res->mdev, false, res->drop_rqn,
+ mlx5e_rqt_size(res->mdev, res->max_nch));
if (err)
goto out;
@@ -401,7 +427,7 @@ mlx5e_rx_res_create(struct mlx5_core_dev *mdev, enum mlx5e_rx_res_features featu
struct mlx5e_rx_res *res;
int err;
- res = mlx5e_rx_res_alloc();
+ res = mlx5e_rx_res_alloc(mdev, max_nch);
if (!res)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
index 4fa8366db31f..82aaba8a82b3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rx_res.h
@@ -59,6 +59,7 @@ int mlx5e_rx_res_rss_destroy(struct mlx5e_rx_res *res, u32 rss_idx);
int mlx5e_rx_res_rss_cnt(struct mlx5e_rx_res *res);
int mlx5e_rx_res_rss_index(struct mlx5e_rx_res *res, struct mlx5e_rss *rss);
struct mlx5e_rss *mlx5e_rx_res_rss_get(struct mlx5e_rx_res *res, u32 rss_idx);
+void mlx5e_rx_res_rss_update_num_channels(struct mlx5e_rx_res *res, u32 nch);
/* Workaround for hairpin */
struct mlx5e_rss_params_hash mlx5e_rx_res_get_current_hash(struct mlx5e_rx_res *res);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index dff02434ff45..215261a69255 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -1247,7 +1247,7 @@ static u32 mlx5e_get_rxfh_key_size(struct net_device *netdev)
u32 mlx5e_ethtool_get_rxfh_indir_size(struct mlx5e_priv *priv)
{
- return MLX5E_INDIR_RQT_SIZE;
+ return mlx5e_rqt_size(priv->mdev, priv->channels.params.num_channels);
}
static u32 mlx5e_get_rxfh_indir_size(struct net_device *netdev)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 4230990b74d2..bbb1d37a8401 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2885,8 +2885,12 @@ static int mlx5e_num_channels_changed(struct mlx5e_priv *priv)
mlx5e_set_default_xps_cpumasks(priv, &priv->channels.params);
/* This function may be called on attach, before priv->rx_res is created. */
- if (!netif_is_rxfh_configured(priv->netdev) && priv->rx_res)
- mlx5e_rx_res_rss_set_indir_uniform(priv->rx_res, count);
+ if (priv->rx_res) {
+ mlx5e_rx_res_rss_update_num_channels(priv->rx_res, count);
+
+ if (!netif_is_rxfh_configured(priv->netdev))
+ mlx5e_rx_res_rss_set_indir_uniform(priv->rx_res, count);
+ }
return 0;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index c24828b688ac..4e09ebcfff3b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -753,19 +753,21 @@ static int mlx5e_hairpin_create_indirect_rqt(struct mlx5e_hairpin *hp)
{
struct mlx5e_priv *priv = hp->func_priv;
struct mlx5_core_dev *mdev = priv->mdev;
- struct mlx5e_rss_params_indir *indir;
+ struct mlx5e_rss_params_indir indir;
int err;
- indir = kvmalloc(sizeof(*indir), GFP_KERNEL);
- if (!indir)
- return -ENOMEM;
+ err = mlx5e_rss_params_indir_init(&indir, mdev,
+ mlx5e_rqt_size(mdev, hp->num_channels),
+ mlx5e_rqt_size(mdev, priv->max_nch));
+ if (err)
+ return err;
- mlx5e_rss_params_indir_init_uniform(indir, hp->num_channels);
+ mlx5e_rss_params_indir_init_uniform(&indir, hp->num_channels);
err = mlx5e_rqt_init_indir(&hp->indir_rqt, mdev, hp->pair->rqn, hp->num_channels,
mlx5e_rx_res_get_current_hash(priv->rx_res).hfunc,
- indir);
+ &indir);
- kvfree(indir);
+ mlx5e_rss_params_indir_cleanup(&indir);
return err;
}
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 14/15] net/mlx5e: Increase max supported channels number to 256
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
` (12 preceding siblings ...)
2023-10-12 19:27 ` [net-next V2 13/15] net/mlx5e: Preparations for supporting larger number of channels Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 21:35 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 15/15] net/mlx5e: Allow IPsec soft/hard limits in bytes Saeed Mahameed
2023-10-14 1:10 ` [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Jakub Kicinski
15 siblings, 1 reply; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Adham Faris
From: Adham Faris <afaris@nvidia.com>
Increase max supported channels number to 256 (it is not extended
further due to testing disabilities).
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 2130cba548b5..e789f0586c3d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -141,7 +141,7 @@ struct page_pool;
#define MLX5E_PARAMS_DEFAULT_MIN_RX_WQES_MPW 0x2
#define MLX5E_MIN_NUM_CHANNELS 0x1
-#define MLX5E_MAX_NUM_CHANNELS 128
+#define MLX5E_MAX_NUM_CHANNELS 256
#define MLX5E_TX_CQ_POLL_BUDGET 128
#define MLX5E_TX_XSK_POLL_BUDGET 64
#define MLX5E_SQ_RECOVER_MIN_INTERVAL 500 /* msecs */
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [net-next V2 15/15] net/mlx5e: Allow IPsec soft/hard limits in bytes
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
` (13 preceding siblings ...)
2023-10-12 19:27 ` [net-next V2 14/15] net/mlx5e: Increase max supported channels number to 256 Saeed Mahameed
@ 2023-10-12 19:27 ` Saeed Mahameed
2023-10-12 21:37 ` Jacob Keller
2023-10-14 1:10 ` [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Jakub Kicinski
15 siblings, 1 reply; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 19:27 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
Patrisious Haddad
From: Leon Romanovsky <leonro@nvidia.com>
Actually the mlx5 code already has needed support to allow users
to configure soft/hard limits in bytes. It is possible due to the
situation with TX path, where CX7 devices are missing hardware
implementation to send events to the software, see commit b2f7b01d36a9
("net/mlx5e: Simulate missing IPsec TX limits hardware functionality").
That software workaround is not limited to TX and works for bytes too.
So relax the validation logic to not block soft/hard limits in bytes.
Reviewed-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../mellanox/mlx5/core/en_accel/ipsec.c | 23 +++++++++++-------
.../mellanox/mlx5/core/en_accel/ipsec_fs.c | 24 +++++++++++--------
2 files changed, 28 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
index 7d4ceb9b9c16..257c41870f78 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
@@ -56,7 +56,7 @@ static struct mlx5e_ipsec_pol_entry *to_ipsec_pol_entry(struct xfrm_policy *x)
return (struct mlx5e_ipsec_pol_entry *)x->xdo.offload_handle;
}
-static void mlx5e_ipsec_handle_tx_limit(struct work_struct *_work)
+static void mlx5e_ipsec_handle_sw_limits(struct work_struct *_work)
{
struct mlx5e_ipsec_dwork *dwork =
container_of(_work, struct mlx5e_ipsec_dwork, dwork.work);
@@ -486,9 +486,15 @@ static int mlx5e_xfrm_validate_state(struct mlx5_core_dev *mdev,
return -EINVAL;
}
- if (x->lft.hard_byte_limit != XFRM_INF ||
- x->lft.soft_byte_limit != XFRM_INF) {
- NL_SET_ERR_MSG_MOD(extack, "Device doesn't support limits in bytes");
+ if (x->lft.soft_byte_limit >= x->lft.hard_byte_limit &&
+ x->lft.hard_byte_limit != XFRM_INF) {
+ /* XFRM stack doesn't prevent such configuration :(. */
+ NL_SET_ERR_MSG_MOD(extack, "Hard byte limit must be greater than soft one");
+ return -EINVAL;
+ }
+
+ if (!x->lft.soft_byte_limit || !x->lft.hard_byte_limit) {
+ NL_SET_ERR_MSG_MOD(extack, "Soft/hard byte limits can't be 0");
return -EINVAL;
}
@@ -624,11 +630,10 @@ static int mlx5e_ipsec_create_dwork(struct mlx5e_ipsec_sa_entry *sa_entry)
if (x->xso.type != XFRM_DEV_OFFLOAD_PACKET)
return 0;
- if (x->xso.dir != XFRM_DEV_OFFLOAD_OUT)
- return 0;
-
if (x->lft.soft_packet_limit == XFRM_INF &&
- x->lft.hard_packet_limit == XFRM_INF)
+ x->lft.hard_packet_limit == XFRM_INF &&
+ x->lft.soft_byte_limit == XFRM_INF &&
+ x->lft.hard_byte_limit == XFRM_INF)
return 0;
dwork = kzalloc(sizeof(*dwork), GFP_KERNEL);
@@ -636,7 +641,7 @@ static int mlx5e_ipsec_create_dwork(struct mlx5e_ipsec_sa_entry *sa_entry)
return -ENOMEM;
dwork->sa_entry = sa_entry;
- INIT_DELAYED_WORK(&dwork->dwork, mlx5e_ipsec_handle_tx_limit);
+ INIT_DELAYED_WORK(&dwork->dwork, mlx5e_ipsec_handle_sw_limits);
sa_entry->dwork = dwork;
return 0;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
index 7dba4221993f..eda1cb528deb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
@@ -1249,15 +1249,17 @@ static int rx_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry)
setup_fte_no_frags(spec);
setup_fte_upper_proto_match(spec, &attrs->upspec);
- if (rx != ipsec->rx_esw)
- err = setup_modify_header(ipsec, attrs->type,
- sa_entry->ipsec_obj_id | BIT(31),
- XFRM_DEV_OFFLOAD_IN, &flow_act);
- else
- err = mlx5_esw_ipsec_rx_setup_modify_header(sa_entry, &flow_act);
+ if (!attrs->drop) {
+ if (rx != ipsec->rx_esw)
+ err = setup_modify_header(ipsec, attrs->type,
+ sa_entry->ipsec_obj_id | BIT(31),
+ XFRM_DEV_OFFLOAD_IN, &flow_act);
+ else
+ err = mlx5_esw_ipsec_rx_setup_modify_header(sa_entry, &flow_act);
- if (err)
- goto err_mod_header;
+ if (err)
+ goto err_mod_header;
+ }
switch (attrs->type) {
case XFRM_DEV_OFFLOAD_PACKET:
@@ -1307,7 +1309,8 @@ static int rx_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry)
if (flow_act.pkt_reformat)
mlx5_packet_reformat_dealloc(mdev, flow_act.pkt_reformat);
err_pkt_reformat:
- mlx5_modify_header_dealloc(mdev, flow_act.modify_hdr);
+ if (flow_act.modify_hdr)
+ mlx5_modify_header_dealloc(mdev, flow_act.modify_hdr);
err_mod_header:
kvfree(spec);
err_alloc:
@@ -1805,7 +1808,8 @@ void mlx5e_accel_ipsec_fs_del_rule(struct mlx5e_ipsec_sa_entry *sa_entry)
return;
}
- mlx5_modify_header_dealloc(mdev, ipsec_rule->modify_hdr);
+ if (ipsec_rule->modify_hdr)
+ mlx5_modify_header_dealloc(mdev, ipsec_rule->modify_hdr);
mlx5_esw_ipsec_rx_id_mapping_remove(sa_entry);
rx_ft_put(sa_entry->ipsec, sa_entry->attrs.family, sa_entry->attrs.type);
}
--
2.41.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [net-next V2 01/15] net/mlx5: Parallelize vhca event handling
2023-10-12 19:27 ` [net-next V2 01/15] net/mlx5: Parallelize vhca event handling Saeed Mahameed
@ 2023-10-12 21:13 ` Jacob Keller
0 siblings, 0 replies; 32+ messages in thread
From: Jacob Keller @ 2023-10-12 21:13 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Wei Zhang, Moshe Shemesh,
Shay Drory
On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
> From: Wei Zhang <weizhang@nvidia.com>
>
> At present, mlx5 driver have a general purpose
> event handler which not only handles vhca event
> but also many other events. This incurs a huge
> bottleneck because the event handler is
> implemented by single threaded workqueue and all
> events are forced to be handled in serial manner
> even though application tries to create multiple
> SFs simultaneously.
>
> Introduce a dedicated vhca event handler which
> manages SFs parallel creation.
>
> Signed-off-by: Wei Zhang <weizhang@nvidia.com>
> Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
> Reviewed-by: Shay Drory <shayd@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
> .../net/ethernet/mellanox/mlx5/core/events.c | 5 --
> .../ethernet/mellanox/mlx5/core/mlx5_core.h | 3 +-
> .../mellanox/mlx5/core/sf/vhca_event.c | 57 ++++++++++++++++++-
> include/linux/mlx5/driver.h | 1 +
> 4 files changed, 57 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/events.c b/drivers/net/ethernet/mellanox/mlx5/core/events.c
> index 3ec892d51f57..d91ea53eb394 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/events.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/events.c
> @@ -441,8 +441,3 @@ int mlx5_blocking_notifier_call_chain(struct mlx5_core_dev *dev, unsigned int ev
>
> return blocking_notifier_call_chain(&events->sw_nh, event, data);
> }
> -
> -void mlx5_events_work_enqueue(struct mlx5_core_dev *dev, struct work_struct *work)
> -{
> - queue_work(dev->priv.events->wq, work);
> -}
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
> index 124352459c23..94f809f52f27 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
> @@ -143,6 +143,8 @@ enum mlx5_semaphore_space_address {
>
> #define MLX5_DEFAULT_PROF 2
> #define MLX5_SF_PROF 3
> +#define MLX5_NUM_FW_CMD_THREADS 8
> +#define MLX5_DEV_MAX_WQS MLX5_NUM_FW_CMD_THREADS
>
> static inline int mlx5_flexible_inlen(struct mlx5_core_dev *dev, size_t fixed,
> size_t item_size, size_t num_items,
> @@ -331,7 +333,6 @@ int mlx5_vport_set_other_func_cap(struct mlx5_core_dev *dev, const void *hca_cap
> #define mlx5_vport_get_other_func_general_cap(dev, vport, out) \
> mlx5_vport_get_other_func_cap(dev, vport, out, MLX5_CAP_GENERAL)
>
> -void mlx5_events_work_enqueue(struct mlx5_core_dev *dev, struct work_struct *work);
> static inline u32 mlx5_sriov_get_vf_total_msix(struct pci_dev *pdev)
> {
> struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
> index d908fba968f0..c6fd729de8b2 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
> @@ -21,6 +21,15 @@ struct mlx5_vhca_event_work {
> struct mlx5_vhca_state_event event;
> };
>
> +struct mlx5_vhca_event_handler {
> + struct workqueue_struct *wq;
> +};
> +
> +struct mlx5_vhca_events {
> + struct mlx5_core_dev *dev;
> + struct mlx5_vhca_event_handler handler[MLX5_DEV_MAX_WQS];
> +};
> +
> int mlx5_cmd_query_vhca_state(struct mlx5_core_dev *dev, u16 function_id, u32 *out, u32 outlen)
> {
> u32 in[MLX5_ST_SZ_DW(query_vhca_state_in)] = {};
> @@ -99,6 +108,12 @@ static void mlx5_vhca_state_work_handler(struct work_struct *_work)
> kfree(work);
> }
>
> +static void
> +mlx5_vhca_events_work_enqueue(struct mlx5_core_dev *dev, int idx, struct work_struct *work)
> +{
> + queue_work(dev->priv.vhca_events->handler[idx].wq, work);
> +}
I guess you need seprate single-threaded work queues because each
sequence of vhca work items on a given index need to be serialized, but
you don't need to serialize between different vhca index? Makes sense
enough.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [net-next V2 02/15] net/mlx5: Redesign SF active work to remove table_lock
2023-10-12 19:27 ` [net-next V2 02/15] net/mlx5: Redesign SF active work to remove table_lock Saeed Mahameed
@ 2023-10-12 21:18 ` Jacob Keller
0 siblings, 0 replies; 32+ messages in thread
From: Jacob Keller @ 2023-10-12 21:18 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Wei Zhang, Shay Drory
On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
> From: Wei Zhang <weizhang@nvidia.com>
>
> active_work is a work that iterates over all
> possible SF devices which their SF port
> representors are located on different function,
> and in case SF is in active state, probes it.
> Currently, the active_work in active_wq is
> synced with mlx5_vhca_events_work via table_lock
> and this lock causing a bottleneck in performance.
>
> To remove table_lock, redesign active_wq logic
> so that it now pushes active_work per SF to
> mlx5_vhca_events_workqueues. Since the latter
> workqueues are ordered, active_work and
> mlx5_vhca_events_work with same index will be
> pushed into same workqueue, thus it completely
> eliminates the need for a lock.
>
Ahh, here's the reason you need single-thread work queues.
Minor nit with using _work instead of keeping work variable, but not a
big deal:
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
> Signed-off-by: Wei Zhang <weizhang@nvidia.com>
> Signed-off-by: Shay Drory <shayd@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
> .../ethernet/mellanox/mlx5/core/sf/dev/dev.c | 86 ++++++++++++-------
> .../mellanox/mlx5/core/sf/vhca_event.c | 16 +++-
> .../mellanox/mlx5/core/sf/vhca_event.h | 3 +
> 3 files changed, 74 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
> index 0f9b280514b8..c93492b67788 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
> @@ -17,13 +17,19 @@ struct mlx5_sf_dev_table {
> phys_addr_t base_address;
> u64 sf_bar_length;
> struct notifier_block nb;
> - struct mutex table_lock; /* Serializes sf life cycle and vhca state change handler */
> struct workqueue_struct *active_wq;
> struct work_struct work;
> u8 stop_active_wq:1;
> struct mlx5_core_dev *dev;
> };
>
> +struct mlx5_sf_dev_active_work_ctx {
> + struct work_struct work;
> + struct mlx5_vhca_state_event event;
> + struct mlx5_sf_dev_table *table;
> + int sf_index;
> +};
> +
> static bool mlx5_sf_dev_supported(const struct mlx5_core_dev *dev)
> {
> return MLX5_CAP_GEN(dev, sf) && mlx5_vhca_event_supported(dev);
> @@ -165,7 +171,6 @@ mlx5_sf_dev_state_change_handler(struct notifier_block *nb, unsigned long event_
> return 0;
>
> sf_index = event->function_id - base_id;
> - mutex_lock(&table->table_lock);
> sf_dev = xa_load(&table->devices, sf_index);
> switch (event->new_vhca_state) {
> case MLX5_VHCA_STATE_INVALID:
> @@ -189,7 +194,6 @@ mlx5_sf_dev_state_change_handler(struct notifier_block *nb, unsigned long event_
> default:
> break;
> }
> - mutex_unlock(&table->table_lock);
> return 0;
> }
>
> @@ -214,15 +218,44 @@ static int mlx5_sf_dev_vhca_arm_all(struct mlx5_sf_dev_table *table)
> return 0;
> }
>
> -static void mlx5_sf_dev_add_active_work(struct work_struct *work)
> +static void mlx5_sf_dev_add_active_work(struct work_struct *_work)
Whats with changing this to _work? not sure it helps readability at all.
You don't use work in the function elsewhere except as part of
container_of which was fine before. Either way, not a huge deal to me.
> {
> - struct mlx5_sf_dev_table *table = container_of(work, struct mlx5_sf_dev_table, work);
> + struct mlx5_sf_dev_active_work_ctx *work_ctx;
> +
> + work_ctx = container_of(_work, struct mlx5_sf_dev_active_work_ctx, work);
> + if (work_ctx->table->stop_active_wq)
> + goto out;
> + /* Don't probe device which is already probe */
> + if (!xa_load(&work_ctx->table->devices, work_ctx->sf_index))
> + mlx5_sf_dev_add(work_ctx->table->dev, work_ctx->sf_index,
> + work_ctx->event.function_id, work_ctx->event.sw_function_id);
> + /* There is a race where SF got inactive after the query
> + * above. e.g.: the query returns that the state of the
> + * SF is active, and after that the eswitch manager set it to
> + * inactive.
> + * This case cannot be managed in SW, since the probing of the
> + * SF is on one system, and the inactivation is on a different
> + * system.
> + * If the inactive is done after the SF perform init_hca(),
> + * the SF will fully probe and then removed. If it was
> + * done before init_hca(), the SF probe will fail.
> + */
This is a scary looking comment. :D
The result seems ok though: if you deactivate the SF before it starts
probing, it fails to probe and is this never activated? Ok.
> +out:
> + kfree(work_ctx);
> +}
> +
> +/* In case SFs are generated externally, probe active SFs */
> +static void mlx5_sf_dev_queue_active_works(struct work_struct *_work)
> +{
> + struct mlx5_sf_dev_table *table = container_of(_work, struct mlx5_sf_dev_table, work);
> u32 out[MLX5_ST_SZ_DW(query_vhca_state_out)] = {};
> + struct mlx5_sf_dev_active_work_ctx *work_ctx;
> struct mlx5_core_dev *dev = table->dev;
> u16 max_functions;
> u16 function_id;
> u16 sw_func_id;
> int err = 0;
> + int wq_idx;
> u8 state;
> int i;
>
> @@ -242,27 +275,22 @@ static void mlx5_sf_dev_add_active_work(struct work_struct *work)
> continue;
>
> sw_func_id = MLX5_GET(query_vhca_state_out, out, vhca_state_context.sw_function_id);
> - mutex_lock(&table->table_lock);
> - /* Don't probe device which is already probe */
> - if (!xa_load(&table->devices, i))
> - mlx5_sf_dev_add(dev, i, function_id, sw_func_id);
> - /* There is a race where SF got inactive after the query
> - * above. e.g.: the query returns that the state of the
> - * SF is active, and after that the eswitch manager set it to
> - * inactive.
> - * This case cannot be managed in SW, since the probing of the
> - * SF is on one system, and the inactivation is on a different
> - * system.
> - * If the inactive is done after the SF perform init_hca(),
> - * the SF will fully probe and then removed. If it was
> - * done before init_hca(), the SF probe will fail.
> - */
Ahh, a pre-existing comment ok!
> - mutex_unlock(&table->table_lock);
> + work_ctx = kzalloc(sizeof(*work_ctx), GFP_KERNEL);
> + if (!work_ctx)
> + return;
> +
> + INIT_WORK(&work_ctx->work, &mlx5_sf_dev_add_active_work);
> + work_ctx->event.function_id = function_id;
> + work_ctx->event.sw_function_id = sw_func_id;
> + work_ctx->table = table;
> + work_ctx->sf_index = i;
> + wq_idx = work_ctx->event.function_id % MLX5_DEV_MAX_WQS;
> + mlx5_vhca_events_work_enqueue(dev, wq_idx, &work_ctx->work);
> }
> }
>
> /* In case SFs are generated externally, probe active SFs */
> -static int mlx5_sf_dev_queue_active_work(struct mlx5_sf_dev_table *table)
> +static int mlx5_sf_dev_create_active_works(struct mlx5_sf_dev_table *table)
> {
> if (MLX5_CAP_GEN(table->dev, eswitch_manager))
> return 0; /* the table is local */
> @@ -273,12 +301,12 @@ static int mlx5_sf_dev_queue_active_work(struct mlx5_sf_dev_table *table)
> table->active_wq = create_singlethread_workqueue("mlx5_active_sf");
> if (!table->active_wq)
> return -ENOMEM;
> - INIT_WORK(&table->work, &mlx5_sf_dev_add_active_work);
> + INIT_WORK(&table->work, &mlx5_sf_dev_queue_active_works);
> queue_work(table->active_wq, &table->work);
> return 0;
> }
>
> -static void mlx5_sf_dev_destroy_active_work(struct mlx5_sf_dev_table *table)
> +static void mlx5_sf_dev_destroy_active_works(struct mlx5_sf_dev_table *table)
> {
> if (table->active_wq) {
> table->stop_active_wq = true;
> @@ -305,14 +333,13 @@ void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev)
> table->sf_bar_length = 1 << (MLX5_CAP_GEN(dev, log_min_sf_size) + 12);
> table->base_address = pci_resource_start(dev->pdev, 2);
> xa_init(&table->devices);
> - mutex_init(&table->table_lock);
> dev->priv.sf_dev_table = table;
>
> err = mlx5_vhca_event_notifier_register(dev, &table->nb);
> if (err)
> goto vhca_err;
>
> - err = mlx5_sf_dev_queue_active_work(table);
> + err = mlx5_sf_dev_create_active_works(table);
> if (err)
> goto add_active_err;
>
> @@ -322,9 +349,10 @@ void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev)
> return;
>
> arm_err:
> - mlx5_sf_dev_destroy_active_work(table);
> + mlx5_sf_dev_destroy_active_works(table);
> add_active_err:
> mlx5_vhca_event_notifier_unregister(dev, &table->nb);
> + mlx5_vhca_event_work_queues_flush(dev);
> vhca_err:
> kfree(table);
> dev->priv.sf_dev_table = NULL;
> @@ -350,9 +378,9 @@ void mlx5_sf_dev_table_destroy(struct mlx5_core_dev *dev)
> if (!table)
> return;
>
> - mlx5_sf_dev_destroy_active_work(table);
> + mlx5_sf_dev_destroy_active_works(table);
> mlx5_vhca_event_notifier_unregister(dev, &table->nb);
> - mutex_destroy(&table->table_lock);
> + mlx5_vhca_event_work_queues_flush(dev);
>
> /* Now that event handler is not running, it is safe to destroy
> * the sf device without race.
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
> index c6fd729de8b2..cda01ba441ae 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
> @@ -108,8 +108,7 @@ static void mlx5_vhca_state_work_handler(struct work_struct *_work)
> kfree(work);
> }
>
> -static void
> -mlx5_vhca_events_work_enqueue(struct mlx5_core_dev *dev, int idx, struct work_struct *work)
> +void mlx5_vhca_events_work_enqueue(struct mlx5_core_dev *dev, int idx, struct work_struct *work)
> {
> queue_work(dev->priv.vhca_events->handler[idx].wq, work);
> }
> @@ -191,6 +190,19 @@ int mlx5_vhca_event_init(struct mlx5_core_dev *dev)
> return err;
> }
>
> +void mlx5_vhca_event_work_queues_flush(struct mlx5_core_dev *dev)
> +{
> + struct mlx5_vhca_events *vhca_events;
> + int i;
> +
> + if (!mlx5_vhca_event_supported(dev))
> + return;
> +
> + vhca_events = dev->priv.vhca_events;
> + for (i = 0; i < MLX5_DEV_MAX_WQS; i++)
> + flush_workqueue(vhca_events->handler[i].wq);
> +}
> +
> void mlx5_vhca_event_cleanup(struct mlx5_core_dev *dev)
> {
> struct mlx5_vhca_events *vhca_events;
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h
> index 013cdfe90616..1725ba64f8af 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h
> @@ -28,6 +28,9 @@ int mlx5_modify_vhca_sw_id(struct mlx5_core_dev *dev, u16 function_id, u32 sw_fn
> int mlx5_vhca_event_arm(struct mlx5_core_dev *dev, u16 function_id);
> int mlx5_cmd_query_vhca_state(struct mlx5_core_dev *dev, u16 function_id,
> u32 *out, u32 outlen);
> +void mlx5_vhca_events_work_enqueue(struct mlx5_core_dev *dev, int idx, struct work_struct *work);
> +void mlx5_vhca_event_work_queues_flush(struct mlx5_core_dev *dev);
> +
> #else
>
> static inline void mlx5_vhca_state_cap_handle(struct mlx5_core_dev *dev, void *set_hca_cap)
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [net-next V2 03/15] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key
2023-10-12 19:27 ` [net-next V2 03/15] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key Saeed Mahameed
@ 2023-10-12 21:23 ` Jacob Keller
0 siblings, 0 replies; 32+ messages in thread
From: Jacob Keller @ 2023-10-12 21:23 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch
On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
> From: Shay Drory <shayd@nvidia.com>
>
> Downstream patch will add devcom component which will be locked in
> many places. This can lead to a false positive "possible circular
> locking dependency" warning by lockdep, on flows which lock more than
> one mlx5 devcom component, such as probing ETH aux device.
> Hence, add a lock_class_key per mlx5 device.
>
Right, because the default init_rwsem creates a static key that would
then be re-used for each time. Makes sense.
I wondered if there was an init function that also handles registering
and initializing the key too, but doesn't appear to be something that
already exists, and plenty of examples doing it this way.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
> Signed-off-by: Shay Drory <shayd@nvidia.com>
> Reviewed-by: Mark Bloch <mbloch@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
> index 00e67910e3ee..89ac3209277e 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
> @@ -31,6 +31,7 @@ struct mlx5_devcom_comp {
> struct kref ref;
> bool ready;
> struct rw_semaphore sem;
> + struct lock_class_key lock_key;
> };
>
> struct mlx5_devcom_comp_dev {
> @@ -119,6 +120,8 @@ mlx5_devcom_comp_alloc(u64 id, u64 key, mlx5_devcom_event_handler_t handler)
> comp->key = key;
> comp->handler = handler;
> init_rwsem(&comp->sem);
> + lockdep_register_key(&comp->lock_key);
> + lockdep_set_class(&comp->sem, &comp->lock_key);
> kref_init(&comp->ref);
> INIT_LIST_HEAD(&comp->comp_dev_list_head);
>
> @@ -133,6 +136,7 @@ mlx5_devcom_comp_release(struct kref *ref)
> mutex_lock(&comp_list_lock);
> list_del(&comp->comp_list);
> mutex_unlock(&comp_list_lock);
> + lockdep_unregister_key(&comp->lock_key);
> kfree(comp);
> }
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [net-next V2 04/15] net/mlx5: Refactor LAG peer device lookout bus logic to mlx5 devcom
2023-10-12 19:27 ` [net-next V2 04/15] net/mlx5: Refactor LAG peer device lookout bus logic to mlx5 devcom Saeed Mahameed
@ 2023-10-12 21:26 ` Jacob Keller
2023-10-12 21:46 ` Saeed Mahameed
0 siblings, 1 reply; 32+ messages in thread
From: Jacob Keller @ 2023-10-12 21:26 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch
On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
> From: Shay Drory <shayd@nvidia.com>
>
> LAG peer device lookout bus logic required the usage of global lock,
> mlx5_intf_mutex.
> As part of the effort to remove this global lock, refactor LAG peer
> device lookout to use mlx5 devcom layer.
>
> Signed-off-by: Shay Drory <shayd@nvidia.com>
> Reviewed-by: Mark Bloch <mbloch@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/dev.c | 68 -------------------
> .../net/ethernet/mellanox/mlx5/core/lag/lag.c | 12 ++--
> .../ethernet/mellanox/mlx5/core/lib/devcom.c | 14 ++++
> .../ethernet/mellanox/mlx5/core/lib/devcom.h | 4 ++
> .../net/ethernet/mellanox/mlx5/core/main.c | 25 +++++++
> .../ethernet/mellanox/mlx5/core/mlx5_core.h | 1 -
> include/linux/mlx5/driver.h | 1 +
> 7 files changed, 52 insertions(+), 73 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
> index 1fc03480c2ff..6e3a8c22881f 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
> @@ -566,74 +566,6 @@ bool mlx5_same_hw_devs(struct mlx5_core_dev *dev, struct mlx5_core_dev *peer_dev
> return (fsystem_guid && psystem_guid && fsystem_guid == psystem_guid);
> }
>
> -static u32 mlx5_gen_pci_id(const struct mlx5_core_dev *dev)
> -{
> - return (u32)((pci_domain_nr(dev->pdev->bus) << 16) |
> - (dev->pdev->bus->number << 8) |
> - PCI_SLOT(dev->pdev->devfn));
> -}
> -
> -static int _next_phys_dev(struct mlx5_core_dev *mdev,
> - const struct mlx5_core_dev *curr)
> -{
> - if (!mlx5_core_is_pf(mdev))
> - return 0;
> -
> - if (mdev == curr)
> - return 0;
> -
> - if (!mlx5_same_hw_devs(mdev, (struct mlx5_core_dev *)curr) &&
> - mlx5_gen_pci_id(mdev) != mlx5_gen_pci_id(curr))
> - return 0;
> -
> - return 1;
> -}
> -
> -static void *pci_get_other_drvdata(struct device *this, struct device *other)
> -{
> - if (this->driver != other->driver)
> - return NULL;
> -
> - return pci_get_drvdata(to_pci_dev(other));
> -}
> -
> -static int next_phys_dev_lag(struct device *dev, const void *data)
> -{
> - struct mlx5_core_dev *mdev, *this = (struct mlx5_core_dev *)data;
> -
> - mdev = pci_get_other_drvdata(this->device, dev);
> - if (!mdev)
> - return 0;
> -
> - if (!mlx5_lag_is_supported(mdev))
> - return 0;
> -
> - return _next_phys_dev(mdev, data);
> -}
> -
> -static struct mlx5_core_dev *mlx5_get_next_dev(struct mlx5_core_dev *dev,
> - int (*match)(struct device *dev, const void *data))
> -{
> - struct device *next;
> -
> - if (!mlx5_core_is_pf(dev))
> - return NULL;
> -
> - next = bus_find_device(&pci_bus_type, NULL, dev, match);
> - if (!next)
> - return NULL;
> -
> - put_device(next);
> - return pci_get_drvdata(to_pci_dev(next));
> -}
> -
> -/* Must be called with intf_mutex held */
> -struct mlx5_core_dev *mlx5_get_next_phys_dev_lag(struct mlx5_core_dev *dev)
> -{
> - lockdep_assert_held(&mlx5_intf_mutex);
The old flow had a lockdep_assert_held
> - return mlx5_get_next_dev(dev, &next_phys_dev_lag);
> -}
> -
> void mlx5_dev_list_lock(void)
> {
> mutex_lock(&mlx5_intf_mutex);
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
> index af3fac090b82..f0b57f97739f 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
> @@ -1212,13 +1212,14 @@ static void mlx5_ldev_remove_mdev(struct mlx5_lag *ldev,
> dev->priv.lag = NULL;
> }
>
> -/* Must be called with intf_mutex held */
> +/* Must be called with HCA devcom component lock held */
> static int __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev)
> {
> + struct mlx5_devcom_comp_dev *pos = NULL;
> struct mlx5_lag *ldev = NULL;
> struct mlx5_core_dev *tmp_dev;
>
> - tmp_dev = mlx5_get_next_phys_dev_lag(dev);
> + tmp_dev = mlx5_devcom_get_next_peer_data(dev->priv.hca_devcom_comp, &pos);
> if (tmp_dev)
> ldev = mlx5_lag_dev(tmp_dev);
>
But you didn't bother to add one here? Does
mlx5_devcom_get_next_peer_data already do that?
Not a big deal either way to me.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [net-next V2 05/15] net/mlx5: Replace global mlx5_intf_lock with HCA devcom component lock
2023-10-12 19:27 ` [net-next V2 05/15] net/mlx5: Replace global mlx5_intf_lock with HCA devcom component lock Saeed Mahameed
@ 2023-10-12 21:28 ` Jacob Keller
0 siblings, 0 replies; 32+ messages in thread
From: Jacob Keller @ 2023-10-12 21:28 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch
On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
> From: Shay Drory <shayd@nvidia.com>
>
> mlx5_intf_lock is used to sync between LAG changes and its slaves
> mlx5 core dev aux devices changes, which means every time mlx5 core
> dev add/remove aux devices, mlx5 is taking this global lock, even if
> LAG functionality isn't supported over the core dev.
> This cause a bottleneck when probing VFs/SFs in parallel.
>
> Hence, replace mlx5_intf_lock with HCA devcom component lock, or no
> lock if LAG functionality isn't supported.
>
> Signed-off-by: Shay Drory <shayd@nvidia.com>
> Reviewed-by: Mark Bloch <mbloch@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [net-next V2 06/15] net/mlx5: Remove unused declaration
2023-10-12 19:27 ` [net-next V2 06/15] net/mlx5: Remove unused declaration Saeed Mahameed
@ 2023-10-12 21:29 ` Jacob Keller
0 siblings, 0 replies; 32+ messages in thread
From: Jacob Keller @ 2023-10-12 21:29 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Yue Haibing,
Leon Romanovsky, Simon Horman
On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
> From: Yue Haibing <yuehaibing@huawei.com>
>
> Commit 2ac9cfe78223 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path")
> declared mlx5e_ipsec_inverse_table_init() but never implemented it.
> Commit f52f2faee581 ("net/mlx5e: Introduce flow steering API")
> declared mlx5e_fs_set_tc() but never implemented it.
> Commit f2f3df550139 ("net/mlx5: EQ, Privatize eq_table and friends")
> declared mlx5_eq_comp_cpumask() but never implemented it.
> Commit cac1eb2cf2e3 ("net/mlx5: Lag, properly lock eswitch if needed")
> removed mlx5_lag_update() but not its declaration.
> Commit 35ba005d820b ("net/mlx5: DR, Set flex parser for TNL_MPLS dynamically")
> removed mlx5dr_ste_build_tnl_mpls() but not its declaration.
>
> Commit e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
> declared but never implemented mlx5_alloc_cmd_mailbox_chain() and mlx5_free_cmd_mailbox_chain().
> Commit 0cf53c124756 ("net/mlx5: FWPage, Use async events chain")
> removed mlx5_core_req_pages_handler() but not its declaration.
> Commit 938fe83c8dcb ("net/mlx5_core: New device capabilities handling")
> removed mlx5_query_odp_caps() but not its declaration.
> Commit f6a8a19bb11b ("RDMA/netdev: Hoist alloc_netdev_mqs out of the driver")
> removed mlx5_rdma_netdev_alloc() but not its declaration.
>
> Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
> Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> Reviewed-by: Simon Horman <horms@kernel.org>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [net-next V2 09/15] net/mlx5e: Use PTR_ERR_OR_ZERO() to simplify code
2023-10-12 19:27 ` [net-next V2 09/15] net/mlx5e: " Saeed Mahameed
@ 2023-10-12 21:30 ` Jacob Keller
0 siblings, 0 replies; 32+ messages in thread
From: Jacob Keller @ 2023-10-12 21:30 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Yu Liao, Leon Romanovsky
On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
> From: Yu Liao <liaoyu15@huawei.com>
>
> Use the standard error pointer macro to shorten the code and simplify.
>
> Signed-off-by: Yu Liao <liaoyu15@huawei.com>
> Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [net-next V2 10/15] net/mlx5e: Refactor rx_res_init() and rx_res_free() APIs
2023-10-12 19:27 ` [net-next V2 10/15] net/mlx5e: Refactor rx_res_init() and rx_res_free() APIs Saeed Mahameed
@ 2023-10-12 21:31 ` Jacob Keller
0 siblings, 0 replies; 32+ messages in thread
From: Jacob Keller @ 2023-10-12 21:31 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Adham Faris
On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
> From: Adham Faris <afaris@nvidia.com>
>
> Refactor mlx5e_rx_res_init() and mlx5e_rx_res_free() by wrapping
> mlx5e_rx_res_alloc() and mlx5e_rx_res_destroy() API's respectively.
>
> Signed-off-by: Adham Faris <afaris@nvidia.com>
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [net-next V2 11/15] net/mlx5e: Refactor mlx5e_rss_set_rxfh() and mlx5e_rss_get_rxfh()
2023-10-12 19:27 ` [net-next V2 11/15] net/mlx5e: Refactor mlx5e_rss_set_rxfh() and mlx5e_rss_get_rxfh() Saeed Mahameed
@ 2023-10-12 21:32 ` Jacob Keller
0 siblings, 0 replies; 32+ messages in thread
From: Jacob Keller @ 2023-10-12 21:32 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Adham Faris
On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
> From: Adham Faris <afaris@nvidia.com>
>
> Initialize indirect table array with memcpy rather than for loop.
> This change has made for two reasons:
> 1) To be consistent with the indirect table array init in
> mlx5e_rss_set_rxfh().
> 2) In general, prefer to use memcpy for array initializing rather than
> for loop.
>
> Signed-off-by: Adham Faris <afaris@nvidia.com>
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [net-next V2 12/15] net/mlx5e: Refactor mlx5e_rss_init() and mlx5e_rss_free() API's
2023-10-12 19:27 ` [net-next V2 12/15] net/mlx5e: Refactor mlx5e_rss_init() and mlx5e_rss_free() API's Saeed Mahameed
@ 2023-10-12 21:33 ` Jacob Keller
0 siblings, 0 replies; 32+ messages in thread
From: Jacob Keller @ 2023-10-12 21:33 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Adham Faris
On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
> From: Adham Faris <afaris@nvidia.com>
>
> Introduce code refactoring below:
> 1) Introduce single API for creating and destroying rss object,
> mlx5e_rss_create() and mlx5e_rss_destroy() respectively.
> 2) mlx5e_rss_create() constructs and initializes RSS object depends
> on a function new param enum mlx5e_rss_create_type. Callers (like
> rx_res.c) will no longer need to allocate RSS object via
> mlx5e_rss_alloc() and initialize it immediately via
> mlx5e_rss_init_no_tirs() or mlx5e_rss_init(), this will be done by
> a single call to mlx5e_rss_create(). Hence, mlx5e_rss_alloc() and
> mlx5e_rss_init_no_tirs() have been removed from rss.h file and became
> static functions.
>
> Signed-off-by: Adham Faris <afaris@nvidia.com>
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [net-next V2 13/15] net/mlx5e: Preparations for supporting larger number of channels
2023-10-12 19:27 ` [net-next V2 13/15] net/mlx5e: Preparations for supporting larger number of channels Saeed Mahameed
@ 2023-10-12 21:35 ` Jacob Keller
0 siblings, 0 replies; 32+ messages in thread
From: Jacob Keller @ 2023-10-12 21:35 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Adham Faris
On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
> From: Adham Faris <afaris@nvidia.com>
>
> Data center server CPUs number keeps getting larger with time.
> Currently, our driver limits the number of channels to 128.
>
> Maximum channels number is enforced and bounded by hardcoded
> defines (en.h/MLX5E_MAX_NUM_CHANNELS) even though the device and machine
> (CPUs num) can allow more.
>
> Refactor current implementation in order to handle further channels.
>
> The maximum supported channels number will be increased in the followup
> patch.
>
> Introduce RQT size calculation/allocation scheme below:
> 1) Preserve current RQT size of 256 for channels number up to 128 (the
> old limit).
> 2) For greater channels number, RQT size is calculated by multiplying
> the channels number by 2 and rounding up the result to the nearest
> power of 2. If the calculated RQT size exceeds the maximum supported
> size by the NIC, fallback to this maximum RQT size
> (1 << log_max_rqt_size).
>
> Since RQT size is no more static, allocate and free the indirection
> table SW shadow dynamically.
>
> Signed-off-by: Adham Faris <afaris@nvidia.com>
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [net-next V2 14/15] net/mlx5e: Increase max supported channels number to 256
2023-10-12 19:27 ` [net-next V2 14/15] net/mlx5e: Increase max supported channels number to 256 Saeed Mahameed
@ 2023-10-12 21:35 ` Jacob Keller
0 siblings, 0 replies; 32+ messages in thread
From: Jacob Keller @ 2023-10-12 21:35 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Adham Faris
On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
> From: Adham Faris <afaris@nvidia.com>
>
> Increase max supported channels number to 256 (it is not extended
> further due to testing disabilities).
>
> Signed-off-by: Adham Faris <afaris@nvidia.com>
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [net-next V2 15/15] net/mlx5e: Allow IPsec soft/hard limits in bytes
2023-10-12 19:27 ` [net-next V2 15/15] net/mlx5e: Allow IPsec soft/hard limits in bytes Saeed Mahameed
@ 2023-10-12 21:37 ` Jacob Keller
0 siblings, 0 replies; 32+ messages in thread
From: Jacob Keller @ 2023-10-12 21:37 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Leon Romanovsky,
Patrisious Haddad
On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> Actually the mlx5 code already has needed support to allow users
> to configure soft/hard limits in bytes. It is possible due to the
> situation with TX path, where CX7 devices are missing hardware
> implementation to send events to the software, see commit b2f7b01d36a9
> ("net/mlx5e: Simulate missing IPsec TX limits hardware functionality").
>
> That software workaround is not limited to TX and works for bytes too.
> So relax the validation logic to not block soft/hard limits in bytes.
>
> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
> .../mellanox/mlx5/core/en_accel/ipsec.c | 23 +++++++++++-------
> .../mellanox/mlx5/core/en_accel/ipsec_fs.c | 24 +++++++++++--------
> 2 files changed, 28 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
> index 7d4ceb9b9c16..257c41870f78 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
> @@ -56,7 +56,7 @@ static struct mlx5e_ipsec_pol_entry *to_ipsec_pol_entry(struct xfrm_policy *x)
> return (struct mlx5e_ipsec_pol_entry *)x->xdo.offload_handle;
> }
>
> -static void mlx5e_ipsec_handle_tx_limit(struct work_struct *_work)
> +static void mlx5e_ipsec_handle_sw_limits(struct work_struct *_work)
> {
> struct mlx5e_ipsec_dwork *dwork =
> container_of(_work, struct mlx5e_ipsec_dwork, dwork.work);
> @@ -486,9 +486,15 @@ static int mlx5e_xfrm_validate_state(struct mlx5_core_dev *mdev,
> return -EINVAL;
> }
>
> - if (x->lft.hard_byte_limit != XFRM_INF ||
> - x->lft.soft_byte_limit != XFRM_INF) {
> - NL_SET_ERR_MSG_MOD(extack, "Device doesn't support limits in bytes");
> + if (x->lft.soft_byte_limit >= x->lft.hard_byte_limit &&
> + x->lft.hard_byte_limit != XFRM_INF) {
> + /* XFRM stack doesn't prevent such configuration :(. */
> + NL_SET_ERR_MSG_MOD(extack, "Hard byte limit must be greater than soft one");
> + return -EINVAL;
> + }
> +
Seems like we should fix that? :D
> + if (!x->lft.soft_byte_limit || !x->lft.hard_byte_limit) {
> + NL_SET_ERR_MSG_MOD(extack, "Soft/hard byte limits can't be 0");
> return -EINVAL;
> }
>
> @@ -624,11 +630,10 @@ static int mlx5e_ipsec_create_dwork(struct mlx5e_ipsec_sa_entry *sa_entry)
> if (x->xso.type != XFRM_DEV_OFFLOAD_PACKET)
> return 0;
>
> - if (x->xso.dir != XFRM_DEV_OFFLOAD_OUT)
> - return 0;
> -
> if (x->lft.soft_packet_limit == XFRM_INF &&
> - x->lft.hard_packet_limit == XFRM_INF)
> + x->lft.hard_packet_limit == XFRM_INF &&
> + x->lft.soft_byte_limit == XFRM_INF &&
> + x->lft.hard_byte_limit == XFRM_INF)
> return 0;
>
> dwork = kzalloc(sizeof(*dwork), GFP_KERNEL);
> @@ -636,7 +641,7 @@ static int mlx5e_ipsec_create_dwork(struct mlx5e_ipsec_sa_entry *sa_entry)
> return -ENOMEM;
>
> dwork->sa_entry = sa_entry;
> - INIT_DELAYED_WORK(&dwork->dwork, mlx5e_ipsec_handle_tx_limit);
> + INIT_DELAYED_WORK(&dwork->dwork, mlx5e_ipsec_handle_sw_limits);
> sa_entry->dwork = dwork;
> return 0;
> }
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
> index 7dba4221993f..eda1cb528deb 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
> @@ -1249,15 +1249,17 @@ static int rx_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry)
> setup_fte_no_frags(spec);
> setup_fte_upper_proto_match(spec, &attrs->upspec);
>
> - if (rx != ipsec->rx_esw)
> - err = setup_modify_header(ipsec, attrs->type,
> - sa_entry->ipsec_obj_id | BIT(31),
> - XFRM_DEV_OFFLOAD_IN, &flow_act);
> - else
> - err = mlx5_esw_ipsec_rx_setup_modify_header(sa_entry, &flow_act);
> + if (!attrs->drop) {
> + if (rx != ipsec->rx_esw)
> + err = setup_modify_header(ipsec, attrs->type,
> + sa_entry->ipsec_obj_id | BIT(31),
> + XFRM_DEV_OFFLOAD_IN, &flow_act);
> + else
> + err = mlx5_esw_ipsec_rx_setup_modify_header(sa_entry, &flow_act);
>
> - if (err)
> - goto err_mod_header;
> + if (err)
> + goto err_mod_header;
> + }
>
> switch (attrs->type) {
> case XFRM_DEV_OFFLOAD_PACKET:
> @@ -1307,7 +1309,8 @@ static int rx_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry)
> if (flow_act.pkt_reformat)
> mlx5_packet_reformat_dealloc(mdev, flow_act.pkt_reformat);
> err_pkt_reformat:
> - mlx5_modify_header_dealloc(mdev, flow_act.modify_hdr);
> + if (flow_act.modify_hdr)
> + mlx5_modify_header_dealloc(mdev, flow_act.modify_hdr);
> err_mod_header:
> kvfree(spec);
> err_alloc:
> @@ -1805,7 +1808,8 @@ void mlx5e_accel_ipsec_fs_del_rule(struct mlx5e_ipsec_sa_entry *sa_entry)
> return;
> }
>
> - mlx5_modify_header_dealloc(mdev, ipsec_rule->modify_hdr);
> + if (ipsec_rule->modify_hdr)
> + mlx5_modify_header_dealloc(mdev, ipsec_rule->modify_hdr);
> mlx5_esw_ipsec_rx_id_mapping_remove(sa_entry);
> rx_ft_put(sa_entry->ipsec, sa_entry->attrs.family, sa_entry->attrs.type);
> }
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [net-next V2 04/15] net/mlx5: Refactor LAG peer device lookout bus logic to mlx5 devcom
2023-10-12 21:26 ` Jacob Keller
@ 2023-10-12 21:46 ` Saeed Mahameed
0 siblings, 0 replies; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-12 21:46 UTC (permalink / raw)
To: Jacob Keller
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch
On 12 Oct 14:26, Jacob Keller wrote:
>
>
>On 10/12/2023 12:27 PM, Saeed Mahameed wrote:
>> From: Shay Drory <shayd@nvidia.com>
>>
>> LAG peer device lookout bus logic required the usage of global lock,
>> mlx5_intf_mutex.
>> As part of the effort to remove this global lock, refactor LAG peer
>> device lookout to use mlx5 devcom layer.
>>
>> Signed-off-by: Shay Drory <shayd@nvidia.com>
>> Reviewed-by: Mark Bloch <mbloch@nvidia.com>
>> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
>> ---
>> drivers/net/ethernet/mellanox/mlx5/core/dev.c | 68 -------------------
>> .../net/ethernet/mellanox/mlx5/core/lag/lag.c | 12 ++--
>> .../ethernet/mellanox/mlx5/core/lib/devcom.c | 14 ++++
>> .../ethernet/mellanox/mlx5/core/lib/devcom.h | 4 ++
>> .../net/ethernet/mellanox/mlx5/core/main.c | 25 +++++++
>> .../ethernet/mellanox/mlx5/core/mlx5_core.h | 1 -
>> include/linux/mlx5/driver.h | 1 +
>> 7 files changed, 52 insertions(+), 73 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
>> index 1fc03480c2ff..6e3a8c22881f 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
>> @@ -566,74 +566,6 @@ bool mlx5_same_hw_devs(struct mlx5_core_dev *dev, struct mlx5_core_dev *peer_dev
>> return (fsystem_guid && psystem_guid && fsystem_guid == psystem_guid);
>> }
>>
>> -static u32 mlx5_gen_pci_id(const struct mlx5_core_dev *dev)
>> -{
>> - return (u32)((pci_domain_nr(dev->pdev->bus) << 16) |
>> - (dev->pdev->bus->number << 8) |
>> - PCI_SLOT(dev->pdev->devfn));
>> -}
>> -
>> -static int _next_phys_dev(struct mlx5_core_dev *mdev,
>> - const struct mlx5_core_dev *curr)
>> -{
>> - if (!mlx5_core_is_pf(mdev))
>> - return 0;
>> -
>> - if (mdev == curr)
>> - return 0;
>> -
>> - if (!mlx5_same_hw_devs(mdev, (struct mlx5_core_dev *)curr) &&
>> - mlx5_gen_pci_id(mdev) != mlx5_gen_pci_id(curr))
>> - return 0;
>> -
>> - return 1;
>> -}
>> -
>> -static void *pci_get_other_drvdata(struct device *this, struct device *other)
>> -{
>> - if (this->driver != other->driver)
>> - return NULL;
>> -
>> - return pci_get_drvdata(to_pci_dev(other));
>> -}
>> -
>> -static int next_phys_dev_lag(struct device *dev, const void *data)
>> -{
>> - struct mlx5_core_dev *mdev, *this = (struct mlx5_core_dev *)data;
>> -
>> - mdev = pci_get_other_drvdata(this->device, dev);
>> - if (!mdev)
>> - return 0;
>> -
>> - if (!mlx5_lag_is_supported(mdev))
>> - return 0;
>> -
>> - return _next_phys_dev(mdev, data);
>> -}
>> -
>> -static struct mlx5_core_dev *mlx5_get_next_dev(struct mlx5_core_dev *dev,
>> - int (*match)(struct device *dev, const void *data))
>> -{
>> - struct device *next;
>> -
>> - if (!mlx5_core_is_pf(dev))
>> - return NULL;
>> -
>> - next = bus_find_device(&pci_bus_type, NULL, dev, match);
>> - if (!next)
>> - return NULL;
>> -
>> - put_device(next);
>> - return pci_get_drvdata(to_pci_dev(next));
>> -}
>> -
>> -/* Must be called with intf_mutex held */
>> -struct mlx5_core_dev *mlx5_get_next_phys_dev_lag(struct mlx5_core_dev *dev)
>> -{
>> - lockdep_assert_held(&mlx5_intf_mutex);
>
>The old flow had a lockdep_assert_held
>
>> - return mlx5_get_next_dev(dev, &next_phys_dev_lag);
>> -}
>> -
>> void mlx5_dev_list_lock(void)
>> {
>> mutex_lock(&mlx5_intf_mutex);
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
>> index af3fac090b82..f0b57f97739f 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
>> @@ -1212,13 +1212,14 @@ static void mlx5_ldev_remove_mdev(struct mlx5_lag *ldev,
>> dev->priv.lag = NULL;
>> }
>>
>> -/* Must be called with intf_mutex held */
>> +/* Must be called with HCA devcom component lock held */
>> static int __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev)
>> {
>> + struct mlx5_devcom_comp_dev *pos = NULL;
>> struct mlx5_lag *ldev = NULL;
>> struct mlx5_core_dev *tmp_dev;
>>
>> - tmp_dev = mlx5_get_next_phys_dev_lag(dev);
>> + tmp_dev = mlx5_devcom_get_next_peer_data(dev->priv.hca_devcom_comp, &pos);
>> if (tmp_dev)
>> ldev = mlx5_lag_dev(tmp_dev);
>>
>
>But you didn't bother to add one here? Does
>mlx5_devcom_get_next_peer_data already do that?
>
The global mutex isn't required anymore after this refactoring,
and devcom has own locking and sync mechanism.
>Not a big deal either way to me.
>
>Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pull request][net-next V2 00/15] mlx5 updates 2023-10-10
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
` (14 preceding siblings ...)
2023-10-12 19:27 ` [net-next V2 15/15] net/mlx5e: Allow IPsec soft/hard limits in bytes Saeed Mahameed
@ 2023-10-14 1:10 ` Jakub Kicinski
2023-10-14 17:19 ` Saeed Mahameed
15 siblings, 1 reply; 32+ messages in thread
From: Jakub Kicinski @ 2023-10-14 1:10 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Paolo Abeni, Eric Dumazet, Saeed Mahameed,
netdev, Tariq Toukan
On Thu, 12 Oct 2023 12:27:35 -0700 Saeed Mahameed wrote:
> v1->v2:
> - patch #2, remove leftover mutex_init() to fix the build break
>
> This provides misc updates to mlx5 driver.
> For more information please see tag log below.
>
> Please pull and let me know if there is any problem.
There are missing sign-offs on the PR and patches don't apply
because I pulled Leon's ipsec branch :(
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pull request][net-next V2 00/15] mlx5 updates 2023-10-10
2023-10-14 1:10 ` [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Jakub Kicinski
@ 2023-10-14 17:19 ` Saeed Mahameed
0 siblings, 0 replies; 32+ messages in thread
From: Saeed Mahameed @ 2023-10-14 17:19 UTC (permalink / raw)
To: Jakub Kicinski
Cc: David S. Miller, Paolo Abeni, Eric Dumazet, Saeed Mahameed,
netdev, Tariq Toukan
On 13 Oct 18:10, Jakub Kicinski wrote:
>On Thu, 12 Oct 2023 12:27:35 -0700 Saeed Mahameed wrote:
>> v1->v2:
>> - patch #2, remove leftover mutex_init() to fix the build break
>>
>> This provides misc updates to mlx5 driver.
>> For more information please see tag log below.
>>
>> Please pull and let me know if there is any problem.
>
>There are missing sign-offs on the PR and patches don't apply
>because I pulled Leon's ipsec branch :(
>
Thanks ! just sent v3.
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2023-10-14 17:19 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-12 19:27 [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Saeed Mahameed
2023-10-12 19:27 ` [net-next V2 01/15] net/mlx5: Parallelize vhca event handling Saeed Mahameed
2023-10-12 21:13 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 02/15] net/mlx5: Redesign SF active work to remove table_lock Saeed Mahameed
2023-10-12 21:18 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 03/15] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key Saeed Mahameed
2023-10-12 21:23 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 04/15] net/mlx5: Refactor LAG peer device lookout bus logic to mlx5 devcom Saeed Mahameed
2023-10-12 21:26 ` Jacob Keller
2023-10-12 21:46 ` Saeed Mahameed
2023-10-12 19:27 ` [net-next V2 05/15] net/mlx5: Replace global mlx5_intf_lock with HCA devcom component lock Saeed Mahameed
2023-10-12 21:28 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 06/15] net/mlx5: Remove unused declaration Saeed Mahameed
2023-10-12 21:29 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 07/15] net/mlx5: fix config name in Kconfig parameter documentation Saeed Mahameed
2023-10-12 19:27 ` [net-next V2 08/15] net/mlx5: Use PTR_ERR_OR_ZERO() to simplify code Saeed Mahameed
2023-10-12 19:27 ` [net-next V2 09/15] net/mlx5e: " Saeed Mahameed
2023-10-12 21:30 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 10/15] net/mlx5e: Refactor rx_res_init() and rx_res_free() APIs Saeed Mahameed
2023-10-12 21:31 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 11/15] net/mlx5e: Refactor mlx5e_rss_set_rxfh() and mlx5e_rss_get_rxfh() Saeed Mahameed
2023-10-12 21:32 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 12/15] net/mlx5e: Refactor mlx5e_rss_init() and mlx5e_rss_free() API's Saeed Mahameed
2023-10-12 21:33 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 13/15] net/mlx5e: Preparations for supporting larger number of channels Saeed Mahameed
2023-10-12 21:35 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 14/15] net/mlx5e: Increase max supported channels number to 256 Saeed Mahameed
2023-10-12 21:35 ` Jacob Keller
2023-10-12 19:27 ` [net-next V2 15/15] net/mlx5e: Allow IPsec soft/hard limits in bytes Saeed Mahameed
2023-10-12 21:37 ` Jacob Keller
2023-10-14 1:10 ` [pull request][net-next V2 00/15] mlx5 updates 2023-10-10 Jakub Kicinski
2023-10-14 17:19 ` Saeed Mahameed
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).