* [PATCH rdma-next 0/5] Add support and infrastructure for RDMA TRANSPORT
@ 2025-02-26 13:01 Leon Romanovsky
2025-02-26 13:01 ` [PATCH mlx5-next 1/5] net/mlx5: Add RDMA_CTRL HW capabilities Leon Romanovsky
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-02-26 13:01 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Andrew Lunn, Chiara Meiohas, Eric Dumazet, Jakub Kicinski,
linux-rdma, Mark Bloch, Moshe Shemesh, netdev, Paolo Abeni,
Patrisious Haddad, Saeed Mahameed, Tariq Toukan
Hi,
This is preparation series targeted for mlx5-next, which will be used
later in RDMA.
This series adds RDMA transport steering logic which would allow the
vport group manager to catch control packets from VFs and forward them
to control SW to help with congestion control.
In addition, RDMA will provide new set of APIs to better control exposed
FW capabilities and this series is needed to make sure mlx5 command interface
will ensure that privileged commands can always proceed,
Thanks
Chiara Meiohas (3):
net/mlx5: Add RDMA_CTRL HW capabilities
net/mlx5: Allow the throttle mechanism to be more dynamic
net/mlx5: Limit non-privileged commands
Patrisious Haddad (2):
net/mlx5: Query ADV_RDMA capabilities
net/mlx5: fs, add RDMA TRANSPORT steering domain support
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 120 ++++++++++--
.../mellanox/mlx5/core/esw/acl/helper.c | 2 +-
.../mellanox/mlx5/core/eswitch_offloads.c | 6 +-
.../net/ethernet/mellanox/mlx5/core/fs_cmd.c | 2 +
.../net/ethernet/mellanox/mlx5/core/fs_core.c | 178 ++++++++++++++++--
.../net/ethernet/mellanox/mlx5/core/fs_core.h | 12 +-
drivers/net/ethernet/mellanox/mlx5/core/fw.c | 7 +
.../net/ethernet/mellanox/mlx5/core/main.c | 1 +
include/linux/mlx5/device.h | 11 ++
include/linux/mlx5/driver.h | 6 +
include/linux/mlx5/fs.h | 10 +-
include/linux/mlx5/mlx5_ifc.h | 52 ++++-
12 files changed, 368 insertions(+), 39 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH mlx5-next 1/5] net/mlx5: Add RDMA_CTRL HW capabilities
2025-02-26 13:01 [PATCH rdma-next 0/5] Add support and infrastructure for RDMA TRANSPORT Leon Romanovsky
@ 2025-02-26 13:01 ` Leon Romanovsky
2025-02-26 13:01 ` [PATCH mlx5-next 2/5] net/mlx5: Allow the throttle mechanism to be more dynamic Leon Romanovsky
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-02-26 13:01 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Chiara Meiohas, linux-rdma, Moshe Shemesh, netdev, Saeed Mahameed,
Tariq Toukan
From: Chiara Meiohas <cmeiohas@nvidia.com>
Add RDMA_CTRL UCTX capabilities and add the RDMA_CTRL general object
type in hca_cap_2.
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
include/linux/mlx5/mlx5_ifc.h | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index cc2875e843f7..3b3d88ffcacc 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1570,6 +1570,8 @@ enum {
enum {
MLX5_UCTX_CAP_RAW_TX = 1UL << 0,
MLX5_UCTX_CAP_INTERNAL_DEV_RES = 1UL << 1,
+ MLX5_UCTX_CAP_RDMA_CTRL = 1UL << 3,
+ MLX5_UCTX_CAP_RDMA_CTRL_OTHER_VHCA = 1UL << 4,
};
#define MLX5_FC_BULK_SIZE_FACTOR 128
@@ -2140,7 +2142,8 @@ struct mlx5_ifc_cmd_hca_cap_2_bits {
u8 log_min_mkey_entity_size[0x5];
u8 reserved_at_1b0[0x10];
- u8 reserved_at_1c0[0x60];
+ u8 general_obj_types_127_64[0x40];
+ u8 reserved_at_200[0x20];
u8 reserved_at_220[0x1];
u8 sw_vhca_id_valid[0x1];
@@ -12494,6 +12497,10 @@ enum {
MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_FLOW_METER_ASO = BIT_ULL(0x24),
};
+enum {
+ MLX5_HCA_CAP_2_GENERAL_OBJECT_TYPES_RDMA_CTRL = BIT_ULL(0x13),
+};
+
enum {
MLX5_GENERAL_OBJECT_TYPES_ENCRYPTION_KEY = 0xc,
MLX5_GENERAL_OBJECT_TYPES_IPSEC = 0x13,
@@ -12501,6 +12508,7 @@ enum {
MLX5_GENERAL_OBJECT_TYPES_FLOW_METER_ASO = 0x24,
MLX5_GENERAL_OBJECT_TYPES_MACSEC = 0x27,
MLX5_GENERAL_OBJECT_TYPES_INT_KEK = 0x47,
+ MLX5_GENERAL_OBJECT_TYPES_RDMA_CTRL = 0x53,
MLX5_GENERAL_OBJECT_TYPES_FLOW_TABLE_ALIAS = 0xff15,
};
--
2.48.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH mlx5-next 2/5] net/mlx5: Allow the throttle mechanism to be more dynamic
2025-02-26 13:01 [PATCH rdma-next 0/5] Add support and infrastructure for RDMA TRANSPORT Leon Romanovsky
2025-02-26 13:01 ` [PATCH mlx5-next 1/5] net/mlx5: Add RDMA_CTRL HW capabilities Leon Romanovsky
@ 2025-02-26 13:01 ` Leon Romanovsky
2025-02-26 13:01 ` [PATCH mlx5-next 3/5] net/mlx5: Limit non-privileged commands Leon Romanovsky
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-02-26 13:01 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Chiara Meiohas, Andrew Lunn, Eric Dumazet, Jakub Kicinski,
linux-rdma, Moshe Shemesh, netdev, Paolo Abeni, Saeed Mahameed,
Tariq Toukan
From: Chiara Meiohas <cmeiohas@nvidia.com>
Previously, throttle commands were identified and limited based on
opcode. These commands were limited to half the command slots using a
semaphore, and callback commands checked the opcode to determine
semaphore release.
To allow exceptions, we introduce a variable to indicate when the
throttle lock is held. This allows scenarios where throttle commands
are not limited. Callback functions use this variable to determine
if the throttle semaphore needs to be released.
This patch contains no functional changes. It's a preparation for the
next patch.
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 43 +++++++++++++------
include/linux/mlx5/driver.h | 1 +
2 files changed, 32 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index e733b81e18a2..19c0c15c7e08 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1881,7 +1881,7 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
{
struct mlx5_cmd_msg *inb, *outb;
u16 opcode = in_to_opcode(in);
- bool throttle_op;
+ bool throttle_locked = false;
int pages_queue;
gfp_t gfp;
u8 token;
@@ -1890,12 +1890,12 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
if (mlx5_cmd_is_down(dev) || !opcode_allowed(&dev->cmd, opcode))
return -ENXIO;
- throttle_op = mlx5_cmd_is_throttle_opcode(opcode);
- if (throttle_op) {
- if (callback) {
- if (down_trylock(&dev->cmd.vars.throttle_sem))
- return -EBUSY;
- } else {
+ if (!callback) {
+ /* The semaphore is already held for callback commands. It was
+ * acquired in mlx5_cmd_exec_cb()
+ */
+ if (mlx5_cmd_is_throttle_opcode(opcode)) {
+ throttle_locked = true;
down(&dev->cmd.vars.throttle_sem);
}
}
@@ -1941,7 +1941,7 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
out_in:
free_msg(dev, inb);
out_up:
- if (throttle_op)
+ if (throttle_locked)
up(&dev->cmd.vars.throttle_sem);
return err;
}
@@ -2104,17 +2104,17 @@ static void mlx5_cmd_exec_cb_handler(int status, void *_work)
struct mlx5_async_work *work = _work;
struct mlx5_async_ctx *ctx;
struct mlx5_core_dev *dev;
- u16 opcode;
+ bool throttle_locked;
ctx = work->ctx;
dev = ctx->dev;
- opcode = work->opcode;
+ throttle_locked = work->throttle_locked;
status = cmd_status_err(dev, status, work->opcode, work->op_mod, work->out);
work->user_callback(status, work);
/* Can't access "work" from this point on. It could have been freed in
* the callback.
*/
- if (mlx5_cmd_is_throttle_opcode(opcode))
+ if (throttle_locked)
up(&dev->cmd.vars.throttle_sem);
if (atomic_dec_and_test(&ctx->num_inflight))
complete(&ctx->inflight_done);
@@ -2131,11 +2131,30 @@ int mlx5_cmd_exec_cb(struct mlx5_async_ctx *ctx, void *in, int in_size,
work->opcode = in_to_opcode(in);
work->op_mod = MLX5_GET(mbox_in, in, op_mod);
work->out = out;
+ work->throttle_locked = false;
if (WARN_ON(!atomic_inc_not_zero(&ctx->num_inflight)))
return -EIO;
+
+ if (mlx5_cmd_is_throttle_opcode(in_to_opcode(in))) {
+ if (down_trylock(&ctx->dev->cmd.vars.throttle_sem)) {
+ ret = -EBUSY;
+ goto dec_num_inflight;
+ }
+ work->throttle_locked = true;
+ }
+
ret = cmd_exec(ctx->dev, in, in_size, out, out_size,
mlx5_cmd_exec_cb_handler, work, false);
- if (ret && atomic_dec_and_test(&ctx->num_inflight))
+ if (ret)
+ goto sem_up;
+
+ return 0;
+
+sem_up:
+ if (work->throttle_locked)
+ up(&ctx->dev->cmd.vars.throttle_sem);
+dec_num_inflight:
+ if (atomic_dec_and_test(&ctx->num_inflight))
complete(&ctx->inflight_done);
return ret;
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index af86097641b0..876d6b03a87a 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -989,6 +989,7 @@ struct mlx5_async_work {
mlx5_async_cbk_t user_callback;
u16 opcode; /* cmd opcode */
u16 op_mod; /* cmd op_mod */
+ u8 throttle_locked:1;
void *out; /* pointer to the cmd output buffer */
};
--
2.48.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH mlx5-next 3/5] net/mlx5: Limit non-privileged commands
2025-02-26 13:01 [PATCH rdma-next 0/5] Add support and infrastructure for RDMA TRANSPORT Leon Romanovsky
2025-02-26 13:01 ` [PATCH mlx5-next 1/5] net/mlx5: Add RDMA_CTRL HW capabilities Leon Romanovsky
2025-02-26 13:01 ` [PATCH mlx5-next 2/5] net/mlx5: Allow the throttle mechanism to be more dynamic Leon Romanovsky
@ 2025-02-26 13:01 ` Leon Romanovsky
2025-02-26 13:01 ` [PATCH mlx5-next 4/5] net/mlx5: Query ADV_RDMA capabilities Leon Romanovsky
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-02-26 13:01 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Chiara Meiohas, Andrew Lunn, Eric Dumazet, Jakub Kicinski,
linux-rdma, Moshe Shemesh, netdev, Paolo Abeni, Saeed Mahameed,
Tariq Toukan
From: Chiara Meiohas <cmeiohas@nvidia.com>
Limit non-privileged UID commands to half of the available command slots
when privileged UIDs are present.
Privileged throttle commands will not be limited.
Use an xarray to store privileged UIDs. Add insert and remove functions
for privileged UIDs management.
Non-user commands (with uid 0) are not limited.
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 85 +++++++++++++++++--
include/linux/mlx5/driver.h | 5 ++
2 files changed, 82 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 19c0c15c7e08..e53dbdc0a7a1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -94,6 +94,11 @@ static u16 in_to_opcode(void *in)
return MLX5_GET(mbox_in, in, opcode);
}
+static u16 in_to_uid(void *in)
+{
+ return MLX5_GET(mbox_in, in, uid);
+}
+
/* Returns true for opcodes that might be triggered very frequently and throttle
* the command interface. Limit their command slots usage.
*/
@@ -823,7 +828,7 @@ static void cmd_status_print(struct mlx5_core_dev *dev, void *in, void *out)
opcode = in_to_opcode(in);
op_mod = MLX5_GET(mbox_in, in, op_mod);
- uid = MLX5_GET(mbox_in, in, uid);
+ uid = in_to_uid(in);
status = MLX5_GET(mbox_out, out, status);
if (!uid && opcode != MLX5_CMD_OP_DESTROY_MKEY &&
@@ -1871,6 +1876,17 @@ static int is_manage_pages(void *in)
return in_to_opcode(in) == MLX5_CMD_OP_MANAGE_PAGES;
}
+static bool mlx5_has_privileged_uid(struct mlx5_core_dev *dev)
+{
+ return !xa_empty(&dev->cmd.vars.privileged_uids);
+}
+
+static bool mlx5_cmd_is_privileged_uid(struct mlx5_core_dev *dev,
+ u16 uid)
+{
+ return !!xa_load(&dev->cmd.vars.privileged_uids, uid);
+}
+
/* Notes:
* 1. Callback functions may not sleep
* 2. Page queue commands do not support asynchrous completion
@@ -1882,6 +1898,8 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
struct mlx5_cmd_msg *inb, *outb;
u16 opcode = in_to_opcode(in);
bool throttle_locked = false;
+ bool unpriv_locked = false;
+ u16 uid = in_to_uid(in);
int pages_queue;
gfp_t gfp;
u8 token;
@@ -1894,7 +1912,12 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
/* The semaphore is already held for callback commands. It was
* acquired in mlx5_cmd_exec_cb()
*/
- if (mlx5_cmd_is_throttle_opcode(opcode)) {
+ if (uid && mlx5_has_privileged_uid(dev)) {
+ if (!mlx5_cmd_is_privileged_uid(dev, uid)) {
+ unpriv_locked = true;
+ down(&dev->cmd.vars.unprivileged_sem);
+ }
+ } else if (mlx5_cmd_is_throttle_opcode(opcode)) {
throttle_locked = true;
down(&dev->cmd.vars.throttle_sem);
}
@@ -1943,6 +1966,9 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
out_up:
if (throttle_locked)
up(&dev->cmd.vars.throttle_sem);
+ if (unpriv_locked)
+ up(&dev->cmd.vars.unprivileged_sem);
+
return err;
}
@@ -2105,10 +2131,12 @@ static void mlx5_cmd_exec_cb_handler(int status, void *_work)
struct mlx5_async_ctx *ctx;
struct mlx5_core_dev *dev;
bool throttle_locked;
+ bool unpriv_locked;
ctx = work->ctx;
dev = ctx->dev;
throttle_locked = work->throttle_locked;
+ unpriv_locked = work->unpriv_locked;
status = cmd_status_err(dev, status, work->opcode, work->op_mod, work->out);
work->user_callback(status, work);
/* Can't access "work" from this point on. It could have been freed in
@@ -2116,6 +2144,8 @@ static void mlx5_cmd_exec_cb_handler(int status, void *_work)
*/
if (throttle_locked)
up(&dev->cmd.vars.throttle_sem);
+ if (unpriv_locked)
+ up(&dev->cmd.vars.unprivileged_sem);
if (atomic_dec_and_test(&ctx->num_inflight))
complete(&ctx->inflight_done);
}
@@ -2124,6 +2154,8 @@ int mlx5_cmd_exec_cb(struct mlx5_async_ctx *ctx, void *in, int in_size,
void *out, int out_size, mlx5_async_cbk_t callback,
struct mlx5_async_work *work)
{
+ struct mlx5_core_dev *dev = ctx->dev;
+ u16 uid;
int ret;
work->ctx = ctx;
@@ -2132,18 +2164,29 @@ int mlx5_cmd_exec_cb(struct mlx5_async_ctx *ctx, void *in, int in_size,
work->op_mod = MLX5_GET(mbox_in, in, op_mod);
work->out = out;
work->throttle_locked = false;
+ work->unpriv_locked = false;
+ uid = in_to_uid(in);
+
if (WARN_ON(!atomic_inc_not_zero(&ctx->num_inflight)))
return -EIO;
- if (mlx5_cmd_is_throttle_opcode(in_to_opcode(in))) {
- if (down_trylock(&ctx->dev->cmd.vars.throttle_sem)) {
+ if (uid && mlx5_has_privileged_uid(dev)) {
+ if (!mlx5_cmd_is_privileged_uid(dev, uid)) {
+ if (down_trylock(&dev->cmd.vars.unprivileged_sem)) {
+ ret = -EBUSY;
+ goto dec_num_inflight;
+ }
+ work->unpriv_locked = true;
+ }
+ } else if (mlx5_cmd_is_throttle_opcode(in_to_opcode(in))) {
+ if (down_trylock(&dev->cmd.vars.throttle_sem)) {
ret = -EBUSY;
goto dec_num_inflight;
}
work->throttle_locked = true;
}
- ret = cmd_exec(ctx->dev, in, in_size, out, out_size,
+ ret = cmd_exec(dev, in, in_size, out, out_size,
mlx5_cmd_exec_cb_handler, work, false);
if (ret)
goto sem_up;
@@ -2152,7 +2195,9 @@ int mlx5_cmd_exec_cb(struct mlx5_async_ctx *ctx, void *in, int in_size,
sem_up:
if (work->throttle_locked)
- up(&ctx->dev->cmd.vars.throttle_sem);
+ up(&dev->cmd.vars.throttle_sem);
+ if (work->unpriv_locked)
+ up(&dev->cmd.vars.unprivileged_sem);
dec_num_inflight:
if (atomic_dec_and_test(&ctx->num_inflight))
complete(&ctx->inflight_done);
@@ -2390,10 +2435,16 @@ int mlx5_cmd_enable(struct mlx5_core_dev *dev)
sema_init(&cmd->vars.sem, cmd->vars.max_reg_cmds);
sema_init(&cmd->vars.pages_sem, 1);
sema_init(&cmd->vars.throttle_sem, DIV_ROUND_UP(cmd->vars.max_reg_cmds, 2));
+ sema_init(&cmd->vars.unprivileged_sem,
+ DIV_ROUND_UP(cmd->vars.max_reg_cmds, 2));
+
+ xa_init(&cmd->vars.privileged_uids);
cmd->pool = dma_pool_create("mlx5_cmd", mlx5_core_dma_dev(dev), size, align, 0);
- if (!cmd->pool)
- return -ENOMEM;
+ if (!cmd->pool) {
+ err = -ENOMEM;
+ goto err_destroy_xa;
+ }
err = alloc_cmd_page(dev, cmd);
if (err)
@@ -2427,6 +2478,8 @@ int mlx5_cmd_enable(struct mlx5_core_dev *dev)
free_cmd_page(dev, cmd);
err_free_pool:
dma_pool_destroy(cmd->pool);
+err_destroy_xa:
+ xa_destroy(&dev->cmd.vars.privileged_uids);
return err;
}
@@ -2439,6 +2492,7 @@ void mlx5_cmd_disable(struct mlx5_core_dev *dev)
destroy_msg_cache(dev);
free_cmd_page(dev, cmd);
dma_pool_destroy(cmd->pool);
+ xa_destroy(&dev->cmd.vars.privileged_uids);
}
void mlx5_cmd_set_state(struct mlx5_core_dev *dev,
@@ -2446,3 +2500,18 @@ void mlx5_cmd_set_state(struct mlx5_core_dev *dev,
{
dev->cmd.state = cmdif_state;
}
+
+int mlx5_cmd_add_privileged_uid(struct mlx5_core_dev *dev, u16 uid)
+{
+ return xa_insert(&dev->cmd.vars.privileged_uids, uid,
+ xa_mk_value(uid), GFP_KERNEL);
+}
+EXPORT_SYMBOL(mlx5_cmd_add_privileged_uid);
+
+void mlx5_cmd_remove_privileged_uid(struct mlx5_core_dev *dev, u16 uid)
+{
+ void *data = xa_erase(&dev->cmd.vars.privileged_uids, uid);
+
+ WARN(!data, "Privileged UID %u does not exist\n", uid);
+}
+EXPORT_SYMBOL(mlx5_cmd_remove_privileged_uid);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 876d6b03a87a..4f593a61220d 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -305,6 +305,8 @@ struct mlx5_cmd {
struct semaphore sem;
struct semaphore pages_sem;
struct semaphore throttle_sem;
+ struct semaphore unprivileged_sem;
+ struct xarray privileged_uids;
} vars;
enum mlx5_cmdif_state state;
void *cmd_alloc_buf;
@@ -990,6 +992,7 @@ struct mlx5_async_work {
u16 opcode; /* cmd opcode */
u16 op_mod; /* cmd op_mod */
u8 throttle_locked:1;
+ u8 unpriv_locked:1;
void *out; /* pointer to the cmd output buffer */
};
@@ -1020,6 +1023,8 @@ int mlx5_cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
int mlx5_cmd_exec_polling(struct mlx5_core_dev *dev, void *in, int in_size,
void *out, int out_size);
bool mlx5_cmd_is_down(struct mlx5_core_dev *dev);
+int mlx5_cmd_add_privileged_uid(struct mlx5_core_dev *dev, u16 uid);
+void mlx5_cmd_remove_privileged_uid(struct mlx5_core_dev *dev, u16 uid);
void mlx5_core_uplink_netdev_set(struct mlx5_core_dev *mdev, struct net_device *netdev);
void mlx5_core_uplink_netdev_event_replay(struct mlx5_core_dev *mdev);
--
2.48.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH mlx5-next 4/5] net/mlx5: Query ADV_RDMA capabilities
2025-02-26 13:01 [PATCH rdma-next 0/5] Add support and infrastructure for RDMA TRANSPORT Leon Romanovsky
` (2 preceding siblings ...)
2025-02-26 13:01 ` [PATCH mlx5-next 3/5] net/mlx5: Limit non-privileged commands Leon Romanovsky
@ 2025-02-26 13:01 ` Leon Romanovsky
2025-02-26 13:01 ` [PATCH mlx5-next 5/5] net/mlx5: fs, add RDMA TRANSPORT steering domain support Leon Romanovsky
2025-03-08 18:23 ` [PATCH rdma-next 0/5] Add support and infrastructure for RDMA TRANSPORT Leon Romanovsky
5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-02-26 13:01 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Patrisious Haddad, Andrew Lunn, Eric Dumazet, Jakub Kicinski,
linux-rdma, Mark Bloch, netdev, Paolo Abeni, Saeed Mahameed,
Tariq Toukan
From: Patrisious Haddad <phaddad@nvidia.com>
Query ADV_RDMA capabilities which provide information for
advanced RDMA related features.
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/fw.c | 7 ++++
.../net/ethernet/mellanox/mlx5/core/main.c | 1 +
include/linux/mlx5/device.h | 5 +++
include/linux/mlx5/mlx5_ifc.h | 42 ++++++++++++++++++-
4 files changed, 54 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw.c b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
index b253d1673398..57476487e31f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
@@ -287,6 +287,13 @@ int mlx5_query_hca_caps(struct mlx5_core_dev *dev)
return err;
}
+ if (MLX5_CAP_GEN(dev, adv_rdma)) {
+ err = mlx5_core_get_caps_mode(dev, MLX5_CAP_ADV_RDMA,
+ HCA_CAP_OPMOD_GET_CUR);
+ if (err)
+ return err;
+ }
+
return 0;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index ec956c4bcebd..af0b677393d0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1795,6 +1795,7 @@ static const int types[] = {
MLX5_CAP_ADV_VIRTUALIZATION,
MLX5_CAP_CRYPTO,
MLX5_CAP_SHAMPO,
+ MLX5_CAP_ADV_RDMA,
};
static void mlx5_hca_caps_free(struct mlx5_core_dev *dev)
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index fd37f4e54d76..0ae6d69c5221 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -1251,6 +1251,7 @@ enum mlx5_cap_type {
MLX5_CAP_GENERAL_2 = 0x20,
MLX5_CAP_PORT_SELECTION = 0x25,
MLX5_CAP_ADV_VIRTUALIZATION = 0x26,
+ MLX5_CAP_ADV_RDMA = 0x28,
/* NUM OF CAP Types */
MLX5_CAP_NUM
};
@@ -1384,6 +1385,10 @@ enum mlx5_qcam_feature_groups {
MLX5_GET(adv_virtualization_cap, \
mdev->caps.hca[MLX5_CAP_ADV_VIRTUALIZATION]->cur, cap)
+#define MLX5_CAP_ADV_RDMA(mdev, cap) \
+ MLX5_GET(adv_rdma_cap, \
+ mdev->caps.hca[MLX5_CAP_ADV_RDMA]->cur, cap)
+
#define MLX5_CAP_FLOWTABLE_PORT_SELECTION(mdev, cap) \
MLX5_CAP_PORT_SELECTION(mdev, flow_table_properties_port_selection.cap)
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 3b3d88ffcacc..fea8af42f954 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1993,7 +1993,9 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 max_geneve_tlv_options[0x8];
u8 reserved_at_568[0x3];
u8 max_geneve_tlv_option_data_len[0x5];
- u8 reserved_at_570[0x9];
+ u8 reserved_at_570[0x1];
+ u8 adv_rdma[0x1];
+ u8 reserved_at_572[0x7];
u8 adv_virtualization[0x1];
u8 reserved_at_57a[0x6];
@@ -13076,6 +13078,44 @@ struct mlx5_ifc_load_vhca_state_out_bits {
u8 reserved_at_40[0x40];
};
+struct mlx5_ifc_adv_rdma_cap_bits {
+ u8 rdma_transport_manager[0x1];
+ u8 rdma_transport_manager_other_eswitch[0x1];
+ u8 reserved_at_2[0x1e];
+
+ u8 rcx_type[0x8];
+ u8 reserved_at_28[0x2];
+ u8 ps_entry_log_max_value[0x6];
+ u8 reserved_at_30[0x6];
+ u8 qp_max_ps_num_entry[0xa];
+
+ u8 mp_max_num_queues[0x8];
+ u8 ps_user_context_max_log_size[0x8];
+ u8 message_based_qp_and_striding_wq[0x8];
+ u8 reserved_at_58[0x8];
+
+ u8 max_receive_send_message_size_stride[0x10];
+ u8 reserved_at_70[0x10];
+
+ u8 max_receive_send_message_size_byte[0x20];
+
+ u8 reserved_at_a0[0x160];
+
+ struct mlx5_ifc_flow_table_prop_layout_bits rdma_transport_rx_flow_table_properties;
+
+ struct mlx5_ifc_flow_table_prop_layout_bits rdma_transport_tx_flow_table_properties;
+
+ struct mlx5_ifc_flow_table_fields_supported_2_bits rdma_transport_rx_ft_field_support_2;
+
+ struct mlx5_ifc_flow_table_fields_supported_2_bits rdma_transport_tx_ft_field_support_2;
+
+ struct mlx5_ifc_flow_table_fields_supported_2_bits rdma_transport_rx_ft_field_bitmask_support_2;
+
+ struct mlx5_ifc_flow_table_fields_supported_2_bits rdma_transport_tx_ft_field_bitmask_support_2;
+
+ u8 reserved_at_800[0x3800];
+};
+
struct mlx5_ifc_adv_virtualization_cap_bits {
u8 reserved_at_0[0x3];
u8 pg_track_log_max_num[0x5];
--
2.48.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH mlx5-next 5/5] net/mlx5: fs, add RDMA TRANSPORT steering domain support
2025-02-26 13:01 [PATCH rdma-next 0/5] Add support and infrastructure for RDMA TRANSPORT Leon Romanovsky
` (3 preceding siblings ...)
2025-02-26 13:01 ` [PATCH mlx5-next 4/5] net/mlx5: Query ADV_RDMA capabilities Leon Romanovsky
@ 2025-02-26 13:01 ` Leon Romanovsky
2025-03-08 18:23 ` [PATCH rdma-next 0/5] Add support and infrastructure for RDMA TRANSPORT Leon Romanovsky
5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-02-26 13:01 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Patrisious Haddad, Andrew Lunn, Eric Dumazet, Jakub Kicinski,
linux-rdma, Mark Bloch, netdev, Paolo Abeni, Saeed Mahameed,
Tariq Toukan
From: Patrisious Haddad <phaddad@nvidia.com>
Add RX and TX RDMA_TRANSPORT flow table namespace, and the ability
to create flow tables in those namespaces.
The RDMA_TRANSPORT RX and TX are per vport.
Packets will traverse through RDMA_TRANSPORT_RX after RDMA_RX and through
RDMA_TRANSPORT_TX before RDMA_TX, ensuring proper control and management.
RDMA_TRANSPORT domains are managed by the vport group manager.
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
.../mellanox/mlx5/core/esw/acl/helper.c | 2 +-
.../mellanox/mlx5/core/eswitch_offloads.c | 6 +-
.../net/ethernet/mellanox/mlx5/core/fs_cmd.c | 2 +
.../net/ethernet/mellanox/mlx5/core/fs_core.c | 178 ++++++++++++++++--
.../net/ethernet/mellanox/mlx5/core/fs_core.h | 12 +-
include/linux/mlx5/device.h | 6 +
include/linux/mlx5/fs.h | 10 +-
7 files changed, 195 insertions(+), 21 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/helper.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/helper.c
index d599e50af346..3ce455c2535c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/helper.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/helper.c
@@ -27,7 +27,7 @@ esw_acl_table_create(struct mlx5_eswitch *esw, struct mlx5_vport *vport, int ns,
esw_debug(dev, "Create vport[%d] %s ACL table\n", vport_num,
ns == MLX5_FLOW_NAMESPACE_ESW_INGRESS ? "ingress" : "egress");
- root_ns = mlx5_get_flow_vport_acl_namespace(dev, ns, vport->index);
+ root_ns = mlx5_get_flow_vport_namespace(dev, ns, vport->index);
if (!root_ns) {
esw_warn(dev, "Failed to get E-Switch root namespace for vport (%d)\n",
vport_num);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 20cc01ceee8a..1ee5791573d8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2828,9 +2828,9 @@ static int esw_set_master_egress_rule(struct mlx5_core_dev *master,
if (IS_ERR(vport))
return PTR_ERR(vport);
- egress_ns = mlx5_get_flow_vport_acl_namespace(master,
- MLX5_FLOW_NAMESPACE_ESW_EGRESS,
- vport->index);
+ egress_ns = mlx5_get_flow_vport_namespace(master,
+ MLX5_FLOW_NAMESPACE_ESW_EGRESS,
+ vport->index);
if (!egress_ns)
return -EINVAL;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
index ae20c061e0fb..a47c29571f64 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
@@ -1142,6 +1142,8 @@ const struct mlx5_flow_cmds *mlx5_fs_cmd_get_default(enum fs_flow_table_type typ
case FS_FT_RDMA_RX:
case FS_FT_RDMA_TX:
case FS_FT_PORT_SEL:
+ case FS_FT_RDMA_TRANSPORT_RX:
+ case FS_FT_RDMA_TRANSPORT_TX:
return mlx5_fs_cmd_get_fw_cmds();
default:
return mlx5_fs_cmd_get_stub_cmds();
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 22dc23d991d2..6163bc98d94a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -1456,7 +1456,7 @@ mlx5_create_auto_grouped_flow_table(struct mlx5_flow_namespace *ns,
struct mlx5_flow_table *ft;
int autogroups_max_fte;
- ft = mlx5_create_flow_table(ns, ft_attr);
+ ft = mlx5_create_vport_flow_table(ns, ft_attr, ft_attr->vport);
if (IS_ERR(ft))
return ft;
@@ -2764,9 +2764,9 @@ struct mlx5_flow_namespace *mlx5_get_flow_namespace(struct mlx5_core_dev *dev,
}
EXPORT_SYMBOL(mlx5_get_flow_namespace);
-struct mlx5_flow_namespace *mlx5_get_flow_vport_acl_namespace(struct mlx5_core_dev *dev,
- enum mlx5_flow_namespace_type type,
- int vport)
+struct mlx5_flow_namespace *
+mlx5_get_flow_vport_namespace(struct mlx5_core_dev *dev,
+ enum mlx5_flow_namespace_type type, int vport_idx)
{
struct mlx5_flow_steering *steering = dev->priv.steering;
@@ -2775,25 +2775,43 @@ struct mlx5_flow_namespace *mlx5_get_flow_vport_acl_namespace(struct mlx5_core_d
switch (type) {
case MLX5_FLOW_NAMESPACE_ESW_EGRESS:
- if (vport >= steering->esw_egress_acl_vports)
+ if (vport_idx >= steering->esw_egress_acl_vports)
return NULL;
if (steering->esw_egress_root_ns &&
- steering->esw_egress_root_ns[vport])
- return &steering->esw_egress_root_ns[vport]->ns;
+ steering->esw_egress_root_ns[vport_idx])
+ return &steering->esw_egress_root_ns[vport_idx]->ns;
else
return NULL;
case MLX5_FLOW_NAMESPACE_ESW_INGRESS:
- if (vport >= steering->esw_ingress_acl_vports)
+ if (vport_idx >= steering->esw_ingress_acl_vports)
return NULL;
if (steering->esw_ingress_root_ns &&
- steering->esw_ingress_root_ns[vport])
- return &steering->esw_ingress_root_ns[vport]->ns;
+ steering->esw_ingress_root_ns[vport_idx])
+ return &steering->esw_ingress_root_ns[vport_idx]->ns;
+ else
+ return NULL;
+ case MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX:
+ if (vport_idx >= steering->rdma_transport_rx_vports)
+ return NULL;
+ if (steering->rdma_transport_rx_root_ns &&
+ steering->rdma_transport_rx_root_ns[vport_idx])
+ return &steering->rdma_transport_rx_root_ns[vport_idx]->ns;
+ else
+ return NULL;
+ case MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX:
+ if (vport_idx >= steering->rdma_transport_tx_vports)
+ return NULL;
+
+ if (steering->rdma_transport_tx_root_ns &&
+ steering->rdma_transport_tx_root_ns[vport_idx])
+ return &steering->rdma_transport_tx_root_ns[vport_idx]->ns;
else
return NULL;
default:
return NULL;
}
}
+EXPORT_SYMBOL(mlx5_get_flow_vport_namespace);
static struct fs_prio *_fs_create_prio(struct mlx5_flow_namespace *ns,
unsigned int prio,
@@ -3199,6 +3217,127 @@ static int init_rdma_tx_root_ns(struct mlx5_flow_steering *steering)
return err;
}
+static int
+init_rdma_transport_rx_root_ns_one(struct mlx5_flow_steering *steering,
+ int vport_idx)
+{
+ struct fs_prio *prio;
+
+ steering->rdma_transport_rx_root_ns[vport_idx] =
+ create_root_ns(steering, FS_FT_RDMA_TRANSPORT_RX);
+ if (!steering->rdma_transport_rx_root_ns[vport_idx])
+ return -ENOMEM;
+
+ /* create 1 prio*/
+ prio = fs_create_prio(&steering->rdma_transport_rx_root_ns[vport_idx]->ns,
+ MLX5_RDMA_TRANSPORT_BYPASS_PRIO, 1);
+ return PTR_ERR_OR_ZERO(prio);
+}
+
+static int
+init_rdma_transport_tx_root_ns_one(struct mlx5_flow_steering *steering,
+ int vport_idx)
+{
+ struct fs_prio *prio;
+
+ steering->rdma_transport_tx_root_ns[vport_idx] =
+ create_root_ns(steering, FS_FT_RDMA_TRANSPORT_TX);
+ if (!steering->rdma_transport_tx_root_ns[vport_idx])
+ return -ENOMEM;
+
+ /* create 1 prio*/
+ prio = fs_create_prio(&steering->rdma_transport_tx_root_ns[vport_idx]->ns,
+ MLX5_RDMA_TRANSPORT_BYPASS_PRIO, 1);
+ return PTR_ERR_OR_ZERO(prio);
+}
+
+static int init_rdma_transport_rx_root_ns(struct mlx5_flow_steering *steering)
+{
+ struct mlx5_core_dev *dev = steering->dev;
+ int total_vports;
+ int err;
+ int i;
+
+ /* In case eswitch not supported and working in legacy mode */
+ total_vports = mlx5_eswitch_get_total_vports(dev) ?: 1;
+
+ steering->rdma_transport_rx_root_ns =
+ kcalloc(total_vports,
+ sizeof(*steering->rdma_transport_rx_root_ns),
+ GFP_KERNEL);
+ if (!steering->rdma_transport_rx_root_ns)
+ return -ENOMEM;
+
+ for (i = 0; i < total_vports; i++) {
+ err = init_rdma_transport_rx_root_ns_one(steering, i);
+ if (err)
+ goto cleanup_root_ns;
+ }
+ steering->rdma_transport_rx_vports = total_vports;
+ return 0;
+
+cleanup_root_ns:
+ while (i--)
+ cleanup_root_ns(steering->rdma_transport_rx_root_ns[i]);
+ kfree(steering->rdma_transport_rx_root_ns);
+ steering->rdma_transport_rx_root_ns = NULL;
+ return err;
+}
+
+static int init_rdma_transport_tx_root_ns(struct mlx5_flow_steering *steering)
+{
+ struct mlx5_core_dev *dev = steering->dev;
+ int total_vports;
+ int err;
+ int i;
+
+ /* In case eswitch not supported and working in legacy mode */
+ total_vports = mlx5_eswitch_get_total_vports(dev) ?: 1;
+
+ steering->rdma_transport_tx_root_ns =
+ kcalloc(total_vports,
+ sizeof(*steering->rdma_transport_tx_root_ns),
+ GFP_KERNEL);
+ if (!steering->rdma_transport_tx_root_ns)
+ return -ENOMEM;
+
+ for (i = 0; i < total_vports; i++) {
+ err = init_rdma_transport_tx_root_ns_one(steering, i);
+ if (err)
+ goto cleanup_root_ns;
+ }
+ steering->rdma_transport_tx_vports = total_vports;
+ return 0;
+
+cleanup_root_ns:
+ while (i--)
+ cleanup_root_ns(steering->rdma_transport_tx_root_ns[i]);
+ kfree(steering->rdma_transport_tx_root_ns);
+ steering->rdma_transport_tx_root_ns = NULL;
+ return err;
+}
+
+static void cleanup_rdma_transport_roots_ns(struct mlx5_flow_steering *steering)
+{
+ int i;
+
+ if (steering->rdma_transport_rx_root_ns) {
+ for (i = 0; i < steering->rdma_transport_rx_vports; i++)
+ cleanup_root_ns(steering->rdma_transport_rx_root_ns[i]);
+
+ kfree(steering->rdma_transport_rx_root_ns);
+ steering->rdma_transport_rx_root_ns = NULL;
+ }
+
+ if (steering->rdma_transport_tx_root_ns) {
+ for (i = 0; i < steering->rdma_transport_tx_vports; i++)
+ cleanup_root_ns(steering->rdma_transport_tx_root_ns[i]);
+
+ kfree(steering->rdma_transport_tx_root_ns);
+ steering->rdma_transport_tx_root_ns = NULL;
+ }
+}
+
/* FT and tc chains are stored in the same array so we can re-use the
* mlx5_get_fdb_sub_ns() and tc api for FT chains.
* When creating a new ns for each chain store it in the first available slot.
@@ -3631,6 +3770,7 @@ void mlx5_fs_core_cleanup(struct mlx5_core_dev *dev)
cleanup_root_ns(steering->rdma_rx_root_ns);
cleanup_root_ns(steering->rdma_tx_root_ns);
cleanup_root_ns(steering->egress_root_ns);
+ cleanup_rdma_transport_roots_ns(steering);
devl_params_unregister(priv_to_devlink(dev), mlx5_fs_params,
ARRAY_SIZE(mlx5_fs_params));
@@ -3700,6 +3840,18 @@ int mlx5_fs_core_init(struct mlx5_core_dev *dev)
goto err;
}
+ if (MLX5_CAP_FLOWTABLE_RDMA_TRANSPORT_RX(dev, ft_support)) {
+ err = init_rdma_transport_rx_root_ns(steering);
+ if (err)
+ goto err;
+ }
+
+ if (MLX5_CAP_FLOWTABLE_RDMA_TRANSPORT_TX(dev, ft_support)) {
+ err = init_rdma_transport_tx_root_ns(steering);
+ if (err)
+ goto err;
+ }
+
return 0;
err:
@@ -3850,8 +4002,10 @@ mlx5_get_root_namespace(struct mlx5_core_dev *dev, enum mlx5_flow_namespace_type
struct mlx5_flow_namespace *ns;
if (ns_type == MLX5_FLOW_NAMESPACE_ESW_EGRESS ||
- ns_type == MLX5_FLOW_NAMESPACE_ESW_INGRESS)
- ns = mlx5_get_flow_vport_acl_namespace(dev, ns_type, 0);
+ ns_type == MLX5_FLOW_NAMESPACE_ESW_INGRESS ||
+ ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX ||
+ ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX)
+ ns = mlx5_get_flow_vport_namespace(dev, ns_type, 0);
else
ns = mlx5_get_flow_namespace(dev, ns_type);
if (!ns)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
index 20837e526679..dbe747e93841 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
@@ -115,7 +115,9 @@ enum fs_flow_table_type {
FS_FT_PORT_SEL = 0X9,
FS_FT_FDB_RX = 0xa,
FS_FT_FDB_TX = 0xb,
- FS_FT_MAX_TYPE = FS_FT_FDB_TX,
+ FS_FT_RDMA_TRANSPORT_RX = 0xd,
+ FS_FT_RDMA_TRANSPORT_TX = 0xe,
+ FS_FT_MAX_TYPE = FS_FT_RDMA_TRANSPORT_TX,
};
enum fs_flow_table_op_mod {
@@ -158,6 +160,10 @@ struct mlx5_flow_steering {
struct mlx5_flow_root_namespace *port_sel_root_ns;
int esw_egress_acl_vports;
int esw_ingress_acl_vports;
+ struct mlx5_flow_root_namespace **rdma_transport_rx_root_ns;
+ struct mlx5_flow_root_namespace **rdma_transport_tx_root_ns;
+ int rdma_transport_rx_vports;
+ int rdma_transport_tx_vports;
};
struct fs_node {
@@ -434,7 +440,9 @@ struct mlx5_flow_root_namespace *find_root(struct fs_node *node);
(type == FS_FT_PORT_SEL) ? MLX5_CAP_FLOWTABLE_PORT_SELECTION(mdev, cap) : \
(type == FS_FT_FDB_RX) ? MLX5_CAP_ESW_FLOWTABLE_FDB(mdev, cap) : \
(type == FS_FT_FDB_TX) ? MLX5_CAP_ESW_FLOWTABLE_FDB(mdev, cap) : \
- (BUILD_BUG_ON_ZERO(FS_FT_FDB_TX != FS_FT_MAX_TYPE))\
+ (type == FS_FT_RDMA_TRANSPORT_RX) ? MLX5_CAP_FLOWTABLE_RDMA_TRANSPORT_RX(mdev, cap) : \
+ (type == FS_FT_RDMA_TRANSPORT_TX) ? MLX5_CAP_FLOWTABLE_RDMA_TRANSPORT_TX(mdev, cap) : \
+ (BUILD_BUG_ON_ZERO(FS_FT_RDMA_TRANSPORT_TX != FS_FT_MAX_TYPE))\
)
#endif
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index 0ae6d69c5221..8fe56d0362c6 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -1346,6 +1346,12 @@ enum mlx5_qcam_feature_groups {
#define MLX5_CAP_FLOWTABLE_RDMA_TX(mdev, cap) \
MLX5_CAP_FLOWTABLE(mdev, flow_table_properties_nic_transmit_rdma.cap)
+#define MLX5_CAP_FLOWTABLE_RDMA_TRANSPORT_RX(mdev, cap) \
+ MLX5_CAP_ADV_RDMA(mdev, rdma_transport_rx_flow_table_properties.cap)
+
+#define MLX5_CAP_FLOWTABLE_RDMA_TRANSPORT_TX(mdev, cap) \
+ MLX5_CAP_ADV_RDMA(mdev, rdma_transport_tx_flow_table_properties.cap)
+
#define MLX5_CAP_ESW_FLOWTABLE(mdev, cap) \
MLX5_GET(flow_table_eswitch_cap, \
mdev->caps.hca[MLX5_CAP_ESWITCH_FLOW_TABLE]->cur, cap)
diff --git a/include/linux/mlx5/fs.h b/include/linux/mlx5/fs.h
index 01cb72d68c23..fd62b2b1611d 100644
--- a/include/linux/mlx5/fs.h
+++ b/include/linux/mlx5/fs.h
@@ -40,6 +40,7 @@
#define MLX5_SET_CFG(p, f, v) MLX5_SET(create_flow_group_in, p, f, v)
+#define MLX5_RDMA_TRANSPORT_BYPASS_PRIO 0
#define MLX5_FS_MAX_POOL_SIZE BIT(30)
enum mlx5_flow_destination_type {
@@ -110,6 +111,8 @@ enum mlx5_flow_namespace_type {
MLX5_FLOW_NAMESPACE_RDMA_TX_IPSEC,
MLX5_FLOW_NAMESPACE_RDMA_RX_MACSEC,
MLX5_FLOW_NAMESPACE_RDMA_TX_MACSEC,
+ MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX,
+ MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX,
};
enum {
@@ -194,9 +197,9 @@ struct mlx5_flow_namespace *
mlx5_get_flow_namespace(struct mlx5_core_dev *dev,
enum mlx5_flow_namespace_type type);
struct mlx5_flow_namespace *
-mlx5_get_flow_vport_acl_namespace(struct mlx5_core_dev *dev,
- enum mlx5_flow_namespace_type type,
- int vport);
+mlx5_get_flow_vport_namespace(struct mlx5_core_dev *dev,
+ enum mlx5_flow_namespace_type type,
+ int vport_idx);
struct mlx5_flow_table_attr {
int prio;
@@ -204,6 +207,7 @@ struct mlx5_flow_table_attr {
u32 level;
u32 flags;
u16 uid;
+ u16 vport;
struct mlx5_flow_table *next_ft;
struct {
--
2.48.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH rdma-next 0/5] Add support and infrastructure for RDMA TRANSPORT
2025-02-26 13:01 [PATCH rdma-next 0/5] Add support and infrastructure for RDMA TRANSPORT Leon Romanovsky
` (4 preceding siblings ...)
2025-02-26 13:01 ` [PATCH mlx5-next 5/5] net/mlx5: fs, add RDMA TRANSPORT steering domain support Leon Romanovsky
@ 2025-03-08 18:23 ` Leon Romanovsky
5 siblings, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2025-03-08 18:23 UTC (permalink / raw)
To: Jason Gunthorpe, Leon Romanovsky
Cc: Andrew Lunn, Chiara Meiohas, Eric Dumazet, Jakub Kicinski,
linux-rdma, Mark Bloch, Moshe Shemesh, netdev, Paolo Abeni,
Patrisious Haddad, Saeed Mahameed, Tariq Toukan
On Wed, 26 Feb 2025 15:01:04 +0200, Leon Romanovsky wrote:
> This is preparation series targeted for mlx5-next, which will be used
> later in RDMA.
>
> This series adds RDMA transport steering logic which would allow the
> vport group manager to catch control packets from VFs and forward them
> to control SW to help with congestion control.
>
> [...]
Applied, thanks!
[1/5] net/mlx5: Add RDMA_CTRL HW capabilities
https://git.kernel.org/rdma/rdma/c/f6f425f3d251c0
[2/5] net/mlx5: Allow the throttle mechanism to be more dynamic
https://git.kernel.org/rdma/rdma/c/0a34fad1bed45f
[3/5] net/mlx5: Limit non-privileged commands
https://git.kernel.org/rdma/rdma/c/f9deed0980fe29
[4/5] net/mlx5: Query ADV_RDMA capabilities
https://git.kernel.org/rdma/rdma/c/ab7d228c7e0d0e
[5/5] net/mlx5: fs, add RDMA TRANSPORT steering domain support
https://git.kernel.org/rdma/rdma/c/15b103df80b250
Best regards,
--
Leon Romanovsky <leon@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-03-08 18:23 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-26 13:01 [PATCH rdma-next 0/5] Add support and infrastructure for RDMA TRANSPORT Leon Romanovsky
2025-02-26 13:01 ` [PATCH mlx5-next 1/5] net/mlx5: Add RDMA_CTRL HW capabilities Leon Romanovsky
2025-02-26 13:01 ` [PATCH mlx5-next 2/5] net/mlx5: Allow the throttle mechanism to be more dynamic Leon Romanovsky
2025-02-26 13:01 ` [PATCH mlx5-next 3/5] net/mlx5: Limit non-privileged commands Leon Romanovsky
2025-02-26 13:01 ` [PATCH mlx5-next 4/5] net/mlx5: Query ADV_RDMA capabilities Leon Romanovsky
2025-02-26 13:01 ` [PATCH mlx5-next 5/5] net/mlx5: fs, add RDMA TRANSPORT steering domain support Leon Romanovsky
2025-03-08 18:23 ` [PATCH rdma-next 0/5] Add support and infrastructure for RDMA TRANSPORT Leon Romanovsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).