From: Tariq Toukan <tariqt@nvidia.com>
To: "David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Eric Dumazet <edumazet@google.com>,
"Andrew Lunn" <andrew+netdev@lunn.ch>,
Jiri Pirko <jiri@nvidia.com>
Cc: Cosmin Ratiu <cratiu@nvidia.com>,
Carolina Jubran <cjubran@nvidia.com>,
Gal Pressman <gal@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
Donald Hunter <donald.hunter@gmail.com>,
Jiri Pirko <jiri@resnulli.us>, Jonathan Corbet <corbet@lwn.net>,
Saeed Mahameed <saeedm@nvidia.com>,
Leon Romanovsky <leon@kernel.org>,
Tariq Toukan <tariqt@nvidia.com>, <netdev@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <linux-doc@vger.kernel.org>,
<linux-rdma@vger.kernel.org>
Subject: [PATCH net-next 08/10] net/mlx5: qos: Support cross-esw tx scheduling
Date: Thu, 13 Feb 2025 20:01:32 +0200 [thread overview]
Message-ID: <20250213180134.323929-9-tariqt@nvidia.com> (raw)
In-Reply-To: <20250213180134.323929-1-tariqt@nvidia.com>
From: Cosmin Ratiu <cratiu@nvidia.com>
Up to now, rate groups could only contain vports from the same E-Switch.
This patch relaxes that restriction if the device supports it
(HCA_CAP.esw_cross_esw_sched == true) and the right conditions are met:
- Link Aggregation (LAG) is enabled.
- The E-Switches use the same qos domain.
This also enables the use of the previously added shared esw qos
domains.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/esw/qos.c | 59 +++++++++++++++----
1 file changed, 49 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c
index 6a469f214187..e6dcfe348a7e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c
@@ -147,7 +147,9 @@ struct mlx5_esw_sched_node {
enum sched_node_type type;
/* The eswitch this node belongs to. */
struct mlx5_eswitch *esw;
- /* The children nodes of this node, empty list for leaf nodes. */
+ /* The children nodes of this node, empty list for leaf nodes.
+ * Can be from multiple E-Switches.
+ */
struct list_head children;
/* Valid only if this node is associated with a vport. */
struct mlx5_vport *vport;
@@ -398,6 +400,7 @@ static int esw_qos_vport_create_sched_element(struct mlx5_esw_sched_node *vport_
{
u32 sched_ctx[MLX5_ST_SZ_DW(scheduling_context)] = {};
struct mlx5_core_dev *dev = vport_node->esw->dev;
+ struct mlx5_vport *vport = vport_node->vport;
void *attr;
if (!mlx5_qos_element_type_supported(dev,
@@ -408,7 +411,13 @@ static int esw_qos_vport_create_sched_element(struct mlx5_esw_sched_node *vport_
MLX5_SET(scheduling_context, sched_ctx, element_type,
SCHEDULING_CONTEXT_ELEMENT_TYPE_VPORT);
attr = MLX5_ADDR_OF(scheduling_context, sched_ctx, element_attributes);
- MLX5_SET(vport_element, attr, vport_number, vport_node->vport->vport);
+ MLX5_SET(vport_element, attr, vport_number, vport->vport);
+ if (vport->dev != dev) {
+ /* The port is assigned to a node on another eswitch. */
+ MLX5_SET(vport_element, attr, eswitch_owner_vhca_id_valid, true);
+ MLX5_SET(vport_element, attr, eswitch_owner_vhca_id,
+ MLX5_CAP_GEN(vport->dev, vhca_id));
+ }
MLX5_SET(scheduling_context, sched_ctx, parent_element_id, vport_node->parent->ix);
MLX5_SET(scheduling_context, sched_ctx, max_average_bw, vport_node->max_rate);
@@ -887,10 +896,16 @@ static int esw_qos_devlink_rate_to_mbps(struct mlx5_core_dev *mdev, const char *
int mlx5_esw_qos_init(struct mlx5_eswitch *esw)
{
- if (esw->qos.domain)
- return 0; /* Nothing to change. */
+ bool use_shared_domain = esw->mode == MLX5_ESWITCH_OFFLOADS &&
+ MLX5_CAP_QOS(esw->dev, esw_cross_esw_sched);
+
+ if (esw->qos.domain) {
+ if (esw->qos.domain->shared == use_shared_domain)
+ return 0; /* Nothing to change. */
+ esw_qos_domain_release(esw);
+ }
- return esw_qos_domain_init(esw, false);
+ return esw_qos_domain_init(esw, use_shared_domain);
}
void mlx5_esw_qos_cleanup(struct mlx5_eswitch *esw)
@@ -1021,16 +1036,40 @@ int mlx5_esw_devlink_rate_node_del(struct devlink_rate *rate_node, void *priv,
return 0;
}
-int mlx5_esw_qos_vport_update_parent(struct mlx5_vport *vport, struct mlx5_esw_sched_node *parent,
- struct netlink_ext_ack *extack)
+static int mlx5_esw_validate_cross_esw_scheduling(struct mlx5_eswitch *esw,
+ struct mlx5_esw_sched_node *parent,
+ struct netlink_ext_ack *extack)
{
- struct mlx5_eswitch *esw = vport->dev->priv.eswitch;
- int err = 0;
+ if (!parent || esw == parent->esw)
+ return 0;
- if (parent && parent->esw != esw) {
+ if (!MLX5_CAP_QOS(esw->dev, esw_cross_esw_sched)) {
NL_SET_ERR_MSG_MOD(extack, "Cross E-Switch scheduling is not supported");
return -EOPNOTSUPP;
}
+ if (esw->qos.domain != parent->esw->qos.domain) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Cannot add vport to a parent belonging to a different qos domain");
+ return -EOPNOTSUPP;
+ }
+ if (!mlx5_lag_is_active(esw->dev)) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Cross E-Switch scheduling requires LAG to be activated");
+ return -EOPNOTSUPP;
+ }
+
+ return 0;
+}
+
+int mlx5_esw_qos_vport_update_parent(struct mlx5_vport *vport, struct mlx5_esw_sched_node *parent,
+ struct netlink_ext_ack *extack)
+{
+ struct mlx5_eswitch *esw = vport->dev->priv.eswitch;
+ int err;
+
+ err = mlx5_esw_validate_cross_esw_scheduling(esw, parent, extack);
+ if (err)
+ return err;
esw_qos_lock(esw);
if (!vport->qos.sched_node && parent)
--
2.45.0
next prev parent reply other threads:[~2025-02-13 18:03 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-13 18:01 [PATCH net-next 00/10] devlink and mlx5: Introduce rate domains Tariq Toukan
2025-02-13 18:01 ` [PATCH net-next 01/10] devlink: Remove unused param of devlink_rate_nodes_check Tariq Toukan
2025-02-18 2:54 ` Kalesh Anakkur Purayil
2025-02-13 18:01 ` [PATCH net-next 02/10] devlink: Store devlink rates in a rate domain Tariq Toukan
2025-02-13 18:01 ` [PATCH net-next 03/10] devlink: Serialize access to rate domains Tariq Toukan
2025-02-14 12:54 ` Jiri Pirko
2025-02-19 2:21 ` Jakub Kicinski
2025-02-25 13:36 ` Jiri Pirko
2025-02-26 1:40 ` Jakub Kicinski
2025-02-26 14:44 ` Jiri Pirko
2025-02-27 2:53 ` Jakub Kicinski
2025-02-27 12:22 ` Jiri Pirko
2025-03-03 22:06 ` Jakub Kicinski
2025-03-04 13:11 ` Jiri Pirko
2025-03-05 0:04 ` Jakub Kicinski
2025-03-05 11:48 ` Jiri Pirko
2025-02-13 18:01 ` [PATCH net-next 04/10] devlink: Introduce shared " Tariq Toukan
2025-02-13 18:01 ` [PATCH net-next 05/10] devlink: Allow specifying parent device for rate commands Tariq Toukan
2025-02-13 18:01 ` [PATCH net-next 06/10] devlink: Allow rate node parents from other devlinks Tariq Toukan
2025-02-13 18:01 ` [PATCH net-next 07/10] net/mlx5: qos: Introduce shared esw qos domains Tariq Toukan
2025-02-13 18:01 ` Tariq Toukan [this message]
2025-02-13 18:01 ` [PATCH net-next 09/10] net/mlx5: qos: Init shared devlink rate domain Tariq Toukan
2025-02-13 18:01 ` [PATCH net-next 10/10] net/mlx5: Document devlink rates and cross-esw scheduling Tariq Toukan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250213180134.323929-9-tariqt@nvidia.com \
--to=tariqt@nvidia.com \
--cc=andrew+netdev@lunn.ch \
--cc=cjubran@nvidia.com \
--cc=corbet@lwn.net \
--cc=cratiu@nvidia.com \
--cc=davem@davemloft.net \
--cc=donald.hunter@gmail.com \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=jiri@nvidia.com \
--cc=jiri@resnulli.us \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=mbloch@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=saeedm@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox