* [pull request][net-next 00/14] mlx5 updates 2023-05-31
@ 2023-06-01 6:01 Saeed Mahameed
2023-06-01 6:01 ` [net-next 01/14] net/mlx5e: en_tc, Extend peer flows to a list Saeed Mahameed
` (13 more replies)
0 siblings, 14 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan
From: Saeed Mahameed <saeedm@nvidia.com>
This series is part 1 of 2 part series to add 4 port lag support in
mlx5.
For more information please see tag log below.
Please pull and let me know if there is any problem.
Thanks,
Saeed.
The following changes since commit 60cbd38bb0ad9e4395fba9c6994f258f1d6cad51:
Merge branch 'xstats-for-tc-taprio' (2023-05-31 10:00:30 +0100)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2023-05-31
for you to fetch changes up to 053540c77f58313451481890336d63ab49d7e1cb:
net/mlx5: Devcom, extend mlx5_devcom_send_event to work with more than two devices (2023-05-31 22:15:40 -0700)
----------------------------------------------------------------
mlx5-updates-2023-05-31
net/mlx5: Support 4 ports VF LAG, part 1/2
This series continues the series[1] "Support 4 ports HCAs LAG mode"
by Mark Bloch. This series adds support for 4 ports VF LAG (single FDB
E-Switch).
This series of patches focuses on refactoring different sections of the
code that make assumptions about VF LAG supporting only two ports. For
instance, it assumes that each device can only have one peer.
Patches 1-5:
- Refactor ETH handling of TC rules of eswitches with peers.
Patch 6:
- Refactors peer miss group table.
Patches 7-9:
- Refactor single FDB E-Switch creation.
Patch 10:
- Refactor the DR layer.
Patches 11-14:
- Refactors devcom layer.
Next series will refactor LAG layer and enable 4 ports VF LAG.
This series specifically allows HCAs with 4 ports to create a VF LAG
with only 4 ports. It is not possible to create a VF LAG with 2 or 3
ports using HCAs that have 4 ports.
Currently, the Merged E-Switch feature only supports HCAs with 2 ports.
However, upcoming patches will introduce support for HCAs with 4 ports.
In order to activate VF LAG a user can execute:
devlink dev eswitch set pci/0000:08:00.0 mode switchdev
devlink dev eswitch set pci/0000:08:00.1 mode switchdev
devlink dev eswitch set pci/0000:08:00.2 mode switchdev
devlink dev eswitch set pci/0000:08:00.3 mode switchdev
ip link add name bond0 type bond
ip link set dev bond0 type bond mode 802.3ad
ip link set dev eth2 master bond0
ip link set dev eth3 master bond0
ip link set dev eth4 master bond0
ip link set dev eth5 master bond0
Where eth2, eth3, eth4 and eth5 are net-interfaces of pci/0000:08:00.0
pci/0000:08:00.1 pci/0000:08:00.2 pci/0000:08:00.3 respectively.
User can verify LAG state and type via debugfs:
/sys/kernel/debug/mlx5/0000\:08\:00.0/lag/state
/sys/kernel/debug/mlx5/0000\:08\:00.0/lag/type
[1]
https://lore.kernel.org/netdev/20220510055743.118828-1-saeedm@nvidia.com/
----------------------------------------------------------------
Mark Bloch (3):
net/mlx5e: en_tc, Extend peer flows to a list
net/mlx5e: rep, store send to vport rules per peer
net/mlx5e: en_tc, re-factor query route port
Shay Drory (11):
net/mlx5e: tc, Refactor peer add/del flow
net/mlx5e: Handle offloads flows per peer
net/mlx5: E-switch, enlarge peer miss group table
net/mlx5: E-switch, refactor FDB miss rule add/remove
net/mlx5: E-switch, Handle multiple master egress rules
net/mlx5: E-switch, generalize shared FDB creation
net/mlx5: DR, handle more than one peer domain
net/mlx5: Devcom, Rename paired to ready
net/mlx5: E-switch, mark devcom as not ready when all eswitches are unpaired
net/mlx5: Devcom, introduce devcom_for_each_peer_entry
net/mlx5: Devcom, extend mlx5_devcom_send_event to work with more than two devices
.../net/ethernet/mellanox/mlx5/core/en/tc_priv.h | 4 +-
drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 137 ++++++++++++---
drivers/net/ethernet/mellanox/mlx5/core/en_rep.h | 7 +-
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 183 +++++++++++++--------
.../mellanox/mlx5/core/esw/acl/egress_ofld.c | 25 ++-
.../net/ethernet/mellanox/mlx5/core/esw/acl/ofld.h | 1 +
.../net/ethernet/mellanox/mlx5/core/esw/bridge.c | 30 +++-
.../ethernet/mellanox/mlx5/core/esw/bridge_mcast.c | 21 ++-
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 32 +++-
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 176 +++++++++++++-------
drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c | 3 +-
drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h | 3 +-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 5 +-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.h | 3 +-
drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 39 ++++-
.../net/ethernet/mellanox/mlx5/core/lib/devcom.c | 124 +++++++++-----
.../net/ethernet/mellanox/mlx5/core/lib/devcom.h | 35 ++--
.../mellanox/mlx5/core/steering/dr_action.c | 5 +-
.../mellanox/mlx5/core/steering/dr_domain.c | 13 +-
.../mellanox/mlx5/core/steering/dr_ste_v0.c | 9 +-
.../mellanox/mlx5/core/steering/dr_ste_v1.c | 9 +-
.../mellanox/mlx5/core/steering/dr_types.h | 2 +-
.../ethernet/mellanox/mlx5/core/steering/fs_dr.c | 5 +-
.../ethernet/mellanox/mlx5/core/steering/mlx5dr.h | 3 +-
24 files changed, 603 insertions(+), 271 deletions(-)
^ permalink raw reply [flat|nested] 18+ messages in thread
* [net-next 01/14] net/mlx5e: en_tc, Extend peer flows to a list
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
2023-06-01 6:01 ` [net-next 02/14] net/mlx5e: tc, Refactor peer add/del flow Saeed Mahameed
` (12 subsequent siblings)
13 siblings, 0 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Mark Bloch, Roi Dayan
From: Mark Bloch <mbloch@nvidia.com>
Currently, mlx5e_flow is holding a pointer to a peer_flow, in case one
was created. e.g. There is an assumption that mlx5e_flow can have only
one peer.
In order to support more than one peer, refactor mlx5e_flow to hold a
list of peer flows.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../ethernet/mellanox/mlx5/core/en/tc_priv.h | 2 +-
.../net/ethernet/mellanox/mlx5/core/en_tc.c | 43 ++++++++++++-------
2 files changed, 28 insertions(+), 17 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_priv.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_priv.h
index ba2b1f24ff14..8a500a966f06 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_priv.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_priv.h
@@ -94,13 +94,13 @@ struct mlx5e_tc_flow {
* destinations.
*/
struct encap_flow_item encaps[MLX5_MAX_FLOW_FWD_VPORTS];
- struct mlx5e_tc_flow *peer_flow;
struct mlx5e_hairpin_entry *hpe; /* attached hairpin instance */
struct list_head hairpin; /* flows sharing the same hairpin */
struct list_head peer; /* flows with peer flow */
struct list_head unready; /* flows not ready to be offloaded (e.g
* due to missing route)
*/
+ struct list_head peer_flows; /* flows on peer */
struct net_device *orig_dev; /* netdev adding flow first */
int tmp_entry_index;
struct list_head tmp_list; /* temporary flow list used by neigh update */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 1b0906cb57ef..de8b7a119a02 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -2074,6 +2074,8 @@ void mlx5e_put_flow_list(struct mlx5e_priv *priv, struct list_head *flow_list)
static void __mlx5e_tc_del_fdb_peer_flow(struct mlx5e_tc_flow *flow)
{
struct mlx5_eswitch *esw = flow->priv->mdev->priv.eswitch;
+ struct mlx5e_tc_flow *peer_flow;
+ struct mlx5e_tc_flow *tmp;
if (!flow_flag_test(flow, ESWITCH) ||
!flow_flag_test(flow, DUP))
@@ -2085,12 +2087,13 @@ static void __mlx5e_tc_del_fdb_peer_flow(struct mlx5e_tc_flow *flow)
flow_flag_clear(flow, DUP);
- if (refcount_dec_and_test(&flow->peer_flow->refcnt)) {
- mlx5e_tc_del_fdb_flow(flow->peer_flow->priv, flow->peer_flow);
- kfree(flow->peer_flow);
+ list_for_each_entry_safe(peer_flow, tmp, &flow->peer_flows, peer_flows) {
+ if (refcount_dec_and_test(&peer_flow->refcnt)) {
+ mlx5e_tc_del_fdb_flow(peer_flow->priv, peer_flow);
+ list_del(&peer_flow->peer_flows);
+ kfree(peer_flow);
+ }
}
-
- flow->peer_flow = NULL;
}
static void mlx5e_tc_del_fdb_peer_flow(struct mlx5e_tc_flow *flow)
@@ -4378,6 +4381,7 @@ mlx5e_alloc_flow(struct mlx5e_priv *priv, int attr_size,
INIT_LIST_HEAD(&flow->hairpin);
INIT_LIST_HEAD(&flow->l3_to_l2_reformat);
INIT_LIST_HEAD(&flow->attrs);
+ INIT_LIST_HEAD(&flow->peer_flows);
refcount_set(&flow->refcnt, 1);
init_completion(&flow->init_done);
init_completion(&flow->del_hw_done);
@@ -4526,7 +4530,7 @@ static int mlx5e_tc_add_fdb_peer_flow(struct flow_cls_offload *f,
goto out;
}
- flow->peer_flow = peer_flow;
+ list_add_tail(&peer_flow->peer_flows, &flow->peer_flows);
flow_flag_set(flow, DUP);
mutex_lock(&esw->offloads.peer_mutex);
list_add_tail(&flow->peer, &esw->offloads.peer_flows);
@@ -4824,19 +4828,26 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv,
if (!peer_esw)
goto out;
- if (flow_flag_test(flow, DUP) &&
- flow_flag_test(flow->peer_flow, OFFLOADED)) {
- u64 bytes2;
- u64 packets2;
- u64 lastuse2;
+ if (flow_flag_test(flow, DUP)) {
+ struct mlx5e_tc_flow *peer_flow;
- if (flow_flag_test(flow, USE_ACT_STATS)) {
- f->use_act_stats = true;
- } else {
- counter = mlx5e_tc_get_counter(flow->peer_flow);
+ list_for_each_entry(peer_flow, &flow->peer_flows, peer_flows) {
+ u64 packets2;
+ u64 lastuse2;
+ u64 bytes2;
+
+ if (!flow_flag_test(peer_flow, OFFLOADED))
+ continue;
+ if (flow_flag_test(flow, USE_ACT_STATS)) {
+ f->use_act_stats = true;
+ break;
+ }
+
+ counter = mlx5e_tc_get_counter(peer_flow);
if (!counter)
goto no_peer_counter;
- mlx5_fc_query_cached(counter, &bytes2, &packets2, &lastuse2);
+ mlx5_fc_query_cached(counter, &bytes2, &packets2,
+ &lastuse2);
bytes += bytes2;
packets += packets2;
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [net-next 02/14] net/mlx5e: tc, Refactor peer add/del flow
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
2023-06-01 6:01 ` [net-next 01/14] net/mlx5e: en_tc, Extend peer flows to a list Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
2023-06-01 6:01 ` [net-next 03/14] net/mlx5e: rep, store send to vport rules per peer Saeed Mahameed
` (11 subsequent siblings)
13 siblings, 0 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch,
Roi Dayan
From: Shay Drory <shayd@nvidia.com>
Move peer_eswitch outside mlx5e_tc_add_fdb_peer_flow() so downstream
patch can call mlx5e_tc_add_fdb_peer_flow() with multiple peers.
Move peer_eswitch in the remove flow as well in order to keep symmetry.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/en_tc.c | 65 ++++++++++---------
1 file changed, 34 insertions(+), 31 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index de8b7a119a02..6bf0fd3038bf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -2071,7 +2071,7 @@ void mlx5e_put_flow_list(struct mlx5e_priv *priv, struct list_head *flow_list)
mlx5e_flow_put(priv, flow);
}
-static void __mlx5e_tc_del_fdb_peer_flow(struct mlx5e_tc_flow *flow)
+static void mlx5e_tc_del_fdb_peer_flow(struct mlx5e_tc_flow *flow)
{
struct mlx5_eswitch *esw = flow->priv->mdev->priv.eswitch;
struct mlx5e_tc_flow *peer_flow;
@@ -2096,25 +2096,20 @@ static void __mlx5e_tc_del_fdb_peer_flow(struct mlx5e_tc_flow *flow)
}
}
-static void mlx5e_tc_del_fdb_peer_flow(struct mlx5e_tc_flow *flow)
-{
- struct mlx5_core_dev *dev = flow->priv->mdev;
- struct mlx5_devcom *devcom = dev->priv.devcom;
- struct mlx5_eswitch *peer_esw;
-
- peer_esw = mlx5_devcom_get_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
- if (!peer_esw)
- return;
-
- __mlx5e_tc_del_fdb_peer_flow(flow);
- mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
-}
-
static void mlx5e_tc_del_flow(struct mlx5e_priv *priv,
struct mlx5e_tc_flow *flow)
{
if (mlx5e_is_eswitch_flow(flow)) {
+ struct mlx5_devcom *devcom = flow->priv->mdev->priv.devcom;
+ struct mlx5_eswitch *peer_esw;
+
+ peer_esw = mlx5_devcom_get_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ if (!peer_esw) {
+ mlx5e_tc_del_fdb_flow(priv, flow);
+ return;
+ }
mlx5e_tc_del_fdb_peer_flow(flow);
+ mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
mlx5e_tc_del_fdb_flow(priv, flow);
} else {
mlx5e_tc_del_nic_flow(priv, flow);
@@ -4490,22 +4485,18 @@ __mlx5e_add_fdb_flow(struct mlx5e_priv *priv,
static int mlx5e_tc_add_fdb_peer_flow(struct flow_cls_offload *f,
struct mlx5e_tc_flow *flow,
- unsigned long flow_flags)
+ unsigned long flow_flags,
+ struct mlx5_eswitch *peer_esw)
{
struct mlx5e_priv *priv = flow->priv, *peer_priv;
- struct mlx5_eswitch *esw = priv->mdev->priv.eswitch, *peer_esw;
+ struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
struct mlx5_esw_flow_attr *attr = flow->attr->esw_attr;
- struct mlx5_devcom *devcom = priv->mdev->priv.devcom;
struct mlx5e_tc_flow_parse_attr *parse_attr;
struct mlx5e_rep_priv *peer_urpriv;
struct mlx5e_tc_flow *peer_flow;
struct mlx5_core_dev *in_mdev;
int err = 0;
- peer_esw = mlx5_devcom_get_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
- if (!peer_esw)
- return -ENODEV;
-
peer_urpriv = mlx5_eswitch_get_uplink_priv(peer_esw, REP_ETH);
peer_priv = netdev_priv(peer_urpriv->netdev);
@@ -4537,7 +4528,6 @@ static int mlx5e_tc_add_fdb_peer_flow(struct flow_cls_offload *f,
mutex_unlock(&esw->offloads.peer_mutex);
out:
- mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
return err;
}
@@ -4548,9 +4538,11 @@ mlx5e_add_fdb_flow(struct mlx5e_priv *priv,
struct net_device *filter_dev,
struct mlx5e_tc_flow **__flow)
{
+ struct mlx5_devcom *devcom = priv->mdev->priv.devcom;
struct mlx5e_rep_priv *rpriv = priv->ppriv;
struct mlx5_eswitch_rep *in_rep = rpriv->rep;
struct mlx5_core_dev *in_mdev = priv->mdev;
+ struct mlx5_eswitch *peer_esw;
struct mlx5e_tc_flow *flow;
int err;
@@ -4559,19 +4551,30 @@ mlx5e_add_fdb_flow(struct mlx5e_priv *priv,
if (IS_ERR(flow))
return PTR_ERR(flow);
- if (is_peer_flow_needed(flow)) {
- err = mlx5e_tc_add_fdb_peer_flow(f, flow, flow_flags);
- if (err) {
- mlx5e_tc_del_fdb_flow(priv, flow);
- goto out;
- }
+ if (!is_peer_flow_needed(flow)) {
+ *__flow = flow;
+ return 0;
}
+ peer_esw = mlx5_devcom_get_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ if (!peer_esw) {
+ err = -ENODEV;
+ goto clean_flow;
+ }
+
+ err = mlx5e_tc_add_fdb_peer_flow(f, flow, flow_flags, peer_esw);
+ if (err)
+ goto peer_clean;
+ mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+
*__flow = flow;
return 0;
-out:
+peer_clean:
+ mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+clean_flow:
+ mlx5e_tc_del_fdb_flow(priv, flow);
return err;
}
@@ -5376,7 +5379,7 @@ void mlx5e_tc_clean_fdb_peer_flows(struct mlx5_eswitch *esw)
struct mlx5e_tc_flow *flow, *tmp;
list_for_each_entry_safe(flow, tmp, &esw->offloads.peer_flows, peer)
- __mlx5e_tc_del_fdb_peer_flow(flow);
+ mlx5e_tc_del_fdb_peer_flow(flow);
}
void mlx5e_tc_reoffload_flows_work(struct work_struct *work)
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [net-next 03/14] net/mlx5e: rep, store send to vport rules per peer
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
2023-06-01 6:01 ` [net-next 01/14] net/mlx5e: en_tc, Extend peer flows to a list Saeed Mahameed
2023-06-01 6:01 ` [net-next 02/14] net/mlx5e: tc, Refactor peer add/del flow Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
2023-06-02 16:02 ` Simon Horman
2023-06-01 6:01 ` [net-next 04/14] net/mlx5e: en_tc, re-factor query route port Saeed Mahameed
` (10 subsequent siblings)
13 siblings, 1 reply; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Mark Bloch, Shay Drory,
Roi Dayan
From: Mark Bloch <mbloch@nvidia.com>
Each representor, for each send queue, is holding a
send_to_vport rule for the peer eswitch.
In order to support more than one peer, and to map between the peer
rules and peer eswitches, refactor representor to hold both the peer
rules and pointer to the peer eswitches.
This enables mlx5 to store send_to_vport rules per peer, where each
peer have dedicate index via mlx5_get_dev_index().
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/en_rep.c | 97 +++++++++++++++----
.../net/ethernet/mellanox/mlx5/core/en_rep.h | 7 +-
.../mellanox/mlx5/core/eswitch_offloads.c | 18 ++--
3 files changed, 96 insertions(+), 26 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 1fc386eccaf8..13d69c5634ac 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -373,7 +373,9 @@ static void mlx5e_sqs2vport_stop(struct mlx5_eswitch *esw,
struct mlx5_eswitch_rep *rep)
{
struct mlx5e_rep_sq *rep_sq, *tmp;
+ struct mlx5e_rep_sq_peer *sq_peer;
struct mlx5e_rep_priv *rpriv;
+ unsigned long i;
if (esw->mode != MLX5_ESWITCH_OFFLOADS)
return;
@@ -381,8 +383,15 @@ static void mlx5e_sqs2vport_stop(struct mlx5_eswitch *esw,
rpriv = mlx5e_rep_to_rep_priv(rep);
list_for_each_entry_safe(rep_sq, tmp, &rpriv->vport_sqs_list, list) {
mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule);
- if (rep_sq->send_to_vport_rule_peer)
- mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule_peer);
+ xa_for_each(&rep_sq->sq_peer, i, sq_peer) {
+ if (sq_peer->rule)
+ mlx5_eswitch_del_send_to_vport_rule(sq_peer->rule);
+
+ xa_erase(&rep_sq->sq_peer, i);
+ kfree(sq_peer);
+ }
+
+ xa_destroy(&rep_sq->sq_peer);
list_del(&rep_sq->list);
kfree(rep_sq);
}
@@ -394,6 +403,7 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
{
struct mlx5_eswitch *peer_esw = NULL;
struct mlx5_flow_handle *flow_rule;
+ struct mlx5e_rep_sq_peer *sq_peer;
struct mlx5e_rep_priv *rpriv;
struct mlx5e_rep_sq *rep_sq;
int err;
@@ -413,6 +423,7 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
err = -ENOMEM;
goto out_err;
}
+ xa_init(&rep_sq->sq_peer);
/* Add re-inject rule to the PF/representor sqs */
flow_rule = mlx5_eswitch_add_send_to_vport_rule(esw, esw, rep,
@@ -426,15 +437,24 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
rep_sq->sqn = sqns_array[i];
if (peer_esw) {
+ int peer_rule_idx = mlx5_get_dev_index(peer_esw->dev);
+
+ sq_peer = kzalloc(sizeof(*sq_peer), GFP_KERNEL);
+ if (!sq_peer)
+ goto out_sq_peer_err;
+
flow_rule = mlx5_eswitch_add_send_to_vport_rule(peer_esw, esw,
rep, sqns_array[i]);
if (IS_ERR(flow_rule)) {
err = PTR_ERR(flow_rule);
- mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule);
- kfree(rep_sq);
- goto out_err;
+ goto out_flow_rule_err;
}
- rep_sq->send_to_vport_rule_peer = flow_rule;
+
+ sq_peer->rule = flow_rule;
+ sq_peer->peer = peer_esw;
+ err = xa_insert(&rep_sq->sq_peer, peer_rule_idx, sq_peer, GFP_KERNEL);
+ if (err)
+ goto out_xa_err;
}
list_add(&rep_sq->list, &rpriv->vport_sqs_list);
@@ -445,6 +465,14 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
return 0;
+out_xa_err:
+ mlx5_eswitch_del_send_to_vport_rule(flow_rule);
+out_flow_rule_err:
+ kfree(sq_peer);
+out_sq_peer_err:
+ mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule);
+ xa_destroy(&rep_sq->sq_peer);
+ kfree(rep_sq);
out_err:
mlx5e_sqs2vport_stop(esw, rep);
@@ -1524,17 +1552,24 @@ static void *mlx5e_vport_rep_get_proto_dev(struct mlx5_eswitch_rep *rep)
return rpriv->netdev;
}
-static void mlx5e_vport_rep_event_unpair(struct mlx5_eswitch_rep *rep)
+static void mlx5e_vport_rep_event_unpair(struct mlx5_eswitch_rep *rep,
+ struct mlx5_eswitch *peer_esw)
{
+ int i = mlx5_get_dev_index(peer_esw->dev);
struct mlx5e_rep_priv *rpriv;
struct mlx5e_rep_sq *rep_sq;
+ WARN_ON_ONCE(!peer_esw);
rpriv = mlx5e_rep_to_rep_priv(rep);
list_for_each_entry(rep_sq, &rpriv->vport_sqs_list, list) {
- if (!rep_sq->send_to_vport_rule_peer)
+ struct mlx5e_rep_sq_peer *sq_peer = xa_load(&rep_sq->sq_peer, i);
+
+ if (!sq_peer || sq_peer->peer != peer_esw)
continue;
- mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule_peer);
- rep_sq->send_to_vport_rule_peer = NULL;
+
+ mlx5_eswitch_del_send_to_vport_rule(sq_peer->rule);
+ xa_erase(&rep_sq->sq_peer, i);
+ kfree(sq_peer);
}
}
@@ -1542,24 +1577,52 @@ static int mlx5e_vport_rep_event_pair(struct mlx5_eswitch *esw,
struct mlx5_eswitch_rep *rep,
struct mlx5_eswitch *peer_esw)
{
+ int i = mlx5_get_dev_index(peer_esw->dev);
struct mlx5_flow_handle *flow_rule;
+ struct mlx5e_rep_sq_peer *sq_peer;
struct mlx5e_rep_priv *rpriv;
struct mlx5e_rep_sq *rep_sq;
+ int err;
rpriv = mlx5e_rep_to_rep_priv(rep);
list_for_each_entry(rep_sq, &rpriv->vport_sqs_list, list) {
- if (rep_sq->send_to_vport_rule_peer)
+ sq_peer = xa_load(&rep_sq->sq_peer, i);
+
+ if (sq_peer && sq_peer->peer)
continue;
- flow_rule = mlx5_eswitch_add_send_to_vport_rule(peer_esw, esw, rep, rep_sq->sqn);
- if (IS_ERR(flow_rule))
+
+ flow_rule = mlx5_eswitch_add_send_to_vport_rule(peer_esw, esw, rep,
+ rep_sq->sqn);
+ if (IS_ERR(flow_rule)) {
+ err = PTR_ERR(flow_rule);
goto err_out;
- rep_sq->send_to_vport_rule_peer = flow_rule;
+ }
+
+ if (sq_peer) {
+ sq_peer->rule = flow_rule;
+ sq_peer->peer = peer_esw;
+ continue;
+ }
+ sq_peer = kzalloc(sizeof(*sq_peer), GFP_KERNEL);
+ if (!sq_peer) {
+ err = -ENOMEM;
+ goto err_sq_alloc;
+ }
+ err = xa_insert(&rep_sq->sq_peer, i, sq_peer, GFP_KERNEL);
+ if (err)
+ goto err_xa;
+ sq_peer->rule = flow_rule;
+ sq_peer->peer = peer_esw;
}
return 0;
+err_xa:
+ kfree(sq_peer);
+err_sq_alloc:
+ mlx5_eswitch_del_send_to_vport_rule(flow_rule);
err_out:
- mlx5e_vport_rep_event_unpair(rep);
- return PTR_ERR(flow_rule);
+ mlx5e_vport_rep_event_unpair(rep, peer_esw);
+ return err;
}
static int mlx5e_vport_rep_event(struct mlx5_eswitch *esw,
@@ -1572,7 +1635,7 @@ static int mlx5e_vport_rep_event(struct mlx5_eswitch *esw,
if (event == MLX5_SWITCHDEV_EVENT_PAIR)
err = mlx5e_vport_rep_event_pair(esw, rep, data);
else if (event == MLX5_SWITCHDEV_EVENT_UNPAIR)
- mlx5e_vport_rep_event_unpair(rep);
+ mlx5e_vport_rep_event_unpair(rep, data);
return err;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
index 80b7f5079a5a..70640fa1ad7b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
@@ -225,9 +225,14 @@ struct mlx5e_encap_entry {
struct rcu_head rcu;
};
+struct mlx5e_rep_sq_peer {
+ struct mlx5_flow_handle *rule;
+ void *peer;
+};
+
struct mlx5e_rep_sq {
struct mlx5_flow_handle *send_to_vport_rule;
- struct mlx5_flow_handle *send_to_vport_rule_peer;
+ struct xarray sq_peer;
u32 sqn;
struct list_head list;
};
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 1b2f5e273525..9526382f1573 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2673,7 +2673,8 @@ void mlx5_eswitch_offloads_destroy_single_fdb(struct mlx5_eswitch *master_esw,
#define ESW_OFFLOADS_DEVCOM_PAIR (0)
#define ESW_OFFLOADS_DEVCOM_UNPAIR (1)
-static void mlx5_esw_offloads_rep_event_unpair(struct mlx5_eswitch *esw)
+static void mlx5_esw_offloads_rep_event_unpair(struct mlx5_eswitch *esw,
+ struct mlx5_eswitch *peer_esw)
{
const struct mlx5_eswitch_rep_ops *ops;
struct mlx5_eswitch_rep *rep;
@@ -2686,17 +2687,18 @@ static void mlx5_esw_offloads_rep_event_unpair(struct mlx5_eswitch *esw)
ops = esw->offloads.rep_ops[rep_type];
if (atomic_read(&rep->rep_data[rep_type].state) == REP_LOADED &&
ops->event)
- ops->event(esw, rep, MLX5_SWITCHDEV_EVENT_UNPAIR, NULL);
+ ops->event(esw, rep, MLX5_SWITCHDEV_EVENT_UNPAIR, peer_esw);
}
}
}
-static void mlx5_esw_offloads_unpair(struct mlx5_eswitch *esw)
+static void mlx5_esw_offloads_unpair(struct mlx5_eswitch *esw,
+ struct mlx5_eswitch *peer_esw)
{
#if IS_ENABLED(CONFIG_MLX5_CLS_ACT)
mlx5e_tc_clean_fdb_peer_flows(esw);
#endif
- mlx5_esw_offloads_rep_event_unpair(esw);
+ mlx5_esw_offloads_rep_event_unpair(esw, peer_esw);
esw_del_fdb_peer_miss_rules(esw);
}
@@ -2728,7 +2730,7 @@ static int mlx5_esw_offloads_pair(struct mlx5_eswitch *esw,
return 0;
err_out:
- mlx5_esw_offloads_unpair(esw);
+ mlx5_esw_offloads_unpair(esw, peer_esw);
return err;
}
@@ -2802,8 +2804,8 @@ static int mlx5_esw_offloads_devcom_event(int event,
mlx5_devcom_set_paired(devcom, MLX5_DEVCOM_ESW_OFFLOADS, false);
esw->paired[mlx5_get_dev_index(peer_esw->dev)] = false;
peer_esw->paired[mlx5_get_dev_index(esw->dev)] = false;
- mlx5_esw_offloads_unpair(peer_esw);
- mlx5_esw_offloads_unpair(esw);
+ mlx5_esw_offloads_unpair(peer_esw, esw);
+ mlx5_esw_offloads_unpair(esw, peer_esw);
mlx5_esw_offloads_set_ns_peer(esw, peer_esw, false);
break;
}
@@ -2811,7 +2813,7 @@ static int mlx5_esw_offloads_devcom_event(int event,
return 0;
err_pair:
- mlx5_esw_offloads_unpair(esw);
+ mlx5_esw_offloads_unpair(esw, peer_esw);
err_peer:
mlx5_esw_offloads_set_ns_peer(esw, peer_esw, false);
err_out:
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [net-next 04/14] net/mlx5e: en_tc, re-factor query route port
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
` (2 preceding siblings ...)
2023-06-01 6:01 ` [net-next 03/14] net/mlx5e: rep, store send to vport rules per peer Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
2023-06-01 6:01 ` [net-next 05/14] net/mlx5e: Handle offloads flows per peer Saeed Mahameed
` (9 subsequent siblings)
13 siblings, 0 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Mark Bloch, Roi Dayan
From: Mark Bloch <mbloch@nvidia.com>
query for peer esw outside of if scope.
This is preparation for query route port over multiple peers.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/en_tc.c | 32 ++++++++-----------
1 file changed, 13 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 6bf0fd3038bf..2135c0ef8560 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -1666,8 +1666,10 @@ int mlx5e_tc_query_route_vport(struct net_device *out_dev, struct net_device *ro
{
struct mlx5e_priv *out_priv, *route_priv;
struct mlx5_core_dev *route_mdev;
+ struct mlx5_devcom *devcom;
struct mlx5_eswitch *esw;
u16 vhca_id;
+ int err;
out_priv = netdev_priv(out_dev);
esw = out_priv->mdev->priv.eswitch;
@@ -1675,28 +1677,20 @@ int mlx5e_tc_query_route_vport(struct net_device *out_dev, struct net_device *ro
route_mdev = route_priv->mdev;
vhca_id = MLX5_CAP_GEN(route_mdev, vhca_id);
- if (mlx5_lag_is_active(out_priv->mdev)) {
- struct mlx5_devcom *devcom;
- int err;
-
- /* In lag case we may get devices from different eswitch instances.
- * If we failed to get vport num, it means, mostly, that we on the wrong
- * eswitch.
- */
- err = mlx5_eswitch_vhca_id_to_vport(esw, vhca_id, vport);
- if (err != -ENOENT)
- return err;
-
- rcu_read_lock();
- devcom = out_priv->mdev->priv.devcom;
- esw = mlx5_devcom_get_peer_data_rcu(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
- err = esw ? mlx5_eswitch_vhca_id_to_vport(esw, vhca_id, vport) : -ENODEV;
- rcu_read_unlock();
+ err = mlx5_eswitch_vhca_id_to_vport(esw, vhca_id, vport);
+ if (!err)
+ return err;
+ if (!mlx5_lag_is_active(out_priv->mdev))
return err;
- }
- return mlx5_eswitch_vhca_id_to_vport(esw, vhca_id, vport);
+ rcu_read_lock();
+ devcom = out_priv->mdev->priv.devcom;
+ esw = mlx5_devcom_get_peer_data_rcu(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ err = esw ? mlx5_eswitch_vhca_id_to_vport(esw, vhca_id, vport) : -ENODEV;
+ rcu_read_unlock();
+
+ return err;
}
static int
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [net-next 05/14] net/mlx5e: Handle offloads flows per peer
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
` (3 preceding siblings ...)
2023-06-01 6:01 ` [net-next 04/14] net/mlx5e: en_tc, re-factor query route port Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
2023-06-01 6:01 ` [net-next 06/14] net/mlx5: E-switch, enlarge peer miss group table Saeed Mahameed
` (8 subsequent siblings)
13 siblings, 0 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch,
Roi Dayan
From: Shay Drory <shayd@nvidia.com>
Currently, E-switch offloads table have a list of all flows that
create a peer_flow over the peer eswitch.
In order to support more than one peer, extend E-switch offloads
table peer_flow to hold an array of lists, where each peer have
dedicate index via mlx5_get_dev_index(). Thereafter, extend original
flow to hold an array of peers as well.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../ethernet/mellanox/mlx5/core/en/tc_priv.h | 2 +-
.../net/ethernet/mellanox/mlx5/core/en_tc.c | 37 +++++++++++++++----
.../net/ethernet/mellanox/mlx5/core/eswitch.h | 2 +-
.../mellanox/mlx5/core/eswitch_offloads.c | 4 +-
4 files changed, 34 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_priv.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_priv.h
index 8a500a966f06..6cc23af66b5b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_priv.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_priv.h
@@ -96,7 +96,7 @@ struct mlx5e_tc_flow {
struct encap_flow_item encaps[MLX5_MAX_FLOW_FWD_VPORTS];
struct mlx5e_hairpin_entry *hpe; /* attached hairpin instance */
struct list_head hairpin; /* flows sharing the same hairpin */
- struct list_head peer; /* flows with peer flow */
+ struct list_head peer[MLX5_MAX_PORTS]; /* flows with peer flow */
struct list_head unready; /* flows not ready to be offloaded (e.g
* due to missing route)
*/
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 2135c0ef8560..13071d184d88 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -2065,7 +2065,8 @@ void mlx5e_put_flow_list(struct mlx5e_priv *priv, struct list_head *flow_list)
mlx5e_flow_put(priv, flow);
}
-static void mlx5e_tc_del_fdb_peer_flow(struct mlx5e_tc_flow *flow)
+static void mlx5e_tc_del_fdb_peer_flow(struct mlx5e_tc_flow *flow,
+ int peer_index)
{
struct mlx5_eswitch *esw = flow->priv->mdev->priv.eswitch;
struct mlx5e_tc_flow *peer_flow;
@@ -2076,18 +2077,32 @@ static void mlx5e_tc_del_fdb_peer_flow(struct mlx5e_tc_flow *flow)
return;
mutex_lock(&esw->offloads.peer_mutex);
- list_del(&flow->peer);
+ list_del(&flow->peer[peer_index]);
mutex_unlock(&esw->offloads.peer_mutex);
- flow_flag_clear(flow, DUP);
-
list_for_each_entry_safe(peer_flow, tmp, &flow->peer_flows, peer_flows) {
+ if (peer_index != mlx5_get_dev_index(peer_flow->priv->mdev))
+ continue;
if (refcount_dec_and_test(&peer_flow->refcnt)) {
mlx5e_tc_del_fdb_flow(peer_flow->priv, peer_flow);
list_del(&peer_flow->peer_flows);
kfree(peer_flow);
}
}
+
+ if (list_empty(&flow->peer_flows))
+ flow_flag_clear(flow, DUP);
+}
+
+static void mlx5e_tc_del_fdb_peers_flow(struct mlx5e_tc_flow *flow)
+{
+ int i;
+
+ for (i = 0; i < MLX5_MAX_PORTS; i++) {
+ if (i == mlx5_get_dev_index(flow->priv->mdev))
+ continue;
+ mlx5e_tc_del_fdb_peer_flow(flow, i);
+ }
}
static void mlx5e_tc_del_flow(struct mlx5e_priv *priv,
@@ -2102,7 +2117,7 @@ static void mlx5e_tc_del_flow(struct mlx5e_priv *priv,
mlx5e_tc_del_fdb_flow(priv, flow);
return;
}
- mlx5e_tc_del_fdb_peer_flow(flow);
+ mlx5e_tc_del_fdb_peers_flow(flow);
mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
mlx5e_tc_del_fdb_flow(priv, flow);
} else {
@@ -4486,6 +4501,7 @@ static int mlx5e_tc_add_fdb_peer_flow(struct flow_cls_offload *f,
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
struct mlx5_esw_flow_attr *attr = flow->attr->esw_attr;
struct mlx5e_tc_flow_parse_attr *parse_attr;
+ int i = mlx5_get_dev_index(peer_esw->dev);
struct mlx5e_rep_priv *peer_urpriv;
struct mlx5e_tc_flow *peer_flow;
struct mlx5_core_dev *in_mdev;
@@ -4518,7 +4534,7 @@ static int mlx5e_tc_add_fdb_peer_flow(struct flow_cls_offload *f,
list_add_tail(&peer_flow->peer_flows, &flow->peer_flows);
flow_flag_set(flow, DUP);
mutex_lock(&esw->offloads.peer_mutex);
- list_add_tail(&flow->peer, &esw->offloads.peer_flows);
+ list_add_tail(&flow->peer[i], &esw->offloads.peer_flows[i]);
mutex_unlock(&esw->offloads.peer_mutex);
out:
@@ -5371,9 +5387,14 @@ int mlx5e_tc_num_filters(struct mlx5e_priv *priv, unsigned long flags)
void mlx5e_tc_clean_fdb_peer_flows(struct mlx5_eswitch *esw)
{
struct mlx5e_tc_flow *flow, *tmp;
+ int i;
- list_for_each_entry_safe(flow, tmp, &esw->offloads.peer_flows, peer)
- mlx5e_tc_del_fdb_peer_flow(flow);
+ for (i = 0; i < MLX5_MAX_PORTS; i++) {
+ if (i == mlx5_get_dev_index(esw->dev))
+ continue;
+ list_for_each_entry_safe(flow, tmp, &esw->offloads.peer_flows[i], peer[i])
+ mlx5e_tc_del_fdb_peers_flow(flow);
+ }
}
void mlx5e_tc_reoffload_flows_work(struct work_struct *work)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index f70124ad71cf..eadc39542e5e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -249,7 +249,7 @@ struct mlx5_esw_offload {
struct mlx5_flow_group *vport_rx_drop_group;
struct mlx5_flow_handle *vport_rx_drop_rule;
struct xarray vport_reps;
- struct list_head peer_flows;
+ struct list_head peer_flows[MLX5_MAX_PORTS];
struct mutex peer_mutex;
struct mutex encap_tbl_lock; /* protects encap_tbl */
DECLARE_HASHTABLE(encap_tbl, 8);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 9526382f1573..a767f3d52c76 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2825,8 +2825,10 @@ static int mlx5_esw_offloads_devcom_event(int event,
void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw)
{
struct mlx5_devcom *devcom = esw->dev->priv.devcom;
+ int i;
- INIT_LIST_HEAD(&esw->offloads.peer_flows);
+ for (i = 0; i < MLX5_MAX_PORTS; i++)
+ INIT_LIST_HEAD(&esw->offloads.peer_flows[i]);
mutex_init(&esw->offloads.peer_mutex);
if (!MLX5_CAP_ESW(esw->dev, merged_eswitch))
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [net-next 06/14] net/mlx5: E-switch, enlarge peer miss group table
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
` (4 preceding siblings ...)
2023-06-01 6:01 ` [net-next 05/14] net/mlx5e: Handle offloads flows per peer Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
2023-06-01 6:01 ` [net-next 07/14] net/mlx5: E-switch, refactor FDB miss rule add/remove Saeed Mahameed
` (7 subsequent siblings)
13 siblings, 0 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch,
Roi Dayan
From: Shay Drory <shayd@nvidia.com>
There is an implicit assumption that peer miss group table
require to handle only a single peer.
Also, there is an assumption that total_vports of the master
is greater or equal to the total_vports of each peer.
Change the code to support peer miss group for more than one peer.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index a767f3d52c76..ca69ed487413 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -1573,6 +1573,7 @@ esw_create_peer_esw_miss_group(struct mlx5_eswitch *esw,
u32 *flow_group_in,
int *ix)
{
+ int max_peer_ports = (esw->total_vports - 1) * (MLX5_MAX_PORTS - 1);
int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
struct mlx5_flow_group *g;
void *match_criteria;
@@ -1599,8 +1600,8 @@ esw_create_peer_esw_miss_group(struct mlx5_eswitch *esw,
MLX5_SET(create_flow_group_in, flow_group_in, start_flow_index, *ix);
MLX5_SET(create_flow_group_in, flow_group_in, end_flow_index,
- *ix + esw->total_vports - 1);
- *ix += esw->total_vports;
+ *ix + max_peer_ports);
+ *ix += max_peer_ports + 1;
g = mlx5_create_flow_group(fdb, flow_group_in);
if (IS_ERR(g)) {
@@ -1702,7 +1703,7 @@ static int esw_create_offloads_fdb_tables(struct mlx5_eswitch *esw)
* total vports of the peer (currently is also uses esw->total_vports).
*/
table_size = MLX5_MAX_PORTS * (esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ) +
- esw->total_vports * 2 + MLX5_ESW_MISS_FLOWS;
+ esw->total_vports * MLX5_MAX_PORTS + MLX5_ESW_MISS_FLOWS;
/* create the slow path fdb with encap set, so further table instances
* can be created at run time while VFs are probed if the FW allows that.
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [net-next 07/14] net/mlx5: E-switch, refactor FDB miss rule add/remove
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
` (5 preceding siblings ...)
2023-06-01 6:01 ` [net-next 06/14] net/mlx5: E-switch, enlarge peer miss group table Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
2023-06-01 6:01 ` [net-next 08/14] net/mlx5: E-switch, Handle multiple master egress rules Saeed Mahameed
` (6 subsequent siblings)
13 siblings, 0 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch,
Roi Dayan
From: Shay Drory <shayd@nvidia.com>
Currently, E-switch FDB have a single peer miss rule.
In order to support more than one peer, refactor E-switch FDB to
have peer miss rule per peer, and change the code to add/remove a
rule from specific peer.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 2 +-
.../net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 9 +++++----
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index eadc39542e5e..2a941e1cc686 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -218,7 +218,7 @@ struct mlx5_eswitch_fdb {
struct mlx5_flow_group *send_to_vport_grp;
struct mlx5_flow_group *send_to_vport_meta_grp;
struct mlx5_flow_group *peer_miss_grp;
- struct mlx5_flow_handle **peer_miss_rules;
+ struct mlx5_flow_handle **peer_miss_rules[MLX5_MAX_PORTS];
struct mlx5_flow_group *miss_grp;
struct mlx5_flow_handle **send_to_vport_meta_rules;
struct mlx5_flow_handle *miss_rule_uni;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index ca69ed487413..a7f352777d9e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -1132,7 +1132,7 @@ static int esw_add_fdb_peer_miss_rules(struct mlx5_eswitch *esw,
flows[vport->index] = flow;
}
- esw->fdb_table.offloads.peer_miss_rules = flows;
+ esw->fdb_table.offloads.peer_miss_rules[mlx5_get_dev_index(peer_dev)] = flows;
kvfree(spec);
return 0;
@@ -1160,13 +1160,14 @@ static int esw_add_fdb_peer_miss_rules(struct mlx5_eswitch *esw,
return err;
}
-static void esw_del_fdb_peer_miss_rules(struct mlx5_eswitch *esw)
+static void esw_del_fdb_peer_miss_rules(struct mlx5_eswitch *esw,
+ struct mlx5_core_dev *peer_dev)
{
struct mlx5_flow_handle **flows;
struct mlx5_vport *vport;
unsigned long i;
- flows = esw->fdb_table.offloads.peer_miss_rules;
+ flows = esw->fdb_table.offloads.peer_miss_rules[mlx5_get_dev_index(peer_dev)];
mlx5_esw_for_each_vf_vport(esw, i, vport, mlx5_core_max_vfs(esw->dev))
mlx5_del_flow_rules(flows[vport->index]);
@@ -2700,7 +2701,7 @@ static void mlx5_esw_offloads_unpair(struct mlx5_eswitch *esw,
mlx5e_tc_clean_fdb_peer_flows(esw);
#endif
mlx5_esw_offloads_rep_event_unpair(esw, peer_esw);
- esw_del_fdb_peer_miss_rules(esw);
+ esw_del_fdb_peer_miss_rules(esw, peer_esw->dev);
}
static int mlx5_esw_offloads_pair(struct mlx5_eswitch *esw,
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [net-next 08/14] net/mlx5: E-switch, Handle multiple master egress rules
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
` (6 preceding siblings ...)
2023-06-01 6:01 ` [net-next 07/14] net/mlx5: E-switch, refactor FDB miss rule add/remove Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
2023-06-01 6:01 ` [net-next 09/14] net/mlx5: E-switch, generalize shared FDB creation Saeed Mahameed
` (5 subsequent siblings)
13 siblings, 0 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Roi Dayan
From: Shay Drory <shayd@nvidia.com>
Currently, whenever a shared FDB is created, the slave eswitch is
creating master egress rule to the master eswitch.
In order to support more than two ports, which means there will be
more than one slave eswitch, enlarge bounce_rule, which is used to
create master egress rule, to an xarray.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../mellanox/mlx5/core/esw/acl/egress_ofld.c | 15 +--
.../net/ethernet/mellanox/mlx5/core/eswitch.h | 8 +-
.../mellanox/mlx5/core/eswitch_offloads.c | 92 +++++++++++++------
3 files changed, 79 insertions(+), 36 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c
index 2e504c7461c6..ae815a8392c6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c
@@ -15,13 +15,15 @@ static void esw_acl_egress_ofld_fwd2vport_destroy(struct mlx5_vport *vport)
vport->egress.offloads.fwd_rule = NULL;
}
-static void esw_acl_egress_ofld_bounce_rule_destroy(struct mlx5_vport *vport)
+static void esw_acl_egress_ofld_bounce_rules_destroy(struct mlx5_vport *vport)
{
- if (!vport->egress.offloads.bounce_rule)
- return;
+ struct mlx5_flow_handle *bounce_rule;
+ unsigned long i;
- mlx5_del_flow_rules(vport->egress.offloads.bounce_rule);
- vport->egress.offloads.bounce_rule = NULL;
+ xa_for_each(&vport->egress.offloads.bounce_rules, i, bounce_rule) {
+ mlx5_del_flow_rules(bounce_rule);
+ xa_erase(&vport->egress.offloads.bounce_rules, i);
+ }
}
static int esw_acl_egress_ofld_fwd2vport_create(struct mlx5_eswitch *esw,
@@ -96,7 +98,7 @@ static void esw_acl_egress_ofld_rules_destroy(struct mlx5_vport *vport)
{
esw_acl_egress_vlan_destroy(vport);
esw_acl_egress_ofld_fwd2vport_destroy(vport);
- esw_acl_egress_ofld_bounce_rule_destroy(vport);
+ esw_acl_egress_ofld_bounce_rules_destroy(vport);
}
static int esw_acl_egress_ofld_groups_create(struct mlx5_eswitch *esw,
@@ -194,6 +196,7 @@ int esw_acl_egress_ofld_setup(struct mlx5_eswitch *esw, struct mlx5_vport *vport
vport->egress.acl = NULL;
return err;
}
+ vport->egress.type = VPORT_EGRESS_ACL_TYPE_DEFAULT;
err = esw_acl_egress_ofld_groups_create(esw, vport);
if (err)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index 2a941e1cc686..05ae1c3a6e68 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -123,8 +123,14 @@ struct vport_ingress {
} offloads;
};
+enum vport_egress_acl_type {
+ VPORT_EGRESS_ACL_TYPE_DEFAULT,
+ VPORT_EGRESS_ACL_TYPE_SHARED_FDB,
+};
+
struct vport_egress {
struct mlx5_flow_table *acl;
+ enum vport_egress_acl_type type;
struct mlx5_flow_handle *allowed_vlan;
struct mlx5_flow_group *vlan_grp;
union {
@@ -136,7 +142,7 @@ struct vport_egress {
struct {
struct mlx5_flow_group *fwd_grp;
struct mlx5_flow_handle *fwd_rule;
- struct mlx5_flow_handle *bounce_rule;
+ struct xarray bounce_rules;
struct mlx5_flow_group *bounce_grp;
} offloads;
};
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index a7f352777d9e..ce70320b89b3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2512,6 +2512,7 @@ static int __esw_set_master_egress_rule(struct mlx5_core_dev *master,
struct mlx5_vport *vport,
struct mlx5_flow_table *acl)
{
+ u16 slave_index = MLX5_CAP_GEN(slave, vhca_id);
struct mlx5_flow_handle *flow_rule = NULL;
struct mlx5_flow_destination dest = {};
struct mlx5_flow_act flow_act = {};
@@ -2527,8 +2528,7 @@ static int __esw_set_master_egress_rule(struct mlx5_core_dev *master,
misc = MLX5_ADDR_OF(fte_match_param, spec->match_value,
misc_parameters);
MLX5_SET(fte_match_set_misc, misc, source_port, MLX5_VPORT_UPLINK);
- MLX5_SET(fte_match_set_misc, misc, source_eswitch_owner_vhca_id,
- MLX5_CAP_GEN(slave, vhca_id));
+ MLX5_SET(fte_match_set_misc, misc, source_eswitch_owner_vhca_id, slave_index);
misc = MLX5_ADDR_OF(fte_match_param, spec->match_criteria, misc_parameters);
MLX5_SET_TO_ONES(fte_match_set_misc, misc, source_port);
@@ -2543,44 +2543,35 @@ static int __esw_set_master_egress_rule(struct mlx5_core_dev *master,
flow_rule = mlx5_add_flow_rules(acl, spec, &flow_act,
&dest, 1);
- if (IS_ERR(flow_rule))
+ if (IS_ERR(flow_rule)) {
err = PTR_ERR(flow_rule);
- else
- vport->egress.offloads.bounce_rule = flow_rule;
+ } else {
+ err = xa_insert(&vport->egress.offloads.bounce_rules,
+ slave_index, flow_rule, GFP_KERNEL);
+ if (err)
+ mlx5_del_flow_rules(flow_rule);
+ }
kvfree(spec);
return err;
}
-static int esw_set_master_egress_rule(struct mlx5_core_dev *master,
- struct mlx5_core_dev *slave)
+static int esw_master_egress_create_resources(struct mlx5_flow_namespace *egress_ns,
+ struct mlx5_vport *vport)
{
int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
- struct mlx5_eswitch *esw = master->priv.eswitch;
struct mlx5_flow_table_attr ft_attr = {
- .max_fte = 1, .prio = 0, .level = 0,
+ .max_fte = MLX5_MAX_PORTS, .prio = 0, .level = 0,
.flags = MLX5_FLOW_TABLE_OTHER_VPORT,
};
- struct mlx5_flow_namespace *egress_ns;
struct mlx5_flow_table *acl;
struct mlx5_flow_group *g;
- struct mlx5_vport *vport;
void *match_criteria;
u32 *flow_group_in;
int err;
- vport = mlx5_eswitch_get_vport(esw, esw->manager_vport);
- if (IS_ERR(vport))
- return PTR_ERR(vport);
-
- egress_ns = mlx5_get_flow_vport_acl_namespace(master,
- MLX5_FLOW_NAMESPACE_ESW_EGRESS,
- vport->index);
- if (!egress_ns)
- return -EINVAL;
-
if (vport->egress.acl)
- return -EINVAL;
+ return 0;
flow_group_in = kvzalloc(inlen, GFP_KERNEL);
if (!flow_group_in)
@@ -2604,7 +2595,7 @@ static int esw_set_master_egress_rule(struct mlx5_core_dev *master,
MLX5_SET(create_flow_group_in, flow_group_in,
source_eswitch_owner_vhca_id_valid, 1);
MLX5_SET(create_flow_group_in, flow_group_in, start_flow_index, 0);
- MLX5_SET(create_flow_group_in, flow_group_in, end_flow_index, 0);
+ MLX5_SET(create_flow_group_in, flow_group_in, end_flow_index, MLX5_MAX_PORTS);
g = mlx5_create_flow_group(acl, flow_group_in);
if (IS_ERR(g)) {
@@ -2612,19 +2603,15 @@ static int esw_set_master_egress_rule(struct mlx5_core_dev *master,
goto err_group;
}
- err = __esw_set_master_egress_rule(master, slave, vport, acl);
- if (err)
- goto err_rule;
-
vport->egress.acl = acl;
vport->egress.offloads.bounce_grp = g;
+ vport->egress.type = VPORT_EGRESS_ACL_TYPE_SHARED_FDB;
+ xa_init_flags(&vport->egress.offloads.bounce_rules, XA_FLAGS_ALLOC);
kvfree(flow_group_in);
return 0;
-err_rule:
- mlx5_destroy_flow_group(g);
err_group:
mlx5_destroy_flow_table(acl);
out:
@@ -2632,6 +2619,52 @@ static int esw_set_master_egress_rule(struct mlx5_core_dev *master,
return err;
}
+static void esw_master_egress_destroy_resources(struct mlx5_vport *vport)
+{
+ mlx5_destroy_flow_group(vport->egress.offloads.bounce_grp);
+ mlx5_destroy_flow_table(vport->egress.acl);
+}
+
+static int esw_set_master_egress_rule(struct mlx5_core_dev *master,
+ struct mlx5_core_dev *slave)
+{
+ struct mlx5_eswitch *esw = master->priv.eswitch;
+ u16 slave_index = MLX5_CAP_GEN(slave, vhca_id);
+ struct mlx5_flow_namespace *egress_ns;
+ struct mlx5_vport *vport;
+ int err;
+
+ vport = mlx5_eswitch_get_vport(esw, esw->manager_vport);
+ if (IS_ERR(vport))
+ return PTR_ERR(vport);
+
+ egress_ns = mlx5_get_flow_vport_acl_namespace(master,
+ MLX5_FLOW_NAMESPACE_ESW_EGRESS,
+ vport->index);
+ if (!egress_ns)
+ return -EINVAL;
+
+ if (vport->egress.acl && vport->egress.type != VPORT_EGRESS_ACL_TYPE_SHARED_FDB)
+ return 0;
+
+ err = esw_master_egress_create_resources(egress_ns, vport);
+ if (err)
+ return err;
+
+ if (xa_load(&vport->egress.offloads.bounce_rules, slave_index))
+ return -EINVAL;
+
+ err = __esw_set_master_egress_rule(master, slave, vport, vport->egress.acl);
+ if (err)
+ goto err_rule;
+
+ return 0;
+
+err_rule:
+ esw_master_egress_destroy_resources(vport);
+ return err;
+}
+
static void esw_unset_master_egress_rule(struct mlx5_core_dev *dev)
{
struct mlx5_vport *vport;
@@ -2640,6 +2673,7 @@ static void esw_unset_master_egress_rule(struct mlx5_core_dev *dev)
dev->priv.eswitch->manager_vport);
esw_acl_egress_ofld_cleanup(vport);
+ xa_destroy(&vport->egress.offloads.bounce_rules);
}
int mlx5_eswitch_offloads_config_single_fdb(struct mlx5_eswitch *master_esw,
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [net-next 09/14] net/mlx5: E-switch, generalize shared FDB creation
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
` (7 preceding siblings ...)
2023-06-01 6:01 ` [net-next 08/14] net/mlx5: E-switch, Handle multiple master egress rules Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
2023-06-01 6:01 ` [net-next 10/14] net/mlx5: DR, handle more than one peer domain Saeed Mahameed
` (4 subsequent siblings)
13 siblings, 0 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Roi Dayan
From: Shay Drory <shayd@nvidia.com>
Shared FDB creation is hard coded for only two eswitches.
Generalize shared FDB creation so that any number of eswitches could
create shared FDB.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../mellanox/mlx5/core/esw/acl/egress_ofld.c | 12 +++++++
.../mellanox/mlx5/core/esw/acl/ofld.h | 1 +
.../net/ethernet/mellanox/mlx5/core/eswitch.h | 12 +++----
.../mellanox/mlx5/core/eswitch_offloads.c | 32 +++++++++--------
.../net/ethernet/mellanox/mlx5/core/lag/lag.c | 35 +++++++++++++++----
5 files changed, 66 insertions(+), 26 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c
index ae815a8392c6..24b1ca4e4ff8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c
@@ -15,6 +15,18 @@ static void esw_acl_egress_ofld_fwd2vport_destroy(struct mlx5_vport *vport)
vport->egress.offloads.fwd_rule = NULL;
}
+void esw_acl_egress_ofld_bounce_rule_destroy(struct mlx5_vport *vport, int rule_index)
+{
+ struct mlx5_flow_handle *bounce_rule =
+ xa_load(&vport->egress.offloads.bounce_rules, rule_index);
+
+ if (!bounce_rule)
+ return;
+
+ mlx5_del_flow_rules(bounce_rule);
+ xa_erase(&vport->egress.offloads.bounce_rules, rule_index);
+}
+
static void esw_acl_egress_ofld_bounce_rules_destroy(struct mlx5_vport *vport)
{
struct mlx5_flow_handle *bounce_rule;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/ofld.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/ofld.h
index c9f8469e9a47..536b04e83618 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/ofld.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/ofld.h
@@ -10,6 +10,7 @@
/* Eswitch acl egress external APIs */
int esw_acl_egress_ofld_setup(struct mlx5_eswitch *esw, struct mlx5_vport *vport);
void esw_acl_egress_ofld_cleanup(struct mlx5_vport *vport);
+void esw_acl_egress_ofld_bounce_rule_destroy(struct mlx5_vport *vport, int rule_index);
int mlx5_esw_acl_egress_vport_bond(struct mlx5_eswitch *esw, u16 active_vport_num,
u16 passive_vport_num);
int mlx5_esw_acl_egress_vport_unbond(struct mlx5_eswitch *esw, u16 vport_num);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index 05ae1c3a6e68..9833d1a587cc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -754,9 +754,9 @@ void esw_vport_change_handle_locked(struct mlx5_vport *vport);
bool mlx5_esw_offloads_controller_valid(const struct mlx5_eswitch *esw, u32 controller);
-int mlx5_eswitch_offloads_config_single_fdb(struct mlx5_eswitch *master_esw,
- struct mlx5_eswitch *slave_esw);
-void mlx5_eswitch_offloads_destroy_single_fdb(struct mlx5_eswitch *master_esw,
+int mlx5_eswitch_offloads_single_fdb_add_one(struct mlx5_eswitch *master_esw,
+ struct mlx5_eswitch *slave_esw, int max_slaves);
+void mlx5_eswitch_offloads_single_fdb_del_one(struct mlx5_eswitch *master_esw,
struct mlx5_eswitch *slave_esw);
int mlx5_eswitch_reload_reps(struct mlx5_eswitch *esw);
@@ -808,14 +808,14 @@ mlx5_esw_vport_to_devlink_port_index(const struct mlx5_core_dev *dev,
}
static inline int
-mlx5_eswitch_offloads_config_single_fdb(struct mlx5_eswitch *master_esw,
- struct mlx5_eswitch *slave_esw)
+mlx5_eswitch_offloads_single_fdb_add_one(struct mlx5_eswitch *master_esw,
+ struct mlx5_eswitch *slave_esw, int max_slaves)
{
return 0;
}
static inline void
-mlx5_eswitch_offloads_destroy_single_fdb(struct mlx5_eswitch *master_esw,
+mlx5_eswitch_offloads_single_fdb_del_one(struct mlx5_eswitch *master_esw,
struct mlx5_eswitch *slave_esw) {}
static inline int
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index ce70320b89b3..98d75a33a624 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2557,11 +2557,11 @@ static int __esw_set_master_egress_rule(struct mlx5_core_dev *master,
}
static int esw_master_egress_create_resources(struct mlx5_flow_namespace *egress_ns,
- struct mlx5_vport *vport)
+ struct mlx5_vport *vport, size_t count)
{
int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
struct mlx5_flow_table_attr ft_attr = {
- .max_fte = MLX5_MAX_PORTS, .prio = 0, .level = 0,
+ .max_fte = count, .prio = 0, .level = 0,
.flags = MLX5_FLOW_TABLE_OTHER_VPORT,
};
struct mlx5_flow_table *acl;
@@ -2595,7 +2595,7 @@ static int esw_master_egress_create_resources(struct mlx5_flow_namespace *egress
MLX5_SET(create_flow_group_in, flow_group_in,
source_eswitch_owner_vhca_id_valid, 1);
MLX5_SET(create_flow_group_in, flow_group_in, start_flow_index, 0);
- MLX5_SET(create_flow_group_in, flow_group_in, end_flow_index, MLX5_MAX_PORTS);
+ MLX5_SET(create_flow_group_in, flow_group_in, end_flow_index, count);
g = mlx5_create_flow_group(acl, flow_group_in);
if (IS_ERR(g)) {
@@ -2626,7 +2626,7 @@ static void esw_master_egress_destroy_resources(struct mlx5_vport *vport)
}
static int esw_set_master_egress_rule(struct mlx5_core_dev *master,
- struct mlx5_core_dev *slave)
+ struct mlx5_core_dev *slave, size_t count)
{
struct mlx5_eswitch *esw = master->priv.eswitch;
u16 slave_index = MLX5_CAP_GEN(slave, vhca_id);
@@ -2647,7 +2647,7 @@ static int esw_set_master_egress_rule(struct mlx5_core_dev *master,
if (vport->egress.acl && vport->egress.type != VPORT_EGRESS_ACL_TYPE_SHARED_FDB)
return 0;
- err = esw_master_egress_create_resources(egress_ns, vport);
+ err = esw_master_egress_create_resources(egress_ns, vport, count);
if (err)
return err;
@@ -2665,19 +2665,24 @@ static int esw_set_master_egress_rule(struct mlx5_core_dev *master,
return err;
}
-static void esw_unset_master_egress_rule(struct mlx5_core_dev *dev)
+static void esw_unset_master_egress_rule(struct mlx5_core_dev *dev,
+ struct mlx5_core_dev *slave_dev)
{
struct mlx5_vport *vport;
vport = mlx5_eswitch_get_vport(dev->priv.eswitch,
dev->priv.eswitch->manager_vport);
- esw_acl_egress_ofld_cleanup(vport);
- xa_destroy(&vport->egress.offloads.bounce_rules);
+ esw_acl_egress_ofld_bounce_rule_destroy(vport, MLX5_CAP_GEN(slave_dev, vhca_id));
+
+ if (xa_empty(&vport->egress.offloads.bounce_rules)) {
+ esw_acl_egress_ofld_cleanup(vport);
+ xa_destroy(&vport->egress.offloads.bounce_rules);
+ }
}
-int mlx5_eswitch_offloads_config_single_fdb(struct mlx5_eswitch *master_esw,
- struct mlx5_eswitch *slave_esw)
+int mlx5_eswitch_offloads_single_fdb_add_one(struct mlx5_eswitch *master_esw,
+ struct mlx5_eswitch *slave_esw, int max_slaves)
{
int err;
@@ -2687,7 +2692,7 @@ int mlx5_eswitch_offloads_config_single_fdb(struct mlx5_eswitch *master_esw,
return err;
err = esw_set_master_egress_rule(master_esw->dev,
- slave_esw->dev);
+ slave_esw->dev, max_slaves);
if (err)
goto err_acl;
@@ -2695,15 +2700,14 @@ int mlx5_eswitch_offloads_config_single_fdb(struct mlx5_eswitch *master_esw,
err_acl:
esw_set_slave_root_fdb(NULL, slave_esw->dev);
-
return err;
}
-void mlx5_eswitch_offloads_destroy_single_fdb(struct mlx5_eswitch *master_esw,
+void mlx5_eswitch_offloads_single_fdb_del_one(struct mlx5_eswitch *master_esw,
struct mlx5_eswitch *slave_esw)
{
- esw_unset_master_egress_rule(master_esw->dev);
esw_set_slave_root_fdb(NULL, slave_esw->dev);
+ esw_unset_master_egress_rule(master_esw->dev, slave_esw->dev);
}
#define ESW_OFFLOADS_DEVCOM_PAIR (0)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index 5d331b940f4d..9bc2822881ca 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -550,6 +550,29 @@ char *mlx5_get_str_port_sel_mode(enum mlx5_lag_mode mode, unsigned long flags)
}
}
+static int mlx5_lag_create_single_fdb(struct mlx5_lag *ldev)
+{
+ struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev;
+ struct mlx5_eswitch *master_esw = dev0->priv.eswitch;
+ int err;
+ int i;
+
+ for (i = MLX5_LAG_P1 + 1; i < ldev->ports; i++) {
+ struct mlx5_eswitch *slave_esw = ldev->pf[i].dev->priv.eswitch;
+
+ err = mlx5_eswitch_offloads_single_fdb_add_one(master_esw,
+ slave_esw, ldev->ports);
+ if (err)
+ goto err;
+ }
+ return 0;
+err:
+ for (; i > MLX5_LAG_P1; i--)
+ mlx5_eswitch_offloads_single_fdb_del_one(master_esw,
+ ldev->pf[i].dev->priv.eswitch);
+ return err;
+}
+
static int mlx5_create_lag(struct mlx5_lag *ldev,
struct lag_tracker *tracker,
enum mlx5_lag_mode mode,
@@ -557,7 +580,6 @@ static int mlx5_create_lag(struct mlx5_lag *ldev,
{
bool shared_fdb = test_bit(MLX5_LAG_MODE_FLAG_SHARED_FDB, &flags);
struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev;
- struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev;
u32 in[MLX5_ST_SZ_DW(destroy_lag_in)] = {};
int err;
@@ -575,8 +597,7 @@ static int mlx5_create_lag(struct mlx5_lag *ldev,
}
if (shared_fdb) {
- err = mlx5_eswitch_offloads_config_single_fdb(dev0->priv.eswitch,
- dev1->priv.eswitch);
+ err = mlx5_lag_create_single_fdb(ldev);
if (err)
mlx5_core_err(dev0, "Can't enable single FDB mode\n");
else
@@ -647,19 +668,21 @@ int mlx5_activate_lag(struct mlx5_lag *ldev,
int mlx5_deactivate_lag(struct mlx5_lag *ldev)
{
struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev;
- struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev;
+ struct mlx5_eswitch *master_esw = dev0->priv.eswitch;
u32 in[MLX5_ST_SZ_DW(destroy_lag_in)] = {};
bool roce_lag = __mlx5_lag_is_roce(ldev);
unsigned long flags = ldev->mode_flags;
int err;
+ int i;
ldev->mode = MLX5_LAG_MODE_NONE;
ldev->mode_flags = 0;
mlx5_lag_mp_reset(ldev);
if (test_bit(MLX5_LAG_MODE_FLAG_SHARED_FDB, &flags)) {
- mlx5_eswitch_offloads_destroy_single_fdb(dev0->priv.eswitch,
- dev1->priv.eswitch);
+ for (i = MLX5_LAG_P1 + 1; i < ldev->ports; i++)
+ mlx5_eswitch_offloads_single_fdb_del_one(master_esw,
+ ldev->pf[i].dev->priv.eswitch);
clear_bit(MLX5_LAG_MODE_FLAG_SHARED_FDB, &flags);
}
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [net-next 10/14] net/mlx5: DR, handle more than one peer domain
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
` (8 preceding siblings ...)
2023-06-01 6:01 ` [net-next 09/14] net/mlx5: E-switch, generalize shared FDB creation Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
2023-06-01 6:01 ` [net-next 11/14] net/mlx5: Devcom, Rename paired to ready Saeed Mahameed
` (3 subsequent siblings)
13 siblings, 0 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory,
Yevgeny Kliteynik
From: Shay Drory <shayd@nvidia.com>
Currently, DR domain is using the assumption that each domain can only
have a single peer.
In order to support VF LAG of more then two ports, expand peer domain
to use an array of peers, and align the code accordingly.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 12 +++++++-----
drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c | 3 ++-
drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h | 3 ++-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 5 +++--
drivers/net/ethernet/mellanox/mlx5/core/fs_core.h | 3 ++-
.../mellanox/mlx5/core/steering/dr_action.c | 5 +++--
.../mellanox/mlx5/core/steering/dr_domain.c | 13 +++++++------
.../mellanox/mlx5/core/steering/dr_ste_v0.c | 9 +++++----
.../mellanox/mlx5/core/steering/dr_ste_v1.c | 9 +++++----
.../ethernet/mellanox/mlx5/core/steering/dr_types.h | 2 +-
.../ethernet/mellanox/mlx5/core/steering/fs_dr.c | 5 +++--
.../ethernet/mellanox/mlx5/core/steering/mlx5dr.h | 3 ++-
12 files changed, 42 insertions(+), 30 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 98d75a33a624..761278e1af5c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2778,7 +2778,9 @@ static int mlx5_esw_offloads_set_ns_peer(struct mlx5_eswitch *esw,
struct mlx5_eswitch *peer_esw,
bool pair)
{
+ u8 peer_idx = mlx5_get_dev_index(peer_esw->dev);
struct mlx5_flow_root_namespace *peer_ns;
+ u8 idx = mlx5_get_dev_index(esw->dev);
struct mlx5_flow_root_namespace *ns;
int err;
@@ -2786,18 +2788,18 @@ static int mlx5_esw_offloads_set_ns_peer(struct mlx5_eswitch *esw,
ns = esw->dev->priv.steering->fdb_root_ns;
if (pair) {
- err = mlx5_flow_namespace_set_peer(ns, peer_ns);
+ err = mlx5_flow_namespace_set_peer(ns, peer_ns, peer_idx);
if (err)
return err;
- err = mlx5_flow_namespace_set_peer(peer_ns, ns);
+ err = mlx5_flow_namespace_set_peer(peer_ns, ns, idx);
if (err) {
- mlx5_flow_namespace_set_peer(ns, NULL);
+ mlx5_flow_namespace_set_peer(ns, NULL, peer_idx);
return err;
}
} else {
- mlx5_flow_namespace_set_peer(ns, NULL);
- mlx5_flow_namespace_set_peer(peer_ns, NULL);
+ mlx5_flow_namespace_set_peer(ns, NULL, peer_idx);
+ mlx5_flow_namespace_set_peer(peer_ns, NULL, idx);
}
return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
index 144e59480686..11374c3744c5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
@@ -139,7 +139,8 @@ static void mlx5_cmd_stub_modify_header_dealloc(struct mlx5_flow_root_namespace
}
static int mlx5_cmd_stub_set_peer(struct mlx5_flow_root_namespace *ns,
- struct mlx5_flow_root_namespace *peer_ns)
+ struct mlx5_flow_root_namespace *peer_ns,
+ u8 peer_idx)
{
return 0;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
index 8ef4254b9ea1..b6b9a5a20591 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
@@ -93,7 +93,8 @@ struct mlx5_flow_cmds {
struct mlx5_modify_hdr *modify_hdr);
int (*set_peer)(struct mlx5_flow_root_namespace *ns,
- struct mlx5_flow_root_namespace *peer_ns);
+ struct mlx5_flow_root_namespace *peer_ns,
+ u8 peer_idx);
int (*create_ns)(struct mlx5_flow_root_namespace *ns);
int (*destroy_ns)(struct mlx5_flow_root_namespace *ns);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 19da02c41616..4ef04aa28771 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -3620,7 +3620,8 @@ void mlx5_destroy_match_definer(struct mlx5_core_dev *dev,
}
int mlx5_flow_namespace_set_peer(struct mlx5_flow_root_namespace *ns,
- struct mlx5_flow_root_namespace *peer_ns)
+ struct mlx5_flow_root_namespace *peer_ns,
+ u8 peer_idx)
{
if (peer_ns && ns->mode != peer_ns->mode) {
mlx5_core_err(ns->dev,
@@ -3628,7 +3629,7 @@ int mlx5_flow_namespace_set_peer(struct mlx5_flow_root_namespace *ns,
return -EINVAL;
}
- return ns->cmds->set_peer(ns, peer_ns);
+ return ns->cmds->set_peer(ns, peer_ns, peer_idx);
}
/* This function should be called only at init stage of the namespace.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
index f137a0611b77..200ec946409c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
@@ -295,7 +295,8 @@ void mlx5_fc_update_sampling_interval(struct mlx5_core_dev *dev,
const struct mlx5_flow_cmds *mlx5_fs_cmd_get_fw_cmds(void);
int mlx5_flow_namespace_set_peer(struct mlx5_flow_root_namespace *ns,
- struct mlx5_flow_root_namespace *peer_ns);
+ struct mlx5_flow_root_namespace *peer_ns,
+ u8 peer_idx);
int mlx5_flow_namespace_set_mode(struct mlx5_flow_namespace *ns,
enum mlx5_flow_steering_mode mode);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
index 0eb9a8d7f282..4e9bc1897a88 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
@@ -2071,8 +2071,9 @@ mlx5dr_action_create_dest_vport(struct mlx5dr_domain *dmn,
struct mlx5dr_action *action;
u8 peer_vport;
- peer_vport = vhca_id_valid && (vhca_id != dmn->info.caps.gvmi);
- vport_dmn = peer_vport ? dmn->peer_dmn : dmn;
+ peer_vport = vhca_id_valid && mlx5_core_is_pf(dmn->mdev) &&
+ (vhca_id != dmn->info.caps.gvmi);
+ vport_dmn = peer_vport ? dmn->peer_dmn[vhca_id] : dmn;
if (!vport_dmn) {
mlx5dr_dbg(dmn, "No peer vport domain for given vhca_id\n");
return NULL;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c
index 9a2dfe6ebe31..75dc85dc24ef 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c
@@ -555,17 +555,18 @@ int mlx5dr_domain_destroy(struct mlx5dr_domain *dmn)
}
void mlx5dr_domain_set_peer(struct mlx5dr_domain *dmn,
- struct mlx5dr_domain *peer_dmn)
+ struct mlx5dr_domain *peer_dmn,
+ u8 peer_idx)
{
mlx5dr_domain_lock(dmn);
- if (dmn->peer_dmn)
- refcount_dec(&dmn->peer_dmn->refcount);
+ if (dmn->peer_dmn[peer_idx])
+ refcount_dec(&dmn->peer_dmn[peer_idx]->refcount);
- dmn->peer_dmn = peer_dmn;
+ dmn->peer_dmn[peer_idx] = peer_dmn;
- if (dmn->peer_dmn)
- refcount_inc(&dmn->peer_dmn->refcount);
+ if (dmn->peer_dmn[peer_idx])
+ refcount_inc(&dmn->peer_dmn[peer_idx]->refcount);
mlx5dr_domain_unlock(dmn);
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c
index 2010d4ac6519..69d7a8f3c402 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c
@@ -1647,6 +1647,7 @@ dr_ste_v0_build_src_gvmi_qpn_tag(struct mlx5dr_match_param *value,
u8 *tag)
{
struct mlx5dr_match_misc *misc = &value->misc;
+ int id = misc->source_eswitch_owner_vhca_id;
struct mlx5dr_cmd_vport_cap *vport_cap;
struct mlx5dr_domain *dmn = sb->dmn;
struct mlx5dr_domain *vport_dmn;
@@ -1657,11 +1658,11 @@ dr_ste_v0_build_src_gvmi_qpn_tag(struct mlx5dr_match_param *value,
if (sb->vhca_id_valid) {
/* Find port GVMI based on the eswitch_owner_vhca_id */
- if (misc->source_eswitch_owner_vhca_id == dmn->info.caps.gvmi)
+ if (id == dmn->info.caps.gvmi)
vport_dmn = dmn;
- else if (dmn->peer_dmn && (misc->source_eswitch_owner_vhca_id ==
- dmn->peer_dmn->info.caps.gvmi))
- vport_dmn = dmn->peer_dmn;
+ else if (id < MLX5_MAX_PORTS && dmn->peer_dmn[id] &&
+ (id == dmn->peer_dmn[id]->info.caps.gvmi))
+ vport_dmn = dmn->peer_dmn[id];
else
return -EINVAL;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c
index 4c0704ad166b..f4ef0b22b991 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c
@@ -1979,6 +1979,7 @@ static int dr_ste_v1_build_src_gvmi_qpn_tag(struct mlx5dr_match_param *value,
u8 *tag)
{
struct mlx5dr_match_misc *misc = &value->misc;
+ int id = misc->source_eswitch_owner_vhca_id;
struct mlx5dr_cmd_vport_cap *vport_cap;
struct mlx5dr_domain *dmn = sb->dmn;
struct mlx5dr_domain *vport_dmn;
@@ -1988,11 +1989,11 @@ static int dr_ste_v1_build_src_gvmi_qpn_tag(struct mlx5dr_match_param *value,
if (sb->vhca_id_valid) {
/* Find port GVMI based on the eswitch_owner_vhca_id */
- if (misc->source_eswitch_owner_vhca_id == dmn->info.caps.gvmi)
+ if (id == dmn->info.caps.gvmi)
vport_dmn = dmn;
- else if (dmn->peer_dmn && (misc->source_eswitch_owner_vhca_id ==
- dmn->peer_dmn->info.caps.gvmi))
- vport_dmn = dmn->peer_dmn;
+ else if (id < MLX5_MAX_PORTS && dmn->peer_dmn[id] &&
+ (id == dmn->peer_dmn[id]->info.caps.gvmi))
+ vport_dmn = dmn->peer_dmn[id];
else
return -EINVAL;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
index 678a993ab053..1622dbbe6b97 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
@@ -935,7 +935,7 @@ struct mlx5dr_domain_info {
};
struct mlx5dr_domain {
- struct mlx5dr_domain *peer_dmn;
+ struct mlx5dr_domain *peer_dmn[MLX5_MAX_PORTS];
struct mlx5_core_dev *mdev;
u32 pdn;
struct mlx5_uars_page *uar;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
index 984653756779..c6fda1cbfcff 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
@@ -770,14 +770,15 @@ static int mlx5_cmd_dr_update_fte(struct mlx5_flow_root_namespace *ns,
}
static int mlx5_cmd_dr_set_peer(struct mlx5_flow_root_namespace *ns,
- struct mlx5_flow_root_namespace *peer_ns)
+ struct mlx5_flow_root_namespace *peer_ns,
+ u8 peer_idx)
{
struct mlx5dr_domain *peer_domain = NULL;
if (peer_ns)
peer_domain = peer_ns->fs_dr_domain.dr_domain;
mlx5dr_domain_set_peer(ns->fs_dr_domain.dr_domain,
- peer_domain);
+ peer_domain, peer_idx);
return 0;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h
index 9afd268a2573..5ba88f2ecb3f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h
@@ -48,7 +48,8 @@ int mlx5dr_domain_destroy(struct mlx5dr_domain *domain);
int mlx5dr_domain_sync(struct mlx5dr_domain *domain, u32 flags);
void mlx5dr_domain_set_peer(struct mlx5dr_domain *dmn,
- struct mlx5dr_domain *peer_dmn);
+ struct mlx5dr_domain *peer_dmn,
+ u8 peer_idx);
struct mlx5dr_table *
mlx5dr_table_create(struct mlx5dr_domain *domain, u32 level, u32 flags,
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [net-next 11/14] net/mlx5: Devcom, Rename paired to ready
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
` (9 preceding siblings ...)
2023-06-01 6:01 ` [net-next 10/14] net/mlx5: DR, handle more than one peer domain Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
2023-06-01 6:01 ` [net-next 12/14] net/mlx5: E-switch, mark devcom as not ready when all eswitches are unpaired Saeed Mahameed
` (2 subsequent siblings)
13 siblings, 0 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch,
Roi Dayan
From: Shay Drory <shayd@nvidia.com>
In downstream patch devcom will provide support for more than two
devices. The term 'paired' will be renamed as 'ready' to convey a
more accurate meaning.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/en_rep.c | 2 +-
.../net/ethernet/mellanox/mlx5/core/en_tc.c | 4 ++--
.../mellanox/mlx5/core/eswitch_offloads.c | 4 ++--
.../net/ethernet/mellanox/mlx5/core/lag/lag.c | 4 ++--
.../ethernet/mellanox/mlx5/core/lib/devcom.c | 20 +++++++++----------
.../ethernet/mellanox/mlx5/core/lib/devcom.h | 10 +++++-----
6 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 13d69c5634ac..45e9e7b383dc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -413,7 +413,7 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
return 0;
rpriv = mlx5e_rep_to_rep_priv(rep);
- if (mlx5_devcom_is_paired(esw->dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS))
+ if (mlx5_devcom_comp_is_ready(esw->dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS))
peer_esw = mlx5_devcom_get_peer_data(esw->dev->priv.devcom,
MLX5_DEVCOM_ESW_OFFLOADS);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 13071d184d88..f4ab00d84691 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -4293,8 +4293,8 @@ static bool is_peer_flow_needed(struct mlx5e_tc_flow *flow)
flow_flag_test(flow, INGRESS);
bool act_is_encap = !!(attr->action &
MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT);
- bool esw_paired = mlx5_devcom_is_paired(esw_attr->in_mdev->priv.devcom,
- MLX5_DEVCOM_ESW_OFFLOADS);
+ bool esw_paired = mlx5_devcom_comp_is_ready(esw_attr->in_mdev->priv.devcom,
+ MLX5_DEVCOM_ESW_OFFLOADS);
if (!esw_paired)
return false;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 761278e1af5c..aeb15b10048e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2836,14 +2836,14 @@ static int mlx5_esw_offloads_devcom_event(int event,
esw->paired[mlx5_get_dev_index(peer_esw->dev)] = true;
peer_esw->paired[mlx5_get_dev_index(esw->dev)] = true;
- mlx5_devcom_set_paired(devcom, MLX5_DEVCOM_ESW_OFFLOADS, true);
+ mlx5_devcom_comp_set_ready(devcom, MLX5_DEVCOM_ESW_OFFLOADS, true);
break;
case ESW_OFFLOADS_DEVCOM_UNPAIR:
if (!esw->paired[mlx5_get_dev_index(peer_esw->dev)])
break;
- mlx5_devcom_set_paired(devcom, MLX5_DEVCOM_ESW_OFFLOADS, false);
+ mlx5_devcom_comp_set_ready(devcom, MLX5_DEVCOM_ESW_OFFLOADS, false);
esw->paired[mlx5_get_dev_index(peer_esw->dev)] = false;
peer_esw->paired[mlx5_get_dev_index(esw->dev)] = false;
mlx5_esw_offloads_unpair(peer_esw, esw);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index 9bc2822881ca..c820f7d266de 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -824,8 +824,8 @@ bool mlx5_shared_fdb_supported(struct mlx5_lag *ldev)
is_mdev_switchdev_mode(dev1) &&
mlx5_eswitch_vport_match_metadata_enabled(dev0->priv.eswitch) &&
mlx5_eswitch_vport_match_metadata_enabled(dev1->priv.eswitch) &&
- mlx5_devcom_is_paired(dev0->priv.devcom,
- MLX5_DEVCOM_ESW_OFFLOADS) &&
+ mlx5_devcom_comp_is_ready(dev0->priv.devcom,
+ MLX5_DEVCOM_ESW_OFFLOADS) &&
MLX5_CAP_GEN(dev1, lag_native_fdb_selection) &&
MLX5_CAP_ESW(dev1, root_ft_on_other_esw) &&
MLX5_CAP_ESW(dev0, esw_shared_ingress_acl))
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
index b7d779d08d83..7446900a589e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
@@ -19,7 +19,7 @@ struct mlx5_devcom_component {
mlx5_devcom_event_handler_t handler;
struct rw_semaphore sem;
- bool paired;
+ bool ready;
};
struct mlx5_devcom_list {
@@ -218,25 +218,25 @@ int mlx5_devcom_send_event(struct mlx5_devcom *devcom,
return err;
}
-void mlx5_devcom_set_paired(struct mlx5_devcom *devcom,
- enum mlx5_devcom_components id,
- bool paired)
+void mlx5_devcom_comp_set_ready(struct mlx5_devcom *devcom,
+ enum mlx5_devcom_components id,
+ bool ready)
{
struct mlx5_devcom_component *comp;
comp = &devcom->priv->components[id];
WARN_ON(!rwsem_is_locked(&comp->sem));
- WRITE_ONCE(comp->paired, paired);
+ WRITE_ONCE(comp->ready, ready);
}
-bool mlx5_devcom_is_paired(struct mlx5_devcom *devcom,
- enum mlx5_devcom_components id)
+bool mlx5_devcom_comp_is_ready(struct mlx5_devcom *devcom,
+ enum mlx5_devcom_components id)
{
if (IS_ERR_OR_NULL(devcom))
return false;
- return READ_ONCE(devcom->priv->components[id].paired);
+ return READ_ONCE(devcom->priv->components[id].ready);
}
void *mlx5_devcom_get_peer_data(struct mlx5_devcom *devcom,
@@ -250,7 +250,7 @@ void *mlx5_devcom_get_peer_data(struct mlx5_devcom *devcom,
comp = &devcom->priv->components[id];
down_read(&comp->sem);
- if (!READ_ONCE(comp->paired)) {
+ if (!READ_ONCE(comp->ready)) {
up_read(&comp->sem);
return NULL;
}
@@ -278,7 +278,7 @@ void *mlx5_devcom_get_peer_data_rcu(struct mlx5_devcom *devcom, enum mlx5_devcom
/* This can change concurrently, however 'data' pointer will remain
* valid for the duration of RCU read section.
*/
- if (!READ_ONCE(comp->paired))
+ if (!READ_ONCE(comp->ready))
return NULL;
return rcu_dereference(comp->device[i].data);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
index 9a496f4722da..d465de8459b4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
@@ -33,11 +33,11 @@ int mlx5_devcom_send_event(struct mlx5_devcom *devcom,
int event,
void *event_data);
-void mlx5_devcom_set_paired(struct mlx5_devcom *devcom,
- enum mlx5_devcom_components id,
- bool paired);
-bool mlx5_devcom_is_paired(struct mlx5_devcom *devcom,
- enum mlx5_devcom_components id);
+void mlx5_devcom_comp_set_ready(struct mlx5_devcom *devcom,
+ enum mlx5_devcom_components id,
+ bool ready);
+bool mlx5_devcom_comp_is_ready(struct mlx5_devcom *devcom,
+ enum mlx5_devcom_components id);
void *mlx5_devcom_get_peer_data(struct mlx5_devcom *devcom,
enum mlx5_devcom_components id);
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [net-next 12/14] net/mlx5: E-switch, mark devcom as not ready when all eswitches are unpaired
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
` (10 preceding siblings ...)
2023-06-01 6:01 ` [net-next 11/14] net/mlx5: Devcom, Rename paired to ready Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
2023-06-01 6:01 ` [net-next 13/14] net/mlx5: Devcom, introduce devcom_for_each_peer_entry Saeed Mahameed
2023-06-01 6:01 ` [net-next 14/14] net/mlx5: Devcom, extend mlx5_devcom_send_event to work with more than two devices Saeed Mahameed
13 siblings, 0 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch,
Roi Dayan
From: Shay Drory <shayd@nvidia.com>
Whenever an eswitch is unpaired with another, the driver mark devcom
as not ready. While this is correct in case we are pairing only two
eswitches, in order to support pairing of more than two eswitches,
driver need to mark devcom as not ready only when all eswitches are
unpaired.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 1 +
.../net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 8 +++++++-
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index 9833d1a587cc..d6e4ca436f39 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -343,6 +343,7 @@ struct mlx5_eswitch {
int mode;
u16 manager_vport;
u16 first_host_vport;
+ u8 num_peers;
struct mlx5_esw_functions esw_funcs;
struct {
u32 large_group_num;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index aeb15b10048e..09367a320741 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2836,6 +2836,8 @@ static int mlx5_esw_offloads_devcom_event(int event,
esw->paired[mlx5_get_dev_index(peer_esw->dev)] = true;
peer_esw->paired[mlx5_get_dev_index(esw->dev)] = true;
+ esw->num_peers++;
+ peer_esw->num_peers++;
mlx5_devcom_comp_set_ready(devcom, MLX5_DEVCOM_ESW_OFFLOADS, true);
break;
@@ -2843,7 +2845,10 @@ static int mlx5_esw_offloads_devcom_event(int event,
if (!esw->paired[mlx5_get_dev_index(peer_esw->dev)])
break;
- mlx5_devcom_comp_set_ready(devcom, MLX5_DEVCOM_ESW_OFFLOADS, false);
+ peer_esw->num_peers--;
+ esw->num_peers--;
+ if (!esw->num_peers && !peer_esw->num_peers)
+ mlx5_devcom_comp_set_ready(devcom, MLX5_DEVCOM_ESW_OFFLOADS, false);
esw->paired[mlx5_get_dev_index(peer_esw->dev)] = false;
peer_esw->paired[mlx5_get_dev_index(esw->dev)] = false;
mlx5_esw_offloads_unpair(peer_esw, esw);
@@ -2884,6 +2889,7 @@ void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw)
mlx5_esw_offloads_devcom_event,
esw);
+ esw->num_peers = 0;
mlx5_devcom_send_event(devcom,
MLX5_DEVCOM_ESW_OFFLOADS,
ESW_OFFLOADS_DEVCOM_PAIR, esw);
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [net-next 13/14] net/mlx5: Devcom, introduce devcom_for_each_peer_entry
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
` (11 preceding siblings ...)
2023-06-01 6:01 ` [net-next 12/14] net/mlx5: E-switch, mark devcom as not ready when all eswitches are unpaired Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
2023-06-01 6:01 ` [net-next 14/14] net/mlx5: Devcom, extend mlx5_devcom_send_event to work with more than two devices Saeed Mahameed
13 siblings, 0 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch,
Vlad Buslov, Roi Dayan
From: Shay Drory <shayd@nvidia.com>
Introduce generic APIs which will retrieve all peers.
This API replace mlx5_devcom_get/release_peer_data which retrieve
only a single peer.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/en_rep.c | 92 +++++++++++--------
.../net/ethernet/mellanox/mlx5/core/en_tc.c | 44 +++++----
.../ethernet/mellanox/mlx5/core/esw/bridge.c | 30 ++++--
.../mellanox/mlx5/core/esw/bridge_mcast.c | 21 ++++-
.../net/ethernet/mellanox/mlx5/core/eswitch.h | 7 ++
.../ethernet/mellanox/mlx5/core/lib/devcom.c | 89 ++++++++++++------
.../ethernet/mellanox/mlx5/core/lib/devcom.h | 23 ++++-
7 files changed, 209 insertions(+), 97 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 45e9e7b383dc..0af47eaec893 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -397,25 +397,64 @@ static void mlx5e_sqs2vport_stop(struct mlx5_eswitch *esw,
}
}
+static int mlx5e_sqs2vport_add_peers_rules(struct mlx5_eswitch *esw, struct mlx5_eswitch_rep *rep,
+ struct mlx5_devcom *devcom,
+ struct mlx5e_rep_sq *rep_sq, int i)
+{
+ struct mlx5_eswitch *peer_esw = NULL;
+ struct mlx5_flow_handle *flow_rule;
+ int tmp;
+
+ mlx5_devcom_for_each_peer_entry(devcom, MLX5_DEVCOM_ESW_OFFLOADS,
+ peer_esw, tmp) {
+ int peer_rule_idx = mlx5_get_dev_index(peer_esw->dev);
+ struct mlx5e_rep_sq_peer *sq_peer;
+ int err;
+
+ sq_peer = kzalloc(sizeof(*sq_peer), GFP_KERNEL);
+ if (!sq_peer)
+ return -ENOMEM;
+
+ flow_rule = mlx5_eswitch_add_send_to_vport_rule(peer_esw, esw,
+ rep, rep_sq->sqn);
+ if (IS_ERR(flow_rule)) {
+ kfree(sq_peer);
+ return PTR_ERR(flow_rule);
+ }
+
+ sq_peer->rule = flow_rule;
+ sq_peer->peer = peer_esw;
+ err = xa_insert(&rep_sq->sq_peer, peer_rule_idx, sq_peer, GFP_KERNEL);
+ if (err) {
+ kfree(sq_peer);
+ mlx5_eswitch_del_send_to_vport_rule(flow_rule);
+ return err;
+ }
+ }
+
+ return 0;
+}
+
static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
struct mlx5_eswitch_rep *rep,
u32 *sqns_array, int sqns_num)
{
- struct mlx5_eswitch *peer_esw = NULL;
struct mlx5_flow_handle *flow_rule;
- struct mlx5e_rep_sq_peer *sq_peer;
struct mlx5e_rep_priv *rpriv;
struct mlx5e_rep_sq *rep_sq;
+ struct mlx5_devcom *devcom;
+ bool devcom_locked = false;
int err;
int i;
if (esw->mode != MLX5_ESWITCH_OFFLOADS)
return 0;
+ devcom = esw->dev->priv.devcom;
rpriv = mlx5e_rep_to_rep_priv(rep);
- if (mlx5_devcom_comp_is_ready(esw->dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS))
- peer_esw = mlx5_devcom_get_peer_data(esw->dev->priv.devcom,
- MLX5_DEVCOM_ESW_OFFLOADS);
+ if (mlx5_devcom_comp_is_ready(devcom, MLX5_DEVCOM_ESW_OFFLOADS) &&
+ mlx5_devcom_for_each_peer_begin(devcom, MLX5_DEVCOM_ESW_OFFLOADS))
+ devcom_locked = true;
for (i = 0; i < sqns_num; i++) {
rep_sq = kzalloc(sizeof(*rep_sq), GFP_KERNEL);
@@ -423,7 +462,6 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
err = -ENOMEM;
goto out_err;
}
- xa_init(&rep_sq->sq_peer);
/* Add re-inject rule to the PF/representor sqs */
flow_rule = mlx5_eswitch_add_send_to_vport_rule(esw, esw, rep,
@@ -436,48 +474,30 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
rep_sq->send_to_vport_rule = flow_rule;
rep_sq->sqn = sqns_array[i];
- if (peer_esw) {
- int peer_rule_idx = mlx5_get_dev_index(peer_esw->dev);
-
- sq_peer = kzalloc(sizeof(*sq_peer), GFP_KERNEL);
- if (!sq_peer)
- goto out_sq_peer_err;
-
- flow_rule = mlx5_eswitch_add_send_to_vport_rule(peer_esw, esw,
- rep, sqns_array[i]);
- if (IS_ERR(flow_rule)) {
- err = PTR_ERR(flow_rule);
- goto out_flow_rule_err;
+ xa_init(&rep_sq->sq_peer);
+ if (devcom_locked) {
+ err = mlx5e_sqs2vport_add_peers_rules(esw, rep, devcom, rep_sq, i);
+ if (err) {
+ mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule);
+ xa_destroy(&rep_sq->sq_peer);
+ kfree(rep_sq);
+ goto out_err;
}
-
- sq_peer->rule = flow_rule;
- sq_peer->peer = peer_esw;
- err = xa_insert(&rep_sq->sq_peer, peer_rule_idx, sq_peer, GFP_KERNEL);
- if (err)
- goto out_xa_err;
}
list_add(&rep_sq->list, &rpriv->vport_sqs_list);
}
- if (peer_esw)
- mlx5_devcom_release_peer_data(esw->dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ if (devcom_locked)
+ mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
return 0;
-out_xa_err:
- mlx5_eswitch_del_send_to_vport_rule(flow_rule);
-out_flow_rule_err:
- kfree(sq_peer);
-out_sq_peer_err:
- mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule);
- xa_destroy(&rep_sq->sq_peer);
- kfree(rep_sq);
out_err:
mlx5e_sqs2vport_stop(esw, rep);
- if (peer_esw)
- mlx5_devcom_release_peer_data(esw->dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ if (devcom_locked)
+ mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
return err;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index f4ab00d84691..752b82553bc3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -1670,6 +1670,7 @@ int mlx5e_tc_query_route_vport(struct net_device *out_dev, struct net_device *ro
struct mlx5_eswitch *esw;
u16 vhca_id;
int err;
+ int i;
out_priv = netdev_priv(out_dev);
esw = out_priv->mdev->priv.eswitch;
@@ -1686,8 +1687,13 @@ int mlx5e_tc_query_route_vport(struct net_device *out_dev, struct net_device *ro
rcu_read_lock();
devcom = out_priv->mdev->priv.devcom;
- esw = mlx5_devcom_get_peer_data_rcu(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
- err = esw ? mlx5_eswitch_vhca_id_to_vport(esw, vhca_id, vport) : -ENODEV;
+ err = -ENODEV;
+ mlx5_devcom_for_each_peer_entry_rcu(devcom, MLX5_DEVCOM_ESW_OFFLOADS,
+ esw, i) {
+ err = mlx5_eswitch_vhca_id_to_vport(esw, vhca_id, vport);
+ if (!err)
+ break;
+ }
rcu_read_unlock();
return err;
@@ -2110,15 +2116,14 @@ static void mlx5e_tc_del_flow(struct mlx5e_priv *priv,
{
if (mlx5e_is_eswitch_flow(flow)) {
struct mlx5_devcom *devcom = flow->priv->mdev->priv.devcom;
- struct mlx5_eswitch *peer_esw;
- peer_esw = mlx5_devcom_get_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
- if (!peer_esw) {
+ if (!mlx5_devcom_for_each_peer_begin(devcom, MLX5_DEVCOM_ESW_OFFLOADS)) {
mlx5e_tc_del_fdb_flow(priv, flow);
return;
}
+
mlx5e_tc_del_fdb_peers_flow(flow);
- mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
mlx5e_tc_del_fdb_flow(priv, flow);
} else {
mlx5e_tc_del_nic_flow(priv, flow);
@@ -4555,6 +4560,7 @@ mlx5e_add_fdb_flow(struct mlx5e_priv *priv,
struct mlx5_eswitch *peer_esw;
struct mlx5e_tc_flow *flow;
int err;
+ int i;
flow = __mlx5e_add_fdb_flow(priv, f, flow_flags, filter_dev, in_rep,
in_mdev);
@@ -4566,23 +4572,27 @@ mlx5e_add_fdb_flow(struct mlx5e_priv *priv,
return 0;
}
- peer_esw = mlx5_devcom_get_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
- if (!peer_esw) {
+ if (!mlx5_devcom_for_each_peer_begin(devcom, MLX5_DEVCOM_ESW_OFFLOADS)) {
err = -ENODEV;
goto clean_flow;
}
- err = mlx5e_tc_add_fdb_peer_flow(f, flow, flow_flags, peer_esw);
- if (err)
- goto peer_clean;
- mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ mlx5_devcom_for_each_peer_entry(devcom,
+ MLX5_DEVCOM_ESW_OFFLOADS,
+ peer_esw, i) {
+ err = mlx5e_tc_add_fdb_peer_flow(f, flow, flow_flags, peer_esw);
+ if (err)
+ goto peer_clean;
+ }
- *__flow = flow;
+ mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ *__flow = flow;
return 0;
peer_clean:
- mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ mlx5e_tc_del_fdb_peers_flow(flow);
+ mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
clean_flow:
mlx5e_tc_del_fdb_flow(priv, flow);
return err;
@@ -4802,7 +4812,6 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv,
{
struct mlx5_devcom *devcom = priv->mdev->priv.devcom;
struct rhashtable *tc_ht = get_tc_ht(priv, flags);
- struct mlx5_eswitch *peer_esw;
struct mlx5e_tc_flow *flow;
struct mlx5_fc *counter;
u64 lastuse = 0;
@@ -4837,8 +4846,7 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv,
/* Under multipath it's possible for one rule to be currently
* un-offloaded while the other rule is offloaded.
*/
- peer_esw = mlx5_devcom_get_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
- if (!peer_esw)
+ if (!mlx5_devcom_for_each_peer_begin(devcom, MLX5_DEVCOM_ESW_OFFLOADS))
goto out;
if (flow_flag_test(flow, DUP)) {
@@ -4869,7 +4877,7 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv,
}
no_peer_counter:
- mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
out:
flow_stats_update(&f->stats, bytes, packets, 0, lastuse,
FLOW_ACTION_HW_STATS_DELAYED);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index 1ba03e219111..bea7cc645461 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -647,22 +647,35 @@ mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr,
}
static struct mlx5_flow_handle *
-mlx5_esw_bridge_ingress_flow_peer_create(u16 vport_num, const unsigned char *addr,
+mlx5_esw_bridge_ingress_flow_peer_create(u16 vport_num, u16 esw_owner_vhca_id,
+ const unsigned char *addr,
struct mlx5_esw_bridge_vlan *vlan, u32 counter_id,
struct mlx5_esw_bridge *bridge)
{
struct mlx5_devcom *devcom = bridge->br_offloads->esw->dev->priv.devcom;
+ struct mlx5_eswitch *tmp, *peer_esw = NULL;
static struct mlx5_flow_handle *handle;
- struct mlx5_eswitch *peer_esw;
+ int i;
- peer_esw = mlx5_devcom_get_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
- if (!peer_esw)
+ if (!mlx5_devcom_for_each_peer_begin(devcom, MLX5_DEVCOM_ESW_OFFLOADS))
return ERR_PTR(-ENODEV);
+ mlx5_devcom_for_each_peer_entry(devcom,
+ MLX5_DEVCOM_ESW_OFFLOADS,
+ tmp, i) {
+ if (mlx5_esw_is_owner(tmp, vport_num, esw_owner_vhca_id)) {
+ peer_esw = tmp;
+ break;
+ }
+ }
+ if (!peer_esw) {
+ mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ return ERR_PTR(-ENODEV);
+ }
+
handle = mlx5_esw_bridge_ingress_flow_with_esw_create(vport_num, addr, vlan, counter_id,
bridge, peer_esw);
-
- mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
return handle;
}
@@ -1369,8 +1382,9 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, u16 esw_ow
entry->ingress_counter = counter;
handle = peer ?
- mlx5_esw_bridge_ingress_flow_peer_create(vport_num, addr, vlan,
- mlx5_fc_id(counter), bridge) :
+ mlx5_esw_bridge_ingress_flow_peer_create(vport_num, esw_owner_vhca_id,
+ addr, vlan, mlx5_fc_id(counter),
+ bridge) :
mlx5_esw_bridge_ingress_flow_create(vport_num, addr, vlan,
mlx5_fc_id(counter), bridge);
if (IS_ERR(handle)) {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_mcast.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_mcast.c
index 2eae594a5e80..2455f8b93c1e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_mcast.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_mcast.c
@@ -540,16 +540,29 @@ static struct mlx5_flow_handle *
mlx5_esw_bridge_mcast_filter_flow_peer_create(struct mlx5_esw_bridge_port *port)
{
struct mlx5_devcom *devcom = port->bridge->br_offloads->esw->dev->priv.devcom;
+ struct mlx5_eswitch *tmp, *peer_esw = NULL;
static struct mlx5_flow_handle *handle;
- struct mlx5_eswitch *peer_esw;
+ int i;
- peer_esw = mlx5_devcom_get_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
- if (!peer_esw)
+ if (!mlx5_devcom_for_each_peer_begin(devcom, MLX5_DEVCOM_ESW_OFFLOADS))
return ERR_PTR(-ENODEV);
+ mlx5_devcom_for_each_peer_entry(devcom,
+ MLX5_DEVCOM_ESW_OFFLOADS,
+ tmp, i) {
+ if (mlx5_esw_is_owner(tmp, port->vport_num, port->esw_owner_vhca_id)) {
+ peer_esw = tmp;
+ break;
+ }
+ }
+ if (!peer_esw) {
+ mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ return ERR_PTR(-ENODEV);
+ }
+
handle = mlx5_esw_bridge_mcast_flow_with_esw_create(port, peer_esw);
- mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+ mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
return handle;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index d6e4ca436f39..c42c16d9ccbc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -585,6 +585,13 @@ mlx5_esw_is_manager_vport(const struct mlx5_eswitch *esw, u16 vport_num)
return esw->manager_vport == vport_num;
}
+static inline bool mlx5_esw_is_owner(struct mlx5_eswitch *esw, u16 vport_num,
+ u16 esw_owner_vhca_id)
+{
+ return esw_owner_vhca_id == MLX5_CAP_GEN(esw->dev, vhca_id) ||
+ (vport_num == MLX5_VPORT_UPLINK && mlx5_lag_is_master(esw->dev));
+}
+
static inline u16 mlx5_eswitch_first_host_vport_num(struct mlx5_core_dev *dev)
{
return mlx5_core_is_ecpf_esw_manager(dev) ?
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
index 7446900a589e..96a3b7b9a5cd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
@@ -239,55 +239,92 @@ bool mlx5_devcom_comp_is_ready(struct mlx5_devcom *devcom,
return READ_ONCE(devcom->priv->components[id].ready);
}
-void *mlx5_devcom_get_peer_data(struct mlx5_devcom *devcom,
- enum mlx5_devcom_components id)
+bool mlx5_devcom_for_each_peer_begin(struct mlx5_devcom *devcom,
+ enum mlx5_devcom_components id)
{
struct mlx5_devcom_component *comp;
- int i;
if (IS_ERR_OR_NULL(devcom))
- return NULL;
+ return false;
comp = &devcom->priv->components[id];
down_read(&comp->sem);
if (!READ_ONCE(comp->ready)) {
up_read(&comp->sem);
- return NULL;
+ return false;
}
- for (i = 0; i < MLX5_DEVCOM_PORTS_SUPPORTED; i++)
- if (i != devcom->idx)
- break;
+ return true;
+}
+
+void mlx5_devcom_for_each_peer_end(struct mlx5_devcom *devcom,
+ enum mlx5_devcom_components id)
+{
+ struct mlx5_devcom_component *comp = &devcom->priv->components[id];
- return rcu_dereference_protected(comp->device[i].data, lockdep_is_held(&comp->sem));
+ up_read(&comp->sem);
}
-void *mlx5_devcom_get_peer_data_rcu(struct mlx5_devcom *devcom, enum mlx5_devcom_components id)
+void *mlx5_devcom_get_next_peer_data(struct mlx5_devcom *devcom,
+ enum mlx5_devcom_components id,
+ int *i)
{
struct mlx5_devcom_component *comp;
- int i;
+ void *ret;
+ int idx;
- if (IS_ERR_OR_NULL(devcom))
- return NULL;
+ comp = &devcom->priv->components[id];
- for (i = 0; i < MLX5_DEVCOM_PORTS_SUPPORTED; i++)
- if (i != devcom->idx)
- break;
+ if (*i == MLX5_DEVCOM_PORTS_SUPPORTED)
+ return NULL;
+ for (idx = *i; idx < MLX5_DEVCOM_PORTS_SUPPORTED; idx++) {
+ if (idx != devcom->idx) {
+ ret = rcu_dereference_protected(comp->device[idx].data,
+ lockdep_is_held(&comp->sem));
+ if (ret)
+ break;
+ }
+ }
- comp = &devcom->priv->components[id];
- /* This can change concurrently, however 'data' pointer will remain
- * valid for the duration of RCU read section.
- */
- if (!READ_ONCE(comp->ready))
+ if (idx == MLX5_DEVCOM_PORTS_SUPPORTED) {
+ *i = idx;
return NULL;
+ }
+ *i = idx + 1;
- return rcu_dereference(comp->device[i].data);
+ return ret;
}
-void mlx5_devcom_release_peer_data(struct mlx5_devcom *devcom,
- enum mlx5_devcom_components id)
+void *mlx5_devcom_get_next_peer_data_rcu(struct mlx5_devcom *devcom,
+ enum mlx5_devcom_components id,
+ int *i)
{
- struct mlx5_devcom_component *comp = &devcom->priv->components[id];
+ struct mlx5_devcom_component *comp;
+ void *ret;
+ int idx;
- up_read(&comp->sem);
+ comp = &devcom->priv->components[id];
+
+ if (*i == MLX5_DEVCOM_PORTS_SUPPORTED)
+ return NULL;
+ for (idx = *i; idx < MLX5_DEVCOM_PORTS_SUPPORTED; idx++) {
+ if (idx != devcom->idx) {
+ /* This can change concurrently, however 'data' pointer will remain
+ * valid for the duration of RCU read section.
+ */
+ if (!READ_ONCE(comp->ready))
+ return NULL;
+ ret = rcu_dereference(comp->device[idx].data);
+ if (ret)
+ break;
+ }
+ }
+
+ if (idx == MLX5_DEVCOM_PORTS_SUPPORTED) {
+ *i = idx;
+ return NULL;
+ }
+ *i = idx + 1;
+
+ return ret;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
index d465de8459b4..b7f72f1a5367 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
@@ -39,11 +39,24 @@ void mlx5_devcom_comp_set_ready(struct mlx5_devcom *devcom,
bool mlx5_devcom_comp_is_ready(struct mlx5_devcom *devcom,
enum mlx5_devcom_components id);
-void *mlx5_devcom_get_peer_data(struct mlx5_devcom *devcom,
- enum mlx5_devcom_components id);
-void *mlx5_devcom_get_peer_data_rcu(struct mlx5_devcom *devcom, enum mlx5_devcom_components id);
-void mlx5_devcom_release_peer_data(struct mlx5_devcom *devcom,
+bool mlx5_devcom_for_each_peer_begin(struct mlx5_devcom *devcom,
+ enum mlx5_devcom_components id);
+void mlx5_devcom_for_each_peer_end(struct mlx5_devcom *devcom,
enum mlx5_devcom_components id);
+void *mlx5_devcom_get_next_peer_data(struct mlx5_devcom *devcom,
+ enum mlx5_devcom_components id, int *i);
-#endif
+#define mlx5_devcom_for_each_peer_entry(devcom, id, data, i) \
+ for (i = 0, data = mlx5_devcom_get_next_peer_data(devcom, id, &i); \
+ data; \
+ data = mlx5_devcom_get_next_peer_data(devcom, id, &i))
+
+void *mlx5_devcom_get_next_peer_data_rcu(struct mlx5_devcom *devcom,
+ enum mlx5_devcom_components id, int *i);
+#define mlx5_devcom_for_each_peer_entry_rcu(devcom, id, data, i) \
+ for (i = 0, data = mlx5_devcom_get_next_peer_data_rcu(devcom, id, &i); \
+ data; \
+ data = mlx5_devcom_get_next_peer_data_rcu(devcom, id, &i))
+
+#endif
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [net-next 14/14] net/mlx5: Devcom, extend mlx5_devcom_send_event to work with more than two devices
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
` (12 preceding siblings ...)
2023-06-01 6:01 ` [net-next 13/14] net/mlx5: Devcom, introduce devcom_for_each_peer_entry Saeed Mahameed
@ 2023-06-01 6:01 ` Saeed Mahameed
13 siblings, 0 replies; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-01 6:01 UTC (permalink / raw)
To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Mark Bloch,
Roi Dayan
From: Shay Drory <shayd@nvidia.com>
mlx5_devcom_send_event is used to send event from one eswitch to the
other. In other words, only one event is sent, which means, no error
mechanism is needed.
However, In case devcom have more than two eswitches, a proper error
mechanism is needed. Hence, in case of error, devcom will perform the
error unwind, since devcom knows how many events were successful.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
.../mellanox/mlx5/core/eswitch_offloads.c | 4 +++-
.../ethernet/mellanox/mlx5/core/lib/devcom.c | 17 +++++++++++++++--
.../ethernet/mellanox/mlx5/core/lib/devcom.h | 2 +-
3 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 09367a320741..29de4e759f4f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2892,7 +2892,8 @@ void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw)
esw->num_peers = 0;
mlx5_devcom_send_event(devcom,
MLX5_DEVCOM_ESW_OFFLOADS,
- ESW_OFFLOADS_DEVCOM_PAIR, esw);
+ ESW_OFFLOADS_DEVCOM_PAIR,
+ ESW_OFFLOADS_DEVCOM_UNPAIR, esw);
}
void mlx5_esw_offloads_devcom_cleanup(struct mlx5_eswitch *esw)
@@ -2906,6 +2907,7 @@ void mlx5_esw_offloads_devcom_cleanup(struct mlx5_eswitch *esw)
return;
mlx5_devcom_send_event(devcom, MLX5_DEVCOM_ESW_OFFLOADS,
+ ESW_OFFLOADS_DEVCOM_UNPAIR,
ESW_OFFLOADS_DEVCOM_UNPAIR, esw);
mlx5_devcom_unregister_component(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
index 96a3b7b9a5cd..8472bbb3cd58 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
@@ -193,7 +193,7 @@ void mlx5_devcom_unregister_component(struct mlx5_devcom *devcom,
int mlx5_devcom_send_event(struct mlx5_devcom *devcom,
enum mlx5_devcom_components id,
- int event,
+ int event, int rollback_event,
void *event_data)
{
struct mlx5_devcom_component *comp;
@@ -210,10 +210,23 @@ int mlx5_devcom_send_event(struct mlx5_devcom *devcom,
if (i != devcom->idx && data) {
err = comp->handler(event, data, event_data);
- break;
+ if (err)
+ goto rollback;
}
}
+ up_write(&comp->sem);
+ return 0;
+
+rollback:
+ while (i--) {
+ void *data = rcu_dereference_protected(comp->device[i].data,
+ lockdep_is_held(&comp->sem));
+
+ if (i != devcom->idx && data)
+ comp->handler(rollback_event, data, event_data);
+ }
+
up_write(&comp->sem);
return err;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
index b7f72f1a5367..bb1970ba8730 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
@@ -30,7 +30,7 @@ void mlx5_devcom_unregister_component(struct mlx5_devcom *devcom,
int mlx5_devcom_send_event(struct mlx5_devcom *devcom,
enum mlx5_devcom_components id,
- int event,
+ int event, int rollback_event,
void *event_data);
void mlx5_devcom_comp_set_ready(struct mlx5_devcom *devcom,
--
2.40.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [net-next 03/14] net/mlx5e: rep, store send to vport rules per peer
2023-06-01 6:01 ` [net-next 03/14] net/mlx5e: rep, store send to vport rules per peer Saeed Mahameed
@ 2023-06-02 16:02 ` Simon Horman
2023-06-02 18:46 ` Saeed Mahameed
0 siblings, 1 reply; 18+ messages in thread
From: Simon Horman @ 2023-06-02 16:02 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Saeed Mahameed, netdev, Tariq Toukan, Mark Bloch, Shay Drory,
Roi Dayan
On Wed, May 31, 2023 at 11:01:07PM -0700, Saeed Mahameed wrote:
> From: Mark Bloch <mbloch@nvidia.com>
>
> Each representor, for each send queue, is holding a
> send_to_vport rule for the peer eswitch.
>
> In order to support more than one peer, and to map between the peer
> rules and peer eswitches, refactor representor to hold both the peer
> rules and pointer to the peer eswitches.
> This enables mlx5 to store send_to_vport rules per peer, where each
> peer have dedicate index via mlx5_get_dev_index().
>
> Signed-off-by: Mark Bloch <mbloch@nvidia.com>
> Signed-off-by: Shay Drory <shayd@nvidia.com>
> Reviewed-by: Roi Dayan <roid@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
...
> @@ -426,15 +437,24 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
> rep_sq->sqn = sqns_array[i];
>
> if (peer_esw) {
> + int peer_rule_idx = mlx5_get_dev_index(peer_esw->dev);
> +
> + sq_peer = kzalloc(sizeof(*sq_peer), GFP_KERNEL);
> + if (!sq_peer)
> + goto out_sq_peer_err;
Hi Mark and Saeed,
Jumping to out_sq_peer_err will return err.
But err seems to be uninitialised here.
> +
> flow_rule = mlx5_eswitch_add_send_to_vport_rule(peer_esw, esw,
> rep, sqns_array[i]);
> if (IS_ERR(flow_rule)) {
> err = PTR_ERR(flow_rule);
> - mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule);
> - kfree(rep_sq);
> - goto out_err;
> + goto out_flow_rule_err;
> }
> - rep_sq->send_to_vport_rule_peer = flow_rule;
> +
> + sq_peer->rule = flow_rule;
> + sq_peer->peer = peer_esw;
> + err = xa_insert(&rep_sq->sq_peer, peer_rule_idx, sq_peer, GFP_KERNEL);
> + if (err)
> + goto out_xa_err;
> }
>
> list_add(&rep_sq->list, &rpriv->vport_sqs_list);
> @@ -445,6 +465,14 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
>
> return 0;
>
> +out_xa_err:
> + mlx5_eswitch_del_send_to_vport_rule(flow_rule);
> +out_flow_rule_err:
> + kfree(sq_peer);
> +out_sq_peer_err:
> + mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule);
> + xa_destroy(&rep_sq->sq_peer);
> + kfree(rep_sq);
> out_err:
> mlx5e_sqs2vport_stop(esw, rep);
>
...
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [net-next 03/14] net/mlx5e: rep, store send to vport rules per peer
2023-06-02 16:02 ` Simon Horman
@ 2023-06-02 18:46 ` Saeed Mahameed
2023-06-03 7:25 ` Simon Horman
0 siblings, 1 reply; 18+ messages in thread
From: Saeed Mahameed @ 2023-06-02 18:46 UTC (permalink / raw)
To: Simon Horman
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Saeed Mahameed, netdev, Tariq Toukan, Mark Bloch, Shay Drory,
Roi Dayan
On 02 Jun 18:02, Simon Horman wrote:
>On Wed, May 31, 2023 at 11:01:07PM -0700, Saeed Mahameed wrote:
>> From: Mark Bloch <mbloch@nvidia.com>
>>
>> Each representor, for each send queue, is holding a
>> send_to_vport rule for the peer eswitch.
>>
>> In order to support more than one peer, and to map between the peer
>> rules and peer eswitches, refactor representor to hold both the peer
>> rules and pointer to the peer eswitches.
>> This enables mlx5 to store send_to_vport rules per peer, where each
>> peer have dedicate index via mlx5_get_dev_index().
>>
>> Signed-off-by: Mark Bloch <mbloch@nvidia.com>
>> Signed-off-by: Shay Drory <shayd@nvidia.com>
>> Reviewed-by: Roi Dayan <roid@nvidia.com>
>> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
>
>...
>
>> @@ -426,15 +437,24 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
>> rep_sq->sqn = sqns_array[i];
>>
>> if (peer_esw) {
>> + int peer_rule_idx = mlx5_get_dev_index(peer_esw->dev);
>> +
>> + sq_peer = kzalloc(sizeof(*sq_peer), GFP_KERNEL);
>> + if (!sq_peer)
>> + goto out_sq_peer_err;
>
>Hi Mark and Saeed,
>
>Jumping to out_sq_peer_err will return err.
>But err seems to be uninitialised here.
>
Thanks Simon, They change this logic in a later refactoring patch:
"net/mlx5: Devcom, introduce devcom_for_each_peer_entry"
where this issue doesn't exist anymore, but i will fix anyway..
Thanks,
Saeed.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [net-next 03/14] net/mlx5e: rep, store send to vport rules per peer
2023-06-02 18:46 ` Saeed Mahameed
@ 2023-06-03 7:25 ` Simon Horman
0 siblings, 0 replies; 18+ messages in thread
From: Simon Horman @ 2023-06-03 7:25 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
Saeed Mahameed, netdev, Tariq Toukan, Mark Bloch, Shay Drory,
Roi Dayan
On Fri, Jun 02, 2023 at 11:46:31AM -0700, Saeed Mahameed wrote:
> On 02 Jun 18:02, Simon Horman wrote:
> > On Wed, May 31, 2023 at 11:01:07PM -0700, Saeed Mahameed wrote:
> > > From: Mark Bloch <mbloch@nvidia.com>
> > >
> > > Each representor, for each send queue, is holding a
> > > send_to_vport rule for the peer eswitch.
> > >
> > > In order to support more than one peer, and to map between the peer
> > > rules and peer eswitches, refactor representor to hold both the peer
> > > rules and pointer to the peer eswitches.
> > > This enables mlx5 to store send_to_vport rules per peer, where each
> > > peer have dedicate index via mlx5_get_dev_index().
> > >
> > > Signed-off-by: Mark Bloch <mbloch@nvidia.com>
> > > Signed-off-by: Shay Drory <shayd@nvidia.com>
> > > Reviewed-by: Roi Dayan <roid@nvidia.com>
> > > Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> >
> > ...
> >
> > > @@ -426,15 +437,24 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
> > > rep_sq->sqn = sqns_array[i];
> > >
> > > if (peer_esw) {
> > > + int peer_rule_idx = mlx5_get_dev_index(peer_esw->dev);
> > > +
> > > + sq_peer = kzalloc(sizeof(*sq_peer), GFP_KERNEL);
> > > + if (!sq_peer)
> > > + goto out_sq_peer_err;
> >
> > Hi Mark and Saeed,
> >
> > Jumping to out_sq_peer_err will return err.
> > But err seems to be uninitialised here.
> >
>
> Thanks Simon, They change this logic in a later refactoring patch:
> "net/mlx5: Devcom, introduce devcom_for_each_peer_entry" where this issue
> doesn't exist anymore, but i will fix anyway..
Thanks, much appreciated.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2023-06-03 7:25 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-01 6:01 [pull request][net-next 00/14] mlx5 updates 2023-05-31 Saeed Mahameed
2023-06-01 6:01 ` [net-next 01/14] net/mlx5e: en_tc, Extend peer flows to a list Saeed Mahameed
2023-06-01 6:01 ` [net-next 02/14] net/mlx5e: tc, Refactor peer add/del flow Saeed Mahameed
2023-06-01 6:01 ` [net-next 03/14] net/mlx5e: rep, store send to vport rules per peer Saeed Mahameed
2023-06-02 16:02 ` Simon Horman
2023-06-02 18:46 ` Saeed Mahameed
2023-06-03 7:25 ` Simon Horman
2023-06-01 6:01 ` [net-next 04/14] net/mlx5e: en_tc, re-factor query route port Saeed Mahameed
2023-06-01 6:01 ` [net-next 05/14] net/mlx5e: Handle offloads flows per peer Saeed Mahameed
2023-06-01 6:01 ` [net-next 06/14] net/mlx5: E-switch, enlarge peer miss group table Saeed Mahameed
2023-06-01 6:01 ` [net-next 07/14] net/mlx5: E-switch, refactor FDB miss rule add/remove Saeed Mahameed
2023-06-01 6:01 ` [net-next 08/14] net/mlx5: E-switch, Handle multiple master egress rules Saeed Mahameed
2023-06-01 6:01 ` [net-next 09/14] net/mlx5: E-switch, generalize shared FDB creation Saeed Mahameed
2023-06-01 6:01 ` [net-next 10/14] net/mlx5: DR, handle more than one peer domain Saeed Mahameed
2023-06-01 6:01 ` [net-next 11/14] net/mlx5: Devcom, Rename paired to ready Saeed Mahameed
2023-06-01 6:01 ` [net-next 12/14] net/mlx5: E-switch, mark devcom as not ready when all eswitches are unpaired Saeed Mahameed
2023-06-01 6:01 ` [net-next 13/14] net/mlx5: Devcom, introduce devcom_for_each_peer_entry Saeed Mahameed
2023-06-01 6:01 ` [net-next 14/14] net/mlx5: Devcom, extend mlx5_devcom_send_event to work with more than two devices Saeed Mahameed
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).