Netdev List
 help / color / mirror / Atom feed
* Re: [iproute2, master 1/2] devlink: Print health reporter's dump time-stamp in a helper function
From: Stephen Hemminger @ 2019-08-29 23:25 UTC (permalink / raw)
  To: Aya Levin; +Cc: netdev, Jiri Pirko, Moshe Shemesh
In-Reply-To: <1566471942-28529-2-git-send-email-ayal@mellanox.com>

On Thu, 22 Aug 2019 14:05:41 +0300
Aya Levin <ayal@mellanox.com> wrote:

> Add pr_out_dump_reporter prefix to the helper function's name and
> encapsulate the print in it.
> 
> Fixes: 2f1242efe9d0 ("devlink: Add devlink health show command")
> Signed-off-by: Aya Levin <ayal@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>


Looks fine, but devlink needs to be converted from doing JSON
printing its own way and use common iproute2 libraries.

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: mv88e6xxx: keep CMODE writable code private
From: David Miller @ 2019-08-30  0:20 UTC (permalink / raw)
  To: vivien.didelot; +Cc: netdev, marek.behun, f.fainelli, andrew
In-Reply-To: <20190828162659.10306-1-vivien.didelot@gmail.com>

From: Vivien Didelot <vivien.didelot@gmail.com>
Date: Wed, 28 Aug 2019 12:26:59 -0400

> This is a follow-up patch for commit 7a3007d22e8d ("net: dsa:
> mv88e6xxx: fully support SERDES on Topaz family").
> 
> Since .port_set_cmode is only called from mv88e6xxx_port_setup_mac and
> mv88e6xxx_phylink_mac_config, it is fine to keep this "make writable"
> code private to the mv88e6341_port_set_cmode implementation, instead
> of adding yet another operation to the switch info structure.
> 
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: mv88e6xxx: get serdes lane after lock
From: David Miller @ 2019-08-30  0:20 UTC (permalink / raw)
  To: vivien.didelot; +Cc: netdev, marek.behun, f.fainelli, andrew
In-Reply-To: <20190828162611.10064-1-vivien.didelot@gmail.com>

From: Vivien Didelot <vivien.didelot@gmail.com>
Date: Wed, 28 Aug 2019 12:26:11 -0400

> This is a follow-up patch for commit 17deaf5cb37a ("net: dsa:
> mv88e6xxx: create serdes_get_lane chip operation").
> 
> The .serdes_get_lane implementations access the CMODE of a port,
> even though it is cached at the moment, it is safer to call them
> after the mutex is locked, not before.
> 
> At the same time, check for an eventual error and return IRQ_DONE,
> instead of blindly ignoring it.
> 
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH bpf-next 00/13] bpf: adding map batch processing support
From: Jakub Kicinski @ 2019-08-30  0:15 UTC (permalink / raw)
  To: Brian Vazquez
  Cc: Yonghong Song, Alexei Starovoitov, bpf, netdev, Daniel Borkmann,
	kernel-team
In-Reply-To: <CAMzD94S87BD0HnjjHVmhMPQ3UijS+oNu+H7NtMN8z8EAexgFtg@mail.gmail.com>

On Thu, 29 Aug 2019 16:13:59 -0700, Brian Vazquez wrote:
> > We need a per-map implementation of the exec side, but roughly maps
> > would do:
> >
> >         LIST_HEAD(deleted);
> >
> >         for entry in map {
> >                 struct map_op_ctx {
> >                         .key    = entry->key,
> >                         .value  = entry->value,
> >                 };
> >
> >                 act = BPF_PROG_RUN(filter, &map_op_ctx);
> >                 if (act & ~ACT_BITS)
> >                         return -EINVAL;
> >
> >                 if (act & DELETE) {
> >                         map_unlink(entry);
> >                         list_add(entry, &deleted);
> >                 }
> >                 if (act & STOP)
> >                         break;
> >         }
> >
> >         synchronize_rcu();
> >
> >         for entry in deleted {
> >                 struct map_op_ctx {
> >                         .key    = entry->key,
> >                         .value  = entry->value,
> >                 };
> >
> >                 BPF_PROG_RUN(dumper, &map_op_ctx);
> >                 map_free(entry);
> >         }
> >  
> Hi Jakub,
> 
> how would that approach support percpu maps?
> 
> I'm thinking of a scenario where you want to do some calculations on
> percpu maps and you are interested on the info on all the cpus not
> just the one that is running the bpf program. Currently on a pcpu map
> the bpf_map_lookup_elem helper only returns the pointer to the data of
> the executing cpu.

Right, we need to have the iteration outside of the bpf program itself,
and pass the element in through the context. That way we can feed each
per cpu entry into the program separately.

^ permalink raw reply

* Re: [PATCH net 1/3] taprio: Fix kernel panic in taprio_destroy
From: David Miller @ 2019-08-30  0:07 UTC (permalink / raw)
  To: olteanv
  Cc: jhs, xiyou.wangcong, jiri, vinicius.gomes, vedang.patel,
	leandro.maciel.dorileo, netdev
In-Reply-To: <20190828144829.32570-2-olteanv@gmail.com>

From: Vladimir Oltean <olteanv@gmail.com>
Date: Wed, 28 Aug 2019 17:48:27 +0300

> taprio_init may fail earlier than this line:
> 
> 	list_add(&q->taprio_list, &taprio_list);
> 
> i.e. due to the net device not being multi queue.
> 
> Attempting to remove q from the global taprio_list when it is not part
> of it will result in a kernel panic.
> 
> Fix it by iterating through the list and removing it only if found.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>

I don't like this solution for two reaons, I think it's actually
error prone, and now every taprio_destroy() eats the cost of traversing
the entire list.

The whole reason to use a list head is O(1) removal.

Just init the list head early in the creation then the list_del() just
works.

^ permalink raw reply

* Re: [PATCH net-next 00/12] net: hns3: add some cleanups and optimizations
From: David Miller @ 2019-08-29 23:58 UTC (permalink / raw)
  To: tanhuazhong; +Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm
In-Reply-To: <1567002196-63242-1-git-send-email-tanhuazhong@huawei.com>

From: Huazhong Tan <tanhuazhong@huawei.com>
Date: Wed, 28 Aug 2019 22:23:04 +0800

> This patch-set includes cleanups, optimizations and bugfix for
> the HNS3 ethernet controller driver.
 ...

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next v3 3/3] dpaa2-eth: Add pause frame support
From: David Miller @ 2019-08-29 23:54 UTC (permalink / raw)
  To: ruxandra.radulescu; +Cc: netdev, andrew, ioana.ciornei
In-Reply-To: <1567001295-31801-3-git-send-email-ruxandra.radulescu@nxp.com>

From: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Date: Wed, 28 Aug 2019 17:08:15 +0300

> Starting with firmware version MC10.18.0, we have support for
> L2 flow control. Asymmetrical configuration (Rx or Tx only) is
> supported, but not pause frame autonegotioation.
> 
> Pause frame configuration is done via ethtool. By default, we start
> with flow control enabled on both Rx and Tx. Changes are propagated
> to hardware through firmware commands, using two flags (PAUSE,
> ASYM_PAUSE) to specify Rx and Tx pause configuration, as follows:
> 
> PAUSE | ASYM_PAUSE | Rx pause | Tx pause
> ----------------------------------------
>   0   |     0      | disabled | disabled
>   0   |     1      | disabled | enabled
>   1   |     0      | enabled  | enabled
>   1   |     1      | enabled  | disabled
> 
> The hardware can automatically send pause frames when the number
> of buffers in the pool goes below a predefined threshold. Due to
> this, flow control is incompatible with Rx frame queue taildrop
> (both mechanisms target the case when processing of ingress
> frames can't keep up with the Rx rate; for large frames, the number
> of buffers in the pool may never get low enough to trigger pause
> frames as long as taildrop is enabled). So we set pause frame
> generation and Rx FQ taildrop as mutually exclusive.
> 
> Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
> Reviewed-by: Andrew Lunn <andrew@lunn.ch>

Applied.

^ permalink raw reply

* Re: [PATCH net-next v3 2/3] dpaa2-eth: Use stored link settings
From: David Miller @ 2019-08-29 23:54 UTC (permalink / raw)
  To: ruxandra.radulescu; +Cc: netdev, andrew, ioana.ciornei
In-Reply-To: <1567001295-31801-2-git-send-email-ruxandra.radulescu@nxp.com>

From: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Date: Wed, 28 Aug 2019 17:08:14 +0300

> Whenever a link state change occurs, we get notified and save
> the new link settings in the device's private data. In ethtool
> get_link_ksettings, use the stored state instead of interrogating
> the firmware each time.
> 
> Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
> Reviewed-by: Andrew Lunn <andrew@lunn.ch>

Applied.

^ permalink raw reply

* Re: [PATCH net-next v3 1/3] dpaa2-eth: Remove support for changing link settings
From: David Miller @ 2019-08-29 23:54 UTC (permalink / raw)
  To: ruxandra.radulescu; +Cc: netdev, andrew, ioana.ciornei
In-Reply-To: <1567001295-31801-1-git-send-email-ruxandra.radulescu@nxp.com>

From: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Date: Wed, 28 Aug 2019 17:08:13 +0300

> We only support fixed-link for now, so there is no point in
> offering users the option to change link settings via ethtool.
> 
> Functionally there is no change, since firmware prevents us from
> changing link parameters anyway.
> 
> Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
> Reviewed-by: Andrew Lunn <andrew@lunn.ch>

Applied.

^ permalink raw reply

* Re: pull-request: mac80211 2019-08-29
From: David Miller @ 2019-08-29 23:44 UTC (permalink / raw)
  To: johannes; +Cc: netdev, linux-wireless
In-Reply-To: <20190829150011.10512-1-johannes@sipsolutions.net>

From: Johannes Berg <johannes@sipsolutions.net>
Date: Thu, 29 Aug 2019 17:00:10 +0200

> We have just three more fixes now, and one of those is a driver fix
> because Kalle is on vacation and I'm covering for him in the meantime.
> 
> Please pull and let me know if there's any problem.

Ok, pulled, thanks.

^ permalink raw reply

* [PATCH mlx5-next 5/5] net/mlx5: Set only stag for match untagged packets
From: Saeed Mahameed @ 2019-08-29 23:42 UTC (permalink / raw)
  To: Saeed Mahameed, Leon Romanovsky
  Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch
In-Reply-To: <20190829234151.9958-1-saeedm@mellanox.com>

From: Mark Bloch <markb@mellanox.com>

cvlan_tag enabled in match criteria and disabled in
match value means both S & C tags don't exist (untagged of both).

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index cc096f6011d9..9e9b41ab392b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -1593,7 +1593,10 @@ static int __parse_cls_flower(struct mlx5e_priv *priv,
 			*match_level = MLX5_MATCH_L2;
 		}
 	} else if (*match_level != MLX5_MATCH_NONE) {
-		MLX5_SET(fte_match_set_lyr_2_4, headers_c, svlan_tag, 1);
+		/* cvlan_tag enabled in match criteria and
+		 * disabled in match value means both S & C tags
+		 * don't exist (untagged of both)
+		 */
 		MLX5_SET(fte_match_set_lyr_2_4, headers_c, cvlan_tag, 1);
 		*match_level = MLX5_MATCH_L2;
 	}
-- 
2.21.0


^ permalink raw reply related

* [PATCH mlx5-next 4/5] net/mlx5: Add stub for mlx5_eswitch_mode
From: Saeed Mahameed @ 2019-08-29 23:42 UTC (permalink / raw)
  To: Saeed Mahameed, Leon Romanovsky
  Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Maor Gottlieb,
	Mark Bloch
In-Reply-To: <20190829234151.9958-1-saeedm@mellanox.com>

From: Maor Gottlieb <maorg@mellanox.com>

Return MLX5_ESWITCH_NONE when CONFIG_MLX5_ESWITCH
is not selected.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 include/linux/mlx5/eswitch.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/mlx5/eswitch.h b/include/linux/mlx5/eswitch.h
index 46b5ba029802..825920d3ca40 100644
--- a/include/linux/mlx5/eswitch.h
+++ b/include/linux/mlx5/eswitch.h
@@ -61,7 +61,6 @@ void *mlx5_eswitch_get_proto_dev(struct mlx5_eswitch *esw,
 struct mlx5_eswitch_rep *mlx5_eswitch_vport_rep(struct mlx5_eswitch *esw,
 						u16 vport_num);
 void *mlx5_eswitch_uplink_get_proto_dev(struct mlx5_eswitch *esw, u8 rep_type);
-u8 mlx5_eswitch_mode(struct mlx5_eswitch *esw);
 struct mlx5_flow_handle *
 mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *esw,
 				    u16 vport_num, u32 sqn);
@@ -75,7 +74,14 @@ mlx5_eswitch_get_encap_mode(const struct mlx5_core_dev *dev);
 bool mlx5_eswitch_vport_match_metadata_enabled(const struct mlx5_eswitch *esw);
 u32 mlx5_eswitch_get_vport_metadata_for_match(const struct mlx5_eswitch *esw,
 					      u16 vport_num);
+u8 mlx5_eswitch_mode(struct mlx5_eswitch *esw);
 #else  /* CONFIG_MLX5_ESWITCH */
+
+static inline u8 mlx5_eswitch_mode(struct mlx5_eswitch *esw)
+{
+	return MLX5_ESWITCH_NONE;
+}
+
 static inline enum devlink_eswitch_encap_mode
 mlx5_eswitch_get_encap_mode(const struct mlx5_core_dev *dev)
 {
-- 
2.21.0


^ permalink raw reply related

* [PATCH mlx5-next 3/5] net/mlx5: Avoid disabling RoCE when uninitialized
From: Saeed Mahameed @ 2019-08-29 23:42 UTC (permalink / raw)
  To: Saeed Mahameed, Leon Romanovsky
  Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Maor Gottlieb,
	Mark Bloch
In-Reply-To: <20190829234151.9958-1-saeedm@mellanox.com>

From: Maor Gottlieb <maorg@mellanox.com>

Move the check if RoCE steering is initialized to the
disable RoCE function, it will ensure that we disable
RoCE only if we succeeded in enabling it before.

Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/rdma.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/rdma.c b/drivers/net/ethernet/mellanox/mlx5/core/rdma.c
index 18af6981e0be..0fc7de4aa572 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/rdma.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/rdma.c
@@ -14,9 +14,6 @@ static void mlx5_rdma_disable_roce_steering(struct mlx5_core_dev *dev)
 {
 	struct mlx5_core_roce *roce = &dev->priv.roce;
 
-	if (!roce->ft)
-		return;
-
 	mlx5_del_flow_rules(roce->allow_rule);
 	mlx5_destroy_flow_group(roce->fg);
 	mlx5_destroy_flow_table(roce->ft);
@@ -145,6 +142,11 @@ static int mlx5_rdma_add_roce_addr(struct mlx5_core_dev *dev)
 
 void mlx5_rdma_disable_roce(struct mlx5_core_dev *dev)
 {
+	struct mlx5_core_roce *roce = &dev->priv.roce;
+
+	if (!roce->ft)
+		return;
+
 	mlx5_rdma_disable_roce_steering(dev);
 	mlx5_rdma_del_roce_addr(dev);
 	mlx5_nic_vport_disable_roce(dev);
-- 
2.21.0


^ permalink raw reply related

* [PATCH mlx5-next 2/5] net/mlx5: Add HW bits and definitions required for SW steering
From: Saeed Mahameed @ 2019-08-29 23:42 UTC (permalink / raw)
  To: Saeed Mahameed, Leon Romanovsky
  Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Alex Vesker,
	Yevgeny Klitenik, Mark Bloch
In-Reply-To: <20190829234151.9958-1-saeedm@mellanox.com>

From: Alex Vesker <valex@mellanox.com>

Add the required Software Steering hardware definitions and
bits to mlx5_ifc.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Yevgeny Klitenik <kliten@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 include/linux/mlx5/device.h   |   7 +
 include/linux/mlx5/mlx5_ifc.h | 235 ++++++++++++++++++++++++++++------
 2 files changed, 205 insertions(+), 37 deletions(-)

diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index ce9839c8bc1a..5767d7fab5f3 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -1162,6 +1162,9 @@ enum mlx5_qcam_feature_groups {
 #define MLX5_CAP_FLOWTABLE(mdev, cap) \
 	MLX5_GET(flow_table_nic_cap, mdev->caps.hca_cur[MLX5_CAP_FLOW_TABLE], cap)
 
+#define MLX5_CAP64_FLOWTABLE(mdev, cap) \
+	MLX5_GET64(flow_table_nic_cap, (mdev)->caps.hca_cur[MLX5_CAP_FLOW_TABLE], cap)
+
 #define MLX5_CAP_FLOWTABLE_MAX(mdev, cap) \
 	MLX5_GET(flow_table_nic_cap, mdev->caps.hca_max[MLX5_CAP_FLOW_TABLE], cap)
 
@@ -1225,6 +1228,10 @@ enum mlx5_qcam_feature_groups {
 	MLX5_GET(e_switch_cap, \
 		 mdev->caps.hca_cur[MLX5_CAP_ESWITCH], cap)
 
+#define MLX5_CAP64_ESW_FLOWTABLE(mdev, cap) \
+	MLX5_GET64(flow_table_eswitch_cap, \
+		(mdev)->caps.hca_cur[MLX5_CAP_ESWITCH_FLOW_TABLE], cap)
+
 #define MLX5_CAP_ESW_MAX(mdev, cap) \
 	MLX5_GET(e_switch_cap, \
 		 mdev->caps.hca_max[MLX5_CAP_ESWITCH], cap)
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 4e278114d8b3..76e945dbc7ed 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -282,6 +282,7 @@ enum {
 	MLX5_CMD_OP_ALLOC_MODIFY_HEADER_CONTEXT   = 0x940,
 	MLX5_CMD_OP_DEALLOC_MODIFY_HEADER_CONTEXT = 0x941,
 	MLX5_CMD_OP_QUERY_MODIFY_HEADER_CONTEXT   = 0x942,
+	MLX5_CMD_OP_SYNC_STEERING                 = 0xb00,
 	MLX5_CMD_OP_FPGA_CREATE_QP                = 0x960,
 	MLX5_CMD_OP_FPGA_MODIFY_QP                = 0x961,
 	MLX5_CMD_OP_FPGA_QUERY_QP                 = 0x962,
@@ -485,7 +486,11 @@ union mlx5_ifc_gre_key_bits {
 };
 
 struct mlx5_ifc_fte_match_set_misc_bits {
-	u8         reserved_at_0[0x8];
+	u8         gre_c_present[0x1];
+	u8         reserved_auto1[0x1];
+	u8         gre_k_present[0x1];
+	u8         gre_s_present[0x1];
+	u8         source_vhca_port[0x4];
 	u8         source_sqn[0x18];
 
 	u8         source_eswitch_owner_vhca_id[0x10];
@@ -565,12 +570,38 @@ struct mlx5_ifc_fte_match_set_misc2_bits {
 
 	u8         metadata_reg_a[0x20];
 
-	u8         reserved_at_1a0[0x60];
+	u8         metadata_reg_b[0x20];
+
+	u8         reserved_at_1c0[0x40];
 };
 
 struct mlx5_ifc_fte_match_set_misc3_bits {
-	u8         reserved_at_0[0x120];
+	u8         inner_tcp_seq_num[0x20];
+
+	u8         outer_tcp_seq_num[0x20];
+
+	u8         inner_tcp_ack_num[0x20];
+
+	u8         outer_tcp_ack_num[0x20];
+
+	u8	   reserved_at_80[0x8];
+	u8         outer_vxlan_gpe_vni[0x18];
+
+	u8         outer_vxlan_gpe_next_protocol[0x8];
+	u8         outer_vxlan_gpe_flags[0x8];
+	u8	   reserved_at_b0[0x10];
+
+	u8	   icmp_header_data[0x20];
+
+	u8	   icmpv6_header_data[0x20];
+
+	u8	   icmp_type[0x8];
+	u8	   icmp_code[0x8];
+	u8	   icmpv6_type[0x8];
+	u8	   icmpv6_code[0x8];
+
 	u8         geneve_tlv_option_0_data[0x20];
+
 	u8         reserved_at_140[0xc0];
 };
 
@@ -666,7 +697,15 @@ struct mlx5_ifc_flow_table_nic_cap_bits {
 
 	struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_transmit_sniffer;
 
-	u8         reserved_at_e00[0x7200];
+	u8         reserved_at_e00[0x1200];
+
+	u8         sw_steering_nic_rx_action_drop_icm_address[0x40];
+
+	u8         sw_steering_nic_tx_action_drop_icm_address[0x40];
+
+	u8         sw_steering_nic_tx_action_allow_icm_address[0x40];
+
+	u8         reserved_at_20c0[0x5f40];
 };
 
 enum {
@@ -698,7 +737,17 @@ struct mlx5_ifc_flow_table_eswitch_cap_bits {
 
 	struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_esw_acl_egress;
 
-	u8      reserved_at_800[0x7800];
+	u8      reserved_at_800[0x1000];
+
+	u8      sw_steering_fdb_action_drop_icm_address_rx[0x40];
+
+	u8      sw_steering_fdb_action_drop_icm_address_tx[0x40];
+
+	u8      sw_steering_uplink_icm_address_rx[0x40];
+
+	u8      sw_steering_uplink_icm_address_tx[0x40];
+
+	u8      reserved_at_1900[0x6700];
 };
 
 enum {
@@ -849,6 +898,25 @@ struct mlx5_ifc_roce_cap_bits {
 	u8         reserved_at_100[0x700];
 };
 
+struct mlx5_ifc_sync_steering_in_bits {
+	u8         opcode[0x10];
+	u8         uid[0x10];
+
+	u8         reserved_at_20[0x10];
+	u8         op_mod[0x10];
+
+	u8         reserved_at_40[0xc0];
+};
+
+struct mlx5_ifc_sync_steering_out_bits {
+	u8         status[0x8];
+	u8         reserved_at_8[0x18];
+
+	u8         syndrome[0x20];
+
+	u8         reserved_at_40[0x40];
+};
+
 struct mlx5_ifc_device_mem_cap_bits {
 	u8         memic[0x1];
 	u8         reserved_at_1[0x1f];
@@ -1041,6 +1109,12 @@ enum {
 	MLX5_CAP_UMR_FENCE_NONE		= 0x2,
 };
 
+enum {
+	MLX5_FLEX_PARSER_VXLAN_GPE_ENABLED	= 1 << 7,
+	MLX5_FLEX_PARSER_ICMP_V4_ENABLED	= 1 << 8,
+	MLX5_FLEX_PARSER_ICMP_V6_ENABLED	= 1 << 9,
+};
+
 enum {
 	MLX5_UCTX_CAP_RAW_TX = 1UL << 0,
 	MLX5_UCTX_CAP_INTERNAL_DEV_RES = 1UL << 1,
@@ -1414,7 +1488,14 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 
 	u8         reserved_at_6c0[0x4];
 	u8         flex_parser_id_geneve_tlv_option_0[0x4];
-	u8	   reserved_at_6c8[0x28];
+	u8         flex_parser_id_icmp_dw1[0x4];
+	u8         flex_parser_id_icmp_dw0[0x4];
+	u8         flex_parser_id_icmpv6_dw1[0x4];
+	u8         flex_parser_id_icmpv6_dw0[0x4];
+	u8         flex_parser_id_outer_first_mpls_over_gre[0x4];
+	u8         flex_parser_id_outer_first_mpls_over_udp_label[0x4];
+
+	u8	   reserved_at_6e0[0x10];
 	u8	   sf_base_id[0x10];
 
 	u8	   reserved_at_700[0x80];
@@ -2652,6 +2733,7 @@ union mlx5_ifc_hca_cap_union_bits {
 	struct mlx5_ifc_debug_cap_bits debug_cap;
 	struct mlx5_ifc_fpga_cap_bits fpga_cap;
 	struct mlx5_ifc_tls_cap_bits tls_cap;
+	struct mlx5_ifc_device_mem_cap_bits device_mem_cap;
 	u8         reserved_at_0[0x8000];
 };
 
@@ -3255,7 +3337,11 @@ struct mlx5_ifc_esw_vport_context_bits {
 	u8         cvlan_pcp[0x3];
 	u8         cvlan_id[0xc];
 
-	u8         reserved_at_60[0x7a0];
+	u8         reserved_at_60[0x720];
+
+	u8         sw_steering_vport_icm_address_rx[0x40];
+
+	u8         sw_steering_vport_icm_address_tx[0x40];
 };
 
 enum {
@@ -4941,23 +5027,98 @@ struct mlx5_ifc_query_hca_cap_in_bits {
 	u8         reserved_at_20[0x10];
 	u8         op_mod[0x10];
 
-	u8         reserved_at_40[0x40];
+	u8         other_function[0x1];
+	u8         reserved_at_41[0xf];
+	u8         function_id[0x10];
+
+	u8         reserved_at_60[0x20];
 };
 
-struct mlx5_ifc_query_flow_table_out_bits {
+struct mlx5_ifc_other_hca_cap_bits {
+	u8         roce[0x1];
+	u8         reserved_0[0x27f];
+};
+
+struct mlx5_ifc_query_other_hca_cap_out_bits {
 	u8         status[0x8];
-	u8         reserved_at_8[0x18];
+	u8         reserved_0[0x18];
 
 	u8         syndrome[0x20];
 
-	u8         reserved_at_40[0x80];
+	u8         reserved_1[0x40];
 
-	u8         reserved_at_c0[0x8];
+	struct     mlx5_ifc_other_hca_cap_bits other_capability;
+};
+
+struct mlx5_ifc_query_other_hca_cap_in_bits {
+	u8         opcode[0x10];
+	u8         reserved_0[0x10];
+
+	u8         reserved_1[0x10];
+	u8         op_mod[0x10];
+
+	u8         reserved_2[0x10];
+	u8         function_id[0x10];
+
+	u8         reserved_3[0x20];
+};
+
+struct mlx5_ifc_modify_other_hca_cap_out_bits {
+	u8         status[0x8];
+	u8         reserved_0[0x18];
+
+	u8         syndrome[0x20];
+
+	u8         reserved_1[0x40];
+};
+
+struct mlx5_ifc_modify_other_hca_cap_in_bits {
+	u8         opcode[0x10];
+	u8         reserved_0[0x10];
+
+	u8         reserved_1[0x10];
+	u8         op_mod[0x10];
+
+	u8         reserved_2[0x10];
+	u8         function_id[0x10];
+	u8         field_select[0x20];
+
+	struct     mlx5_ifc_other_hca_cap_bits other_capability;
+};
+
+struct mlx5_ifc_flow_table_context_bits {
+	u8         reformat_en[0x1];
+	u8         decap_en[0x1];
+	u8         sw_owner[0x1];
+	u8         termination_table[0x1];
+	u8         table_miss_action[0x4];
 	u8         level[0x8];
-	u8         reserved_at_d0[0x8];
+	u8         reserved_at_10[0x8];
 	u8         log_size[0x8];
 
-	u8         reserved_at_e0[0x120];
+	u8         reserved_at_20[0x8];
+	u8         table_miss_id[0x18];
+
+	u8         reserved_at_40[0x8];
+	u8         lag_master_next_table_id[0x18];
+
+	u8         reserved_at_60[0x60];
+
+	u8         sw_owner_icm_root_1[0x40];
+
+	u8         sw_owner_icm_root_0[0x40];
+
+};
+
+struct mlx5_ifc_query_flow_table_out_bits {
+	u8         status[0x8];
+	u8         reserved_at_8[0x18];
+
+	u8         syndrome[0x20];
+
+	u8         reserved_at_40[0x80];
+
+	struct mlx5_ifc_flow_table_context_bits flow_table_context;
 };
 
 struct mlx5_ifc_query_flow_table_in_bits {
@@ -5227,7 +5388,7 @@ struct mlx5_ifc_alloc_packet_reformat_context_out_bits {
 	u8         reserved_at_60[0x20];
 };
 
-enum {
+enum mlx5_reformat_ctx_type {
 	MLX5_REFORMAT_TYPE_L2_TO_VXLAN = 0x0,
 	MLX5_REFORMAT_TYPE_L2_TO_NVGRE = 0x1,
 	MLX5_REFORMAT_TYPE_L2_TO_L2_TUNNEL = 0x2,
@@ -5323,7 +5484,16 @@ enum {
 	MLX5_ACTION_IN_FIELD_OUT_DIPV4         = 0x16,
 	MLX5_ACTION_IN_FIELD_OUT_FIRST_VID     = 0x17,
 	MLX5_ACTION_IN_FIELD_OUT_IPV6_HOPLIMIT = 0x47,
+	MLX5_ACTION_IN_FIELD_METADATA_REG_A    = 0x49,
+	MLX5_ACTION_IN_FIELD_METADATA_REG_B    = 0x50,
 	MLX5_ACTION_IN_FIELD_METADATA_REG_C_0  = 0x51,
+	MLX5_ACTION_IN_FIELD_METADATA_REG_C_1  = 0x52,
+	MLX5_ACTION_IN_FIELD_METADATA_REG_C_2  = 0x53,
+	MLX5_ACTION_IN_FIELD_METADATA_REG_C_3  = 0x54,
+	MLX5_ACTION_IN_FIELD_METADATA_REG_C_4  = 0x55,
+	MLX5_ACTION_IN_FIELD_METADATA_REG_C_5  = 0x56,
+	MLX5_ACTION_IN_FIELD_OUT_TCP_SEQ_NUM   = 0x59,
+	MLX5_ACTION_IN_FIELD_OUT_TCP_ACK_NUM   = 0x5B,
 };
 
 struct mlx5_ifc_alloc_modify_header_context_out_bits {
@@ -7369,35 +7539,26 @@ struct mlx5_ifc_create_mkey_in_bits {
 	u8         klm_pas_mtt[0][0x20];
 };
 
+enum {
+	MLX5_FLOW_TABLE_TYPE_NIC_RX		= 0x0,
+	MLX5_FLOW_TABLE_TYPE_NIC_TX		= 0x1,
+	MLX5_FLOW_TABLE_TYPE_ESW_EGRESS_ACL	= 0x2,
+	MLX5_FLOW_TABLE_TYPE_ESW_INGRESS_ACL	= 0x3,
+	MLX5_FLOW_TABLE_TYPE_FDB		= 0X4,
+	MLX5_FLOW_TABLE_TYPE_SNIFFER_RX		= 0X5,
+	MLX5_FLOW_TABLE_TYPE_SNIFFER_TX		= 0X6,
+};
+
 struct mlx5_ifc_create_flow_table_out_bits {
 	u8         status[0x8];
-	u8         reserved_at_8[0x18];
+	u8         icm_address_63_40[0x18];
 
 	u8         syndrome[0x20];
 
-	u8         reserved_at_40[0x8];
+	u8         icm_address_39_32[0x8];
 	u8         table_id[0x18];
 
-	u8         reserved_at_60[0x20];
-};
-
-struct mlx5_ifc_flow_table_context_bits {
-	u8         reformat_en[0x1];
-	u8         decap_en[0x1];
-	u8         reserved_at_2[0x1];
-	u8         termination_table[0x1];
-	u8         table_miss_action[0x4];
-	u8         level[0x8];
-	u8         reserved_at_10[0x8];
-	u8         log_size[0x8];
-
-	u8         reserved_at_20[0x8];
-	u8         table_miss_id[0x18];
-
-	u8         reserved_at_40[0x8];
-	u8         lag_master_next_table_id[0x18];
-
-	u8         reserved_at_60[0xe0];
+	u8         icm_address_31_0[0x20];
 };
 
 struct mlx5_ifc_create_flow_table_in_bits {
-- 
2.21.0


^ permalink raw reply related

* [PATCH mlx5-next 1/5] net/mlx5: Move device memory management to mlx5_core
From: Saeed Mahameed @ 2019-08-29 23:42 UTC (permalink / raw)
  To: Saeed Mahameed, Leon Romanovsky
  Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
	Ariel Levkovich, Mark Bloch
In-Reply-To: <20190829234151.9958-1-saeedm@mellanox.com>

From: Ariel Levkovich <lariel@mellanox.com>

Move the device memory allocation and deallocation commands
SW ICM memory to mlx5_core to expose this API for all
mlx5_core users.

This comes as preparation for supporting SW steering in kernel
where it will be required to allocate and register device
memory for direct rule insertion.

In addition, an API to register this device memory for future
remote access operations is introduced using the create_mkey
commands.

Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/infiniband/hw/mlx5/cmd.c              | 130 ----------
 drivers/infiniband/hw/mlx5/cmd.h              |   4 -
 drivers/infiniband/hw/mlx5/main.c             | 102 +++-----
 drivers/infiniband/hw/mlx5/mlx5_ib.h          |   2 -
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
 .../net/ethernet/mellanox/mlx5/core/lib/dm.c  | 223 ++++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/main.c    |   5 +
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   3 +
 include/linux/mlx5/driver.h                   |  14 ++
 9 files changed, 276 insertions(+), 209 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/dm.c

diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c
index 6c8645033102..4937947400cd 100644
--- a/drivers/infiniband/hw/mlx5/cmd.c
+++ b/drivers/infiniband/hw/mlx5/cmd.c
@@ -186,136 +186,6 @@ int mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr, u64 length)
 	return err;
 }
 
-int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
-			  u16 uid, phys_addr_t *addr, u32 *obj_id)
-{
-	struct mlx5_core_dev *dev = dm->dev;
-	u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {};
-	u32 in[MLX5_ST_SZ_DW(create_sw_icm_in)] = {};
-	unsigned long *block_map;
-	u64 icm_start_addr;
-	u32 log_icm_size;
-	u32 num_blocks;
-	u32 max_blocks;
-	u64 block_idx;
-	void *sw_icm;
-	int ret;
-
-	MLX5_SET(general_obj_in_cmd_hdr, in, opcode,
-		 MLX5_CMD_OP_CREATE_GENERAL_OBJECT);
-	MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_SW_ICM);
-	MLX5_SET(general_obj_in_cmd_hdr, in, uid, uid);
-
-	switch (type) {
-	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
-		icm_start_addr = MLX5_CAP64_DEV_MEM(dev,
-						steering_sw_icm_start_address);
-		log_icm_size = MLX5_CAP_DEV_MEM(dev, log_steering_sw_icm_size);
-		block_map = dm->steering_sw_icm_alloc_blocks;
-		break;
-	case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
-		icm_start_addr = MLX5_CAP64_DEV_MEM(dev,
-					header_modify_sw_icm_start_address);
-		log_icm_size = MLX5_CAP_DEV_MEM(dev,
-						log_header_modify_sw_icm_size);
-		block_map = dm->header_modify_sw_icm_alloc_blocks;
-		break;
-	default:
-		return -EINVAL;
-	}
-
-	num_blocks = (length + MLX5_SW_ICM_BLOCK_SIZE(dev) - 1) >>
-		     MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
-	max_blocks = BIT(log_icm_size - MLX5_LOG_SW_ICM_BLOCK_SIZE(dev));
-	spin_lock(&dm->lock);
-	block_idx = bitmap_find_next_zero_area(block_map,
-					       max_blocks,
-					       0,
-					       num_blocks, 0);
-
-	if (block_idx < max_blocks)
-		bitmap_set(block_map,
-			   block_idx, num_blocks);
-
-	spin_unlock(&dm->lock);
-
-	if (block_idx >= max_blocks)
-		return -ENOMEM;
-
-	sw_icm = MLX5_ADDR_OF(create_sw_icm_in, in, sw_icm);
-	icm_start_addr += block_idx << MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
-	MLX5_SET64(sw_icm, sw_icm, sw_icm_start_addr,
-		   icm_start_addr);
-	MLX5_SET(sw_icm, sw_icm, log_sw_icm_size, ilog2(length));
-
-	ret = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
-	if (ret) {
-		spin_lock(&dm->lock);
-		bitmap_clear(block_map,
-			     block_idx, num_blocks);
-		spin_unlock(&dm->lock);
-
-		return ret;
-	}
-
-	*addr = icm_start_addr;
-	*obj_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id);
-
-	return 0;
-}
-
-int mlx5_cmd_dealloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
-			    u16 uid, phys_addr_t addr, u32 obj_id)
-{
-	struct mlx5_core_dev *dev = dm->dev;
-	u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {};
-	u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {};
-	unsigned long *block_map;
-	u32 num_blocks;
-	u64 start_idx;
-	int err;
-
-	num_blocks = (length + MLX5_SW_ICM_BLOCK_SIZE(dev) - 1) >>
-		     MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
-
-	switch (type) {
-	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
-		start_idx =
-			(addr - MLX5_CAP64_DEV_MEM(
-					dev, steering_sw_icm_start_address)) >>
-			MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
-		block_map = dm->steering_sw_icm_alloc_blocks;
-		break;
-	case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
-		start_idx =
-			(addr -
-			 MLX5_CAP64_DEV_MEM(
-				 dev, header_modify_sw_icm_start_address)) >>
-			MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
-		block_map = dm->header_modify_sw_icm_alloc_blocks;
-		break;
-	default:
-		return -EINVAL;
-	}
-
-	MLX5_SET(general_obj_in_cmd_hdr, in, opcode,
-		 MLX5_CMD_OP_DESTROY_GENERAL_OBJECT);
-	MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_SW_ICM);
-	MLX5_SET(general_obj_in_cmd_hdr, in, obj_id, obj_id);
-	MLX5_SET(general_obj_in_cmd_hdr, in, uid, uid);
-
-	err =  mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
-	if (err)
-		return err;
-
-	spin_lock(&dm->lock);
-	bitmap_clear(block_map,
-		     start_idx, num_blocks);
-	spin_unlock(&dm->lock);
-
-	return 0;
-}
-
 int mlx5_cmd_query_ext_ppcnt_counters(struct mlx5_core_dev *dev, void *out)
 {
 	u32 in[MLX5_ST_SZ_DW(ppcnt_reg)] = {};
diff --git a/drivers/infiniband/hw/mlx5/cmd.h b/drivers/infiniband/hw/mlx5/cmd.h
index 0572dcba6eae..169cab4915e3 100644
--- a/drivers/infiniband/hw/mlx5/cmd.h
+++ b/drivers/infiniband/hw/mlx5/cmd.h
@@ -65,8 +65,4 @@ int mlx5_cmd_alloc_q_counter(struct mlx5_core_dev *dev, u16 *counter_id,
 			     u16 uid);
 int mlx5_cmd_mad_ifc(struct mlx5_core_dev *dev, const void *inb, void *outb,
 		     u16 opmod, u8 port);
-int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
-			  u16 uid, phys_addr_t *addr, u32 *obj_id);
-int mlx5_cmd_dealloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
-			    u16 uid, phys_addr_t addr, u32 obj_id);
 #endif /* MLX5_IB_CMD_H */
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index c2a5780cb394..42fdbbea06f0 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2280,6 +2280,7 @@ static inline int check_dm_type_support(struct mlx5_ib_dev *dev,
 			return -EOPNOTSUPP;
 		break;
 	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
+	case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
 		if (!capable(CAP_SYS_RAWIO) ||
 		    !capable(CAP_NET_RAW))
 			return -EPERM;
@@ -2344,20 +2345,20 @@ static int handle_alloc_dm_sw_icm(struct ib_ucontext *ctx,
 				  struct uverbs_attr_bundle *attrs,
 				  int type)
 {
-	struct mlx5_dm *dm_db = &to_mdev(ctx->device)->dm;
+	struct mlx5_core_dev *dev = to_mdev(ctx->device)->mdev;
 	u64 act_size;
 	int err;
 
 	/* Allocation size must a multiple of the basic block size
 	 * and a power of 2.
 	 */
-	act_size = round_up(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev));
+	act_size = round_up(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dev));
 	act_size = roundup_pow_of_two(act_size);
 
 	dm->size = act_size;
-	err = mlx5_cmd_alloc_sw_icm(dm_db, type, act_size,
-				    to_mucontext(ctx)->devx_uid, &dm->dev_addr,
-				    &dm->icm_dm.obj_id);
+	err = mlx5_dm_sw_icm_alloc(dev, type, act_size,
+				   to_mucontext(ctx)->devx_uid, &dm->dev_addr,
+				   &dm->icm_dm.obj_id);
 	if (err)
 		return err;
 
@@ -2365,9 +2366,9 @@ static int handle_alloc_dm_sw_icm(struct ib_ucontext *ctx,
 			     MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET,
 			     &dm->dev_addr, sizeof(dm->dev_addr));
 	if (err)
-		mlx5_cmd_dealloc_sw_icm(dm_db, type, dm->size,
-					to_mucontext(ctx)->devx_uid,
-					dm->dev_addr, dm->icm_dm.obj_id);
+		mlx5_dm_sw_icm_dealloc(dev, type, dm->size,
+				       to_mucontext(ctx)->devx_uid, dm->dev_addr,
+				       dm->icm_dm.obj_id);
 
 	return err;
 }
@@ -2407,8 +2408,14 @@ struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev,
 					    attrs);
 		break;
 	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
+		err = handle_alloc_dm_sw_icm(context, dm,
+					     attr, attrs,
+					     MLX5_SW_ICM_TYPE_STEERING);
+		break;
 	case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
-		err = handle_alloc_dm_sw_icm(context, dm, attr, attrs, type);
+		err = handle_alloc_dm_sw_icm(context, dm,
+					     attr, attrs,
+					     MLX5_SW_ICM_TYPE_HEADER_MODIFY);
 		break;
 	default:
 		err = -EOPNOTSUPP;
@@ -2428,6 +2435,7 @@ int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs)
 {
 	struct mlx5_ib_ucontext *ctx = rdma_udata_to_drv_context(
 		&attrs->driver_udata, struct mlx5_ib_ucontext, ibucontext);
+	struct mlx5_core_dev *dev = to_mdev(ibdm->device)->mdev;
 	struct mlx5_dm *dm_db = &to_mdev(ibdm->device)->dm;
 	struct mlx5_ib_dm *dm = to_mdm(ibdm);
 	u32 page_idx;
@@ -2439,19 +2447,23 @@ int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs)
 		if (ret)
 			return ret;
 
-		page_idx = (dm->dev_addr -
-			    pci_resource_start(dm_db->dev->pdev, 0) -
-			    MLX5_CAP64_DEV_MEM(dm_db->dev,
-					       memic_bar_start_addr)) >>
-			   PAGE_SHIFT;
+		page_idx = (dm->dev_addr - pci_resource_start(dev->pdev, 0) -
+			    MLX5_CAP64_DEV_MEM(dev, memic_bar_start_addr)) >>
+			    PAGE_SHIFT;
 		bitmap_clear(ctx->dm_pages, page_idx,
 			     DIV_ROUND_UP(dm->size, PAGE_SIZE));
 		break;
 	case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
+		ret = mlx5_dm_sw_icm_dealloc(dev, MLX5_SW_ICM_TYPE_STEERING,
+					     dm->size, ctx->devx_uid, dm->dev_addr,
+					     dm->icm_dm.obj_id);
+		if (ret)
+			return ret;
+		break;
 	case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM:
-		ret = mlx5_cmd_dealloc_sw_icm(dm_db, dm->type, dm->size,
-					      ctx->devx_uid, dm->dev_addr,
-					      dm->icm_dm.obj_id);
+		ret = mlx5_dm_sw_icm_dealloc(dev, MLX5_SW_ICM_TYPE_HEADER_MODIFY,
+					     dm->size, ctx->devx_uid, dm->dev_addr,
+					     dm->icm_dm.obj_id);
 		if (ret)
 			return ret;
 		break;
@@ -6097,8 +6109,6 @@ static struct ib_counters *mlx5_ib_create_counters(struct ib_device *device,
 
 static void mlx5_ib_stage_init_cleanup(struct mlx5_ib_dev *dev)
 {
-	struct mlx5_core_dev *mdev = dev->mdev;
-
 	mlx5_ib_cleanup_multiport_master(dev);
 	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
 		srcu_barrier(&dev->mr_srcu);
@@ -6106,29 +6116,11 @@ static void mlx5_ib_stage_init_cleanup(struct mlx5_ib_dev *dev)
 	}
 
 	WARN_ON(!bitmap_empty(dev->dm.memic_alloc_pages, MLX5_MAX_MEMIC_PAGES));
-
-	WARN_ON(dev->dm.steering_sw_icm_alloc_blocks &&
-		!bitmap_empty(
-			dev->dm.steering_sw_icm_alloc_blocks,
-			BIT(MLX5_CAP_DEV_MEM(mdev, log_steering_sw_icm_size) -
-			    MLX5_LOG_SW_ICM_BLOCK_SIZE(mdev))));
-
-	kfree(dev->dm.steering_sw_icm_alloc_blocks);
-
-	WARN_ON(dev->dm.header_modify_sw_icm_alloc_blocks &&
-		!bitmap_empty(dev->dm.header_modify_sw_icm_alloc_blocks,
-			      BIT(MLX5_CAP_DEV_MEM(
-					  mdev, log_header_modify_sw_icm_size) -
-				  MLX5_LOG_SW_ICM_BLOCK_SIZE(mdev))));
-
-	kfree(dev->dm.header_modify_sw_icm_alloc_blocks);
 }
 
 static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev)
 {
 	struct mlx5_core_dev *mdev = dev->mdev;
-	u64 header_modify_icm_blocks = 0;
-	u64 steering_icm_blocks = 0;
 	int err;
 	int i;
 
@@ -6173,51 +6165,17 @@ static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev)
 	INIT_LIST_HEAD(&dev->qp_list);
 	spin_lock_init(&dev->reset_flow_resource_lock);
 
-	if (MLX5_CAP_GEN_64(mdev, general_obj_types) &
-	    MLX5_GENERAL_OBJ_TYPES_CAP_SW_ICM) {
-		if (MLX5_CAP64_DEV_MEM(mdev, steering_sw_icm_start_address)) {
-			steering_icm_blocks =
-				BIT(MLX5_CAP_DEV_MEM(mdev,
-						     log_steering_sw_icm_size) -
-				    MLX5_LOG_SW_ICM_BLOCK_SIZE(mdev));
-
-			dev->dm.steering_sw_icm_alloc_blocks =
-				kcalloc(BITS_TO_LONGS(steering_icm_blocks),
-					sizeof(unsigned long), GFP_KERNEL);
-			if (!dev->dm.steering_sw_icm_alloc_blocks)
-				goto err_mp;
-		}
-
-		if (MLX5_CAP64_DEV_MEM(mdev,
-				       header_modify_sw_icm_start_address)) {
-			header_modify_icm_blocks = BIT(
-				MLX5_CAP_DEV_MEM(
-					mdev, log_header_modify_sw_icm_size) -
-				MLX5_LOG_SW_ICM_BLOCK_SIZE(mdev));
-
-			dev->dm.header_modify_sw_icm_alloc_blocks =
-				kcalloc(BITS_TO_LONGS(header_modify_icm_blocks),
-					sizeof(unsigned long), GFP_KERNEL);
-			if (!dev->dm.header_modify_sw_icm_alloc_blocks)
-				goto err_dm;
-		}
-	}
-
 	spin_lock_init(&dev->dm.lock);
 	dev->dm.dev = mdev;
 
 	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
 		err = init_srcu_struct(&dev->mr_srcu);
 		if (err)
-			goto err_dm;
+			goto err_mp;
 	}
 
 	return 0;
 
-err_dm:
-	kfree(dev->dm.steering_sw_icm_alloc_blocks);
-	kfree(dev->dm.header_modify_sw_icm_alloc_blocks);
-
 err_mp:
 	mlx5_ib_cleanup_multiport_master(dev);
 
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index c482f19958b3..afd69ba33b2b 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -880,8 +880,6 @@ struct mlx5_dm {
 	 */
 	spinlock_t lock;
 	DECLARE_BITMAP(memic_alloc_pages, MLX5_MAX_MEMIC_PAGES);
-	unsigned long *steering_sw_icm_alloc_blocks;
-	unsigned long *header_modify_sw_icm_alloc_blocks;
 };
 
 struct mlx5_read_counters_attr {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 57d2cc666fe3..4eb52e8500c3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -15,7 +15,7 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 		health.o mcg.o cq.o alloc.o qp.o port.o mr.o pd.o \
 		transobj.o vport.o sriov.o fs_cmd.o fs_core.o pci_irq.o \
 		fs_counters.o rl.o lag.o dev.o events.o wq.o lib/gid.o \
-		lib/devcom.o lib/pci_vsc.o diag/fs_tracepoint.o \
+		lib/devcom.o lib/pci_vsc.o lib/dm.o diag/fs_tracepoint.o \
 		diag/fw_tracer.o diag/crdump.o devlink.o
 
 #
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/dm.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/dm.c
new file mode 100644
index 000000000000..e065c2f68f5a
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/dm.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+// Copyright (c) 2019 Mellanox Technologies
+
+#include <linux/mlx5/driver.h>
+#include <linux/mlx5/device.h>
+
+#include "mlx5_core.h"
+#include "lib/mlx5.h"
+
+struct mlx5_dm {
+	/* protect access to icm bitmask */
+	spinlock_t lock;
+	unsigned long *steering_sw_icm_alloc_blocks;
+	unsigned long *header_modify_sw_icm_alloc_blocks;
+};
+
+struct mlx5_dm *mlx5_dm_create(struct mlx5_core_dev *dev)
+{
+	u64 header_modify_icm_blocks = 0;
+	u64 steering_icm_blocks = 0;
+	struct mlx5_dm *dm;
+
+	if (!(MLX5_CAP_GEN_64(dev, general_obj_types) & MLX5_GENERAL_OBJ_TYPES_CAP_SW_ICM))
+		return 0;
+
+	dm = kzalloc(sizeof(*dm), GFP_KERNEL);
+	if (!dm)
+		return ERR_PTR(-ENOMEM);
+
+	spin_lock_init(&dm->lock);
+
+	if (MLX5_CAP64_DEV_MEM(dev, steering_sw_icm_start_address)) {
+		steering_icm_blocks =
+			BIT(MLX5_CAP_DEV_MEM(dev, log_steering_sw_icm_size) -
+			    MLX5_LOG_SW_ICM_BLOCK_SIZE(dev));
+
+		dm->steering_sw_icm_alloc_blocks =
+			kcalloc(BITS_TO_LONGS(steering_icm_blocks),
+				sizeof(unsigned long), GFP_KERNEL);
+		if (!dm->steering_sw_icm_alloc_blocks)
+			goto err_steering;
+	}
+
+	if (MLX5_CAP64_DEV_MEM(dev, header_modify_sw_icm_start_address)) {
+		header_modify_icm_blocks =
+			BIT(MLX5_CAP_DEV_MEM(dev, log_header_modify_sw_icm_size) -
+			    MLX5_LOG_SW_ICM_BLOCK_SIZE(dev));
+
+		dm->header_modify_sw_icm_alloc_blocks =
+			kcalloc(BITS_TO_LONGS(header_modify_icm_blocks),
+				sizeof(unsigned long), GFP_KERNEL);
+		if (!dm->header_modify_sw_icm_alloc_blocks)
+			goto err_modify_hdr;
+	}
+
+	return dm;
+
+err_modify_hdr:
+	kfree(dm->steering_sw_icm_alloc_blocks);
+
+err_steering:
+	kfree(dm);
+
+	return ERR_PTR(-ENOMEM);
+}
+
+void mlx5_dm_cleanup(struct mlx5_core_dev *dev)
+{
+	struct mlx5_dm *dm = dev->dm;
+
+	if (!dev->dm)
+		return;
+
+	if (dm->steering_sw_icm_alloc_blocks) {
+		WARN_ON(!bitmap_empty(dm->steering_sw_icm_alloc_blocks,
+				      BIT(MLX5_CAP_DEV_MEM(dev, log_steering_sw_icm_size) -
+					  MLX5_LOG_SW_ICM_BLOCK_SIZE(dev))));
+		kfree(dm->steering_sw_icm_alloc_blocks);
+	}
+
+	if (dm->header_modify_sw_icm_alloc_blocks) {
+		WARN_ON(!bitmap_empty(dm->header_modify_sw_icm_alloc_blocks,
+				      BIT(MLX5_CAP_DEV_MEM(dev,
+							   log_header_modify_sw_icm_size) -
+				      MLX5_LOG_SW_ICM_BLOCK_SIZE(dev))));
+		kfree(dm->header_modify_sw_icm_alloc_blocks);
+	}
+
+	kfree(dm);
+}
+
+int mlx5_dm_sw_icm_alloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type,
+			 u64 length, u16 uid, phys_addr_t *addr, u32 *obj_id)
+{
+	u32 num_blocks = DIV_ROUND_UP_ULL(length, MLX5_SW_ICM_BLOCK_SIZE(dev));
+	u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {};
+	u32 in[MLX5_ST_SZ_DW(create_sw_icm_in)] = {};
+	struct mlx5_dm *dm = dev->dm;
+	unsigned long *block_map;
+	u64 icm_start_addr;
+	u32 log_icm_size;
+	u32 max_blocks;
+	u64 block_idx;
+	void *sw_icm;
+	int ret;
+
+	if (!dev->dm)
+		return -EOPNOTSUPP;
+
+	if (!length || (length & (length - 1)) ||
+	    length & (MLX5_SW_ICM_BLOCK_SIZE(dev) - 1))
+		return -EINVAL;
+
+	MLX5_SET(general_obj_in_cmd_hdr, in, opcode,
+		 MLX5_CMD_OP_CREATE_GENERAL_OBJECT);
+	MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_SW_ICM);
+	MLX5_SET(general_obj_in_cmd_hdr, in, uid, uid);
+
+	switch (type) {
+	case MLX5_SW_ICM_TYPE_STEERING:
+		icm_start_addr = MLX5_CAP64_DEV_MEM(dev, steering_sw_icm_start_address);
+		log_icm_size = MLX5_CAP_DEV_MEM(dev, log_steering_sw_icm_size);
+		block_map = dm->steering_sw_icm_alloc_blocks;
+		break;
+	case MLX5_SW_ICM_TYPE_HEADER_MODIFY:
+		icm_start_addr = MLX5_CAP64_DEV_MEM(dev, header_modify_sw_icm_start_address);
+		log_icm_size = MLX5_CAP_DEV_MEM(dev,
+						log_header_modify_sw_icm_size);
+		block_map = dm->header_modify_sw_icm_alloc_blocks;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	if (!block_map)
+		return -EOPNOTSUPP;
+
+	max_blocks = BIT(log_icm_size - MLX5_LOG_SW_ICM_BLOCK_SIZE(dev));
+	spin_lock(&dm->lock);
+	block_idx = bitmap_find_next_zero_area(block_map,
+					       max_blocks,
+					       0,
+					       num_blocks, 0);
+
+	if (block_idx < max_blocks)
+		bitmap_set(block_map,
+			   block_idx, num_blocks);
+
+	spin_unlock(&dm->lock);
+
+	if (block_idx >= max_blocks)
+		return -ENOMEM;
+
+	sw_icm = MLX5_ADDR_OF(create_sw_icm_in, in, sw_icm);
+	icm_start_addr += block_idx << MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
+	MLX5_SET64(sw_icm, sw_icm, sw_icm_start_addr,
+		   icm_start_addr);
+	MLX5_SET(sw_icm, sw_icm, log_sw_icm_size, ilog2(length));
+
+	ret = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
+	if (ret) {
+		spin_lock(&dm->lock);
+		bitmap_clear(block_map,
+			     block_idx, num_blocks);
+		spin_unlock(&dm->lock);
+
+		return ret;
+	}
+
+	*addr = icm_start_addr;
+	*obj_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mlx5_dm_sw_icm_alloc);
+
+int mlx5_dm_sw_icm_dealloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type,
+			   u64 length, u16 uid, phys_addr_t addr, u32 obj_id)
+{
+	u32 num_blocks = DIV_ROUND_UP_ULL(length, MLX5_SW_ICM_BLOCK_SIZE(dev));
+	u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {};
+	u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {};
+	struct mlx5_dm *dm = dev->dm;
+	unsigned long *block_map;
+	u64 icm_start_addr;
+	u64 start_idx;
+	int err;
+
+	if (!dev->dm)
+		return -EOPNOTSUPP;
+
+	switch (type) {
+	case MLX5_SW_ICM_TYPE_STEERING:
+		icm_start_addr = MLX5_CAP64_DEV_MEM(dev, steering_sw_icm_start_address);
+		block_map = dm->steering_sw_icm_alloc_blocks;
+		break;
+	case MLX5_SW_ICM_TYPE_HEADER_MODIFY:
+		icm_start_addr = MLX5_CAP64_DEV_MEM(dev, header_modify_sw_icm_start_address);
+		block_map = dm->header_modify_sw_icm_alloc_blocks;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	MLX5_SET(general_obj_in_cmd_hdr, in, opcode,
+		 MLX5_CMD_OP_DESTROY_GENERAL_OBJECT);
+	MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_SW_ICM);
+	MLX5_SET(general_obj_in_cmd_hdr, in, obj_id, obj_id);
+	MLX5_SET(general_obj_in_cmd_hdr, in, uid, uid);
+
+	err =  mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
+	if (err)
+		return err;
+
+	start_idx = (addr - icm_start_addr) >> MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
+	spin_lock(&dm->lock);
+	bitmap_clear(block_map,
+		     start_idx, num_blocks);
+	spin_unlock(&dm->lock);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mlx5_dm_sw_icm_dealloc);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 7f70ecb1db6d..c1679d11d71f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -879,6 +879,10 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
 		goto err_eswitch_cleanup;
 	}
 
+	dev->dm = mlx5_dm_create(dev);
+	if (IS_ERR(dev->dm))
+		mlx5_core_warn(dev, "Failed to init device memory%d\n", err);
+
 	dev->tracer = mlx5_fw_tracer_create(dev);
 
 	return 0;
@@ -912,6 +916,7 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
 static void mlx5_cleanup_once(struct mlx5_core_dev *dev)
 {
 	mlx5_fw_tracer_destroy(dev->tracer);
+	mlx5_dm_cleanup(dev);
 	mlx5_fpga_cleanup(dev);
 	mlx5_eswitch_cleanup(dev->priv.eswitch);
 	mlx5_sriov_cleanup(dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 471bbc48bc1f..bbcf4ee40ad5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -198,6 +198,9 @@ int mlx5_set_mtpps(struct mlx5_core_dev *mdev, u32 *mtpps, u32 mtpps_size);
 int mlx5_query_mtppse(struct mlx5_core_dev *mdev, u8 pin, u8 *arm, u8 *mode);
 int mlx5_set_mtppse(struct mlx5_core_dev *mdev, u8 pin, u8 arm, u8 mode);
 
+struct mlx5_dm *mlx5_dm_create(struct mlx5_core_dev *dev);
+void mlx5_dm_cleanup(struct mlx5_core_dev *dev);
+
 #define MLX5_PPS_CAP(mdev) (MLX5_CAP_GEN((mdev), pps) &&		\
 			    MLX5_CAP_GEN((mdev), pps_modify) &&		\
 			    MLX5_CAP_MCAM_FEATURE((mdev), mtpps_fs) &&	\
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 0acd28f2e62c..72bc6ce44b55 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -622,6 +622,11 @@ struct mlx5e_resources {
 	struct mlx5_sq_bfreg       bfreg;
 };
 
+enum mlx5_sw_icm_type {
+	MLX5_SW_ICM_TYPE_STEERING,
+	MLX5_SW_ICM_TYPE_HEADER_MODIFY,
+};
+
 #define MLX5_MAX_RESERVED_GIDS 8
 
 struct mlx5_rsvd_gids {
@@ -653,10 +658,14 @@ struct mlx5_clock {
 	struct mlx5_pps            pps_info;
 };
 
+struct mlx5_dm;
 struct mlx5_fw_tracer;
 struct mlx5_vxlan;
 struct mlx5_geneve;
 
+#define MLX5_LOG_SW_ICM_BLOCK_SIZE(dev) (MLX5_CAP_DEV_MEM(dev, log_sw_icm_alloc_granularity))
+#define MLX5_SW_ICM_BLOCK_SIZE(dev) (1 << MLX5_LOG_SW_ICM_BLOCK_SIZE(dev))
+
 struct mlx5_core_dev {
 	struct device *device;
 	enum mlx5_coredev_type coredev_type;
@@ -690,6 +699,7 @@ struct mlx5_core_dev {
 	atomic_t		num_qps;
 	u32			issi;
 	struct mlx5e_resources  mlx5e_res;
+	struct mlx5_dm          *dm;
 	struct mlx5_vxlan       *vxlan;
 	struct mlx5_geneve      *geneve;
 	struct {
@@ -1072,6 +1082,10 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev,
 				 size_t *offsets);
 struct mlx5_uars_page *mlx5_get_uars_page(struct mlx5_core_dev *mdev);
 void mlx5_put_uars_page(struct mlx5_core_dev *mdev, struct mlx5_uars_page *up);
+int mlx5_dm_sw_icm_alloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type,
+			 u64 length, u16 uid, phys_addr_t *addr, u32 *obj_id);
+int mlx5_dm_sw_icm_dealloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type,
+			   u64 length, u16 uid, phys_addr_t addr, u32 obj_id);
 
 #ifdef CONFIG_MLX5_CORE_IPOIB
 struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
-- 
2.21.0


^ permalink raw reply related

* [PATCH mlx5-next 0/5] Mellanox, mlx5 next updates 2019-09-29
From: Saeed Mahameed @ 2019-08-29 23:42 UTC (permalink / raw)
  To: Saeed Mahameed, Leon Romanovsky
  Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org

Hi All,

This series includes misc updates for mlx5-next shared branch required
for upcoming software steering feature.

1) Alex adds HW bits and definitions required for SW steering
2) Ariel moves device memory management to mlx5_core (From mlx5_ib)
3) Maor, Cleanups and fixups for eswitch mode and RoCE
4) Mar, Set only stag for match untagged packets

In case of no objection this series will be applied to mlx5-next branch
and sent later as pull request to both rdma-next and net-next branches.

Thanks,
Saeed.

---

Alex Vesker (1):
  net/mlx5: Add HW bits and definitions required for SW steering

Ariel Levkovich (1):
  net/mlx5: Move device memory management to mlx5_core

Maor Gottlieb (2):
  net/mlx5: Avoid disabling RoCE when uninitialized
  net/mlx5: Add stub for mlx5_eswitch_mode

Mark Bloch (1):
  net/mlx5: Set only stag for match untagged packets

 drivers/infiniband/hw/mlx5/cmd.c              | 130 ----------
 drivers/infiniband/hw/mlx5/cmd.h              |   4 -
 drivers/infiniband/hw/mlx5/main.c             | 102 +++-----
 drivers/infiniband/hw/mlx5/mlx5_ib.h          |   2 -
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
 .../net/ethernet/mellanox/mlx5/core/en_tc.c   |   5 +-
 .../net/ethernet/mellanox/mlx5/core/lib/dm.c  | 223 +++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/main.c    |   5 +
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   3 +
 .../net/ethernet/mellanox/mlx5/core/rdma.c    |   8 +-
 include/linux/mlx5/device.h                   |   7 +
 include/linux/mlx5/driver.h                   |  14 ++
 include/linux/mlx5/eswitch.h                  |   8 +-
 include/linux/mlx5/mlx5_ifc.h                 | 235 +++++++++++++++---
 14 files changed, 497 insertions(+), 251 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/dm.c

-- 
2.21.0


^ permalink raw reply

* Re: [PATCH bpf-next 04/13] bpf: refactor map_get_next_key()
From: Song Liu @ 2019-08-29 23:39 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, netdev@vger.kernel.org, Alexei Starovoitov, Brian Vazquez,
	Daniel Borkmann, Kernel Team
In-Reply-To: <20190829064506.2750717-1-yhs@fb.com>



> On Aug 28, 2019, at 11:45 PM, Yonghong Song <yhs@fb.com> wrote:
> 
> Refactor function map_get_next_key() with a new helper
> bpf_map_get_next_key(), which will be used later
> for batched map lookup/lookup_and_delete/delete operations.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>


^ permalink raw reply

* Re: [PATCH bpf-next 03/13] bpf: refactor map_delete_elem()
From: Song Liu @ 2019-08-29 23:39 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, netdev@vger.kernel.org, Alexei Starovoitov, Brian Vazquez,
	Daniel Borkmann, Kernel Team
In-Reply-To: <20190829064505.2750541-1-yhs@fb.com>



> On Aug 28, 2019, at 11:45 PM, Yonghong Song <yhs@fb.com> wrote:
> 
> Refactor function map_delete_elem() with a new helper
> bpf_map_delete_elem(), which will be used later
> for batched lookup_and_delete and delete operations.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>


^ permalink raw reply

* Re: [PATCH v6 net-next 16/19] ionic: Add netdev-event handling
From: Jakub Kicinski @ 2019-08-29 23:37 UTC (permalink / raw)
  To: Shannon Nelson; +Cc: netdev, davem
In-Reply-To: <20190829182720.68419-17-snelson@pensando.io>

On Thu, 29 Aug 2019 11:27:17 -0700, Shannon Nelson wrote:
> When the netdev gets a new name from userland, pass that name
> down to the NIC for internal tracking.
> 
> Signed-off-by: Shannon Nelson <snelson@pensando.io>

There is a precedent in ACPI for telling the FW what OS is running but
how is the interface name useful for the firmware I can't really tell.

^ permalink raw reply

* Re: [PATCH bpf-next 02/13] bpf: refactor map_update_elem()
From: Song Liu @ 2019-08-29 23:37 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Networking, Alexei Starovoitov, Brian Vazquez,
	Daniel Borkmann, Kernel Team
In-Reply-To: <20190829064504.2750444-1-yhs@fb.com>



> On Aug 28, 2019, at 11:45 PM, Yonghong Song <yhs@fb.com> wrote:
> 
> Refactor function map_update_elem() by creating a
> helper function bpf_map_update_elem() which will be
> used later by batched map update operation.
> 
> Also reuse function bpf_map_value_size()
> in map_update_elem().
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>


^ permalink raw reply

* Re: [PATCH] ipv6: Not to probe neighbourless routes
From: David Miller @ 2019-08-29 23:37 UTC (permalink / raw)
  To: lkp
  Cc: wang.yi59, kbuild-all, kuznet, yoshfuji, netdev, linux-kernel,
	xue.zhihong, wang.liang82, cheng.lin130
In-Reply-To: <201908300657.DY647BSw%lkp@intel.com>


So yeah, this is one instance where the kbuild test robot's report is
making more rather than less work for us.

We identified the build problem within hours of this patch being
posted and the updated version was posted more than 24 hours ago.

The kbuild robot should really have a way to either:

1) Report build problems faster, humans find the obvious cases like
   this one within a day or less.

2) Notice that a new version of the patch was posted or that a human
   responded to the patch pointing out the build problem.

Otherwise we get postings like this which is just more noise to
delete.

Thanks.

^ permalink raw reply

* Re: [PATCH v6 net-next 15/19] ionic: Add Tx and Rx handling
From: Jakub Kicinski @ 2019-08-29 23:33 UTC (permalink / raw)
  To: Shannon Nelson; +Cc: netdev, davem
In-Reply-To: <20190829182720.68419-16-snelson@pensando.io>

On Thu, 29 Aug 2019 11:27:16 -0700, Shannon Nelson wrote:
> +static int ionic_tx_tso(struct ionic_queue *q, struct sk_buff *skb)
> +{
> +	struct ionic_tx_stats *stats = q_to_tx_stats(q);
> +	struct ionic_desc_info *abort = q->head;
> +	struct device *dev = q->lif->ionic->dev;
> +	struct ionic_desc_info *rewind = abort;
> +	struct ionic_txq_sg_elem *elem;
> +	struct ionic_txq_desc *desc;
> +	unsigned int frag_left = 0;
> +	unsigned int offset = 0;
> +	unsigned int len_left;
> +	dma_addr_t desc_addr;
> +	unsigned int hdrlen;
> +	unsigned int nfrags;
> +	unsigned int seglen;
> +	u64 total_bytes = 0;
> +	u64 total_pkts = 0;
> +	unsigned int left;
> +	unsigned int len;
> +	unsigned int mss;
> +	skb_frag_t *frag;
> +	bool start, done;
> +	bool outer_csum;
> +	bool has_vlan;
> +	u16 desc_len;
> +	u8 desc_nsge;
> +	u16 vlan_tci;
> +	bool encap;
> +	int err;
> +
> +	mss = skb_shinfo(skb)->gso_size;
> +	nfrags = skb_shinfo(skb)->nr_frags;
> +	len_left = skb->len - skb_headlen(skb);
> +	outer_csum = (skb_shinfo(skb)->gso_type & SKB_GSO_GRE_CSUM) ||
> +		     (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM);
> +	has_vlan = !!skb_vlan_tag_present(skb);
> +	vlan_tci = skb_vlan_tag_get(skb);
> +	encap = skb->encapsulation;
> +
> +	/* Preload inner-most TCP csum field with IP pseudo hdr
> +	 * calculated with IP length set to zero.  HW will later
> +	 * add in length to each TCP segment resulting from the TSO.
> +	 */
> +
> +	if (encap)
> +		err = ionic_tx_tcp_inner_pseudo_csum(skb);
> +	else
> +		err = ionic_tx_tcp_pseudo_csum(skb);
> +	if (err)
> +		return err;
> +
> +	if (encap)
> +		hdrlen = skb_inner_transport_header(skb) - skb->data +
> +			 inner_tcp_hdrlen(skb);
> +	else
> +		hdrlen = skb_transport_offset(skb) + tcp_hdrlen(skb);
> +
> +	seglen = hdrlen + mss;
> +	left = skb_headlen(skb);
> +
> +	desc = ionic_tx_tso_next(q, &elem);
> +	start = true;
> +
> +	/* Chop skb->data up into desc segments */
> +
> +	while (left > 0) {
> +		len = min(seglen, left);
> +		frag_left = seglen - len;
> +		desc_addr = ionic_tx_map_single(q, skb->data + offset, len);
> +		if (dma_mapping_error(dev, desc_addr))
> +			goto err_out_abort;
> +		desc_len = len;
> +		desc_nsge = 0;
> +		left -= len;
> +		offset += len;
> +		if (nfrags > 0 && frag_left > 0)
> +			continue;
> +		done = (nfrags == 0 && left == 0);
> +		ionic_tx_tso_post(q, desc, skb,
> +				  desc_addr, desc_nsge, desc_len,
> +				  hdrlen, mss,
> +				  outer_csum,
> +				  vlan_tci, has_vlan,
> +				  start, done);
> +		total_pkts++;
> +		total_bytes += start ? len : len + hdrlen;
> +		desc = ionic_tx_tso_next(q, &elem);
> +		start = false;
> +		seglen = mss;
> +	}
> +
> +	/* Chop skb frags into desc segments */
> +
> +	for (frag = skb_shinfo(skb)->frags; len_left; frag++) {
> +		offset = 0;
> +		left = skb_frag_size(frag);
> +		len_left -= left;
> +		nfrags--;
> +		stats->frags++;
> +
> +		while (left > 0) {
> +			if (frag_left > 0) {
> +				len = min(frag_left, left);
> +				frag_left -= len;
> +				elem->addr =
> +				    cpu_to_le64(ionic_tx_map_frag(q, frag,
> +								  offset, len));
> +				if (dma_mapping_error(dev, elem->addr))
> +					goto err_out_abort;
> +				elem->len = cpu_to_le16(len);
> +				elem++;
> +				desc_nsge++;
> +				left -= len;
> +				offset += len;
> +				if (nfrags > 0 && frag_left > 0)
> +					continue;
> +				done = (nfrags == 0 && left == 0);
> +				ionic_tx_tso_post(q, desc, skb, desc_addr,
> +						  desc_nsge, desc_len,
> +						  hdrlen, mss, outer_csum,
> +						  vlan_tci, has_vlan,
> +						  start, done);
> +				total_pkts++;
> +				total_bytes += start ? len : len + hdrlen;
> +				desc = ionic_tx_tso_next(q, &elem);
> +				start = false;
> +			} else {
> +				len = min(mss, left);
> +				frag_left = mss - len;
> +				desc_addr = ionic_tx_map_frag(q, frag,
> +							      offset, len);
> +				if (dma_mapping_error(dev, desc_addr))
> +					goto err_out_abort;
> +				desc_len = len;
> +				desc_nsge = 0;
> +				left -= len;
> +				offset += len;
> +				if (nfrags > 0 && frag_left > 0)
> +					continue;
> +				done = (nfrags == 0 && left == 0);
> +				ionic_tx_tso_post(q, desc, skb, desc_addr,
> +						  desc_nsge, desc_len,
> +						  hdrlen, mss, outer_csum,
> +						  vlan_tci, has_vlan,
> +						  start, done);
> +				total_pkts++;
> +				total_bytes += start ? len : len + hdrlen;
> +				desc = ionic_tx_tso_next(q, &elem);
> +				start = false;
> +			}
> +		}
> +	}
> +
> +	stats->pkts += total_pkts;
> +	stats->bytes += total_bytes;
> +	stats->tso++;
> +
> +	return 0;
> +
> +err_out_abort:
> +	while (rewind->desc != q->head->desc) {
> +		ionic_tx_clean(q, rewind, NULL, NULL);
> +		rewind = rewind->next;
> +	}
> +	q->head = abort;
> +
> +	return -ENOMEM;
> +}

There's definitely a function for helping drivers which can't do full
TSO slice up the packet, but I can't find it now 😫😫

Eric would definitely know.

Did you have a look? Would it be useful here?

^ permalink raw reply

* Re: [iproute2, master 2/2] devlink: Add a new time-stamp format for health reporter's dump
From: Stephen Hemminger @ 2019-08-29 23:27 UTC (permalink / raw)
  To: Aya Levin; +Cc: netdev, Jiri Pirko, Moshe Shemesh
In-Reply-To: <1566471942-28529-3-git-send-email-ayal@mellanox.com>

On Thu, 22 Aug 2019 14:05:42 +0300
Aya Levin <ayal@mellanox.com> wrote:

> diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
> index fc195cbd66f4..3f8532711315 100644
> --- a/include/uapi/linux/devlink.h
> +++ b/include/uapi/linux/devlink.h
> @@ -348,6 +348,8 @@ enum devlink_attr {
>  	DEVLINK_ATTR_PORT_PCI_PF_NUMBER,	/* u16 */
>  	DEVLINK_ATTR_PORT_PCI_VF_NUMBER,	/* u16 */
>  
> +	DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TSPEC,
> +
>  	/* add new attributes above here, update the policy in devlink.c */
>  
>  	__DEVLINK_ATTR_MAX,
> -- 

Since this is not upstream, this patch needs to go to iproute2-next.
Which means if you want the other bug fix, send it again against master.

^ permalink raw reply

* Re: [v1] iproute2: police: support 64bit rate and peakrate in tc utility
From: Stephen Hemminger @ 2019-08-29 23:21 UTC (permalink / raw)
  To: David Dai; +Cc: jhs, xiyou.wangcong, jiri, netdev, linux-kernel, zdai
In-Reply-To: <1567032776-1118-1-git-send-email-zdai@linux.vnet.ibm.com>

On Wed, 28 Aug 2019 17:52:56 -0500
David Dai <zdai@linux.vnet.ibm.com> wrote:

> For high speed adapter like Mellanox CX-5 card, it can reach upto
> 100 Gbits per second bandwidth. Currently htb already supports 64bit rate
> in tc utility. However police action rate and peakrate are still limited
> to 32bit value (upto 32 Gbits per second). Taking advantage of the 2 new
> attributes TCA_POLICE_RATE64 and TCA_POLICE_PEAKRATE64 from kernel,
> tc can use them to break the 32bit limit, and still keep the backward 
> binary compatibility.
> 
> Tested-by: David Dai <zdai@linux.vnet.ibm.com>
> Signed-off-by: David Dai <zdai@linux.vnet.ibm.com>

This needs to go to iproute2-next not iproute2

^ permalink raw reply

* Re: [PATCH v6 net-next 15/19] ionic: Add Tx and Rx handling
From: Jakub Kicinski @ 2019-08-29 23:18 UTC (permalink / raw)
  To: Shannon Nelson; +Cc: netdev, davem
In-Reply-To: <20190829182720.68419-16-snelson@pensando.io>

On Thu, 29 Aug 2019 11:27:16 -0700, Shannon Nelson wrote:
> +netdev_tx_t ionic_start_xmit(struct sk_buff *skb, struct net_device *netdev)
> +{
> +	u16 queue_index = skb_get_queue_mapping(skb);
> +	struct ionic_lif *lif = netdev_priv(netdev);
> +	struct ionic_queue *q;
> +	int ndescs;
> +	int err;
> +
> +	if (unlikely(!test_bit(IONIC_LIF_UP, lif->state))) {
> +		dev_kfree_skb(skb);
> +		return NETDEV_TX_OK;
> +	}
> +
> +	if (likely(lif_to_txqcq(lif, queue_index)))
> +		q = lif_to_txq(lif, queue_index);
> +	else
> +		q = lif_to_txq(lif, 0);
> +
> +	ndescs = ionic_tx_descs_needed(q, skb);
> +	if (ndescs < 0)
> +		goto err_out_drop;
> +
> +	if (!ionic_q_has_space(q, ndescs)) {

You should stop the queue in advance, whenever you can't ensure that a
max size frame can be placed on the ring. Requeuing is very expensive
so modern drivers should try to never return NETDEV_TX_BUSY

> +		netif_stop_subqueue(netdev, queue_index);
> +		q->stop++;
> +
> +		/* Might race with ionic_tx_clean, check again */
> +		smp_rmb();
> +		if (ionic_q_has_space(q, ndescs)) {
> +			netif_wake_subqueue(netdev, queue_index);
> +			q->wake++;
> +		} else {
> +			return NETDEV_TX_BUSY;
> +		}
> +	}
> +
> +	if (skb_is_gso(skb))
> +		err = ionic_tx_tso(q, skb);
> +	else
> +		err = ionic_tx(q, skb);
> +
> +	if (err)
> +		goto err_out_drop;
> +
> +	return NETDEV_TX_OK;
> +
> +err_out_drop:
> +	netif_stop_subqueue(netdev, queue_index);

This stopping of the queue is suspicious, if ionic_tx() fails there's
no guarantee the queue will ever be woken up, no?

> +	q->stop++;
> +	q->drop++;
> +	dev_kfree_skb(skb);
> +	return NETDEV_TX_OK;
> +}

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox