Netdev List
 help / color / mirror / Atom feed
* [net-next 04/15] net/mlx5: Add support for QUERY_VNIC_ENV command
From: Saeed Mahameed @ 2018-03-23 22:39 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Moshe Shemesh, Saeed Mahameed
In-Reply-To: <20180323223925.21678-1-saeedm@mellanox.com>

From: Moshe Shemesh <moshe@mellanox.com>

Add support for new FW command QUERY_VNIC_ENV.
The command is used by the driver to query vnic diagnostic statistics
from FW.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c |  2 ++
 include/linux/mlx5/mlx5_ifc.h                 | 50 ++++++++++++++++++++++++++-
 2 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index e9a1fbcc4adf..fe5428667ad1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -359,6 +359,7 @@ static int mlx5_internal_err_ret_value(struct mlx5_core_dev *dev, u16 op,
 	case MLX5_CMD_OP_MODIFY_HCA_VPORT_CONTEXT:
 	case MLX5_CMD_OP_QUERY_HCA_VPORT_GID:
 	case MLX5_CMD_OP_QUERY_HCA_VPORT_PKEY:
+	case MLX5_CMD_OP_QUERY_VNIC_ENV:
 	case MLX5_CMD_OP_QUERY_VPORT_COUNTER:
 	case MLX5_CMD_OP_ALLOC_Q_COUNTER:
 	case MLX5_CMD_OP_QUERY_Q_COUNTER:
@@ -501,6 +502,7 @@ const char *mlx5_command_str(int command)
 	MLX5_COMMAND_STR_CASE(MODIFY_HCA_VPORT_CONTEXT);
 	MLX5_COMMAND_STR_CASE(QUERY_HCA_VPORT_GID);
 	MLX5_COMMAND_STR_CASE(QUERY_HCA_VPORT_PKEY);
+	MLX5_COMMAND_STR_CASE(QUERY_VNIC_ENV);
 	MLX5_COMMAND_STR_CASE(QUERY_VPORT_COUNTER);
 	MLX5_COMMAND_STR_CASE(ALLOC_Q_COUNTER);
 	MLX5_COMMAND_STR_CASE(DEALLOC_Q_COUNTER);
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index f3200a9696d6..52e373dd2679 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -143,6 +143,7 @@ enum {
 	MLX5_CMD_OP_MODIFY_HCA_VPORT_CONTEXT      = 0x763,
 	MLX5_CMD_OP_QUERY_HCA_VPORT_GID           = 0x764,
 	MLX5_CMD_OP_QUERY_HCA_VPORT_PKEY          = 0x765,
+	MLX5_CMD_OP_QUERY_VNIC_ENV                = 0x76f,
 	MLX5_CMD_OP_QUERY_VPORT_COUNTER           = 0x770,
 	MLX5_CMD_OP_ALLOC_Q_COUNTER               = 0x771,
 	MLX5_CMD_OP_DEALLOC_Q_COUNTER             = 0x772,
@@ -875,7 +876,7 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 	u8         vhca_group_manager[0x1];
 	u8         ib_virt[0x1];
 	u8         eth_virt[0x1];
-	u8         reserved_at_1a4[0x1];
+	u8         vnic_env_queue_counters[0x1];
 	u8         ets[0x1];
 	u8         nic_flow_table[0x1];
 	u8         eswitch_flow_table[0x1];
@@ -2386,6 +2387,24 @@ struct mlx5_ifc_xrc_srqc_bits {
 	u8         reserved_at_180[0x80];
 };
 
+struct mlx5_ifc_vnic_diagnostic_statistics_bits {
+	u8         counter_error_queues[0x20];
+
+	u8         total_error_queues[0x20];
+
+	u8         send_queue_priority_update_flow[0x20];
+
+	u8         reserved_at_60[0x20];
+
+	u8         nic_receive_steering_discard[0x40];
+
+	u8         receive_discard_vport_down[0x40];
+
+	u8         transmit_discard_vport_down[0x40];
+
+	u8         reserved_at_140[0xec0];
+};
+
 struct mlx5_ifc_traffic_counter_bits {
 	u8         packets[0x40];
 
@@ -3661,6 +3680,35 @@ struct mlx5_ifc_query_vport_state_in_bits {
 	u8         reserved_at_60[0x20];
 };
 
+struct mlx5_ifc_query_vnic_env_out_bits {
+	u8         status[0x8];
+	u8         reserved_at_8[0x18];
+
+	u8         syndrome[0x20];
+
+	u8         reserved_at_40[0x40];
+
+	struct mlx5_ifc_vnic_diagnostic_statistics_bits vport_env;
+};
+
+enum {
+	MLX5_QUERY_VNIC_ENV_IN_OP_MOD_VPORT_DIAG_STATISTICS  = 0x0,
+};
+
+struct mlx5_ifc_query_vnic_env_in_bits {
+	u8         opcode[0x10];
+	u8         reserved_at_10[0x10];
+
+	u8         reserved_at_20[0x10];
+	u8         op_mod[0x10];
+
+	u8         other_vport[0x1];
+	u8         reserved_at_41[0xf];
+	u8         vport_number[0x10];
+
+	u8         reserved_at_60[0x20];
+};
+
 struct mlx5_ifc_query_vport_counter_out_bits {
 	u8         status[0x8];
 	u8         reserved_at_8[0x18];
-- 
2.14.3

^ permalink raw reply related

* [pull request][net-next 00/15] Mellanox, mlx5 misc updates 2018-03-22
From: Saeed Mahameed @ 2018-03-23 22:39 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Saeed Mahameed

Hi Dave,

This series includes some misc updates to mlx5 core and netdev driver,
please note that there is a small change to net/core/ethtool.c and 
include/uapi/linux/ethtool.h that adds new tunable for PFC stall
prevention on/off support, which was already reviewed as RFC [1].

For more information please review and see the tag log below.

Please pull and let me know if there's any problem.

P.S.: This series doesn't introduce any conflict with the ongoing
mlx5 fixes series, mlx5-fixes-2018-03-23.

[1] https://patchwork.ozlabs.org/cover/838314/

Thanks,
Saeed.

---

The following changes since commit 6686c459e1449a3ee5f3fd313b0a559ace7a700e:

  Merge branch 'hns3-VF-reset' (2018-03-22 15:29:05 -0400)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2018-03-22

for you to fetch changes up to 50f0e3bc5e4b5079b1e9b829db1a2246757965d3:

  net/mlx5e: Add VLAN offload features to hw_enc_features (2018-03-22 16:28:31 -0700)

----------------------------------------------------------------
mlx5-updates-2018-03-22 (Misc updates)

This series includes misc updates for mlx5 core and netdev dirver,

Highlights:

>From Inbar, three patches to add support for PFC stall prevention
statistics and enable/disable through new ethtool tunable, as requested
from previous submission.

>From Moshe, four patches, added more drop counters:
	- drop counter for netdev steering miss
	- drop counter for when VF logical link is down
        - drop counter for when netdev logical link is down.

>From Or, three patches to support vlan push/pop offload via tc HW action,
for newer HW (Connectx-5 and onward) via HW steering flow actions rather
than the emulated path for the older HW brands.

And five more misc small trivial patches.

----------------------------------------------------------------
Aviv Heller (1):
      net/mlx5e: Add VLAN offload features to hw_enc_features

Gal Pressman (3):
      net/mlx5e: Remove redundant check in get ethtool stats
      net/mlx5e: Make choose LRO timeout function static
      net/mlx5e: Add a helper macro in set features ndo

Inbar Karmy (3):
      net/mlx5e: Expose PFC stall prevention counters
      ethtool: Add support for configuring PFC stall prevention in ethtool
      net/mlx5e: PFC stall prevention support

Leon Romanovsky (1):
      net/mlx5: Protect from command bit overflow

Moshe Shemesh (4):
      net/mlx5: Add support for QUERY_VNIC_ENV command
      net/mlx5e: Add vnic steering drop statistics
      net/mlx5: Add packet dropped while vport down statistics
      net/mlx5e: Add interface down dropped packets statistics

Or Gerlitz (3):
      net/mlx5: E-Switch, Use same source for offloaded actions check
      net/mlx5: Add core support for vlan push/pop steering action
      net/mlx5e: Offload tc vlan push/pop using HW action

 drivers/net/ethernet/mellanox/mlx5/core/cmd.c      |   4 +-
 .../mellanox/mlx5/core/diag/fs_tracepoint.h        |   2 +
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |   4 +-
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |  60 +++++++++-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  77 +++++++------
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 126 +++++++++++++++++++--
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h |   6 +
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c    |  15 ++-
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  |  31 ++++-
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.h  |  13 ++-
 .../ethernet/mellanox/mlx5/core/eswitch_offloads.c |  30 +++--
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c   |  10 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c  |   4 +-
 drivers/net/ethernet/mellanox/mlx5/core/fw.c       |   3 +
 drivers/net/ethernet/mellanox/mlx5/core/port.c     |  64 +++++++++--
 drivers/net/ethernet/mellanox/mlx5/core/vport.c    |  26 +++++
 include/linux/mlx5/device.h                        |   4 +
 include/linux/mlx5/fs.h                            |   7 ++
 include/linux/mlx5/mlx5_ifc.h                      | 116 +++++++++++++++++--
 include/linux/mlx5/port.h                          |   6 +
 include/linux/mlx5/vport.h                         |   3 +
 include/uapi/linux/ethtool.h                       |   4 +
 net/core/ethtool.c                                 |   6 +
 23 files changed, 531 insertions(+), 90 deletions(-)

^ permalink raw reply

* [net-next 02/15] ethtool: Add support for configuring PFC stall prevention in ethtool
From: Saeed Mahameed @ 2018-03-23 22:39 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Inbar Karmy, Michal Kubecek, Andrew Lunn, Saeed Mahameed
In-Reply-To: <20180323223925.21678-1-saeedm@mellanox.com>

From: Inbar Karmy <inbark@mellanox.com>

In the event where the device unexpectedly becomes unresponsive
for a long period of time, flow control mechanism may propagate
pause frames which will cause congestion spreading to the entire
network.
To prevent this scenario, when the device is stalled for a period
longer than a pre-configured timeout, flow control mechanisms are
automatically disabled.

This patch adds support for the ETHTOOL_PFC_STALL_PREVENTION
as a tunable.
This API provides support for configuring flow control storm prevention
timeout (msec).

Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Cc: Michal Kubecek <mkubecek@suse.cz>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 include/uapi/linux/ethtool.h | 4 ++++
 net/core/ethtool.c           | 6 ++++++
 2 files changed, 10 insertions(+)

diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index 20da156aaf64..9dc63a14a747 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -217,10 +217,14 @@ struct ethtool_value {
 	__u32	data;
 };
 
+#define PFC_STORM_PREVENTION_AUTO	0xffff
+#define PFC_STORM_PREVENTION_DISABLE	0
+
 enum tunable_id {
 	ETHTOOL_ID_UNSPEC,
 	ETHTOOL_RX_COPYBREAK,
 	ETHTOOL_TX_COPYBREAK,
+	ETHTOOL_PFC_PREVENTION_TOUT,
 	/*
 	 * Add your fresh new tubale attribute above and remember to update
 	 * tunable_strings[] in net/core/ethtool.c
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 157cd9efa4be..bb6e498c6e3d 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -121,6 +121,7 @@ tunable_strings[__ETHTOOL_TUNABLE_COUNT][ETH_GSTRING_LEN] = {
 	[ETHTOOL_ID_UNSPEC]     = "Unspec",
 	[ETHTOOL_RX_COPYBREAK]	= "rx-copybreak",
 	[ETHTOOL_TX_COPYBREAK]	= "tx-copybreak",
+	[ETHTOOL_PFC_PREVENTION_TOUT] = "pfc-prevention-tout",
 };
 
 static const char
@@ -2311,6 +2312,11 @@ static int ethtool_tunable_valid(const struct ethtool_tunable *tuna)
 		    tuna->type_id != ETHTOOL_TUNABLE_U32)
 			return -EINVAL;
 		break;
+	case ETHTOOL_PFC_PREVENTION_TOUT:
+		if (tuna->len != sizeof(u16) ||
+		    tuna->type_id != ETHTOOL_TUNABLE_U16)
+			return -EINVAL;
+		break;
 	default:
 		return -EINVAL;
 	}
-- 
2.14.3

^ permalink raw reply related

* [net-next 01/15] net/mlx5e: Expose PFC stall prevention counters
From: Saeed Mahameed @ 2018-03-23 22:39 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Inbar Karmy, Saeed Mahameed
In-Reply-To: <20180323223925.21678-1-saeedm@mellanox.com>

From: Inbar Karmy <inbark@mellanox.com>

Add the needed capability bit and counters to device spec description.
Expose the following two counters in ethtool:

tx_pause_storm_warning_events: when the device is stalled for a period
longer than a pre-configured watermark, the counter increase, allowing
the debug utility an insight into current device status.

tx_pause_storm_error_events: when the device is stalled for a period
longer than a pre-configured timeout, the pause transmission is disabled,
and the counter increase.

Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 19 ++++++++++++++-
 drivers/net/ethernet/mellanox/mlx5/core/fw.c       |  3 +++
 include/linux/mlx5/device.h                        |  4 ++++
 include/linux/mlx5/mlx5_ifc.h                      | 28 +++++++++++++++++++---
 4 files changed, 50 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 5f0f3493d747..2553c58dcf1c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -754,7 +754,15 @@ static const struct counter_desc pport_per_prio_pfc_stats_desc[] = {
 	{ "rx_%s_pause_transition", PPORT_PER_PRIO_OFF(rx_pause_transition) },
 };
 
+static const struct counter_desc pport_pfc_stall_stats_desc[] = {
+	{ "tx_pause_storm_warning_events ", PPORT_PER_PRIO_OFF(device_stall_minor_watermark_cnt) },
+	{ "tx_pause_storm_error_events", PPORT_PER_PRIO_OFF(device_stall_critical_watermark_cnt) },
+};
+
 #define NUM_PPORT_PER_PRIO_PFC_COUNTERS		ARRAY_SIZE(pport_per_prio_pfc_stats_desc)
+#define NUM_PPORT_PFC_STALL_COUNTERS(priv)	(ARRAY_SIZE(pport_pfc_stall_stats_desc) * \
+						 MLX5_CAP_PCAM_FEATURE((priv)->mdev, pfcc_mask) * \
+						 MLX5_CAP_DEBUG((priv)->mdev, stall_detect))
 
 static unsigned long mlx5e_query_pfc_combined(struct mlx5e_priv *priv)
 {
@@ -790,7 +798,8 @@ static int mlx5e_grp_per_prio_pfc_get_num_stats(struct mlx5e_priv *priv)
 {
 	return (mlx5e_query_global_pause_combined(priv) +
 		hweight8(mlx5e_query_pfc_combined(priv))) *
-		NUM_PPORT_PER_PRIO_PFC_COUNTERS;
+		NUM_PPORT_PER_PRIO_PFC_COUNTERS +
+		NUM_PPORT_PFC_STALL_COUNTERS(priv);
 }
 
 static int mlx5e_grp_per_prio_pfc_fill_strings(struct mlx5e_priv *priv,
@@ -818,6 +827,10 @@ static int mlx5e_grp_per_prio_pfc_fill_strings(struct mlx5e_priv *priv,
 		}
 	}
 
+	for (i = 0; i < NUM_PPORT_PFC_STALL_COUNTERS(priv); i++)
+		strcpy(data + (idx++) * ETH_GSTRING_LEN,
+		       pport_pfc_stall_stats_desc[i].format);
+
 	return idx;
 }
 
@@ -845,6 +858,10 @@ static int mlx5e_grp_per_prio_pfc_fill_stats(struct mlx5e_priv *priv,
 		}
 	}
 
+	for (i = 0; i < NUM_PPORT_PFC_STALL_COUNTERS(priv); i++)
+		data[idx++] = MLX5E_READ_CTR64_BE(&priv->stats.pport.per_prio_counters[0],
+						  pport_pfc_stall_stats_desc, i);
+
 	return idx;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw.c b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
index 9d11e92fb541..d7bb10ab2173 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
@@ -183,6 +183,9 @@ int mlx5_query_hca_caps(struct mlx5_core_dev *dev)
 			return err;
 	}
 
+	if (MLX5_CAP_GEN(dev, debug))
+		mlx5_core_get_caps(dev, MLX5_CAP_DEBUG);
+
 	if (MLX5_CAP_GEN(dev, pcam_reg))
 		mlx5_get_pcam_reg(dev);
 
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index e5258ee4e38b..4b5939c78cdd 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -1013,6 +1013,7 @@ enum mlx5_cap_type {
 	MLX5_CAP_RESERVED,
 	MLX5_CAP_VECTOR_CALC,
 	MLX5_CAP_QOS,
+	MLX5_CAP_DEBUG,
 	/* NUM OF CAP Types */
 	MLX5_CAP_NUM
 };
@@ -1140,6 +1141,9 @@ enum mlx5_qcam_feature_groups {
 #define MLX5_CAP_QOS(mdev, cap)\
 	MLX5_GET(qos_cap, mdev->caps.hca_cur[MLX5_CAP_QOS], cap)
 
+#define MLX5_CAP_DEBUG(mdev, cap)\
+	MLX5_GET(debug_cap, mdev->caps.hca_cur[MLX5_CAP_DEBUG], cap)
+
 #define MLX5_CAP_PCAM_FEATURE(mdev, fld) \
 	MLX5_GET(pcam_reg, (mdev)->caps.pcam, feature_cap_mask.enhanced_features.fld)
 
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 14ad84afe8ba..c7d50eccff9e 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -593,6 +593,16 @@ struct mlx5_ifc_qos_cap_bits {
 	u8         reserved_at_100[0x700];
 };
 
+struct mlx5_ifc_debug_cap_bits {
+	u8         reserved_at_0[0x20];
+
+	u8         reserved_at_20[0x2];
+	u8         stall_detect[0x1];
+	u8         reserved_at_23[0x1d];
+
+	u8         reserved_at_40[0x7c0];
+};
+
 struct mlx5_ifc_per_protocol_networking_offload_caps_bits {
 	u8         csum_cap[0x1];
 	u8         vlan_cap[0x1];
@@ -855,7 +865,7 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 	u8         out_of_seq_cnt[0x1];
 	u8         vport_counters[0x1];
 	u8         retransmission_q_counters[0x1];
-	u8         reserved_at_183[0x1];
+	u8         debug[0x1];
 	u8         modify_rq_counter_set_id[0x1];
 	u8         rq_delay_drop[0x1];
 	u8         max_qp_cnt[0xa];
@@ -1572,7 +1582,17 @@ struct mlx5_ifc_eth_per_prio_grp_data_layout_bits {
 
 	u8         rx_pause_transition_low[0x20];
 
-	u8         reserved_at_3c0[0x400];
+	u8         reserved_at_3c0[0x40];
+
+	u8         device_stall_minor_watermark_cnt_high[0x20];
+
+	u8         device_stall_minor_watermark_cnt_low[0x20];
+
+	u8         device_stall_critical_watermark_cnt_high[0x20];
+
+	u8         device_stall_critical_watermark_cnt_low[0x20];
+
+	u8         reserved_at_480[0x340];
 };
 
 struct mlx5_ifc_eth_extended_cntrs_grp_data_layout_bits {
@@ -7874,8 +7894,10 @@ struct mlx5_ifc_peir_reg_bits {
 };
 
 struct mlx5_ifc_pcam_enhanced_features_bits {
-	u8         reserved_at_0[0x7b];
+	u8         reserved_at_0[0x76];
 
+	u8         pfcc_mask[0x1];
+	u8         reserved_at_77[0x4];
 	u8         rx_buffer_fullness_counters[0x1];
 	u8         ptys_connector_type[0x1];
 	u8         reserved_at_7d[0x1];
-- 
2.14.3

^ permalink raw reply related

* Re: [PATCH v6 6/6] bnxt_en: Eliminate duplicate barriers on weakly-ordered archs
From: Michael Chan @ 2018-03-23 22:36 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Netdev, timur, sulrich, linux-arm-msm, linux-arm-kernel,
	open list
In-Reply-To: <1521843791-21201-7-git-send-email-okaya@codeaurora.org>

On Fri, Mar 23, 2018 at 3:23 PM, Sinan Kaya <okaya@codeaurora.org> wrote:
> Code includes wmb() followed by writel(). writel() already has a barrier on
> some architectures like arm64.
>
> This ends up CPU observing two barriers back to back before executing the
> register write.
>
> Create a new wrapper function with relaxed write operator. Use the new
> wrapper when a write is following a wmb().
>
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
>
> Also add mmiowb() so that write code doesn't move outside of scope.
>
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> ---
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++-
>  drivers/net/ethernet/broadcom/bnxt/bnxt.h | 9 +++++++++
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 1500243..fc8ea0d 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -1922,7 +1922,8 @@ static int bnxt_poll_work(struct bnxt *bp, struct bnxt_napi *bnapi, int budget)
>                 /* Sync BD data before updating doorbell */
>                 wmb();
>
> -               bnxt_db_write(bp, db, DB_KEY_TX | prod);
> +               bnxt_db_write_relaxed(bp, db, DB_KEY_TX | prod);
> +               mmiowb();

Sorry for the late review.  mmiowb() is not required here because we
are in NAPI context, so only one CPU will be updating this doorbell.

Other than that, it looks good.

>         }
>
>         cpr->cp_raw_cons = raw_cons;
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> index 1989c47..5e453b9 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> @@ -1401,6 +1401,15 @@ static inline u32 bnxt_tx_avail(struct bnxt *bp, struct bnxt_tx_ring_info *txr)
>                 ((txr->tx_prod - txr->tx_cons) & bp->tx_ring_mask);
>  }
>
> +/* For TX and RX ring doorbells with no ordering guarantee*/
> +static inline void bnxt_db_write_relaxed(struct bnxt *bp, void __iomem *db,
> +                                        u32 val)
> +{
> +       writel_relaxed(val, db);
> +       if (bp->flags & BNXT_FLAG_DOUBLE_DB)
> +               writel_relaxed(val, db);
> +}
> +
>  /* For TX and RX ring doorbells */
>  static inline void bnxt_db_write(struct bnxt *bp, void __iomem *db, u32 val)
>  {
> --
> 2.7.4
>

^ permalink raw reply

* [net 8/8] net/mlx5e: Sync netdev vxlan ports at open
From: Saeed Mahameed @ 2018-03-23 22:05 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Shahar Klein, Saeed Mahameed
In-Reply-To: <20180323220534.19353-1-saeedm@mellanox.com>

From: Shahar Klein <shahark@mellanox.com>

When mlx5_core is loaded it is expected to sync ports
with all vxlan devices so it can support vxlan encap/decap.
This is done via udp_tunnel_get_rx_info(). Currently this
call is set in mlx5e_nic_enable() and if the netdev is not in
NETREG_REGISTERED state it will not be called.

Normally on load the netdev state is not NETREG_REGISTERED
so udp_tunnel_get_rx_info() will not be called.

Moving udp_tunnel_get_rx_info() to mlx5e_open() so
it will be called on netdev UP event and allow encap/decap.

Fixes: 610e89e05c3f ("net/mlx5e: Don't sync netdev state when not registered")
Signed-off-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index e35859657217..9b4827d36e3e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2572,6 +2572,9 @@ int mlx5e_open(struct net_device *netdev)
 		mlx5_set_port_admin_status(priv->mdev, MLX5_PORT_UP);
 	mutex_unlock(&priv->state_lock);
 
+	if (mlx5e_vxlan_allowed(priv->mdev))
+		udp_tunnel_get_rx_info(netdev);
+
 	return err;
 }
 
@@ -4327,12 +4330,6 @@ static void mlx5e_nic_enable(struct mlx5e_priv *priv)
 #ifdef CONFIG_MLX5_CORE_EN_DCB
 	mlx5e_dcbnl_init_app(priv);
 #endif
-	/* Device already registered: sync netdev system state */
-	if (mlx5e_vxlan_allowed(mdev)) {
-		rtnl_lock();
-		udp_tunnel_get_rx_info(netdev);
-		rtnl_unlock();
-	}
 
 	queue_work(priv->wq, &priv->set_rx_mode_work);
 
-- 
2.14.3

^ permalink raw reply related

* [net 7/8] net/mlx5e: Avoid using the ipv6 stub in the TC offload neigh update path
From: Saeed Mahameed @ 2018-03-23 22:05 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Saeed Mahameed
In-Reply-To: <20180323220534.19353-1-saeedm@mellanox.com>

From: Or Gerlitz <ogerlitz@mellanox.com>

Currently we use the global ipv6_stub var to access the ipv6 global
nd table. This practice gets us to troubles when the stub is only partially
set e.g when ipv6 is loaded under the disabled policy. In this case, as of commit
343d60aada5a "ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument"
the stub is not null, but stub->nd_tbl is and we crash.

As we can access directly the ipv6 nd_tbl, the fix is just avoid the
reference through the stub. There is one place in the code were we
issue ipv6 route lookup and keep doing it through the stub, but that
mentioned commit makes sure we get -EAFNOSUPPORT from the stack.

Fixes: 232c001398ae ('net/mlx5e: Add support to neighbour update flow')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 6 +++---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c  | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 9a5a2a7eeab3..500d817d2b0a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -293,7 +293,7 @@ void mlx5e_remove_sqs_fwd_rules(struct mlx5e_priv *priv)
 static void mlx5e_rep_neigh_update_init_interval(struct mlx5e_rep_priv *rpriv)
 {
 #if IS_ENABLED(CONFIG_IPV6)
-	unsigned long ipv6_interval = NEIGH_VAR(&ipv6_stub->nd_tbl->parms,
+	unsigned long ipv6_interval = NEIGH_VAR(&nd_tbl.parms,
 						DELAY_PROBE_TIME);
 #else
 	unsigned long ipv6_interval = ~0UL;
@@ -429,7 +429,7 @@ static int mlx5e_rep_netevent_event(struct notifier_block *nb,
 	case NETEVENT_NEIGH_UPDATE:
 		n = ptr;
 #if IS_ENABLED(CONFIG_IPV6)
-		if (n->tbl != ipv6_stub->nd_tbl && n->tbl != &arp_tbl)
+		if (n->tbl != &nd_tbl && n->tbl != &arp_tbl)
 #else
 		if (n->tbl != &arp_tbl)
 #endif
@@ -477,7 +477,7 @@ static int mlx5e_rep_netevent_event(struct notifier_block *nb,
 		 * done per device delay prob time parameter.
 		 */
 #if IS_ENABLED(CONFIG_IPV6)
-		if (!p->dev || (p->tbl != ipv6_stub->nd_tbl && p->tbl != &arp_tbl))
+		if (!p->dev || (p->tbl != &nd_tbl && p->tbl != &arp_tbl))
 #else
 		if (!p->dev || p->tbl != &arp_tbl)
 #endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index ae11678a31e8..43234cabf444 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -963,7 +963,7 @@ void mlx5e_tc_update_neigh_used_value(struct mlx5e_neigh_hash_entry *nhe)
 		tbl = &arp_tbl;
 #if IS_ENABLED(CONFIG_IPV6)
 	else if (m_neigh->family == AF_INET6)
-		tbl = ipv6_stub->nd_tbl;
+		tbl = &nd_tbl;
 #endif
 	else
 		return;
-- 
2.14.3

^ permalink raw reply related

* [net 5/8] net/mlx5e: Fix traffic being dropped on VF representor
From: Saeed Mahameed @ 2018-03-23 22:05 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Roi Dayan, Saeed Mahameed
In-Reply-To: <20180323220534.19353-1-saeedm@mellanox.com>

From: Roi Dayan <roid@mellanox.com>

Increase representor netdev RQ size to avoid dropped packets.
The current size (two) is just too small to keep up with
conventional slow path traffic patterns.
Also match the SQ size to the RQ size.

Fixes: cb67b832921c ("net/mlx5e: Introduce SRIOV VF representors")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 5ece289548ad..9a5a2a7eeab3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -44,6 +44,11 @@
 #include "en_tc.h"
 #include "fs_core.h"
 
+#define MLX5E_REP_PARAMS_LOG_SQ_SIZE \
+	max(0x6, MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE)
+#define MLX5E_REP_PARAMS_LOG_RQ_SIZE \
+	max(0x6, MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE)
+
 static const char mlx5e_rep_driver_name[] = "mlx5e_rep";
 
 static void mlx5e_rep_get_drvinfo(struct net_device *dev,
@@ -878,9 +883,9 @@ static void mlx5e_build_rep_params(struct mlx5_core_dev *mdev,
 					 MLX5_CQ_PERIOD_MODE_START_FROM_CQE :
 					 MLX5_CQ_PERIOD_MODE_START_FROM_EQE;
 
-	params->log_sq_size = MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE;
+	params->log_sq_size = MLX5E_REP_PARAMS_LOG_SQ_SIZE;
 	params->rq_wq_type  = MLX5_WQ_TYPE_LINKED_LIST;
-	params->log_rq_size = MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE;
+	params->log_rq_size = MLX5E_REP_PARAMS_LOG_RQ_SIZE;
 
 	params->rx_dim_enabled = MLX5_CAP_GEN(mdev, cq_moderation);
 	mlx5e_set_rx_cq_mode_params(params, cq_period_mode);
-- 
2.14.3

^ permalink raw reply related

* [net 2/8] net/mlx5e: Use 32 bits to store VF representor SQ number
From: Saeed Mahameed @ 2018-03-23 22:05 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Saeed Mahameed
In-Reply-To: <20180323220534.19353-1-saeedm@mellanox.com>

From: Or Gerlitz <ogerlitz@mellanox.com>

SQs are 32 and not 16 bits, hence it's wrong to use only 16 bits to
store the sq number for which are going to set steering rule, fix that.

Fixes: cb67b832921c ('net/mlx5e: Introduce SRIOV VF representors')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 0273c233bc85..738554a6c69f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -209,7 +209,7 @@ static void mlx5e_sqs2vport_stop(struct mlx5_eswitch *esw,
 
 static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
 				 struct mlx5_eswitch_rep *rep,
-				 u16 *sqns_array, int sqns_num)
+				 u32 *sqns_array, int sqns_num)
 {
 	struct mlx5_flow_handle *flow_rule;
 	struct mlx5e_rep_priv *rpriv;
@@ -255,9 +255,9 @@ int mlx5e_add_sqs_fwd_rules(struct mlx5e_priv *priv)
 	struct mlx5e_channel *c;
 	int n, tc, num_sqs = 0;
 	int err = -ENOMEM;
-	u16 *sqs;
+	u32 *sqs;
 
-	sqs = kcalloc(priv->channels.num * priv->channels.params.num_tc, sizeof(u16), GFP_KERNEL);
+	sqs = kcalloc(priv->channels.num * priv->channels.params.num_tc, sizeof(*sqs), GFP_KERNEL);
 	if (!sqs)
 		goto out;
 
-- 
2.14.3

^ permalink raw reply related

* [net 6/8] net/mlx5e: Fix memory usage issues in offloading TC flows
From: Saeed Mahameed @ 2018-03-23 22:05 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Jianbo Liu, Saeed Mahameed
In-Reply-To: <20180323220534.19353-1-saeedm@mellanox.com>

From: Jianbo Liu <jianbol@mellanox.com>

For NIC flows, the parsed attributes are not freed when we exit
successfully from mlx5e_configure_flower().

There is possible double free for eswitch flows. If error is returned
from rhashtable_insert_fast(), the parse attrs will be freed in
mlx5e_tc_del_flow(), but they will be freed again before exiting
mlx5e_configure_flower().

To fix both issues we do the following:
(1) change the condition that determines if to issue the free call to
    check if this flow is NIC flow, or it does not have encap action.
(2) reorder the code such that that the check and free calls are done
    before we attempt to add into the hash table.

Fixes: 232c001398ae ('net/mlx5e: Add support to neighbour update flow')
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index fa86a1466718..ae11678a31e8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -2608,19 +2608,19 @@ int mlx5e_configure_flower(struct mlx5e_priv *priv,
 	if (err != -EAGAIN)
 		flow->flags |= MLX5E_TC_FLOW_OFFLOADED;
 
+	if (!(flow->flags & MLX5E_TC_FLOW_ESWITCH) ||
+	    !(flow->esw_attr->action & MLX5_FLOW_CONTEXT_ACTION_ENCAP))
+		kvfree(parse_attr);
+
 	err = rhashtable_insert_fast(&tc->ht, &flow->node,
 				     tc->ht_params);
-	if (err)
-		goto err_del_rule;
+	if (err) {
+		mlx5e_tc_del_flow(priv, flow);
+		kfree(flow);
+	}
 
-	if (flow->flags & MLX5E_TC_FLOW_ESWITCH &&
-	    !(flow->esw_attr->action & MLX5_FLOW_CONTEXT_ACTION_ENCAP))
-		kvfree(parse_attr);
 	return err;
 
-err_del_rule:
-	mlx5e_tc_del_flow(priv, flow);
-
 err_free:
 	kvfree(parse_attr);
 	kfree(flow);
-- 
2.14.3

^ permalink raw reply related

* [net 3/8] net/mlx5: Make eswitch support to depend on switchdev
From: Saeed Mahameed @ 2018-03-23 22:05 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Saeed Mahameed
In-Reply-To: <20180323220534.19353-1-saeedm@mellanox.com>

From: Or Gerlitz <ogerlitz@mellanox.com>

Add dependancy for switchdev to be congfigured as any user-space control
plane SW is expected to use the HW switchdev ID to locate the representors
related to VFs of a certain PF and apply SW/offloaded switching on them.

Fixes: e80541ecabd5 ('net/mlx5: Add CONFIG_MLX5_ESWITCH Kconfig')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/Kconfig   | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c  | 2 --
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
index 25deaa5a534c..c032319f1cb9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
@@ -46,7 +46,7 @@ config MLX5_MPFS
 
 config MLX5_ESWITCH
 	bool "Mellanox Technologies MLX5 SRIOV E-Switch support"
-	depends on MLX5_CORE_EN
+	depends on MLX5_CORE_EN && NET_SWITCHDEV
 	default y
 	---help---
 	  Mellanox Technologies Ethernet SRIOV E-Switch support in ConnectX NIC.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index da94c8cba5ee..e35859657217 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4069,7 +4069,7 @@ static void mlx5e_set_netdev_dev_addr(struct net_device *netdev)
 	}
 }
 
-#if IS_ENABLED(CONFIG_NET_SWITCHDEV) && IS_ENABLED(CONFIG_MLX5_ESWITCH)
+#if IS_ENABLED(CONFIG_MLX5_ESWITCH)
 static const struct switchdev_ops mlx5e_switchdev_ops = {
 	.switchdev_port_attr_get	= mlx5e_attr_get,
 };
@@ -4175,7 +4175,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev)
 
 	mlx5e_set_netdev_dev_addr(netdev);
 
-#if IS_ENABLED(CONFIG_NET_SWITCHDEV) && IS_ENABLED(CONFIG_MLX5_ESWITCH)
+#if IS_ENABLED(CONFIG_MLX5_ESWITCH)
 	if (MLX5_VPORT_MANAGER(mdev))
 		netdev->switchdev_ops = &mlx5e_switchdev_ops;
 #endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 738554a6c69f..5ece289548ad 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -900,9 +900,7 @@ static void mlx5e_build_rep_netdev(struct net_device *netdev)
 
 	netdev->ethtool_ops	  = &mlx5e_rep_ethtool_ops;
 
-#ifdef CONFIG_NET_SWITCHDEV
 	netdev->switchdev_ops = &mlx5e_rep_switchdev_ops;
-#endif
 
 	netdev->features	 |= NETIF_F_VLAN_CHALLENGED | NETIF_F_HW_TC | NETIF_F_NETNS_LOCAL;
 	netdev->hw_features      |= NETIF_F_HW_TC;
-- 
2.14.3

^ permalink raw reply related

* [pull request][net 0/8] Mellanox, mlx5 fixes 2018-03-23
From: Saeed Mahameed @ 2018-03-23 22:05 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Saeed Mahameed

Hi Dave,

The following series includes fixes for mlx5 netdev and eswitch.

For -stable v4.12
    ('net/mlx5e: Avoid using the ipv6 stub in the TC offload neigh update path')
    ('net/mlx5e: Fix traffic being dropped on VF representor')

For -stable v4.13
    ('net/mlx5e: Fix memory usage issues in offloading TC flows')
    ('net/mlx5e: Verify coalescing parameters in range')

For -stable v4.14
    ('net/mlx5e: Don't override vport admin link state in switchdev mode')

For -stable v4.15
    ('108b2b6d5c02 net/mlx5e: Sync netdev vxlan ports at open')

Please pull and let me know if there's any problem.

Thanks,
Saeed.

---

The following changes since commit 1bfa26ff8c4b7512f4e4efa6df211239223033d4:

  ipv6: fix possible deadlock in rt6_age_examine_exception() (2018-03-23 13:40:34 -0400)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2018-03-23

for you to fetch changes up to dcbfde231207f7b6e1f5628d59b182198e1cfae2:

  net/mlx5e: Sync netdev vxlan ports at open (2018-03-23 14:23:50 -0700)

----------------------------------------------------------------
mlx5-fixes-2018-03-23

----------------------------------------------------------------
Jianbo Liu (2):
      net/mlx5e: Don't override vport admin link state in switchdev mode
      net/mlx5e: Fix memory usage issues in offloading TC flows

Moshe Shemesh (1):
      net/mlx5e: Verify coalescing parameters in range

Or Gerlitz (3):
      net/mlx5e: Use 32 bits to store VF representor SQ number
      net/mlx5: Make eswitch support to depend on switchdev
      net/mlx5e: Avoid using the ipv6 stub in the TC offload neigh update path

Roi Dayan (1):
      net/mlx5e: Fix traffic being dropped on VF representor

Shahar Klein (1):
      net/mlx5e: Sync netdev vxlan ports at open

 drivers/net/ethernet/mellanox/mlx5/core/Kconfig    |  2 +-
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   | 17 +++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 13 ++++-----
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   | 34 ++++++++++++----------
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c    | 18 ++++++------
 5 files changed, 51 insertions(+), 33 deletions(-)

^ permalink raw reply

* [net 1/8] net/mlx5e: Don't override vport admin link state in switchdev mode
From: Saeed Mahameed @ 2018-03-23 22:05 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Jianbo Liu, Saeed Mahameed
In-Reply-To: <20180323220534.19353-1-saeedm@mellanox.com>

From: Jianbo Liu <jianbol@mellanox.com>

The vport admin original link state will be re-applied after returning
back to legacy mode, it is not right to change the admin link state value
when in switchdev mode.

Use direct vport commands to alter logical vport state in netdev
representor open/close flows rather than the administrative eswitch API.

Fixes: 20a1ea674783 ('net/mlx5e: Support VF vport link state control for SRIOV switchdev mode')
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 363d8dcb7f17..0273c233bc85 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -668,7 +668,6 @@ static int mlx5e_rep_open(struct net_device *dev)
 	struct mlx5e_priv *priv = netdev_priv(dev);
 	struct mlx5e_rep_priv *rpriv = priv->ppriv;
 	struct mlx5_eswitch_rep *rep = rpriv->rep;
-	struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
 	int err;
 
 	mutex_lock(&priv->state_lock);
@@ -676,8 +675,9 @@ static int mlx5e_rep_open(struct net_device *dev)
 	if (err)
 		goto unlock;
 
-	if (!mlx5_eswitch_set_vport_state(esw, rep->vport,
-					  MLX5_ESW_VPORT_ADMIN_STATE_UP))
+	if (!mlx5_modify_vport_admin_state(priv->mdev,
+			MLX5_QUERY_VPORT_STATE_IN_OP_MOD_ESW_VPORT,
+			rep->vport, MLX5_ESW_VPORT_ADMIN_STATE_UP))
 		netif_carrier_on(dev);
 
 unlock:
@@ -690,11 +690,12 @@ static int mlx5e_rep_close(struct net_device *dev)
 	struct mlx5e_priv *priv = netdev_priv(dev);
 	struct mlx5e_rep_priv *rpriv = priv->ppriv;
 	struct mlx5_eswitch_rep *rep = rpriv->rep;
-	struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
 	int ret;
 
 	mutex_lock(&priv->state_lock);
-	(void)mlx5_eswitch_set_vport_state(esw, rep->vport, MLX5_ESW_VPORT_ADMIN_STATE_DOWN);
+	mlx5_modify_vport_admin_state(priv->mdev,
+			MLX5_QUERY_VPORT_STATE_IN_OP_MOD_ESW_VPORT,
+			rep->vport, MLX5_ESW_VPORT_ADMIN_STATE_DOWN);
 	ret = mlx5e_close_locked(dev);
 	mutex_unlock(&priv->state_lock);
 	return ret;
-- 
2.14.3

^ permalink raw reply related

* [net 4/8] net/mlx5e: Verify coalescing parameters in range
From: Saeed Mahameed @ 2018-03-23 22:05 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Moshe Shemesh, Saeed Mahameed
In-Reply-To: <20180323220534.19353-1-saeedm@mellanox.com>

From: Moshe Shemesh <moshe@mellanox.com>

Add check of coalescing parameters received through ethtool are within
range of values supported by the HW.
Driver gets the coalescing rx/tx-usecs and rx/tx-frames as set by the
users through ethtool. The ethtool support up to 32 bit value for each.
However, mlx5 modify cq limits the coalescing time parameter to 12 bit
and coalescing frames parameters to 16 bits.
Return out of range error if user tries to set these parameters to
higher values.

Fixes: f62b8bb8f2d3 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality')
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index cc8048f68f11..59ebfdae6695 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -477,6 +477,9 @@ static int mlx5e_get_coalesce(struct net_device *netdev,
 	return mlx5e_ethtool_get_coalesce(priv, coal);
 }
 
+#define MLX5E_MAX_COAL_TIME		MLX5_MAX_CQ_PERIOD
+#define MLX5E_MAX_COAL_FRAMES		MLX5_MAX_CQ_COUNT
+
 static void
 mlx5e_set_priv_channels_coalesce(struct mlx5e_priv *priv, struct ethtool_coalesce *coal)
 {
@@ -511,6 +514,20 @@ int mlx5e_ethtool_set_coalesce(struct mlx5e_priv *priv,
 	if (!MLX5_CAP_GEN(mdev, cq_moderation))
 		return -EOPNOTSUPP;
 
+	if (coal->tx_coalesce_usecs > MLX5E_MAX_COAL_TIME ||
+	    coal->rx_coalesce_usecs > MLX5E_MAX_COAL_TIME) {
+		netdev_info(priv->netdev, "%s: maximum coalesce time supported is %lu usecs\n",
+			    __func__, MLX5E_MAX_COAL_TIME);
+		return -ERANGE;
+	}
+
+	if (coal->tx_max_coalesced_frames > MLX5E_MAX_COAL_FRAMES ||
+	    coal->rx_max_coalesced_frames > MLX5E_MAX_COAL_FRAMES) {
+		netdev_info(priv->netdev, "%s: maximum coalesced frames supported is %lu\n",
+			    __func__, MLX5E_MAX_COAL_FRAMES);
+		return -ERANGE;
+	}
+
 	mutex_lock(&priv->state_lock);
 	new_channels.params = priv->channels.params;
 
-- 
2.14.3

^ permalink raw reply related

* [PATCH v6 6/6] bnxt_en: Eliminate duplicate barriers on weakly-ordered archs
From: Sinan Kaya @ 2018-03-23 22:23 UTC (permalink / raw)
  To: netdev, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Michael Chan,
	linux-kernel
In-Reply-To: <1521843791-21201-1-git-send-email-okaya@codeaurora.org>

Code includes wmb() followed by writel(). writel() already has a barrier on
some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a wmb().

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Also add mmiowb() so that write code doesn't move outside of scope.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h | 9 +++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 1500243..fc8ea0d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1922,7 +1922,8 @@ static int bnxt_poll_work(struct bnxt *bp, struct bnxt_napi *bnapi, int budget)
 		/* Sync BD data before updating doorbell */
 		wmb();
 
-		bnxt_db_write(bp, db, DB_KEY_TX | prod);
+		bnxt_db_write_relaxed(bp, db, DB_KEY_TX | prod);
+		mmiowb();
 	}
 
 	cpr->cp_raw_cons = raw_cons;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 1989c47..5e453b9 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1401,6 +1401,15 @@ static inline u32 bnxt_tx_avail(struct bnxt *bp, struct bnxt_tx_ring_info *txr)
 		((txr->tx_prod - txr->tx_cons) & bp->tx_ring_mask);
 }
 
+/* For TX and RX ring doorbells with no ordering guarantee*/
+static inline void bnxt_db_write_relaxed(struct bnxt *bp, void __iomem *db,
+					 u32 val)
+{
+	writel_relaxed(val, db);
+	if (bp->flags & BNXT_FLAG_DOUBLE_DB)
+		writel_relaxed(val, db);
+}
+
 /* For TX and RX ring doorbells */
 static inline void bnxt_db_write(struct bnxt *bp, void __iomem *db, u32 val)
 {
-- 
2.7.4

^ permalink raw reply related

* [PATCH v6 5/6] net: qlge: Eliminate duplicate barriers on weakly-ordered archs
From: Sinan Kaya @ 2018-03-23 22:23 UTC (permalink / raw)
  To: netdev, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Harish Patil,
	Manish Chopra, Dept-GELinuxNICDev, linux-kernel
In-Reply-To: <1521843791-21201-1-git-send-email-okaya@codeaurora.org>

Code includes wmb() followed by writel(). writel() already has a barrier on
some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a wmb().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/qlogic/qlge/qlge.h      | 16 ++++++++++++++++
 drivers/net/ethernet/qlogic/qlge/qlge_main.c |  3 ++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qlge/qlge.h b/drivers/net/ethernet/qlogic/qlge/qlge.h
index 84ac50f..3e71b65 100644
--- a/drivers/net/ethernet/qlogic/qlge/qlge.h
+++ b/drivers/net/ethernet/qlogic/qlge/qlge.h
@@ -2185,6 +2185,22 @@ static inline void ql_write_db_reg(u32 val, void __iomem *addr)
 }
 
 /*
+ * Doorbell Registers:
+ * Doorbell registers are virtual registers in the PCI memory space.
+ * The space is allocated by the chip during PCI initialization.  The
+ * device driver finds the doorbell address in BAR 3 in PCI config space.
+ * The registers are used to control outbound and inbound queues. For
+ * example, the producer index for an outbound queue.  Each queue uses
+ * 1 4k chunk of memory.  The lower half of the space is for outbound
+ * queues. The upper half is for inbound queues.
+ * Caller has to guarantee ordering.
+ */
+static inline void ql_write_db_reg_relaxed(u32 val, void __iomem *addr)
+{
+	writel_relaxed(val, addr);
+}
+
+/*
  * Shadow Registers:
  * Outbound queues have a consumer index that is maintained by the chip.
  * Inbound queues have a producer index that is maintained by the chip.
diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_main.c b/drivers/net/ethernet/qlogic/qlge/qlge_main.c
index 50038d9..8293c202 100644
--- a/drivers/net/ethernet/qlogic/qlge/qlge_main.c
+++ b/drivers/net/ethernet/qlogic/qlge/qlge_main.c
@@ -2700,7 +2700,8 @@ static netdev_tx_t qlge_send(struct sk_buff *skb, struct net_device *ndev)
 		tx_ring->prod_idx = 0;
 	wmb();
 
-	ql_write_db_reg(tx_ring->prod_idx, tx_ring->prod_idx_db_reg);
+	ql_write_db_reg_relaxed(tx_ring->prod_idx, tx_ring->prod_idx_db_reg);
+	mmiowb();
 	netif_printk(qdev, tx_queued, KERN_DEBUG, qdev->ndev,
 		     "tx queued, slot %d, len %d\n",
 		     tx_ring->prod_idx, skb->len);
-- 
2.7.4

^ permalink raw reply related

* [PATCH v6 4/6] bnx2x: Eliminate duplicate barriers on weakly-ordered archs
From: Sinan Kaya @ 2018-03-23 22:23 UTC (permalink / raw)
  To: netdev, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Ariel Elior,
	everest-linux-l2, linux-kernel
In-Reply-To: <1521843791-21201-1-git-send-email-okaya@codeaurora.org>

Code includes wmb() followed by writel(). writel() already has a
barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing
the register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h         | 12 ++++++++----
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c     |  2 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h     |  4 ++--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c |  2 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c    |  4 ++--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c    |  4 +++-
 6 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index 352beff..d847e1b 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -166,6 +166,12 @@ do {						\
 #define REG_RD8(bp, offset)		readb(REG_ADDR(bp, offset))
 #define REG_RD16(bp, offset)		readw(REG_ADDR(bp, offset))
 
+#define REG_WR_RELAXED(bp, offset, val)	\
+	writel_relaxed((u32)val, REG_ADDR(bp, offset))
+
+#define REG_WR16_RELAXED(bp, offset, val) \
+	writew_relaxed((u16)val, REG_ADDR(bp, offset))
+
 #define REG_WR(bp, offset, val)		writel((u32)val, REG_ADDR(bp, offset))
 #define REG_WR8(bp, offset, val)	writeb((u8)val, REG_ADDR(bp, offset))
 #define REG_WR16(bp, offset, val)	writew((u16)val, REG_ADDR(bp, offset))
@@ -758,10 +764,8 @@ struct bnx2x_fastpath {
 #if (BNX2X_DB_SHIFT < BNX2X_DB_MIN_SHIFT)
 #error "Min DB doorbell stride is 8"
 #endif
-#define DOORBELL(bp, cid, val) \
-	do { \
-		writel((u32)(val), bp->doorbells + (bp->db_size * (cid))); \
-	} while (0)
+#define DOORBELL_RELAXED(bp, cid, val) \
+	writel_relaxed((u32)(val), (bp)->doorbells + ((bp)->db_size * (cid)))
 
 /* TX CSUM helpers */
 #define SKB_CS_OFF(skb)		(offsetof(struct tcphdr, check) - \
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index b97820f..91d2de6 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -4156,7 +4156,7 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	/* make sur edescriptor update is observed by HW */
 	wmb();
 
-	DOORBELL(bp, txdata->cid, txdata->tx_db.raw);
+	DOORBELL_RELAXED(bp, txdata->cid, txdata->tx_db.raw);
 
 	mmiowb();
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
index a5265e1..a8ce5c5 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
@@ -522,8 +522,8 @@ static inline void bnx2x_update_rx_prod(struct bnx2x *bp,
 	wmb();
 
 	for (i = 0; i < sizeof(rx_prods)/4; i++)
-		REG_WR(bp, fp->ustorm_rx_prods_offset + i*4,
-		       ((u32 *)&rx_prods)[i]);
+		REG_WR_RELAXED(bp, fp->ustorm_rx_prods_offset + i * 4,
+			       ((u32 *)&rx_prods)[i]);
 
 	mmiowb(); /* keep prod updates ordered */
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
index 39af4f8..da18aa2 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
@@ -2593,7 +2593,7 @@ static int bnx2x_run_loopback(struct bnx2x *bp, int loopback_mode)
 	txdata->tx_db.data.prod += 2;
 	/* make sure descriptor update is observed by the HW */
 	wmb();
-	DOORBELL(bp, txdata->cid, txdata->tx_db.raw);
+	DOORBELL_RELAXED(bp, txdata->cid, txdata->tx_db.raw);
 
 	mmiowb();
 	barrier();
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 74fc9af..146c40d 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -3817,8 +3817,8 @@ static void bnx2x_sp_prod_update(struct bnx2x *bp)
 	 */
 	mb();
 
-	REG_WR16(bp, BAR_XSTRORM_INTMEM + XSTORM_SPQ_PROD_OFFSET(func),
-		 bp->spq_prod_idx);
+	REG_WR16_RELAXED(bp, BAR_XSTRORM_INTMEM + XSTORM_SPQ_PROD_OFFSET(func),
+			 bp->spq_prod_idx);
 	mmiowb();
 }
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
index 76a4668..8e0a317 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
@@ -170,7 +170,9 @@ static int bnx2x_send_msg2pf(struct bnx2x *bp, u8 *done, dma_addr_t msg_mapping)
 	wmb();
 
 	/* Trigger the PF FW */
-	writeb(1, &zone_data->trigger.vf_pf_channel.addr_valid);
+	writeb_relaxed(1, &zone_data->trigger.vf_pf_channel.addr_valid);
+
+	mmiowb();
 
 	/* Wait for PF to complete */
 	while ((tout >= 0) && (!*done)) {
-- 
2.7.4

^ permalink raw reply related

* [PATCH v6 3/6] bnx2x: Replace doorbell barrier() with wmb()
From: Sinan Kaya @ 2018-03-23 22:23 UTC (permalink / raw)
  To: netdev, timur, sulrich
  Cc: linux-arm-msm, linux-kernel, Sinan Kaya, Ariel Elior,
	everest-linux-l2, linux-arm-kernel
In-Reply-To: <1521843791-21201-1-git-send-email-okaya@codeaurora.org>

barrier() doesn't guarantee memory writes to be observed by the hardware on
all architectures. barrier() only tells compiler not to move this code
with respect to other read/writes.

If memory write needs to be observed by the HW, wmb() is the right choice.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c     | 3 ++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index d7c98e8..b97820f 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -4153,7 +4153,8 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	wmb();
 
 	txdata->tx_db.data.prod += nbd;
-	barrier();
+	/* make sur edescriptor update is observed by HW */
+	wmb();
 
 	DOORBELL(bp, txdata->cid, txdata->tx_db.raw);
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
index 1e33abd..39af4f8 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
@@ -2591,7 +2591,8 @@ static int bnx2x_run_loopback(struct bnx2x *bp, int loopback_mode)
 	wmb();
 
 	txdata->tx_db.data.prod += 2;
-	barrier();
+	/* make sure descriptor update is observed by the HW */
+	wmb();
 	DOORBELL(bp, txdata->cid, txdata->tx_db.raw);
 
 	mmiowb();
-- 
2.7.4

^ permalink raw reply related

* [PATCH v6 2/6] qlcnic: Eliminate duplicate barriers on weakly-ordered archs
From: Sinan Kaya @ 2018-03-23 22:23 UTC (permalink / raw)
  To: netdev, timur, sulrich
  Cc: Dept-GELinuxNICDev, linux-arm-msm, linux-kernel, Sinan Kaya,
	Harish Patil, linux-arm-kernel, Manish Chopra
In-Reply-To: <1521843791-21201-1-git-send-email-okaya@codeaurora.org>

Code includes wmb() followed by writel(). writel() already has a
barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing
the register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Acked-by: Manish Chopra <manish.chopra@cavium.com>
---
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
index 46b0372..97c146e7 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
@@ -478,7 +478,7 @@ irqreturn_t qlcnic_83xx_clear_legacy_intr(struct qlcnic_adapter *adapter)
 	wmb();
 
 	/* clear the interrupt trigger control register */
-	writel(0, adapter->isr_int_vec);
+	writel_relaxed(0, adapter->isr_int_vec);
 	intr_val = readl(adapter->isr_int_vec);
 	do {
 		intr_val = readl(adapter->tgt_status_reg);
-- 
2.7.4

^ permalink raw reply related

* [PATCH v6 1/6] net: qla3xxx: Eliminate duplicate barriers on weakly-ordered archs
From: Sinan Kaya @ 2018-03-23 22:23 UTC (permalink / raw)
  To: netdev, timur, sulrich
  Cc: linux-arm-msm, linux-arm-kernel, Sinan Kaya, Dept-GELinuxNICDev,
	linux-kernel
In-Reply-To: <1521843791-21201-1-git-send-email-okaya@codeaurora.org>

Code includes wmb() followed by writel(). writel() already has a
barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing
the register write.

Since code already has an explicit barrier call, changing code to

wmb()
writel_relaxed()
mmiowb()

for multi-arch support.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/net/ethernet/qlogic/qla3xxx.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qla3xxx.c b/drivers/net/ethernet/qlogic/qla3xxx.c
index 9e5264d..b48f761 100644
--- a/drivers/net/ethernet/qlogic/qla3xxx.c
+++ b/drivers/net/ethernet/qlogic/qla3xxx.c
@@ -1858,8 +1858,9 @@ static void ql_update_small_bufq_prod_index(struct ql3_adapter *qdev)
 			qdev->small_buf_release_cnt -= 8;
 		}
 		wmb();
-		writel(qdev->small_buf_q_producer_index,
-			&port_regs->CommonRegs.rxSmallQProducerIndex);
+		writel_relaxed(qdev->small_buf_q_producer_index,
+			       &port_regs->CommonRegs.rxSmallQProducerIndex);
+		mmiowb();
 	}
 }
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH v6 0/6] netdev: Eliminate duplicate barriers on weakly-ordered archs
From: Sinan Kaya @ 2018-03-23 22:23 UTC (permalink / raw)
  To: netdev, timur, sulrich; +Cc: Sinan Kaya, linux-arm-msm, linux-arm-kernel

Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

I did a regex search for wmb() followed by writel() in each drivers
directory.
I scrubbed the ones I care about in this series.

I considered "ease of change", "popular usage" and "performance critical
path" as the determining criteria for my filtering.

We used relaxed API heavily on ARM for a long time but
it did not exist on other architectures. For this reason, relaxed
architectures have been paying double penalty in order to use the common
drivers.

Now that relaxed API is present on all architectures, we can go and scrub
all drivers to see what needs to change and what can remain.

We start with mostly used ones and hope to increase the coverage over time.
It will take a while to cover all drivers.

Feel free to apply patches individually.

Changes since v5:
- add mmiowb() for PPC architecture
- collect reviewed/acked bys
- bn2x: change doorbell barrier to wmb for observability guarantee across
all architectures.

Sinan Kaya (6):
  net: qla3xxx: Eliminate duplicate barriers on weakly-ordered archs
  qlcnic: Eliminate duplicate barriers on weakly-ordered archs
  bnx2x: Replace doorbell barrier() with wmb()
  bnx2x: Eliminate duplicate barriers on weakly-ordered archs
  net: qlge: Eliminate duplicate barriers on weakly-ordered archs
  bnxt_en: Eliminate duplicate barriers on weakly-ordered archs

 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h         | 12 ++++++++----
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c     |  5 +++--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h     |  4 ++--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c |  5 +++--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c    |  4 ++--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c    |  4 +++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c           |  3 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h           |  9 +++++++++
 drivers/net/ethernet/qlogic/qla3xxx.c               |  5 +++--
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c |  2 +-
 drivers/net/ethernet/qlogic/qlge/qlge.h             | 16 ++++++++++++++++
 drivers/net/ethernet/qlogic/qlge/qlge_main.c        |  3 ++-
 12 files changed, 54 insertions(+), 18 deletions(-)

-- 
2.7.4

^ permalink raw reply

* Re: [PATCH net-next 6/8] MIPS: mscc: Add switch to ocelot
From: Florian Fainelli @ 2018-03-23 22:11 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Alexandre Belloni, David S . Miller, Allan Nielsen,
	razvan.stefanescu, po.liu, Thomas Petazzoni, netdev, devicetree,
	linux-kernel, linux-mips, James Hogan
In-Reply-To: <20180323220657.GY24361@lunn.ch>

On 03/23/2018 03:06 PM, Andrew Lunn wrote:
>>> That is the trade off of having a standalone MDIO bus driver.  Maybe
>>> add a phandle to the internal MDIO bus? The switch driver could then
>>> follow the phandle, and direct connect the internal PHYs?
>>
>> This is more or less what patch 7 does, right?
> 
> Patch 7 does it in DT. I'm suggesting it could be done in C. It is
> hard wired, so there is no need to describe it in DT. Use the phandle
> to get the mdio bus, mdiobus_get_phy(, port) to get the phydev and
> then use phy_connect().

That does not sound like a great idea. And to go back to your example
about DSA, it is partially true, you will see some switch bindings
defining the internal PHYs (e.g: qca8k), and most not doing it (b53,
mv88e6xxx, etc.). In either case, this resolves to the same thing
though. Being able to parse a phy-handle property is a lot more
flexible, and if it does matter that the PHY truly is internal, then the
'phy-mode' property can help reflect that.
-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next 6/8] MIPS: mscc: Add switch to ocelot
From: Andrew Lunn @ 2018-03-23 22:06 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Alexandre Belloni, David S . Miller, Allan Nielsen,
	razvan.stefanescu, po.liu, Thomas Petazzoni, netdev, devicetree,
	linux-kernel, linux-mips, James Hogan
In-Reply-To: <dcac43b7-2eb7-d409-a77c-4f671a8cfc3d@gmail.com>

> > That is the trade off of having a standalone MDIO bus driver.  Maybe
> > add a phandle to the internal MDIO bus? The switch driver could then
> > follow the phandle, and direct connect the internal PHYs?
> 
> This is more or less what patch 7 does, right?

Patch 7 does it in DT. I'm suggesting it could be done in C. It is
hard wired, so there is no need to describe it in DT. Use the phandle
to get the mdio bus, mdiobus_get_phy(, port) to get the phydev and
then use phy_connect().

     Andrew

^ permalink raw reply

* Re: [PATCH v7 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
From: Alexander Duyck @ 2018-03-23 21:53 UTC (permalink / raw)
  To: Sinan Kaya, intel-wired-lan
  Cc: sulrich, Netdev, Timur Tabi, Jeff Kirsher, linux-arm-msm,
	linux-arm-kernel
In-Reply-To: <1521831180-25014-1-git-send-email-okaya@codeaurora.org>

On Fri, Mar 23, 2018 at 11:52 AM, Sinan Kaya <okaya@codeaurora.org> wrote:
> Code includes wmb() followed by writel() in multiple places. writel()
> already has a barrier on some architectures like arm64.
>
> This ends up CPU observing two barriers back to back before executing the
> register write.
>
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
>
> I did a regex search for wmb() followed by writel() in each drivers
> directory.
> I scrubbed the ones I care about in this series.
>
> I considered "ease of change", "popular usage" and "performance critical
> path" as the determining criteria for my filtering.
>
> We used relaxed API heavily on ARM for a long time but
> it did not exist on other architectures. For this reason, relaxed
> architectures have been paying double penalty in order to use the common
> drivers.
>
> Now that relaxed API is present on all architectures, we can go and scrub
> all drivers to see what needs to change and what can remain.
>
> We start with mostly used ones and hope to increase the coverage over time.
> It will take a while to cover all drivers.
>
> Feel free to apply patches individually.

I looked over the set and they seem good.

Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>

>
> Changes since v6:
> clean up between 2..6 and then make your Alex's changes on 1 and 7
>     The mmiowb shouldn't be needed for Rx. Only one CPU will be running
>     NAPI for the queue and we will synchronize this with a full writel
>     anyway when we re-enable the interrupts.
>
> Sinan Kaya (7):
>   i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs
>   ixgbe: eliminate duplicate barriers on weakly-ordered archs
>   igbvf: eliminate duplicate barriers on weakly-ordered archs
>   igb: eliminate duplicate barriers on weakly-ordered archs
>   fm10k: Eliminate duplicate barriers on weakly-ordered archs
>   ixgbevf: keep writel() closer to wmb()
>   ixgbevf: eliminate duplicate barriers on weakly-ordered archs
>
>  drivers/net/ethernet/intel/fm10k/fm10k_main.c     |  4 ++--
>  drivers/net/ethernet/intel/i40e/i40e_txrx.c       | 14 ++++++++++----
>  drivers/net/ethernet/intel/i40evf/i40e_txrx.c     |  4 ++--
>  drivers/net/ethernet/intel/igb/igb_main.c         |  4 ++--
>  drivers/net/ethernet/intel/igbvf/netdev.c         |  4 ++--
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c     |  8 ++++----
>  drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |  5 -----
>  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 11 ++++++++---
>  8 files changed, 30 insertions(+), 24 deletions(-)
>
> --
> 2.7.4
>

^ permalink raw reply

* Re: [PATCH net-next 3/8] net: mscc: Add MDIO driver
From: Florian Fainelli @ 2018-03-23 21:51 UTC (permalink / raw)
  To: Alexandre Belloni, David S . Miller
  Cc: Allan Nielsen, razvan.stefanescu, po.liu, Thomas Petazzoni,
	Andrew Lunn, netdev, devicetree, linux-kernel, linux-mips
In-Reply-To: <20180323201117.8416-4-alexandre.belloni@bootlin.com>

On 03/23/2018 01:11 PM, Alexandre Belloni wrote:
> Add a driver for the Microsemi MII Management controller (MIIM) found on
> Microsemi SoCs.
> On Ocelot, there are two controllers, one is connected to the internal
> PHYs, the other one can communicate with external PHYs.
> 
> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
> ---
>  drivers/net/ethernet/Kconfig          |   1 +
>  drivers/net/ethernet/Makefile         |   1 +
>  drivers/net/ethernet/mscc/Kconfig     |  22 ++++
>  drivers/net/ethernet/mscc/Makefile    |   2 +
>  drivers/net/ethernet/mscc/mscc_miim.c | 210 ++++++++++++++++++++++++++++++++++
>  5 files changed, 236 insertions(+)
>  create mode 100644 drivers/net/ethernet/mscc/Kconfig
>  create mode 100644 drivers/net/ethernet/mscc/Makefile
>  create mode 100644 drivers/net/ethernet/mscc/mscc_miim.c
> 
> diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
> index b6cf4b6962f5..adf643484198 100644
> --- a/drivers/net/ethernet/Kconfig
> +++ b/drivers/net/ethernet/Kconfig
> @@ -115,6 +115,7 @@ source "drivers/net/ethernet/mediatek/Kconfig"
>  source "drivers/net/ethernet/mellanox/Kconfig"
>  source "drivers/net/ethernet/micrel/Kconfig"
>  source "drivers/net/ethernet/microchip/Kconfig"
> +source "drivers/net/ethernet/mscc/Kconfig"
>  source "drivers/net/ethernet/moxa/Kconfig"
>  source "drivers/net/ethernet/myricom/Kconfig"
>  
> diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
> index 3cdf01e96e0b..ed7df22de7ff 100644
> --- a/drivers/net/ethernet/Makefile
> +++ b/drivers/net/ethernet/Makefile
> @@ -56,6 +56,7 @@ obj-$(CONFIG_NET_VENDOR_MEDIATEK) += mediatek/
>  obj-$(CONFIG_NET_VENDOR_MELLANOX) += mellanox/
>  obj-$(CONFIG_NET_VENDOR_MICREL) += micrel/
>  obj-$(CONFIG_NET_VENDOR_MICROCHIP) += microchip/
> +obj-$(CONFIG_NET_VENDOR_MICROSEMI) += mscc/
>  obj-$(CONFIG_NET_VENDOR_MOXART) += moxa/
>  obj-$(CONFIG_NET_VENDOR_MYRI) += myricom/
>  obj-$(CONFIG_FEALNX) += fealnx.o
> diff --git a/drivers/net/ethernet/mscc/Kconfig b/drivers/net/ethernet/mscc/Kconfig
> new file mode 100644
> index 000000000000..2330de6e7bb6
> --- /dev/null
> +++ b/drivers/net/ethernet/mscc/Kconfig
> @@ -0,0 +1,22 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR MIT)
> +config NET_VENDOR_MICROSEMI
> +	bool "Microsemi devices"
> +	default y
> +	help
> +	  If you have a network (Ethernet) card belonging to this class, say Y.
> +
> +	  Note that the answer to this question doesn't directly affect the
> +	  kernel: saying N will just cause the configurator to skip all
> +	  the questions about Microsemi devices.
> +
> +if NET_VENDOR_MICROSEMI
> +
> +config MSCC_MIIM
> +	tristate "Microsemi MIIM interface support"
> +	depends on HAS_IOMEM
> +	select PHYLIB
> +	help
> +	  This driver supports the MIIM (MDIO) interface found in the network
> +	  switches of the Microsemi SoCs
> +
> +endif # NET_VENDOR_MICROSEMI
> diff --git a/drivers/net/ethernet/mscc/Makefile b/drivers/net/ethernet/mscc/Makefile
> new file mode 100644
> index 000000000000..4570e8fa4711
> --- /dev/null
> +++ b/drivers/net/ethernet/mscc/Makefile
> @@ -0,0 +1,2 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR MIT)
> +obj-$(CONFIG_MSCC_MIIM) += mscc_miim.o
> diff --git a/drivers/net/ethernet/mscc/mscc_miim.c b/drivers/net/ethernet/mscc/mscc_miim.c
> new file mode 100644
> index 000000000000..95b8d102c90f
> --- /dev/null
> +++ b/drivers/net/ethernet/mscc/mscc_miim.c
> @@ -0,0 +1,210 @@
> +// SPDX-License-Identifier: (GPL-2.0 OR MIT)
> +/*
> + * Driver for the MDIO interface of Microsemi network switches.
> + *
> + * Author: Alexandre Belloni <alexandre.belloni@bootlin.com>
> + * Copyright (c) 2017 Microsemi Corporation
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/phy.h>
> +#include <linux/platform_device.h>
> +#include <linux/bitops.h>
> +#include <linux/io.h>
> +#include <linux/iopoll.h>
> +#include <linux/of_mdio.h>
> +
> +#define MSCC_MIIM_REG_STATUS		0x0
> +#define		MSCC_MIIM_STATUS_STAT_BUSY	BIT(3)
> +#define MSCC_MIIM_REG_CMD		0x8
> +#define		MSCC_MIIM_CMD_OPR_WRITE		BIT(1)
> +#define		MSCC_MIIM_CMD_OPR_READ		BIT(2)
> +#define		MSCC_MIIM_CMD_WRDATA_SHIFT	4
> +#define		MSCC_MIIM_CMD_REGAD_SHIFT	20
> +#define		MSCC_MIIM_CMD_PHYAD_SHIFT	25
> +#define		MSCC_MIIM_CMD_VLD		BIT(31)
> +#define MSCC_MIIM_REG_DATA		0xC
> +#define		MSCC_MIIM_DATA_ERROR		(BIT(16) | BIT(17))
> +
> +#define MSCC_PHY_REG_PHY_CFG	0x0
> +#define		PHY_CFG_PHY_ENA		(BIT(0) | BIT(1) | BIT(2) | BIT(3))
> +#define		PHY_CFG_PHY_COMMON_RESET BIT(4)
> +#define		PHY_CFG_PHY_RESET	(BIT(5) | BIT(6) | BIT(7) | BIT(8))
> +#define MSCC_PHY_REG_PHY_STATUS	0x4
> +
> +struct mscc_miim_dev {
> +	struct mutex lock;
> +	void __iomem *regs;
> +	void __iomem *phy_regs;
> +};
> +
> +static int mscc_miim_wait_ready(struct mii_bus *bus)
> +{
> +	struct mscc_miim_dev *miim = bus->priv;
> +	u32 val;
> +
> +	readl_poll_timeout(miim->regs + MSCC_MIIM_REG_STATUS, val,
> +			   !(val & MSCC_MIIM_STATUS_STAT_BUSY), 100, 250000);
> +	if (val & MSCC_MIIM_STATUS_STAT_BUSY)
> +		return -ETIMEDOUT;
> +
> +	return 0;
> +}
> +
> +static int mscc_miim_read(struct mii_bus *bus, int mii_id, int regnum)
> +{
> +	struct mscc_miim_dev *miim = bus->priv;
> +	u32 val;
> +	int ret;
> +
> +	mutex_lock(&miim->lock);

What is this lock for considering that bus->lock should always be
acquired when doing these operations? As Andrew pointed out, needs to be
initialized with mutex_init(), but likely you would drop it.

> +
> +	ret = mscc_miim_wait_ready(bus);
> +	if (ret)
> +		goto out;
> +
> +	writel(MSCC_MIIM_CMD_VLD | (mii_id << MSCC_MIIM_CMD_PHYAD_SHIFT) |
> +	       (regnum << MSCC_MIIM_CMD_REGAD_SHIFT) | MSCC_MIIM_CMD_OPR_READ,
> +	       miim->regs + MSCC_MIIM_REG_CMD);
> +
> +	ret = mscc_miim_wait_ready(bus);
> +	if (ret)
> +		goto out;

Your example had an interrupt specified, can't you use that instead of
polling?

> +
> +	val = readl(miim->regs + MSCC_MIIM_REG_DATA);
> +	if (val & MSCC_MIIM_DATA_ERROR) {
> +		ret = -EIO;
> +		goto out;
> +	}
> +
> +	ret = val & 0xFFFF;
> +out:
> +	mutex_unlock(&miim->lock);
> +	return ret;
> +}
> +
> +static int mscc_miim_write(struct mii_bus *bus, int mii_id,
> +			   int regnum, u16 value)
> +{
> +	struct mscc_miim_dev *miim = bus->priv;
> +	int ret;
> +
> +	mutex_lock(&miim->lock);
> +
> +	ret = mscc_miim_wait_ready(bus);
> +	if (ret < 0)
> +		goto out;
> +
> +	writel(MSCC_MIIM_CMD_VLD | (mii_id << MSCC_MIIM_CMD_PHYAD_SHIFT) |
> +	       (regnum << MSCC_MIIM_CMD_REGAD_SHIFT) |
> +	       (value << MSCC_MIIM_CMD_WRDATA_SHIFT) |
> +	       MSCC_MIIM_CMD_OPR_WRITE,
> +	       miim->regs + MSCC_MIIM_REG_CMD);
> +
> +out:
> +	mutex_unlock(&miim->lock);
> +	return ret;
> +}
> +
> +static int mscc_miim_reset(struct mii_bus *bus)
> +{
> +	struct mscc_miim_dev *miim = bus->priv;
> +	int i;

unsigned int i

> +
> +	if (miim->phy_regs) {
> +		writel(0, miim->phy_regs + MSCC_PHY_REG_PHY_CFG);
> +		writel(0x1ff, miim->phy_regs + MSCC_PHY_REG_PHY_CFG);
> +		mdelay(500);
> +	}
> +
> +	for (i = 0; i < PHY_MAX_ADDR; i++) {
> +		if (mscc_miim_read(bus, i, MII_PHYSID1) < 0)
> +			bus->phy_mask |= BIT(i);
> +	}

What is this used for? You have an OF MDIO bus which would create a
phy_device for each node specified, is this a similar workaround to what
drivers/net/phy/mdio-bcm-unimac.c has to do? If so, please document it
as such.

Other than that, this looks quite good!
-- 
Florian

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox