* [PATCH net 01/10] net/mlx4_core: Fix the resource-type enum in res tracker to conform to FW spec
2016-10-27 13:27 [PATCH net 00/10] mlx4 misc fixes for 4.9 Tariq Toukan
@ 2016-10-27 13:27 ` Tariq Toukan
2016-10-27 13:27 ` [PATCH net 02/10] net/mlx4_core: Avoid setting ports to auto when only one port type is supported Tariq Toukan
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Tariq Toukan @ 2016-10-27 13:27 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Eran Ben Elisha, Jack Morgenstein, Moshe Shemesh,
Tariq Toukan
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
The resource type enum in the resource tracker was incorrect.
RES_EQ was put in the position of RES_NPORT_ID (a FC resource).
Since the remaining resources maintain their current values,
and RES_EQ is not passed from slaves to the hypervisor in any
FW command, this change affects only the hypervisor.
Therefore, there is no backwards-compatibility issue.
Fixes: 623ed84b1f95 ("mlx4_core: initial header-file changes for SRIOV support")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/mlx4.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index e4878f31e45d..7aad07644289 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -145,9 +145,10 @@ enum mlx4_resource {
RES_MTT,
RES_MAC,
RES_VLAN,
- RES_EQ,
+ RES_NPORT_ID,
RES_COUNTER,
RES_FS_RULE,
+ RES_EQ,
MLX4_NUM_OF_RESOURCE_TYPE
};
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH net 02/10] net/mlx4_core: Avoid setting ports to auto when only one port type is supported
2016-10-27 13:27 [PATCH net 00/10] mlx4 misc fixes for 4.9 Tariq Toukan
2016-10-27 13:27 ` [PATCH net 01/10] net/mlx4_core: Fix the resource-type enum in res tracker to conform to FW spec Tariq Toukan
@ 2016-10-27 13:27 ` Tariq Toukan
2016-10-27 13:27 ` [PATCH net 03/10] net/mlx4_core: Change the default value of enable_qos Tariq Toukan
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Tariq Toukan @ 2016-10-27 13:27 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Eran Ben Elisha, Maor Gottlieb, Moshe Shemesh,
Tariq Toukan
From: Maor Gottlieb <maorg@mellanox.com>
When only one port type is supported, it should be read only.
We reject changing requests, even to the auto sense mode.
Fixes: 27bf91d6a0d5 ("mlx4_core: Add link type autosensing")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/main.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 7183ac4135d2..6f4e67bc3538 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -1102,6 +1102,14 @@ static int __set_port_type(struct mlx4_port_info *info,
int i;
int err = 0;
+ if ((port_type & mdev->caps.supported_type[info->port]) != port_type) {
+ mlx4_err(mdev,
+ "Requested port type for port %d is not supported on this HCA\n",
+ info->port);
+ err = -EINVAL;
+ goto err_sup;
+ }
+
mlx4_stop_sense(mdev);
mutex_lock(&priv->port_mutex);
info->tmp_type = port_type;
@@ -1147,7 +1155,7 @@ static int __set_port_type(struct mlx4_port_info *info,
out:
mlx4_start_sense(mdev);
mutex_unlock(&priv->port_mutex);
-
+err_sup:
return err;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH net 03/10] net/mlx4_core: Change the default value of enable_qos
2016-10-27 13:27 [PATCH net 00/10] mlx4 misc fixes for 4.9 Tariq Toukan
2016-10-27 13:27 ` [PATCH net 01/10] net/mlx4_core: Fix the resource-type enum in res tracker to conform to FW spec Tariq Toukan
2016-10-27 13:27 ` [PATCH net 02/10] net/mlx4_core: Avoid setting ports to auto when only one port type is supported Tariq Toukan
@ 2016-10-27 13:27 ` Tariq Toukan
2016-10-27 13:27 ` [PATCH net 04/10] net/mlx4_en: Resolve dividing by zero in 32-bit system Tariq Toukan
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Tariq Toukan @ 2016-10-27 13:27 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Moshe Lazer, Tariq Toukan
From: Moshe Lazer <moshel@mellanox.com>
Change the default status of quality of service back to disabled,
as it hurts performance in some cases.
Fixes: 38438f7c7e8c ("net/mlx4: Set enhanced QoS support by default when ...")
Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/fw.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index c41ab31a39f8..84bab9f0732e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -49,9 +49,9 @@ enum {
extern void __buggy_use_of_MLX4_GET(void);
extern void __buggy_use_of_MLX4_PUT(void);
-static bool enable_qos = true;
+static bool enable_qos;
module_param(enable_qos, bool, 0444);
-MODULE_PARM_DESC(enable_qos, "Enable Enhanced QoS support (default: on)");
+MODULE_PARM_DESC(enable_qos, "Enable Enhanced QoS support (default: off)");
#define MLX4_GET(dest, source, offset) \
do { \
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH net 04/10] net/mlx4_en: Resolve dividing by zero in 32-bit system
2016-10-27 13:27 [PATCH net 00/10] mlx4 misc fixes for 4.9 Tariq Toukan
` (2 preceding siblings ...)
2016-10-27 13:27 ` [PATCH net 03/10] net/mlx4_core: Change the default value of enable_qos Tariq Toukan
@ 2016-10-27 13:27 ` Tariq Toukan
2016-10-27 13:27 ` [PATCH net 05/10] net/mlx4_en: Process all completions in RX rings after port goes up Tariq Toukan
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Tariq Toukan @ 2016-10-27 13:27 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Eugenia Emantayev, Tariq Toukan
From: Eugenia Emantayev <eugenia@mellanox.com>
When doing roundup_pow_of_two for large enough number with
bit 31, an overflow will occur and a value equal to 1 will
be returned. In this case 1 will be subtracted from the return
value and division by zero will be reached.
Fixes: 31c128b66e5b ("net/mlx4_en: Choose time-stamping shift value according to HW frequency")
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/en_clock.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_clock.c b/drivers/net/ethernet/mellanox/mlx4/en_clock.c
index 08fc5fc56d43..a5fc46bbcbe2 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_clock.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_clock.c
@@ -245,8 +245,11 @@ static u32 freq_to_shift(u16 freq)
{
u32 freq_khz = freq * 1000;
u64 max_val_cycles = freq_khz * 1000 * MLX4_EN_WRAP_AROUND_SEC;
+ u64 tmp_rounded =
+ roundup_pow_of_two(max_val_cycles) > max_val_cycles ?
+ roundup_pow_of_two(max_val_cycles) - 1 : UINT_MAX;
u64 max_val_cycles_rounded = is_power_of_2(max_val_cycles + 1) ?
- max_val_cycles : roundup_pow_of_two(max_val_cycles) - 1;
+ max_val_cycles : tmp_rounded;
/* calculate max possible multiplier in order to fit in 64bit */
u64 max_mul = div_u64(0xffffffffffffffffULL, max_val_cycles_rounded);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH net 05/10] net/mlx4_en: Process all completions in RX rings after port goes up
2016-10-27 13:27 [PATCH net 00/10] mlx4 misc fixes for 4.9 Tariq Toukan
` (3 preceding siblings ...)
2016-10-27 13:27 ` [PATCH net 04/10] net/mlx4_en: Resolve dividing by zero in 32-bit system Tariq Toukan
@ 2016-10-27 13:27 ` Tariq Toukan
2016-10-27 13:27 ` [PATCH net 06/10] net/mlx4_en: Fix panic during reboot Tariq Toukan
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Tariq Toukan @ 2016-10-27 13:27 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Eran Ben Elisha, Erez Shitrit, Eugenia Emantayev,
Tariq Toukan
From: Erez Shitrit <erezsh@mellanox.com>
Currently there is a race between incoming traffic and
initialization flow. HW is able to receive the packets
after INIT_PORT is done and unicast steering is configured.
Before we set priv->port_up NAPI is not scheduled and
receive queues become full. Therefore we never get
new interrupts about the completions.
This issue could happen if running heavy traffic during
bringing port up.
The resolution is to schedule NAPI once port_up is set.
If receive queues were full this will process all cqes
and release them.
Fixes: c27a02cd94d6 ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 7e703bed7b82..e25c11dff525 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1733,6 +1733,13 @@ int mlx4_en_start_port(struct net_device *dev)
udp_tunnel_get_rx_info(dev);
priv->port_up = true;
+
+ /* Process all completions if exist to prevent
+ * the queues freezing if they are full
+ */
+ for (i = 0; i < priv->rx_ring_num; i++)
+ napi_schedule(&priv->rx_cq[i]->napi);
+
netif_tx_start_all_queues(dev);
netif_device_attach(dev);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH net 06/10] net/mlx4_en: Fix panic during reboot
2016-10-27 13:27 [PATCH net 00/10] mlx4 misc fixes for 4.9 Tariq Toukan
` (4 preceding siblings ...)
2016-10-27 13:27 ` [PATCH net 05/10] net/mlx4_en: Process all completions in RX rings after port goes up Tariq Toukan
@ 2016-10-27 13:27 ` Tariq Toukan
2016-10-27 13:27 ` [PATCH net 07/10] net/mlx4_core: Do not access comm channel if it has not yet been initialized Tariq Toukan
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Tariq Toukan @ 2016-10-27 13:27 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Eugenia Emantayev, Tariq Toukan
From: Eugenia Emantayev <eugenia@mellanox.com>
Fix a kernel panic that occurs as a result of an asynchronous event
handled in roce_gid_mgmt:
mlx4_en_get_drvinfo is called and accesses freed resources.
This happens in a shutdown flow only, since pci device is destroyed
while netdevice is still alive.
Fixes: c27a02cd94d6 ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC")
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index e25c11dff525..314f54c8dbed 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2201,6 +2201,7 @@ void mlx4_en_destroy_netdev(struct net_device *dev)
if (!shutdown)
free_netdev(dev);
+ dev->ethtool_ops = NULL;
}
static int mlx4_en_change_mtu(struct net_device *dev, int new_mtu)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH net 07/10] net/mlx4_core: Do not access comm channel if it has not yet been initialized
2016-10-27 13:27 [PATCH net 00/10] mlx4 misc fixes for 4.9 Tariq Toukan
` (5 preceding siblings ...)
2016-10-27 13:27 ` [PATCH net 06/10] net/mlx4_en: Fix panic during reboot Tariq Toukan
@ 2016-10-27 13:27 ` Tariq Toukan
2016-10-27 13:27 ` [PATCH net 08/10] net/mlx4: Fix firmware command timeout during interrupt test Tariq Toukan
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Tariq Toukan @ 2016-10-27 13:27 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Jack Morgenstein, Tariq Toukan
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
In the Hypervisor, there are several FW commands which are invoked
before the comm channel is initialized (in mlx4_multi_func_init).
These include MOD_STAT_CONFIG, QUERY_DEV_CAP, INIT_HCA, and others.
If any of these commands fails, say with a timeout, the Hypervisor
driver enters the internal error reset flow. In this flow, the driver
attempts to notify all slaves via the comm channel that an internal error
has occurred.
Since the comm channel has not yet been initialized (i.e., mapped via
ioremap), this will cause dereferencing a NULL pointer.
To fix this, do not access the comm channel in the internal error flow
if it has not yet been initialized.
Fixes: 55ad359225b2 ("net/mlx4_core: Enable device recovery flow with SRIOV")
Fixes: ab9c17a009ee ("mlx4_core: Modify driver initialization flow to accommodate SRIOV for Ethernet")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/cmd.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index b1cef7a0f7ca..e36bebcab3f2 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -2469,6 +2469,7 @@ int mlx4_multi_func_init(struct mlx4_dev *dev)
kfree(priv->mfunc.master.slave_state);
err_comm:
iounmap(priv->mfunc.comm);
+ priv->mfunc.comm = NULL;
err_vhcr:
dma_free_coherent(&dev->persist->pdev->dev, PAGE_SIZE,
priv->mfunc.vhcr,
@@ -2537,6 +2538,13 @@ void mlx4_report_internal_err_comm_event(struct mlx4_dev *dev)
int slave;
u32 slave_read;
+ /* If the comm channel has not yet been initialized,
+ * skip reporting the internal error event to all
+ * the communication channels.
+ */
+ if (!priv->mfunc.comm)
+ return;
+
/* Report an internal error event to all
* communication channels.
*/
@@ -2571,6 +2579,7 @@ void mlx4_multi_func_cleanup(struct mlx4_dev *dev)
}
iounmap(priv->mfunc.comm);
+ priv->mfunc.comm = NULL;
}
void mlx4_cmd_cleanup(struct mlx4_dev *dev, int cleanup_mask)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH net 08/10] net/mlx4: Fix firmware command timeout during interrupt test
2016-10-27 13:27 [PATCH net 00/10] mlx4 misc fixes for 4.9 Tariq Toukan
` (6 preceding siblings ...)
2016-10-27 13:27 ` [PATCH net 07/10] net/mlx4_core: Do not access comm channel if it has not yet been initialized Tariq Toukan
@ 2016-10-27 13:27 ` Tariq Toukan
2016-10-27 13:27 ` [PATCH net 09/10] net/mlx4_en: Fix potential deadlock in port statistics flow Tariq Toukan
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Tariq Toukan @ 2016-10-27 13:27 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Eugenia Emantayev, Tariq Toukan
From: Eugenia Emantayev <eugenia@mellanox.com>
Currently interrupt test that is part of ethtool selftest runs the
check over all interrupt vectors of the device.
In mlx4_en package part of interrupt vectors are uninitialized since
mlx4_ib doesn't exist. This causes NOP FW command to time out.
Change logic to test current port interrupt vectors only.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/en_selftest.c | 26 +++++++++-
drivers/net/ethernet/mellanox/mlx4/eq.c | 62 +++++++++++-------------
include/linux/mlx4/device.h | 3 +-
3 files changed, 55 insertions(+), 36 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_selftest.c b/drivers/net/ethernet/mellanox/mlx4/en_selftest.c
index b66e03d9711f..c06346a82496 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_selftest.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_selftest.c
@@ -118,6 +118,29 @@ static int mlx4_en_test_loopback(struct mlx4_en_priv *priv)
return !loopback_ok;
}
+static int mlx4_en_test_interrupts(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err = 0;
+ int i = 0;
+
+ err = mlx4_test_async(mdev->dev);
+ /* When not in MSI_X or slave, test only async */
+ if (!(mdev->dev->flags & MLX4_FLAG_MSI_X) || mlx4_is_slave(mdev->dev))
+ return err;
+
+ /* A loop over all completion vectors of current port,
+ * for each vector check whether it works by mapping command
+ * completions to that vector and performing a NOP command
+ */
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ err = mlx4_test_interrupt(mdev->dev, priv->rx_cq[i]->vector);
+ if (err)
+ break;
+ }
+
+ return err;
+}
static int mlx4_en_test_link(struct mlx4_en_priv *priv)
{
@@ -151,7 +174,6 @@ static int mlx4_en_test_speed(struct mlx4_en_priv *priv)
void mlx4_en_ex_selftest(struct net_device *dev, u32 *flags, u64 *buf)
{
struct mlx4_en_priv *priv = netdev_priv(dev);
- struct mlx4_en_dev *mdev = priv->mdev;
int i, carrier_ok;
memset(buf, 0, sizeof(u64) * MLX4_EN_NUM_SELF_TEST);
@@ -177,7 +199,7 @@ void mlx4_en_ex_selftest(struct net_device *dev, u32 *flags, u64 *buf)
netif_carrier_on(dev);
}
- buf[0] = mlx4_test_interrupts(mdev->dev);
+ buf[0] = mlx4_en_test_interrupts(priv);
buf[1] = mlx4_en_test_link(priv);
buf[2] = mlx4_en_test_speed(priv);
diff --git a/drivers/net/ethernet/mellanox/mlx4/eq.c b/drivers/net/ethernet/mellanox/mlx4/eq.c
index cf8f8a72a801..cd3638e6fe25 100644
--- a/drivers/net/ethernet/mellanox/mlx4/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/eq.c
@@ -1361,53 +1361,49 @@ void mlx4_cleanup_eq_table(struct mlx4_dev *dev)
kfree(priv->eq_table.uar_map);
}
-/* A test that verifies that we can accept interrupts on all
- * the irq vectors of the device.
+/* A test that verifies that we can accept interrupts
+ * on the vector allocated for asynchronous events
+ */
+int mlx4_test_async(struct mlx4_dev *dev)
+{
+ return mlx4_NOP(dev);
+}
+EXPORT_SYMBOL(mlx4_test_async);
+
+/* A test that verifies that we can accept interrupts
+ * on the given irq vector of the tested port.
* Interrupts are checked using the NOP command.
*/
-int mlx4_test_interrupts(struct mlx4_dev *dev)
+int mlx4_test_interrupt(struct mlx4_dev *dev, int vector)
{
struct mlx4_priv *priv = mlx4_priv(dev);
- int i;
int err;
- err = mlx4_NOP(dev);
- /* When not in MSI_X, there is only one irq to check */
- if (!(dev->flags & MLX4_FLAG_MSI_X) || mlx4_is_slave(dev))
- return err;
-
- /* A loop over all completion vectors, for each vector we will check
- * whether it works by mapping command completions to that vector
- * and performing a NOP command
- */
- for(i = 0; !err && (i < dev->caps.num_comp_vectors); ++i) {
- /* Make sure request_irq was called */
- if (!priv->eq_table.eq[i].have_irq)
- continue;
-
- /* Temporary use polling for command completions */
- mlx4_cmd_use_polling(dev);
-
- /* Map the new eq to handle all asynchronous events */
- err = mlx4_MAP_EQ(dev, get_async_ev_mask(dev), 0,
- priv->eq_table.eq[i].eqn);
- if (err) {
- mlx4_warn(dev, "Failed mapping eq for interrupt test\n");
- mlx4_cmd_use_events(dev);
- break;
- }
+ /* Temporary use polling for command completions */
+ mlx4_cmd_use_polling(dev);
- /* Go back to using events */
- mlx4_cmd_use_events(dev);
- err = mlx4_NOP(dev);
+ /* Map the new eq to handle all asynchronous events */
+ err = mlx4_MAP_EQ(dev, get_async_ev_mask(dev), 0,
+ priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(vector)].eqn);
+ if (err) {
+ mlx4_warn(dev, "Failed mapping eq for interrupt test\n");
+ goto out;
}
+ /* Go back to using events */
+ mlx4_cmd_use_events(dev);
+ err = mlx4_NOP(dev);
+
/* Return to default */
+ mlx4_cmd_use_polling(dev);
+out:
mlx4_MAP_EQ(dev, get_async_ev_mask(dev), 0,
priv->eq_table.eq[MLX4_EQ_ASYNC].eqn);
+ mlx4_cmd_use_events(dev);
+
return err;
}
-EXPORT_SYMBOL(mlx4_test_interrupts);
+EXPORT_SYMBOL(mlx4_test_interrupt);
bool mlx4_is_eq_vector_valid(struct mlx4_dev *dev, u8 port, int vector)
{
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index f6a164297358..3be7abd6e722 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -1399,7 +1399,8 @@ void mlx4_fmr_unmap(struct mlx4_dev *dev, struct mlx4_fmr *fmr,
u32 *lkey, u32 *rkey);
int mlx4_fmr_free(struct mlx4_dev *dev, struct mlx4_fmr *fmr);
int mlx4_SYNC_TPT(struct mlx4_dev *dev);
-int mlx4_test_interrupts(struct mlx4_dev *dev);
+int mlx4_test_interrupt(struct mlx4_dev *dev, int vector);
+int mlx4_test_async(struct mlx4_dev *dev);
int mlx4_query_diag_counters(struct mlx4_dev *dev, u8 op_modifier,
const u32 offset[], u32 value[],
size_t array_len, u8 port);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH net 09/10] net/mlx4_en: Fix potential deadlock in port statistics flow
2016-10-27 13:27 [PATCH net 00/10] mlx4 misc fixes for 4.9 Tariq Toukan
` (7 preceding siblings ...)
2016-10-27 13:27 ` [PATCH net 08/10] net/mlx4: Fix firmware command timeout during interrupt test Tariq Toukan
@ 2016-10-27 13:27 ` Tariq Toukan
2016-10-27 13:27 ` [PATCH net 10/10] net/mlx4_en: Save slave ethtool stats command Tariq Toukan
2016-10-29 20:24 ` [PATCH net 00/10] mlx4 misc fixes for 4.9 David Miller
10 siblings, 0 replies; 12+ messages in thread
From: Tariq Toukan @ 2016-10-27 13:27 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Jack Morgenstein, Tariq Toukan
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
mlx4_en_DUMP_ETH_STATS took the *counter mutex* and then
called the FW command, with WRAPPED attribute. As a result, the fw command
is wrapped on the Hypervisor when it calls mlx4_en_DUMP_ETH_STATS.
The FW command wrapper flow on the hypervisor takes the *slave_cmd_mutex*
during processing.
At the same time, a VF could be in the process of coming up, and could
call mlx4_QUERY_FUNC_CAP. On the hypervisor, the command flow takes the
*slave_cmd_mutex*, then executes mlx4_QUERY_FUNC_CAP_wrapper.
mlx4_QUERY_FUNC_CAP wrapper calls mlx4_get_default_counter_index(),
which takes the *counter mutex*. DEADLOCK.
The fix is that the DUMP_ETH_STATS fw command should be called with
the NATIVE attribute, so that on the hypervisor, this command does not
enter the wrapper flow.
Since the Hypervisor no longer goes through the wrapper code, we also
simply return 0 in mlx4_DUMP_ETH_STATS_wrapper (i.e.the function succeeds,
but the returned data will be all zeroes).
No need to test if it is the Hypervisor going through the wrapper.
Fixes: f9baff509f8a ("mlx4_core: Add "native" argument to mlx4_cmd ...")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/en_port.c | 4 ++--
drivers/net/ethernet/mellanox/mlx4/mlx4.h | 2 --
drivers/net/ethernet/mellanox/mlx4/port.c | 13 +------------
3 files changed, 3 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_port.c b/drivers/net/ethernet/mellanox/mlx4/en_port.c
index 5aa8b751f417..59473a0ebcdf 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_port.c
@@ -166,7 +166,7 @@ int mlx4_en_DUMP_ETH_STATS(struct mlx4_en_dev *mdev, u8 port, u8 reset)
return PTR_ERR(mailbox);
err = mlx4_cmd_box(mdev->dev, 0, mailbox->dma, in_mod, 0,
MLX4_CMD_DUMP_ETH_STATS, MLX4_CMD_TIME_CLASS_B,
- MLX4_CMD_WRAPPED);
+ MLX4_CMD_NATIVE);
if (err)
goto out;
@@ -322,7 +322,7 @@ int mlx4_en_DUMP_ETH_STATS(struct mlx4_en_dev *mdev, u8 port, u8 reset)
err = mlx4_cmd_box(mdev->dev, 0, mailbox->dma,
in_mod | MLX4_DUMP_ETH_STATS_FLOW_CONTROL,
0, MLX4_CMD_DUMP_ETH_STATS,
- MLX4_CMD_TIME_CLASS_B, MLX4_CMD_WRAPPED);
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
if (err)
goto out;
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index 7aad07644289..88ee7d8a5923 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -1330,8 +1330,6 @@ int mlx4_SET_VLAN_FLTR_wrapper(struct mlx4_dev *dev, int slave,
struct mlx4_cmd_info *cmd);
int mlx4_common_set_vlan_fltr(struct mlx4_dev *dev, int function,
int port, void *buf);
-int mlx4_common_dump_eth_stats(struct mlx4_dev *dev, int slave, u32 in_mod,
- struct mlx4_cmd_mailbox *outbox);
int mlx4_DUMP_ETH_STATS_wrapper(struct mlx4_dev *dev, int slave,
struct mlx4_vhcr *vhcr,
struct mlx4_cmd_mailbox *inbox,
diff --git a/drivers/net/ethernet/mellanox/mlx4/port.c b/drivers/net/ethernet/mellanox/mlx4/port.c
index c5b2064297a1..b656dd5772e5 100644
--- a/drivers/net/ethernet/mellanox/mlx4/port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/port.c
@@ -1728,24 +1728,13 @@ int mlx4_SET_VLAN_FLTR_wrapper(struct mlx4_dev *dev, int slave,
return err;
}
-int mlx4_common_dump_eth_stats(struct mlx4_dev *dev, int slave,
- u32 in_mod, struct mlx4_cmd_mailbox *outbox)
-{
- return mlx4_cmd_box(dev, 0, outbox->dma, in_mod, 0,
- MLX4_CMD_DUMP_ETH_STATS, MLX4_CMD_TIME_CLASS_B,
- MLX4_CMD_NATIVE);
-}
-
int mlx4_DUMP_ETH_STATS_wrapper(struct mlx4_dev *dev, int slave,
struct mlx4_vhcr *vhcr,
struct mlx4_cmd_mailbox *inbox,
struct mlx4_cmd_mailbox *outbox,
struct mlx4_cmd_info *cmd)
{
- if (slave != dev->caps.function)
- return 0;
- return mlx4_common_dump_eth_stats(dev, slave,
- vhcr->in_modifier, outbox);
+ return 0;
}
int mlx4_get_slave_from_roce_gid(struct mlx4_dev *dev, int port, u8 *gid,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH net 10/10] net/mlx4_en: Save slave ethtool stats command
2016-10-27 13:27 [PATCH net 00/10] mlx4 misc fixes for 4.9 Tariq Toukan
` (8 preceding siblings ...)
2016-10-27 13:27 ` [PATCH net 09/10] net/mlx4_en: Fix potential deadlock in port statistics flow Tariq Toukan
@ 2016-10-27 13:27 ` Tariq Toukan
2016-10-29 20:24 ` [PATCH net 00/10] mlx4 misc fixes for 4.9 David Miller
10 siblings, 0 replies; 12+ messages in thread
From: Tariq Toukan @ 2016-10-27 13:27 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Tariq Toukan
Following the previous patch, as an optimization, the slave will
not even bother sending the DUMP_ETH_STATS command over the
comm channel.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 314f54c8dbed..12c99a2655f2 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1917,8 +1917,9 @@ static void mlx4_en_clear_stats(struct net_device *dev)
struct mlx4_en_dev *mdev = priv->mdev;
int i;
- if (mlx4_en_DUMP_ETH_STATS(mdev, priv->port, 1))
- en_dbg(HW, priv, "Failed dumping statistics\n");
+ if (!mlx4_is_slave(mdev->dev))
+ if (mlx4_en_DUMP_ETH_STATS(mdev, priv->port, 1))
+ en_dbg(HW, priv, "Failed dumping statistics\n");
memset(&priv->pstats, 0, sizeof(priv->pstats));
memset(&priv->pkstats, 0, sizeof(priv->pkstats));
--
1.8.3.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH net 00/10] mlx4 misc fixes for 4.9
2016-10-27 13:27 [PATCH net 00/10] mlx4 misc fixes for 4.9 Tariq Toukan
` (9 preceding siblings ...)
2016-10-27 13:27 ` [PATCH net 10/10] net/mlx4_en: Save slave ethtool stats command Tariq Toukan
@ 2016-10-29 20:24 ` David Miller
10 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2016-10-29 20:24 UTC (permalink / raw)
To: tariqt; +Cc: netdev, eranbe
From: Tariq Toukan <tariqt@mellanox.com>
Date: Thu, 27 Oct 2016 16:27:12 +0300
> This patchset contains several bug fixes from the team to the
> mlx4 Eth and Core drivers.
>
> Series generated against net commit:
> ecc515d7238f 'sctp: fix the panic caused by route update'
Series applied, thanks Tariq.
^ permalink raw reply [flat|nested] 12+ messages in thread