From: Saeed Mahameed <saeed@kernel.org>
To: "David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org, Leon Romanovsky <leonro@nvidia.com>,
Moshe Shemesh <moshe@nvidia.com>,
Saeed Mahameed <saeedm@nvidia.com>
Subject: [net-next 06/15] net/mlx5: Check returned value from health recover sequence
Date: Thu, 11 Mar 2021 14:37:14 -0800 [thread overview]
Message-ID: <20210311223723.361301-7-saeed@kernel.org> (raw)
In-Reply-To: <20210311223723.361301-1-saeed@kernel.org>
From: Leon Romanovsky <leonro@nvidia.com>
MLX5_INTERFACE_STATE_UP is far from being reliable check for success to
recover, because it can be changed any time and health logic doesn't
have any locks to protect from it.
The locks are not needed here because health recover is good to have,
but not must to success, so rely on the returned value from the
mlx5_recover_device() as a marker for success/failure.
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/health.c | 6 +++---
drivers/net/ethernet/mellanox/mlx5/core/main.c | 7 +++++--
drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h | 2 +-
3 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 0c32c485eb58..a0a851640804 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -335,12 +335,12 @@ static int mlx5_health_try_recover(struct mlx5_core_dev *dev)
return -EIO;
}
mlx5_core_err(dev, "starting health recovery flow\n");
- mlx5_recover_device(dev);
- if (!test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state) ||
- mlx5_health_check_fatal_sensors(dev)) {
+ if (mlx5_recover_device(dev) || mlx5_health_check_fatal_sensors(dev)) {
mlx5_core_err(dev, "health recovery failed\n");
return -EIO;
}
+
+ mlx5_core_info(dev, "health revovery succeded\n");
return 0;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 363bc3e917c2..e3a417d17707 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1721,11 +1721,14 @@ void mlx5_disable_device(struct mlx5_core_dev *dev)
mlx5_unload_one(dev);
}
-void mlx5_recover_device(struct mlx5_core_dev *dev)
+int mlx5_recover_device(struct mlx5_core_dev *dev)
{
+ int ret = -EIO;
+
mlx5_pci_disable_device(dev);
if (mlx5_pci_slot_reset(dev->pdev) == PCI_ERS_RESULT_RECOVERED)
- mlx5_pci_resume(dev->pdev);
+ ret = mlx5_load_one(dev);
+ return ret;
}
static struct pci_driver mlx5_core_driver = {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 02993a51b114..37c8ec7d2217 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -134,7 +134,7 @@ void mlx5_error_sw_reset(struct mlx5_core_dev *dev);
u32 mlx5_health_check_fatal_sensors(struct mlx5_core_dev *dev);
int mlx5_health_wait_pci_up(struct mlx5_core_dev *dev);
void mlx5_disable_device(struct mlx5_core_dev *dev);
-void mlx5_recover_device(struct mlx5_core_dev *dev);
+int mlx5_recover_device(struct mlx5_core_dev *dev);
int mlx5_sriov_init(struct mlx5_core_dev *dev);
void mlx5_sriov_cleanup(struct mlx5_core_dev *dev);
int mlx5_sriov_attach(struct mlx5_core_dev *dev);
--
2.29.2
next prev parent reply other threads:[~2021-03-11 22:38 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-11 22:37 [pull request][net-next 00/15] mlx5 updates 2021-03-11 Saeed Mahameed
2021-03-11 22:37 ` [net-next 01/15] net/mlx5: Don't skip vport check Saeed Mahameed
2021-03-12 0:30 ` patchwork-bot+netdevbpf
2021-03-11 22:37 ` [net-next 02/15] net/mlx5: Remove impossible checks of interface state Saeed Mahameed
2021-03-11 22:37 ` [net-next 03/15] net/mlx5: Separate probe vs. reload flows Saeed Mahameed
2021-03-11 22:37 ` [net-next 04/15] net/mlx5: Remove second FW tracer check Saeed Mahameed
2021-03-11 22:37 ` [net-next 05/15] net/mlx5: Don't rely on interface state bit Saeed Mahameed
2021-03-11 22:37 ` Saeed Mahameed [this message]
2021-03-11 22:37 ` [net-next 07/15] net/mlx5e: CT, Avoid false lock dependency warning Saeed Mahameed
2021-03-11 22:37 ` [net-next 08/15] net/mlx5e: fix mlx5e_tc_tun_update_header_ipv6 dummy definition Saeed Mahameed
2021-03-11 22:37 ` [net-next 09/15] net/mlx5e: Add missing include Saeed Mahameed
2021-03-11 22:37 ` [net-next 10/15] net/mlx5: Fix indir stable stubs Saeed Mahameed
2021-03-11 22:37 ` [net-next 11/15] net/mlx5e: mlx5_tc_ct_init does not fail Saeed Mahameed
2021-03-11 22:37 ` [net-next 12/15] net/mlx5: SF, Fix return type Saeed Mahameed
2021-03-11 22:37 ` [net-next 13/15] net/mlx5e: rep: Improve reg_cX conditions Saeed Mahameed
2021-03-11 22:37 ` [net-next 14/15] net/mlx5: Avoid unnecessary operation Saeed Mahameed
2021-03-11 22:37 ` [net-next 15/15] net/mlx5e: Alloc flow spec using kvzalloc instead of kzalloc Saeed Mahameed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210311223723.361301-7-saeed@kernel.org \
--to=saeed@kernel.org \
--cc=davem@davemloft.net \
--cc=kuba@kernel.org \
--cc=leonro@nvidia.com \
--cc=moshe@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=saeedm@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).