public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@nvidia.com>
To: <netdev@vger.kernel.org>, <linux-pci@vger.kernel.org>
Cc: <davem@davemloft.net>, <kuba@kernel.org>, <pabeni@redhat.com>,
	<edumazet@google.com>, <bhelgaas@google.com>,
	<alex.williamson@redhat.com>, <lukas@wunner.de>,
	<petrm@nvidia.com>, <jiri@nvidia.com>, <mlxsw@nvidia.com>,
	Ido Schimmel <idosch@nvidia.com>
Subject: [RFC PATCH net-next 10/12] mlxsw: pci: Add support for new reset flow
Date: Tue, 17 Oct 2023 10:42:55 +0300	[thread overview]
Message-ID: <20231017074257.3389177-11-idosch@nvidia.com> (raw)
In-Reply-To: <20231017074257.3389177-1-idosch@nvidia.com>

The driver resets the device during probe and during a devlink reload.
The current reset method reloads the current firmware version or a
pending one, if one was previously flashed using devlink. However, the
current reset method does not result in a PCI hot reset, preventing the
PCI firmware from being upgraded, unless the system is rebooted.

To solve this problem, a new reset command (6) was implemented in the
firmware. Unlike the current command (1), after issuing the new command
the device will not start the reset immediately, but only after a PCI
hot reset.

Implement the new reset method by first verifying that it is supported
by the current firmware version by querying the Management Capabilities
Mask (MCAM) register. If supported, issue the new reset command (6) via
MRSR register followed by a PCI reset by calling
__pci_reset_function_locked().

Once the PCI firmware is operational, go back to the regular reset flow
and wait for the entire device to become ready. That is, repeatedly read
the "system_status" register from the BAR until a value of "FW_READY"
(0x5E) appears.

Tested:

 # for i in $(seq 1 10); do devlink dev reload pci/0000:01:00.0; done

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlxsw/pci.c | 44 ++++++++++++++++++++++-
 drivers/net/ethernet/mellanox/mlxsw/reg.h |  2 ++
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.c b/drivers/net/ethernet/mellanox/mlxsw/pci.c
index 1980343ff873..b5bb47b0215f 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c
@@ -1464,6 +1464,33 @@ static int mlxsw_pci_sys_ready_wait(struct mlxsw_pci *mlxsw_pci,
 	return -EBUSY;
 }
 
+static int mlxsw_pci_reset_at_pci_disable(struct mlxsw_pci *mlxsw_pci)
+{
+	struct pci_dev *pdev = mlxsw_pci->pdev;
+	char mrsr_pl[MLXSW_REG_MRSR_LEN];
+	int err;
+
+	mlxsw_reg_mrsr_pack(mrsr_pl,
+			    MLXSW_REG_MRSR_COMMAND_RESET_AT_PCI_DISABLE);
+	err = mlxsw_reg_write(mlxsw_pci->core, MLXSW_REG(mrsr), mrsr_pl);
+	if (err)
+		return err;
+
+	device_lock_assert(&pdev->dev);
+
+	pci_cfg_access_lock(pdev);
+	pci_save_state(pdev);
+
+	err = __pci_reset_function_locked(pdev);
+	if (err)
+		pci_err(pdev, "PCI function reset failed with %d\n", err);
+
+	pci_restore_state(pdev);
+	pci_cfg_access_unlock(pdev);
+
+	return err;
+}
+
 static int mlxsw_pci_reset_sw(struct mlxsw_pci *mlxsw_pci)
 {
 	char mrsr_pl[MLXSW_REG_MRSR_LEN];
@@ -1476,6 +1503,8 @@ static int
 mlxsw_pci_reset(struct mlxsw_pci *mlxsw_pci, const struct pci_device_id *id)
 {
 	struct pci_dev *pdev = mlxsw_pci->pdev;
+	char mcam_pl[MLXSW_REG_MCAM_LEN];
+	bool pci_reset_supported;
 	u32 sys_status;
 	int err;
 
@@ -1486,10 +1515,23 @@ mlxsw_pci_reset(struct mlxsw_pci *mlxsw_pci, const struct pci_device_id *id)
 		return err;
 	}
 
-	err = mlxsw_pci_reset_sw(mlxsw_pci);
+	mlxsw_reg_mcam_pack(mcam_pl,
+			    MLXSW_REG_MCAM_FEATURE_GROUP_ENHANCED_FEATURES);
+	err = mlxsw_reg_query(mlxsw_pci->core, MLXSW_REG(mcam), mcam_pl);
 	if (err)
 		return err;
 
+	mlxsw_reg_mcam_unpack(mcam_pl, MLXSW_REG_MCAM_PCI_RESET,
+			      &pci_reset_supported);
+
+	if (pci_reset_supported) {
+		pci_dbg(pdev, "Starting PCI reset flow\n");
+		err = mlxsw_pci_reset_at_pci_disable(mlxsw_pci);
+	} else {
+		pci_dbg(pdev, "Starting software reset flow\n");
+		err = mlxsw_pci_reset_sw(mlxsw_pci);
+	}
+
 	err = mlxsw_pci_sys_ready_wait(mlxsw_pci, id, &sys_status);
 	if (err) {
 		dev_err(&pdev->dev, "Failed to reach system ready status after reset. Status is 0x%x\n",
diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h b/drivers/net/ethernet/mellanox/mlxsw/reg.h
index 44f528326394..c314afd4a8ff 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/reg.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h
@@ -10594,6 +10594,8 @@ MLXSW_ITEM32(reg, mcam, feature_group, 0x00, 16, 8);
 enum mlxsw_reg_mcam_mng_feature_cap_mask_bits {
 	/* If set, MCIA supports 128 bytes payloads. Otherwise, 48 bytes. */
 	MLXSW_REG_MCAM_MCIA_128B = 34,
+	/* If set, MRSR.command=6 is supported. */
+	MLXSW_REG_MCAM_PCI_RESET = 48,
 };
 
 #define MLXSW_REG_BYTES_PER_DWORD 0x4
-- 
2.40.1


  parent reply	other threads:[~2023-10-17  7:44 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-17  7:42 [RFC PATCH net-next 00/12] mlxsw: Add support for new reset flow Ido Schimmel
2023-10-17  7:42 ` [RFC PATCH net-next 01/12] netdevsim: Block until all devices are released Ido Schimmel
2023-10-19  0:53   ` Jakub Kicinski
2023-10-17  7:42 ` [RFC PATCH net-next 02/12] devlink: Hold a reference on parent device Ido Schimmel
2023-10-17  7:56   ` Jiri Pirko
2023-10-17  8:11     ` Ido Schimmel
2023-10-17  9:01       ` Jiri Pirko
2023-10-17  7:42 ` [RFC PATCH net-next 03/12] devlink: Acquire device lock during reload Ido Schimmel
2023-10-17  8:04   ` Jiri Pirko
2023-10-17  7:42 ` [RFC PATCH net-next 04/12] PCI: Add no PM reset quirk for NVIDIA Spectrum devices Ido Schimmel
2023-10-18 19:40   ` Bjorn Helgaas
2023-10-22  8:23     ` Ido Schimmel
2023-10-17  7:42 ` [RFC PATCH net-next 05/12] PCI: Add device-specific reset " Ido Schimmel
2023-10-17 10:00   ` Lukas Wunner
2023-10-18 20:08   ` Bjorn Helgaas
2023-10-25 11:05     ` Ido Schimmel
2023-10-17  7:42 ` [RFC PATCH net-next 06/12] PCI: Add debug print for device ready delay Ido Schimmel
2023-10-18 19:41   ` Bjorn Helgaas
2023-10-17  7:42 ` [RFC PATCH net-next 07/12] mlxsw: Extend MRSR pack() function to support new commands Ido Schimmel
2023-10-17  7:42 ` [RFC PATCH net-next 08/12] mlxsw: pci: Rename mlxsw_pci_sw_reset() Ido Schimmel
2023-10-17  7:42 ` [RFC PATCH net-next 09/12] mlxsw: pci: Move software reset code to a separate function Ido Schimmel
2023-10-17  7:42 ` Ido Schimmel [this message]
2023-10-17  7:42 ` [RFC PATCH net-next 11/12] mlxsw: pci: Implement PCI reset handlers Ido Schimmel
2023-10-17  7:42 ` [RFC PATCH net-next 12/12] selftests: mlxsw: Add PCI reset test Ido Schimmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231017074257.3389177-11-idosch@nvidia.com \
    --to=idosch@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jiri@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mlxsw@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=petrm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox