[PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice)

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice)
@ 2024-05-28 22:06 Jacob Keller
  2024-05-28 22:06 ` [PATCH net 1/8] e1000e: move force SMBUS near the end of enable_ulp function Jacob Keller
                   ` (8 more replies)
  0 siblings, 9 replies; 23+ messages in thread
From: Jacob Keller @ 2024-05-28 22:06 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller, netdev
  Cc: Jacob Keller, Hui Wang, Vitaly Lifshits, Naama Meir, Simon Horman,
	Paul Menzel, Tony Nguyen, Zhang Rui, Thinh Tran, Robert Thomas,
	Pucha Himasekhar Reddy, Michal Kubiak, Wojciech Drewek,
	George Kuruvinakunnel, Maciej Fijalkowski, Paul Greenwalt,
	Michal Swiatkowski, Brett Creeley, Przemek Kitszel, Dave Ertman,
	Lukasz Czapnik

This series includes a variety of fixes that have been accumulating on the
Intel Wired LAN dev-queue.

Hui Wang provides a fix for suspend/resume on e1000e due to failure
to correctly setup the SMBUS in enable_ulp().

Thinh Tran provides a fix for EEH I/O suspend/resume on i40e to
ensure that I/O operations can continue after a resume. To avoid duplicate
code, the common logic is factored out of i40e_suspend and i40e_resume.

Michal Kubiak provides a fix for i40e XDP in if the user tries to rmmod the
i40e driver while an XDP program is loaded.

Paul Greenwalt provides a fix to correctly map the 200G PHY types to link
speeds in the ice driver.

Wojciech Drewek provides a fix to ice to resolve sporadic issues with
downloading the firmware package over the Admin queue.

Jacob provides a fix for the ice driver to correct reading the Shadow RAM
portion of the NVM for some of the newer devices including E830 and E825-C
devices.

Dave Ertman provides a fix correcting devlink parameter unregistration in
the event that the driver loads in safe mode and some of the parameters
were not registered.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
Dave Ertman (1):
      ice: check for unregistering correct number of devlink params

Hui Wang (1):
      e1000e: move force SMBUS near the end of enable_ulp function

Jacob Keller (1):
      ice: fix reads from NVM Shadow RAM on E830 and E825-C devices

Michal Kubiak (1):
      i40e: Fix XDP program unloading while removing the driver

Paul Greenwalt (1):
      ice: fix 200G PHY types to link speed mapping

Thinh Tran (2):
      i40e: factoring out i40e_suspend/i40e_resume
      i40e: Fully suspend and resume IO operations in EEH case

Wojciech Drewek (1):
      ice: implement AQ download pkg retry

 drivers/net/ethernet/intel/e1000e/ich8lan.c      |  22 ++
 drivers/net/ethernet/intel/e1000e/netdev.c       |  18 --
 drivers/net/ethernet/intel/i40e/i40e_main.c      | 277 +++++++++++++----------
 drivers/net/ethernet/intel/ice/devlink/devlink.c |  31 ++-
 drivers/net/ethernet/intel/ice/ice_common.c      |  10 +
 drivers/net/ethernet/intel/ice/ice_ddp.c         |  19 +-
 drivers/net/ethernet/intel/ice/ice_nvm.c         |  88 ++++++-
 drivers/net/ethernet/intel/ice/ice_type.h        |  14 +-
 8 files changed, 319 insertions(+), 160 deletions(-)
---
base-commit: 56a5cf538c3f2d935b0d81040a8303b6e7fc5fd8
change-id: 20240528-net-2024-05-28-intel-net-fixes-6d50e8e27b76

Best regards,
-- 
Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net 1/8] e1000e: move force SMBUS near the end of enable_ulp function
  2024-05-28 22:06 [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) Jacob Keller
@ 2024-05-28 22:06 ` Jacob Keller
  2024-05-28 22:06 ` [PATCH net 2/8] i40e: factoring out i40e_suspend/i40e_resume Jacob Keller
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Jacob Keller @ 2024-05-28 22:06 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller, netdev
  Cc: Jacob Keller, Hui Wang, Vitaly Lifshits, Naama Meir, Simon Horman,
	Paul Menzel, Tony Nguyen, Zhang Rui

From: Hui Wang <hui.wang@canonical.com>

The commit 861e8086029e ("e1000e: move force SMBUS from enable ulp
function to avoid PHY loss issue") introduces a regression on
PCH_MTP_I219_LM18 (PCIID: 0x8086550A). Without the referred commit, the
ethernet works well after suspend and resume, but after applying the
commit, the ethernet couldn't work anymore after the resume and the
dmesg shows that the NIC link changes to 10Mbps (1000Mbps originally):

    [   43.305084] e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 10 Mbps Full Duplex, Flow Control: Rx/Tx

Without the commit, the force SMBUS code will not be executed if
"return 0" or "goto out" is executed in the enable_ulp(), and in my
case, the "goto out" is executed since FWSM_FW_VALID is set. But after
applying the commit, the force SMBUS code will be ran unconditionally.

Here move the force SMBUS code back to enable_ulp() and put it
immediately ahead of hw->phy.ops.release(hw), this could allow the
longest settling time as possible for interface in this function and
doesn't change the original code logic.

The issue was found on a Lenovo laptop with the ethernet hw as below:
00:1f.6 Ethernet controller [0200]: Intel Corporation Device [8086:550a]
(rev 20).

And this patch is verified (cable plug and unplug, system suspend
and resume) on Lenovo laptops with ethernet hw: [8086:550a],
[8086:550b], [8086:15bb], [8086:15be], [8086:1a1f], [8086:1a1c] and
[8086:0dc7].

Fixes: 861e8086029e ("e1000e: move force SMBUS from enable ulp function to avoid PHY loss issue")
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Acked-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/e1000e/ich8lan.c | 22 ++++++++++++++++++++++
 drivers/net/ethernet/intel/e1000e/netdev.c  | 18 ------------------
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
index f9e94be36e97..2e98a2a0bead 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
@@ -1225,6 +1225,28 @@ s32 e1000_enable_ulp_lpt_lp(struct e1000_hw *hw, bool to_sx)
 	}
 
 release:
+	/* Switching PHY interface always returns MDI error
+	 * so disable retry mechanism to avoid wasting time
+	 */
+	e1000e_disable_phy_retry(hw);
+
+	/* Force SMBus mode in PHY */
+	ret_val = e1000_read_phy_reg_hv_locked(hw, CV_SMB_CTRL, &phy_reg);
+	if (ret_val) {
+		e1000e_enable_phy_retry(hw);
+		hw->phy.ops.release(hw);
+		goto out;
+	}
+	phy_reg |= CV_SMB_CTRL_FORCE_SMBUS;
+	e1000_write_phy_reg_hv_locked(hw, CV_SMB_CTRL, phy_reg);
+
+	e1000e_enable_phy_retry(hw);
+
+	/* Force SMBus mode in MAC */
+	mac_reg = er32(CTRL_EXT);
+	mac_reg |= E1000_CTRL_EXT_FORCE_SMBUS;
+	ew32(CTRL_EXT, mac_reg);
+
 	hw->phy.ops.release(hw);
 out:
 	if (ret_val)
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 220d62fca55d..da5c59daf8ba 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -6623,7 +6623,6 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool runtime)
 	struct e1000_hw *hw = &adapter->hw;
 	u32 ctrl, ctrl_ext, rctl, status, wufc;
 	int retval = 0;
-	u16 smb_ctrl;
 
 	/* Runtime suspend should only enable wakeup for link changes */
 	if (runtime)
@@ -6697,23 +6696,6 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool runtime)
 			if (retval)
 				return retval;
 		}
-
-		/* Force SMBUS to allow WOL */
-		/* Switching PHY interface always returns MDI error
-		 * so disable retry mechanism to avoid wasting time
-		 */
-		e1000e_disable_phy_retry(hw);
-
-		e1e_rphy(hw, CV_SMB_CTRL, &smb_ctrl);
-		smb_ctrl |= CV_SMB_CTRL_FORCE_SMBUS;
-		e1e_wphy(hw, CV_SMB_CTRL, smb_ctrl);
-
-		e1000e_enable_phy_retry(hw);
-
-		/* Force SMBus mode in MAC */
-		ctrl_ext = er32(CTRL_EXT);
-		ctrl_ext |= E1000_CTRL_EXT_FORCE_SMBUS;
-		ew32(CTRL_EXT, ctrl_ext);
 	}
 
 	/* Ensure that the appropriate bits are set in LPI_CTRL

-- 
2.44.0.53.g0f9d4d28b7e6


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net 2/8] i40e: factoring out i40e_suspend/i40e_resume
  2024-05-28 22:06 [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) Jacob Keller
  2024-05-28 22:06 ` [PATCH net 1/8] e1000e: move force SMBUS near the end of enable_ulp function Jacob Keller
@ 2024-05-28 22:06 ` Jacob Keller
  2024-05-28 22:06 ` [PATCH net 3/8] i40e: Fully suspend and resume IO operations in EEH case Jacob Keller
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Jacob Keller @ 2024-05-28 22:06 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller, netdev
  Cc: Jacob Keller, Thinh Tran, Robert Thomas, Simon Horman,
	Pucha Himasekhar Reddy

From: Thinh Tran <thinhtr@linux.ibm.com>

Two new functions, i40e_io_suspend() and i40e_io_resume(), have been
introduced.  These functions were factored out from the existing
i40e_suspend() and i40e_resume() respectively.  This factoring was
done due to concerns about the logic of the I40E_SUSPENSED state, which
caused the device to be unable to recover.  The functions are now used
in the EEH handling for device suspend/resume callbacks.

The function i40e_enable_mc_magic_wake() has been moved ahead of
i40e_io_suspend() to ensure it is declared before being used.

Tested-by: Robert Thomas <rob.thomas@ibm.com>
Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 249 +++++++++++++++-------------
 1 file changed, 135 insertions(+), 114 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 1f188c052828..d5f25ea304bf 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -16334,6 +16334,139 @@ static void i40e_remove(struct pci_dev *pdev)
 	pci_disable_device(pdev);
 }
 
+/**
+ * i40e_enable_mc_magic_wake - enable multicast magic packet wake up
+ * using the mac_address_write admin q function
+ * @pf: pointer to i40e_pf struct
+ **/
+static void i40e_enable_mc_magic_wake(struct i40e_pf *pf)
+{
+	struct i40e_vsi *main_vsi = i40e_pf_get_main_vsi(pf);
+	struct i40e_hw *hw = &pf->hw;
+	u8 mac_addr[6];
+	u16 flags = 0;
+	int ret;
+
+	/* Get current MAC address in case it's an LAA */
+	if (main_vsi && main_vsi->netdev) {
+		ether_addr_copy(mac_addr, main_vsi->netdev->dev_addr);
+	} else {
+		dev_err(&pf->pdev->dev,
+			"Failed to retrieve MAC address; using default\n");
+		ether_addr_copy(mac_addr, hw->mac.addr);
+	}
+
+	/* The FW expects the mac address write cmd to first be called with
+	 * one of these flags before calling it again with the multicast
+	 * enable flags.
+	 */
+	flags = I40E_AQC_WRITE_TYPE_LAA_WOL;
+
+	if (hw->func_caps.flex10_enable && hw->partition_id != 1)
+		flags = I40E_AQC_WRITE_TYPE_LAA_ONLY;
+
+	ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
+	if (ret) {
+		dev_err(&pf->pdev->dev,
+			"Failed to update MAC address registers; cannot enable Multicast Magic packet wake up");
+		return;
+	}
+
+	flags = I40E_AQC_MC_MAG_EN
+			| I40E_AQC_WOL_PRESERVE_ON_PFR
+			| I40E_AQC_WRITE_TYPE_UPDATE_MC_MAG;
+	ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
+	if (ret)
+		dev_err(&pf->pdev->dev,
+			"Failed to enable Multicast Magic Packet wake up\n");
+}
+
+/**
+ * i40e_io_suspend - suspend all IO operations
+ * @pf: pointer to i40e_pf struct
+ *
+ **/
+static int i40e_io_suspend(struct i40e_pf *pf)
+{
+	struct i40e_hw *hw = &pf->hw;
+
+	set_bit(__I40E_DOWN, pf->state);
+
+	/* Ensure service task will not be running */
+	del_timer_sync(&pf->service_timer);
+	cancel_work_sync(&pf->service_task);
+
+	/* Client close must be called explicitly here because the timer
+	 * has been stopped.
+	 */
+	i40e_notify_client_of_netdev_close(pf, false);
+
+	if (test_bit(I40E_HW_CAP_WOL_MC_MAGIC_PKT_WAKE, pf->hw.caps) &&
+	    pf->wol_en)
+		i40e_enable_mc_magic_wake(pf);
+
+	/* Since we're going to destroy queues during the
+	 * i40e_clear_interrupt_scheme() we should hold the RTNL lock for this
+	 * whole section
+	 */
+	rtnl_lock();
+
+	i40e_prep_for_reset(pf);
+
+	wr32(hw, I40E_PFPM_APM, (pf->wol_en ? I40E_PFPM_APM_APME_MASK : 0));
+	wr32(hw, I40E_PFPM_WUFC, (pf->wol_en ? I40E_PFPM_WUFC_MAG_MASK : 0));
+
+	/* Clear the interrupt scheme and release our IRQs so that the system
+	 * can safely hibernate even when there are a large number of CPUs.
+	 * Otherwise hibernation might fail when mapping all the vectors back
+	 * to CPU0.
+	 */
+	i40e_clear_interrupt_scheme(pf);
+
+	rtnl_unlock();
+
+	return 0;
+}
+
+/**
+ * i40e_io_resume - resume IO operations
+ * @pf: pointer to i40e_pf struct
+ *
+ **/
+static int i40e_io_resume(struct i40e_pf *pf)
+{
+	struct device *dev = &pf->pdev->dev;
+	int err;
+
+	/* We need to hold the RTNL lock prior to restoring interrupt schemes,
+	 * since we're going to be restoring queues
+	 */
+	rtnl_lock();
+
+	/* We cleared the interrupt scheme when we suspended, so we need to
+	 * restore it now to resume device functionality.
+	 */
+	err = i40e_restore_interrupt_scheme(pf);
+	if (err) {
+		dev_err(dev, "Cannot restore interrupt scheme: %d\n",
+			err);
+	}
+
+	clear_bit(__I40E_DOWN, pf->state);
+	i40e_reset_and_rebuild(pf, false, true);
+
+	rtnl_unlock();
+
+	/* Clear suspended state last after everything is recovered */
+	clear_bit(__I40E_SUSPENDED, pf->state);
+
+	/* Restart the service task */
+	mod_timer(&pf->service_timer,
+		  round_jiffies(jiffies + pf->service_timer_period));
+
+	return 0;
+}
+
 /**
  * i40e_pci_error_detected - warning that something funky happened in PCI land
  * @pdev: PCI device information struct
@@ -16446,53 +16579,6 @@ static void i40e_pci_error_resume(struct pci_dev *pdev)
 	i40e_handle_reset_warning(pf, false);
 }
 
-/**
- * i40e_enable_mc_magic_wake - enable multicast magic packet wake up
- * using the mac_address_write admin q function
- * @pf: pointer to i40e_pf struct
- **/
-static void i40e_enable_mc_magic_wake(struct i40e_pf *pf)
-{
-	struct i40e_vsi *main_vsi = i40e_pf_get_main_vsi(pf);
-	struct i40e_hw *hw = &pf->hw;
-	u8 mac_addr[6];
-	u16 flags = 0;
-	int ret;
-
-	/* Get current MAC address in case it's an LAA */
-	if (main_vsi && main_vsi->netdev) {
-		ether_addr_copy(mac_addr, main_vsi->netdev->dev_addr);
-	} else {
-		dev_err(&pf->pdev->dev,
-			"Failed to retrieve MAC address; using default\n");
-		ether_addr_copy(mac_addr, hw->mac.addr);
-	}
-
-	/* The FW expects the mac address write cmd to first be called with
-	 * one of these flags before calling it again with the multicast
-	 * enable flags.
-	 */
-	flags = I40E_AQC_WRITE_TYPE_LAA_WOL;
-
-	if (hw->func_caps.flex10_enable && hw->partition_id != 1)
-		flags = I40E_AQC_WRITE_TYPE_LAA_ONLY;
-
-	ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
-	if (ret) {
-		dev_err(&pf->pdev->dev,
-			"Failed to update MAC address registers; cannot enable Multicast Magic packet wake up");
-		return;
-	}
-
-	flags = I40E_AQC_MC_MAG_EN
-			| I40E_AQC_WOL_PRESERVE_ON_PFR
-			| I40E_AQC_WRITE_TYPE_UPDATE_MC_MAG;
-	ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
-	if (ret)
-		dev_err(&pf->pdev->dev,
-			"Failed to enable Multicast Magic Packet wake up\n");
-}
-
 /**
  * i40e_shutdown - PCI callback for shutting down
  * @pdev: PCI device information struct
@@ -16552,48 +16638,11 @@ static void i40e_shutdown(struct pci_dev *pdev)
 static int i40e_suspend(struct device *dev)
 {
 	struct i40e_pf *pf = dev_get_drvdata(dev);
-	struct i40e_hw *hw = &pf->hw;
 
 	/* If we're already suspended, then there is nothing to do */
 	if (test_and_set_bit(__I40E_SUSPENDED, pf->state))
 		return 0;
-
-	set_bit(__I40E_DOWN, pf->state);
-
-	/* Ensure service task will not be running */
-	del_timer_sync(&pf->service_timer);
-	cancel_work_sync(&pf->service_task);
-
-	/* Client close must be called explicitly here because the timer
-	 * has been stopped.
-	 */
-	i40e_notify_client_of_netdev_close(pf, false);
-
-	if (test_bit(I40E_HW_CAP_WOL_MC_MAGIC_PKT_WAKE, pf->hw.caps) &&
-	    pf->wol_en)
-		i40e_enable_mc_magic_wake(pf);
-
-	/* Since we're going to destroy queues during the
-	 * i40e_clear_interrupt_scheme() we should hold the RTNL lock for this
-	 * whole section
-	 */
-	rtnl_lock();
-
-	i40e_prep_for_reset(pf);
-
-	wr32(hw, I40E_PFPM_APM, (pf->wol_en ? I40E_PFPM_APM_APME_MASK : 0));
-	wr32(hw, I40E_PFPM_WUFC, (pf->wol_en ? I40E_PFPM_WUFC_MAG_MASK : 0));
-
-	/* Clear the interrupt scheme and release our IRQs so that the system
-	 * can safely hibernate even when there are a large number of CPUs.
-	 * Otherwise hibernation might fail when mapping all the vectors back
-	 * to CPU0.
-	 */
-	i40e_clear_interrupt_scheme(pf);
-
-	rtnl_unlock();
-
-	return 0;
+	return i40e_io_suspend(pf);
 }
 
 /**
@@ -16603,39 +16652,11 @@ static int i40e_suspend(struct device *dev)
 static int i40e_resume(struct device *dev)
 {
 	struct i40e_pf *pf = dev_get_drvdata(dev);
-	int err;
 
 	/* If we're not suspended, then there is nothing to do */
 	if (!test_bit(__I40E_SUSPENDED, pf->state))
 		return 0;
-
-	/* We need to hold the RTNL lock prior to restoring interrupt schemes,
-	 * since we're going to be restoring queues
-	 */
-	rtnl_lock();
-
-	/* We cleared the interrupt scheme when we suspended, so we need to
-	 * restore it now to resume device functionality.
-	 */
-	err = i40e_restore_interrupt_scheme(pf);
-	if (err) {
-		dev_err(dev, "Cannot restore interrupt scheme: %d\n",
-			err);
-	}
-
-	clear_bit(__I40E_DOWN, pf->state);
-	i40e_reset_and_rebuild(pf, false, true);
-
-	rtnl_unlock();
-
-	/* Clear suspended state last after everything is recovered */
-	clear_bit(__I40E_SUSPENDED, pf->state);
-
-	/* Restart the service task */
-	mod_timer(&pf->service_timer,
-		  round_jiffies(jiffies + pf->service_timer_period));
-
-	return 0;
+	return i40e_io_resume(pf);
 }
 
 static const struct pci_error_handlers i40e_err_handler = {

-- 
2.44.0.53.g0f9d4d28b7e6


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net 3/8] i40e: Fully suspend and resume IO operations in EEH case
  2024-05-28 22:06 [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) Jacob Keller
  2024-05-28 22:06 ` [PATCH net 1/8] e1000e: move force SMBUS near the end of enable_ulp function Jacob Keller
  2024-05-28 22:06 ` [PATCH net 2/8] i40e: factoring out i40e_suspend/i40e_resume Jacob Keller
@ 2024-05-28 22:06 ` Jacob Keller
  2024-05-28 22:06 ` [PATCH net 4/8] i40e: Fix XDP program unloading while removing the driver Jacob Keller
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Jacob Keller @ 2024-05-28 22:06 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller, netdev
  Cc: Jacob Keller, Thinh Tran, Robert Thomas, Simon Horman,
	Pucha Himasekhar Reddy

From: Thinh Tran <thinhtr@linux.ibm.com>

When EEH events occurs, the callback functions in the i40e, which are
managed by the EEH driver, will completely suspend and resume all IO
operations.

- In the PCI error detected callback, replaced i40e_prep_for_reset()
  with i40e_io_suspend(). The change is to fully suspend all I/O
  operations
- In the PCI error slot reset callback, replaced pci_enable_device_mem()
  with pci_enable_device(). This change enables both I/O and memory of
  the device.
- In the PCI error resume callback, replaced i40e_handle_reset_warning()
  with i40e_io_resume(). This change allows the system to resume I/O
  operations

Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Robert Thomas <rob.thomas@ibm.com>
Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index d5f25ea304bf..284c3fad5a6e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -11171,6 +11171,8 @@ static void i40e_reset_and_rebuild(struct i40e_pf *pf, bool reinit,
 	ret = i40e_reset(pf);
 	if (!ret)
 		i40e_rebuild(pf, reinit, lock_acquired);
+	else
+		dev_err(&pf->pdev->dev, "%s: i40e_reset() FAILED", __func__);
 }
 
 /**
@@ -16491,7 +16493,7 @@ static pci_ers_result_t i40e_pci_error_detected(struct pci_dev *pdev,
 
 	/* shutdown all operations */
 	if (!test_bit(__I40E_SUSPENDED, pf->state))
-		i40e_prep_for_reset(pf);
+		i40e_io_suspend(pf);
 
 	/* Request a slot reset */
 	return PCI_ERS_RESULT_NEED_RESET;
@@ -16513,7 +16515,8 @@ static pci_ers_result_t i40e_pci_error_slot_reset(struct pci_dev *pdev)
 	u32 reg;
 
 	dev_dbg(&pdev->dev, "%s\n", __func__);
-	if (pci_enable_device_mem(pdev)) {
+	/* enable I/O and memory of the device  */
+	if (pci_enable_device(pdev)) {
 		dev_info(&pdev->dev,
 			 "Cannot re-enable PCI device after reset.\n");
 		result = PCI_ERS_RESULT_DISCONNECT;
@@ -16576,7 +16579,7 @@ static void i40e_pci_error_resume(struct pci_dev *pdev)
 	if (test_bit(__I40E_SUSPENDED, pf->state))
 		return;
 
-	i40e_handle_reset_warning(pf, false);
+	i40e_io_resume(pf);
 }
 
 /**

-- 
2.44.0.53.g0f9d4d28b7e6


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net 4/8] i40e: Fix XDP program unloading while removing the driver
  2024-05-28 22:06 [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) Jacob Keller
                   ` (2 preceding siblings ...)
  2024-05-28 22:06 ` [PATCH net 3/8] i40e: Fully suspend and resume IO operations in EEH case Jacob Keller
@ 2024-05-28 22:06 ` Jacob Keller
  2024-05-30  1:54   ` Jakub Kicinski
  2024-05-28 22:06 ` [PATCH net 5/8] ice: fix 200G PHY types to link speed mapping Jacob Keller
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Jacob Keller @ 2024-05-28 22:06 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller, netdev
  Cc: Jacob Keller, Michal Kubiak, Wojciech Drewek,
	George Kuruvinakunnel, Maciej Fijalkowski

From: Michal Kubiak <michal.kubiak@intel.com>

The commit 6533e558c650 ("i40e: Fix reset path while removing
the driver") introduced a new PF state "__I40E_IN_REMOVE" to block
modifying the XDP program while the driver is being removed.
Unfortunately, such a change is useful only if the ".ndo_bpf()"
callback was called out of the rmmod context because unloading the
existing XDP program is also a part of driver removing procedure.
In other words, from the rmmod context the driver is expected to
unload the XDP program without reporting any errors. Otherwise,
the kernel warning with callstack is printed out to dmesg.

Example failing scenario:
 1. Load the i40e driver.
 2. Load the XDP program.
 3. Unload the i40e driver (using "rmmod" command).

Fix this by improving checks in ".ndo_bpf()" to determine if that
callback was called from the removing context and if the kernel
wants to unload the XDP program. Allow for unloading the XDP program
in such a case.

Fixes: 6533e558c650 ("i40e: Fix reset path while removing the driver")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 284c3fad5a6e..2f478edb9f9f 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -13293,6 +13293,20 @@ static int i40e_xdp_setup(struct i40e_vsi *vsi, struct bpf_prog *prog,
 	bool need_reset;
 	int i;
 
+	/* Called from netdev unregister context. Unload the XDP program. */
+	if (vsi->netdev->reg_state == NETREG_UNREGISTERING) {
+		xdp_features_clear_redirect_target(vsi->netdev);
+		old_prog = xchg(&vsi->xdp_prog, NULL);
+		if (old_prog)
+			bpf_prog_put(old_prog);
+
+		return 0;
+	}
+
+	/* VSI shall be deleted in a moment, just return EINVAL */
+	if (test_bit(__I40E_IN_REMOVE, pf->state))
+		return -EINVAL;
+
 	/* Don't allow frames that span over multiple buffers */
 	if (vsi->netdev->mtu > frame_size - I40E_PACKET_HDR_PAD) {
 		NL_SET_ERR_MSG_MOD(extack, "MTU too large for linear frames and XDP prog does not support frags");
@@ -13301,14 +13315,9 @@ static int i40e_xdp_setup(struct i40e_vsi *vsi, struct bpf_prog *prog,
 
 	/* When turning XDP on->off/off->on we reset and rebuild the rings. */
 	need_reset = (i40e_enabled_xdp_vsi(vsi) != !!prog);
-
 	if (need_reset)
 		i40e_prep_for_reset(pf);
 
-	/* VSI shall be deleted in a moment, just return EINVAL */
-	if (test_bit(__I40E_IN_REMOVE, pf->state))
-		return -EINVAL;
-
 	old_prog = xchg(&vsi->xdp_prog, prog);
 
 	if (need_reset) {

-- 
2.44.0.53.g0f9d4d28b7e6


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net 5/8] ice: fix 200G PHY types to link speed mapping
  2024-05-28 22:06 [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) Jacob Keller
                   ` (3 preceding siblings ...)
  2024-05-28 22:06 ` [PATCH net 4/8] i40e: Fix XDP program unloading while removing the driver Jacob Keller
@ 2024-05-28 22:06 ` Jacob Keller
  2024-05-28 22:06 ` [PATCH net 6/8] ice: implement AQ download pkg retry Jacob Keller
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Jacob Keller @ 2024-05-28 22:06 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller, netdev
  Cc: Jacob Keller, Paul Greenwalt, Michal Swiatkowski,
	Pucha Himasekhar Reddy

From: Paul Greenwalt <paul.greenwalt@intel.com>

Commit 24407a01e57c ("ice: Add 200G speed/phy type use") added support
for 200G PHY speeds, but did not include the mapping of 200G PHY types
to link speed. As a result the driver is returning UNKNOWN link speed
when setting 200G ethtool advertised link modes.

To fix this add 200G PHY types to link speed mapping to
ice_get_link_speed_based_on_phy_type().

Fixes: 24407a01e57c ("ice: Add 200G speed/phy type use")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_common.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 5649b257e631..24716a3b494c 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -3148,6 +3148,16 @@ ice_get_link_speed_based_on_phy_type(u64 phy_type_low, u64 phy_type_high)
 	case ICE_PHY_TYPE_HIGH_100G_AUI2:
 		speed_phy_type_high = ICE_AQ_LINK_SPEED_100GB;
 		break;
+	case ICE_PHY_TYPE_HIGH_200G_CR4_PAM4:
+	case ICE_PHY_TYPE_HIGH_200G_SR4:
+	case ICE_PHY_TYPE_HIGH_200G_FR4:
+	case ICE_PHY_TYPE_HIGH_200G_LR4:
+	case ICE_PHY_TYPE_HIGH_200G_DR4:
+	case ICE_PHY_TYPE_HIGH_200G_KR4_PAM4:
+	case ICE_PHY_TYPE_HIGH_200G_AUI4_AOC_ACC:
+	case ICE_PHY_TYPE_HIGH_200G_AUI4:
+		speed_phy_type_high = ICE_AQ_LINK_SPEED_200GB;
+		break;
 	default:
 		speed_phy_type_high = ICE_AQ_LINK_SPEED_UNKNOWN;
 		break;

-- 
2.44.0.53.g0f9d4d28b7e6


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net 6/8] ice: implement AQ download pkg retry
  2024-05-28 22:06 [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) Jacob Keller
                   ` (4 preceding siblings ...)
  2024-05-28 22:06 ` [PATCH net 5/8] ice: fix 200G PHY types to link speed mapping Jacob Keller
@ 2024-05-28 22:06 ` Jacob Keller
  2024-05-30  1:51   ` Jakub Kicinski
  2024-05-28 22:06 ` [PATCH net 7/8] ice: fix reads from NVM Shadow RAM on E830 and E825-C devices Jacob Keller
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Jacob Keller @ 2024-05-28 22:06 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller, netdev
  Cc: Jacob Keller, Wojciech Drewek, Michal Swiatkowski, Brett Creeley,
	Pucha Himasekhar Reddy

From: Wojciech Drewek <wojciech.drewek@intel.com>

ice_aqc_opc_download_pkg (0x0C40) AQ sporadically returns error due
to FW issue. Fix this by retrying five times before moving to
Safe Mode.

Fixes: c76488109616 ("ice: Implement Dynamic Device Personalization (DDP) download")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_ddp.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_ddp.c b/drivers/net/ethernet/intel/ice/ice_ddp.c
index ce5034ed2b24..77b81e5a5a44 100644
--- a/drivers/net/ethernet/intel/ice/ice_ddp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ddp.c
@@ -1339,6 +1339,7 @@ ice_dwnld_cfg_bufs_no_lock(struct ice_hw *hw, struct ice_buf *bufs, u32 start,
 
 	for (i = 0; i < count; i++) {
 		bool last = false;
+		int try_cnt = 0;
 		int status;
 
 		bh = (struct ice_buf_hdr *)(bufs + start + i);
@@ -1346,8 +1347,22 @@ ice_dwnld_cfg_bufs_no_lock(struct ice_hw *hw, struct ice_buf *bufs, u32 start,
 		if (indicate_last)
 			last = ice_is_last_download_buffer(bh, i, count);
 
-		status = ice_aq_download_pkg(hw, bh, ICE_PKG_BUF_SIZE, last,
-					     &offset, &info, NULL);
+		while (try_cnt < 5) {
+			status = ice_aq_download_pkg(hw, bh, ICE_PKG_BUF_SIZE,
+						     last, &offset, &info,
+						     NULL);
+			if (hw->adminq.sq_last_status != ICE_AQ_RC_ENOSEC &&
+			    hw->adminq.sq_last_status != ICE_AQ_RC_EBADSIG)
+				break;
+
+			try_cnt++;
+			msleep(20);
+		}
+
+		if (try_cnt)
+			dev_dbg(ice_hw_to_dev(hw),
+				"ice_aq_download_pkg number of retries: %d\n",
+				try_cnt);
 
 		/* Save AQ status from download package */
 		if (status) {

-- 
2.44.0.53.g0f9d4d28b7e6


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net 7/8] ice: fix reads from NVM Shadow RAM on E830 and E825-C devices
  2024-05-28 22:06 [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) Jacob Keller
                   ` (5 preceding siblings ...)
  2024-05-28 22:06 ` [PATCH net 6/8] ice: implement AQ download pkg retry Jacob Keller
@ 2024-05-28 22:06 ` Jacob Keller
  2024-05-28 22:06 ` [PATCH net 8/8] ice: check for unregistering correct number of devlink params Jacob Keller
  2024-05-30  2:00 ` [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) patchwork-bot+netdevbpf
  8 siblings, 0 replies; 23+ messages in thread
From: Jacob Keller @ 2024-05-28 22:06 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller, netdev
  Cc: Jacob Keller, Paul Greenwalt, Przemek Kitszel,
	Pucha Himasekhar Reddy

The ice driver reads data from the Shadow RAM portion of the NVM during
initialization, including data used to identify the NVM image and device,
such as the ETRACK ID used to populate devlink dev info fw.bundle.

Currently it is using a fixed offset defined by ICE_CSS_HEADER_LENGTH to
compute the appropriate offset. This worked fine for E810 and E822 devices
which both have CSS header length of 330 words.

Other devices, including both E825-C and E830 devices have different sizes
for their CSS header. The use of a hard coded value results in the driver
reading from the wrong block in the NVM when attempting to access the
Shadow RAM copy. This results in the driver reporting the fw.bundle as 0x0
in both the devlink dev info and ethtool -i output.

The first E830 support was introduced by commit ba20ecb1d1bb ("ice: Hook up
4 E830 devices by adding their IDs") and the first E825-C support was
introducted by commit f64e18944233 ("ice: introduce new E825C devices
family")

The NVM actually contains the CSS header length embedded in it. Remove the
hard coded value and replace it with logic to read the length from the NVM
directly. This is more resilient against all existing and future hardware,
vs looking up the expected values from a table. It ensures the driver will
read from the appropriate place when determining the ETRACK ID value used
for populating the fw.bundle_id and for reporting in ethtool -i.

The CSS header length for both the active and inactive flash bank is stored
in the ice_bank_info structure to avoid unnecessary duplicate work when
accessing multiple words of the Shadow RAM. Both banks are read in the
unlikely event that the header length is different for the NVM in the
inactive bank, rather than being different only by the overall device
family.

Fixes: ba20ecb1d1bb ("ice: Hook up 4 E830 devices by adding their IDs")
Co-developed-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_nvm.c  | 88 ++++++++++++++++++++++++++++++-
 drivers/net/ethernet/intel/ice/ice_type.h | 14 +++--
 2 files changed, 93 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_nvm.c b/drivers/net/ethernet/intel/ice/ice_nvm.c
index 84eab92dc03c..5968011e8c7e 100644
--- a/drivers/net/ethernet/intel/ice/ice_nvm.c
+++ b/drivers/net/ethernet/intel/ice/ice_nvm.c
@@ -374,11 +374,25 @@ ice_read_nvm_module(struct ice_hw *hw, enum ice_bank_select bank, u32 offset, u1
  *
  * Read the specified word from the copy of the Shadow RAM found in the
  * specified NVM module.
+ *
+ * Note that the Shadow RAM copy is always located after the CSS header, and
+ * is aligned to 64-byte (32-word) offsets.
  */
 static int
 ice_read_nvm_sr_copy(struct ice_hw *hw, enum ice_bank_select bank, u32 offset, u16 *data)
 {
-	return ice_read_nvm_module(hw, bank, ICE_NVM_SR_COPY_WORD_OFFSET + offset, data);
+	u32 sr_copy;
+
+	switch (bank) {
+	case ICE_ACTIVE_FLASH_BANK:
+		sr_copy = roundup(hw->flash.banks.active_css_hdr_len, 32);
+		break;
+	case ICE_INACTIVE_FLASH_BANK:
+		sr_copy = roundup(hw->flash.banks.inactive_css_hdr_len, 32);
+		break;
+	}
+
+	return ice_read_nvm_module(hw, bank, sr_copy + offset, data);
 }
 
 /**
@@ -1009,6 +1023,72 @@ static int ice_determine_active_flash_banks(struct ice_hw *hw)
 	return 0;
 }
 
+/**
+ * ice_get_nvm_css_hdr_len - Read the CSS header length from the NVM CSS header
+ * @hw: pointer to the HW struct
+ * @bank: whether to read from the active or inactive flash bank
+ * @hdr_len: storage for header length in words
+ *
+ * Read the CSS header length from the NVM CSS header and add the Authentication
+ * header size, and then convert to words.
+ *
+ * Return: zero on success, or a negative error code on failure.
+ */
+static int
+ice_get_nvm_css_hdr_len(struct ice_hw *hw, enum ice_bank_select bank,
+			u32 *hdr_len)
+{
+	u16 hdr_len_l, hdr_len_h;
+	u32 hdr_len_dword;
+	int status;
+
+	status = ice_read_nvm_module(hw, bank, ICE_NVM_CSS_HDR_LEN_L,
+				     &hdr_len_l);
+	if (status)
+		return status;
+
+	status = ice_read_nvm_module(hw, bank, ICE_NVM_CSS_HDR_LEN_H,
+				     &hdr_len_h);
+	if (status)
+		return status;
+
+	/* CSS header length is in DWORD, so convert to words and add
+	 * authentication header size
+	 */
+	hdr_len_dword = hdr_len_h << 16 | hdr_len_l;
+	*hdr_len = (hdr_len_dword * 2) + ICE_NVM_AUTH_HEADER_LEN;
+
+	return 0;
+}
+
+/**
+ * ice_determine_css_hdr_len - Discover CSS header length for the device
+ * @hw: pointer to the HW struct
+ *
+ * Determine the size of the CSS header at the start of the NVM module. This
+ * is useful for locating the Shadow RAM copy in the NVM, as the Shadow RAM is
+ * always located just after the CSS header.
+ *
+ * Return: zero on success, or a negative error code on failure.
+ */
+static int ice_determine_css_hdr_len(struct ice_hw *hw)
+{
+	struct ice_bank_info *banks = &hw->flash.banks;
+	int status;
+
+	status = ice_get_nvm_css_hdr_len(hw, ICE_ACTIVE_FLASH_BANK,
+					 &banks->active_css_hdr_len);
+	if (status)
+		return status;
+
+	status = ice_get_nvm_css_hdr_len(hw, ICE_INACTIVE_FLASH_BANK,
+					 &banks->inactive_css_hdr_len);
+	if (status)
+		return status;
+
+	return 0;
+}
+
 /**
  * ice_init_nvm - initializes NVM setting
  * @hw: pointer to the HW struct
@@ -1055,6 +1135,12 @@ int ice_init_nvm(struct ice_hw *hw)
 		return status;
 	}
 
+	status = ice_determine_css_hdr_len(hw);
+	if (status) {
+		ice_debug(hw, ICE_DBG_NVM, "Failed to determine Shadow RAM copy offsets.\n");
+		return status;
+	}
+
 	status = ice_get_nvm_ver_info(hw, ICE_ACTIVE_FLASH_BANK, &flash->nvm);
 	if (status) {
 		ice_debug(hw, ICE_DBG_INIT, "Failed to read NVM info.\n");
diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h
index f0796a93f428..eef397e5baa0 100644
--- a/drivers/net/ethernet/intel/ice/ice_type.h
+++ b/drivers/net/ethernet/intel/ice/ice_type.h
@@ -482,6 +482,8 @@ struct ice_bank_info {
 	u32 orom_size;				/* Size of OROM bank */
 	u32 netlist_ptr;			/* Pointer to 1st Netlist bank */
 	u32 netlist_size;			/* Size of Netlist bank */
+	u32 active_css_hdr_len;			/* Active CSS header length */
+	u32 inactive_css_hdr_len;		/* Inactive CSS header length */
 	enum ice_flash_bank nvm_bank;		/* Active NVM bank */
 	enum ice_flash_bank orom_bank;		/* Active OROM bank */
 	enum ice_flash_bank netlist_bank;	/* Active Netlist bank */
@@ -1087,17 +1089,13 @@ struct ice_aq_get_set_rss_lut_params {
 #define ICE_SR_SECTOR_SIZE_IN_WORDS	0x800
 
 /* CSS Header words */
+#define ICE_NVM_CSS_HDR_LEN_L			0x02
+#define ICE_NVM_CSS_HDR_LEN_H			0x03
 #define ICE_NVM_CSS_SREV_L			0x14
 #define ICE_NVM_CSS_SREV_H			0x15
 
-/* Length of CSS header section in words */
-#define ICE_CSS_HEADER_LENGTH			330
-
-/* Offset of Shadow RAM copy in the NVM bank area. */
-#define ICE_NVM_SR_COPY_WORD_OFFSET		roundup(ICE_CSS_HEADER_LENGTH, 32)
-
-/* Size in bytes of Option ROM trailer */
-#define ICE_NVM_OROM_TRAILER_LENGTH		(2 * ICE_CSS_HEADER_LENGTH)
+/* Length of Authentication header section in words */
+#define ICE_NVM_AUTH_HEADER_LEN			0x08
 
 /* The Link Topology Netlist section is stored as a series of words. It is
  * stored in the NVM as a TLV, with the first two words containing the type

-- 
2.44.0.53.g0f9d4d28b7e6


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH net 8/8] ice: check for unregistering correct number of devlink params
  2024-05-28 22:06 [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) Jacob Keller
                   ` (6 preceding siblings ...)
  2024-05-28 22:06 ` [PATCH net 7/8] ice: fix reads from NVM Shadow RAM on E830 and E825-C devices Jacob Keller
@ 2024-05-28 22:06 ` Jacob Keller
  2024-05-30  2:00 ` [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) patchwork-bot+netdevbpf
  8 siblings, 0 replies; 23+ messages in thread
From: Jacob Keller @ 2024-05-28 22:06 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller, netdev
  Cc: Jacob Keller, Dave Ertman, Lukasz Czapnik, Przemek Kitszel,
	Pucha Himasekhar Reddy

From: Dave Ertman <david.m.ertman@intel.com>

On module load, the ice driver checks for the lack of a specific PF
capability to determine if it should reduce the number of devlink params
to register.  One situation when this test returns true is when the
driver loads in safe mode.  The same check is not present on the unload
path when devlink params are unregistered.  This results in the driver
triggering a WARN_ON in the kernel devlink code.

The current check and code path uses a reduction in the number of elements
reported in the list of params.  This is fragile and not good for future
maintaining.

Change the parameters to be held in two lists, one always registered and
one dependent on the check.

Add a symmetrical check in the unload path so that the correct parameters
are unregistered as well.

Fixes: 109eb2917284 ("ice: Add tx_scheduling_layers devlink param")
CC: Lukasz Czapnik <lukasz.czapnik@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/ice/devlink/devlink.c | 31 +++++++++++++++++-------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/devlink/devlink.c b/drivers/net/ethernet/intel/ice/devlink/devlink.c
index c4b69655cdf5..704e9ad5144e 100644
--- a/drivers/net/ethernet/intel/ice/devlink/devlink.c
+++ b/drivers/net/ethernet/intel/ice/devlink/devlink.c
@@ -1388,7 +1388,7 @@ enum ice_param_id {
 	ICE_DEVLINK_PARAM_ID_TX_SCHED_LAYERS,
 };
 
-static const struct devlink_param ice_devlink_params[] = {
+static const struct devlink_param ice_dvl_rdma_params[] = {
 	DEVLINK_PARAM_GENERIC(ENABLE_ROCE, BIT(DEVLINK_PARAM_CMODE_RUNTIME),
 			      ice_devlink_enable_roce_get,
 			      ice_devlink_enable_roce_set,
@@ -1397,6 +1397,9 @@ static const struct devlink_param ice_devlink_params[] = {
 			      ice_devlink_enable_iw_get,
 			      ice_devlink_enable_iw_set,
 			      ice_devlink_enable_iw_validate),
+};
+
+static const struct devlink_param ice_dvl_sched_params[] = {
 	DEVLINK_PARAM_DRIVER(ICE_DEVLINK_PARAM_ID_TX_SCHED_LAYERS,
 			     "tx_scheduling_layers",
 			     DEVLINK_PARAM_TYPE_U8,
@@ -1464,21 +1467,31 @@ int ice_devlink_register_params(struct ice_pf *pf)
 {
 	struct devlink *devlink = priv_to_devlink(pf);
 	struct ice_hw *hw = &pf->hw;
-	size_t params_size;
+	int status;
 
-	params_size =  ARRAY_SIZE(ice_devlink_params);
+	status = devl_params_register(devlink, ice_dvl_rdma_params,
+				      ARRAY_SIZE(ice_dvl_rdma_params));
+	if (status)
+		return status;
 
-	if (!hw->func_caps.common_cap.tx_sched_topo_comp_mode_en)
-		params_size--;
+	if (hw->func_caps.common_cap.tx_sched_topo_comp_mode_en)
+		status = devl_params_register(devlink, ice_dvl_sched_params,
+					      ARRAY_SIZE(ice_dvl_sched_params));
 
-	return devl_params_register(devlink, ice_devlink_params,
-				    params_size);
+	return status;
 }
 
 void ice_devlink_unregister_params(struct ice_pf *pf)
 {
-	devl_params_unregister(priv_to_devlink(pf), ice_devlink_params,
-			       ARRAY_SIZE(ice_devlink_params));
+	struct devlink *devlink = priv_to_devlink(pf);
+	struct ice_hw *hw = &pf->hw;
+
+	devl_params_unregister(devlink, ice_dvl_rdma_params,
+			       ARRAY_SIZE(ice_dvl_rdma_params));
+
+	if (hw->func_caps.common_cap.tx_sched_topo_comp_mode_en)
+		devl_params_unregister(devlink, ice_dvl_sched_params,
+				       ARRAY_SIZE(ice_dvl_sched_params));
 }
 
 #define ICE_DEVLINK_READ_BLK_SIZE (1024 * 1024)

-- 
2.44.0.53.g0f9d4d28b7e6


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH net 6/8] ice: implement AQ download pkg retry
  2024-05-28 22:06 ` [PATCH net 6/8] ice: implement AQ download pkg retry Jacob Keller
@ 2024-05-30  1:51   ` Jakub Kicinski
  2024-05-30 16:50     ` Jacob Keller
  0 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2024-05-30  1:51 UTC (permalink / raw)
  To: Jacob Keller
  Cc: David Miller, netdev, Wojciech Drewek, Michal Swiatkowski,
	Brett Creeley, Pucha Himasekhar Reddy

On Tue, 28 May 2024 15:06:09 -0700 Jacob Keller wrote:
> +		while (try_cnt < 5) {
> +			status = ice_aq_download_pkg(hw, bh, ICE_PKG_BUF_SIZE,
> +						     last, &offset, &info,
> +						     NULL);
> +			if (hw->adminq.sq_last_status != ICE_AQ_RC_ENOSEC &&
> +			    hw->adminq.sq_last_status != ICE_AQ_RC_EBADSIG)
> +				break;
> +
> +			try_cnt++;
> +			msleep(20);
> +		}
> +
> +		if (try_cnt)
> +			dev_dbg(ice_hw_to_dev(hw),
> +				"ice_aq_download_pkg number of retries: %d\n",
> +				try_cnt);
>  

That's not a great wait loop. Last iteration sleeps for 20msec and then
gives up without checking the condition.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/8] i40e: Fix XDP program unloading while removing the driver
  2024-05-28 22:06 ` [PATCH net 4/8] i40e: Fix XDP program unloading while removing the driver Jacob Keller
@ 2024-05-30  1:54   ` Jakub Kicinski
  2024-05-30 16:38     ` Jacob Keller
  0 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2024-05-30  1:54 UTC (permalink / raw)
  To: Jacob Keller
  Cc: David Miller, netdev, Michal Kubiak, Wojciech Drewek,
	George Kuruvinakunnel, Maciej Fijalkowski

On Tue, 28 May 2024 15:06:07 -0700 Jacob Keller wrote:
> +	/* Called from netdev unregister context. Unload the XDP program. */
> +	if (vsi->netdev->reg_state == NETREG_UNREGISTERING) {
> +		xdp_features_clear_redirect_target(vsi->netdev);
> +		old_prog = xchg(&vsi->xdp_prog, NULL);
> +		if (old_prog)
> +			bpf_prog_put(old_prog);
> +
> +		return 0;
> +	}

This is not great. The netdevice is closed at this stage, why is the xdp
setup try to do work if the device is closed even when not
unregistering?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice)
  2024-05-28 22:06 [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) Jacob Keller
                   ` (7 preceding siblings ...)
  2024-05-28 22:06 ` [PATCH net 8/8] ice: check for unregistering correct number of devlink params Jacob Keller
@ 2024-05-30  2:00 ` patchwork-bot+netdevbpf
  2024-05-30 16:45   ` Jacob Keller
  8 siblings, 1 reply; 23+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-05-30  2:00 UTC (permalink / raw)
  To: Keller, Jacob E
  Cc: kuba, davem, netdev, hui.wang, vitaly.lifshits, naamax.meir,
	horms, pmenzel, anthony.l.nguyen, rui.zhang, thinhtr, rob.thomas,
	himasekharx.reddy.pucha, michal.kubiak, wojciech.drewek,
	george.kuruvinakunnel, maciej.fijalkowski, paul.greenwalt,
	michal.swiatkowski, brett.creeley, przemyslaw.kitszel,
	david.m.ertman, lukasz.czapnik

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 28 May 2024 15:06:03 -0700 you wrote:
> This series includes a variety of fixes that have been accumulating on the
> Intel Wired LAN dev-queue.
> 
> Hui Wang provides a fix for suspend/resume on e1000e due to failure
> to correctly setup the SMBUS in enable_ulp().
> 
> Thinh Tran provides a fix for EEH I/O suspend/resume on i40e to
> ensure that I/O operations can continue after a resume. To avoid duplicate
> code, the common logic is factored out of i40e_suspend and i40e_resume.
> 
> [...]

Here is the summary with links:
  - [net,1/8] e1000e: move force SMBUS near the end of enable_ulp function
    https://git.kernel.org/netdev/net/c/bfd546a552e1
  - [net,2/8] i40e: factoring out i40e_suspend/i40e_resume
    https://git.kernel.org/netdev/net/c/218ed820d364
  - [net,3/8] i40e: Fully suspend and resume IO operations in EEH case
    https://git.kernel.org/netdev/net/c/c80b6538d35a
  - [net,4/8] i40e: Fix XDP program unloading while removing the driver
    (no matching commit)
  - [net,5/8] ice: fix 200G PHY types to link speed mapping
    https://git.kernel.org/netdev/net/c/2a6d8f2de222
  - [net,6/8] ice: implement AQ download pkg retry
    (no matching commit)
  - [net,7/8] ice: fix reads from NVM Shadow RAM on E830 and E825-C devices
    (no matching commit)
  - [net,8/8] ice: check for unregistering correct number of devlink params
    https://git.kernel.org/netdev/net/c/a51c9b1c9ab2

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/8] i40e: Fix XDP program unloading while removing the driver
  2024-05-30  1:54   ` Jakub Kicinski
@ 2024-05-30 16:38     ` Jacob Keller
  2024-06-05 15:00       ` Michal Kubiak
  0 siblings, 1 reply; 23+ messages in thread
From: Jacob Keller @ 2024-05-30 16:38 UTC (permalink / raw)
  To: Jakub Kicinski, Michal Kubiak
  Cc: David Miller, netdev, Michal Kubiak, Wojciech Drewek,
	George Kuruvinakunnel, Maciej Fijalkowski



On 5/29/2024 6:54 PM, Jakub Kicinski wrote:
> On Tue, 28 May 2024 15:06:07 -0700 Jacob Keller wrote:
>> +	/* Called from netdev unregister context. Unload the XDP program. */
>> +	if (vsi->netdev->reg_state == NETREG_UNREGISTERING) {
>> +		xdp_features_clear_redirect_target(vsi->netdev);
>> +		old_prog = xchg(&vsi->xdp_prog, NULL);
>> +		if (old_prog)
>> +			bpf_prog_put(old_prog);
>> +
>> +		return 0;
>> +	}
> 
> This is not great. The netdevice is closed at this stage, why is the xdp
> setup try to do work if the device is closed even when not
> unregistering?

The comment makes this seem like its happening during unregistration. It
looks like i40e_xdp_setup() is only called from i40e_xdp(), which is if
xdp->command is XDP_SETUP_PROG

From the looks of things, ndo_bpf is called both for setup and teardown?

>    7 >-------/* Set or clear a bpf program used in the earliest stages of packet
>    6 >------- * rx. The prog will have been loaded as BPF_PROG_TYPE_XDP. The callee
>    5 >------- * is responsible for calling bpf_prog_put on any old progs that are
>    4 >------- * stored. In case of error, the callee need not release the new prog
>    3 >------- * reference, but on success it takes ownership and must bpf_prog_put
>    2 >------- * when it is no longer used.
>    1 >------- */

Indeed, dev_xdp_uninstall calls dev_xdp_install in a loop to remove
programs.

As far as I can tell, it looks like the .ndo_bpf call is made with a
program set to NULL during uninstall:

>    30 static void dev_xdp_uninstall(struct net_device *dev)
>    29 {
>    28 >-------struct bpf_xdp_link *link;
>    27 >-------struct bpf_prog *prog;
>    26 >-------enum bpf_xdp_mode mode;
>    25 >-------bpf_op_t bpf_op;
>    24
>    23 >-------ASSERT_RTNL();
>    22
>    21 >-------for (mode = XDP_MODE_SKB; mode < __MAX_XDP_MODE; mode++) {
>    20 >------->-------prog = dev_xdp_prog(dev, mode);
>    19 >------->-------if (!prog)
>    18 >------->------->-------continue;
>    17
>    16 >------->-------bpf_op = dev_xdp_bpf_op(dev, mode);
>    15 >------->-------if (!bpf_op)
>    14 >------->------->-------continue;
>    13
>    12 >------->-------WARN_ON(dev_xdp_install(dev, mode, bpf_op, NULL, 0, NULL));
>    11

Here, dev_xdp_install is called with a prog of NULL

>    10 >------->-------/* auto-detach link from net device */
>     9 >------->-------link = dev_xdp_link(dev, mode);
>     8 >------->-------if (link)
>     7 >------->------->-------link->dev = NULL;
>     6 >------->-------else
>     5 >------->------->-------bpf_prog_put(prog);
>     4
>     3 >------->-------dev_xdp_set_link(dev, mode, NULL);
>     2 >-------}
>     1 }

I think the semantics are confusing here.

Basically, the issue is this function needs to remove the XDP properly
when called by the netdev unregister flow but has a check against adding
a new program if its called during remove...

I think this is confusing and could be improved by refactoring how the
i40e flow works. If the passed-in prog is NULL, its a request to remove.
If its otherwise, its a request to add a new program (possibly replacing
an existing one?).

I think we ought to just be checking NULL and not needing to bother with
the netdev_unregister state at all here?

Hopefully Michal can chime in with a better understanding.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice)
  2024-05-30  2:00 ` [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) patchwork-bot+netdevbpf
@ 2024-05-30 16:45   ` Jacob Keller
  2024-05-30 16:58     ` Jakub Kicinski
  0 siblings, 1 reply; 23+ messages in thread
From: Jacob Keller @ 2024-05-30 16:45 UTC (permalink / raw)
  To: patchwork-bot+netdevbpf, Jakub Kicinski
  Cc: kuba, davem, netdev, hui.wang, vitaly.lifshits, naamax.meir,
	horms, pmenzel, anthony.l.nguyen, rui.zhang, thinhtr, rob.thomas,
	himasekharx.reddy.pucha, michal.kubiak, wojciech.drewek,
	george.kuruvinakunnel, maciej.fijalkowski, paul.greenwalt,
	michal.swiatkowski, brett.creeley, przemyslaw.kitszel,
	david.m.ertman, lukasz.czapnik



On 5/29/2024 7:00 PM, patchwork-bot+netdevbpf@kernel.org wrote:
> Hello:
> 
> This series was applied to netdev/net.git (main)
> by Jakub Kicinski <kuba@kernel.org>:
> 
> On Tue, 28 May 2024 15:06:03 -0700 you wrote:
>> This series includes a variety of fixes that have been accumulating on the
>> Intel Wired LAN dev-queue.
>>
>> Hui Wang provides a fix for suspend/resume on e1000e due to failure
>> to correctly setup the SMBUS in enable_ulp().
>>
>> Thinh Tran provides a fix for EEH I/O suspend/resume on i40e to
>> ensure that I/O operations can continue after a resume. To avoid duplicate
>> code, the common logic is factored out of i40e_suspend and i40e_resume.
>>
>> [...]
> 

Some of the series didn't get applied. The two you commented on with
issues or questions make sense.

> Here is the summary with links:
>   - [net,1/8] e1000e: move force SMBUS near the end of enable_ulp function
>     https://git.kernel.org/netdev/net/c/bfd546a552e1
>   - [net,2/8] i40e: factoring out i40e_suspend/i40e_resume
>     https://git.kernel.org/netdev/net/c/218ed820d364
>   - [net,3/8] i40e: Fully suspend and resume IO operations in EEH case
>     https://git.kernel.org/netdev/net/c/c80b6538d35a
>   - [net,4/8] i40e: Fix XDP program unloading while removing the driver
>     (no matching commit)
>   - [net,5/8] ice: fix 200G PHY types to link speed mapping
>     https://git.kernel.org/netdev/net/c/2a6d8f2de222
>   - [net,6/8] ice: implement AQ download pkg retry
>     (no matching commit)
>   - [net,7/8] ice: fix reads from NVM Shadow RAM on E830 and E825-C devices
>     (no matching commit)

I saw this one didn't get applied either, but don't see any comment on
the list regarding if you have any objections or questions.

Thanks,
Jake

>   - [net,8/8] ice: check for unregistering correct number of devlink params
>     https://git.kernel.org/netdev/net/c/a51c9b1c9ab2

> 
> You are awesome, thank you!

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 6/8] ice: implement AQ download pkg retry
  2024-05-30  1:51   ` Jakub Kicinski
@ 2024-05-30 16:50     ` Jacob Keller
  2024-05-31  8:25       ` Wojciech Drewek
  0 siblings, 1 reply; 23+ messages in thread
From: Jacob Keller @ 2024-05-30 16:50 UTC (permalink / raw)
  To: Jakub Kicinski, Wojciech Drewek
  Cc: David Miller, netdev, Wojciech Drewek, Michal Swiatkowski,
	Brett Creeley, Pucha Himasekhar Reddy



On 5/29/2024 6:51 PM, Jakub Kicinski wrote:
> On Tue, 28 May 2024 15:06:09 -0700 Jacob Keller wrote:
>> +		while (try_cnt < 5) {
>> +			status = ice_aq_download_pkg(hw, bh, ICE_PKG_BUF_SIZE,
>> +						     last, &offset, &info,
>> +						     NULL);
>> +			if (hw->adminq.sq_last_status != ICE_AQ_RC_ENOSEC &&
>> +			    hw->adminq.sq_last_status != ICE_AQ_RC_EBADSIG)
>> +				break;
>> +
>> +			try_cnt++;
>> +			msleep(20);
>> +		}
>> +
>> +		if (try_cnt)
>> +			dev_dbg(ice_hw_to_dev(hw),
>> +				"ice_aq_download_pkg number of retries: %d\n",
>> +				try_cnt);
>>  
> 
> That's not a great wait loop. Last iteration sleeps for 20msec and then
> gives up without checking the condition.

Yea, that seems rather silly.

@Wojciech, would you please look into this?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice)
  2024-05-30 16:45   ` Jacob Keller
@ 2024-05-30 16:58     ` Jakub Kicinski
  2024-05-30 17:04       ` Jacob Keller
  0 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2024-05-30 16:58 UTC (permalink / raw)
  To: Jacob Keller
  Cc: patchwork-bot+netdevbpf, davem, netdev, hui.wang, vitaly.lifshits,
	naamax.meir, horms, pmenzel, anthony.l.nguyen, rui.zhang, thinhtr,
	rob.thomas, himasekharx.reddy.pucha, michal.kubiak,
	wojciech.drewek, george.kuruvinakunnel, maciej.fijalkowski,
	paul.greenwalt, michal.swiatkowski, brett.creeley,
	przemyslaw.kitszel, david.m.ertman, lukasz.czapnik

On Thu, 30 May 2024 09:45:29 -0700 Jacob Keller wrote:
> >   - [net,7/8] ice: fix reads from NVM Shadow RAM on E830 and E825-C devices
> >     (no matching commit)  
> 
> I saw this one didn't get applied either, but don't see any comment on
> the list regarding if you have any objections or questions.

I wasn't 100% sure there's no dependency on 6, better safe than sorry?
:)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice)
  2024-05-30 16:58     ` Jakub Kicinski
@ 2024-05-30 17:04       ` Jacob Keller
  0 siblings, 0 replies; 23+ messages in thread
From: Jacob Keller @ 2024-05-30 17:04 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: patchwork-bot+netdevbpf, davem, netdev, hui.wang, vitaly.lifshits,
	naamax.meir, horms, pmenzel, anthony.l.nguyen, rui.zhang, thinhtr,
	rob.thomas, himasekharx.reddy.pucha, michal.kubiak,
	wojciech.drewek, george.kuruvinakunnel, maciej.fijalkowski,
	paul.greenwalt, michal.swiatkowski, brett.creeley,
	przemyslaw.kitszel, david.m.ertman, lukasz.czapnik



On 5/30/2024 9:58 AM, Jakub Kicinski wrote:
> On Thu, 30 May 2024 09:45:29 -0700 Jacob Keller wrote:
>>>   - [net,7/8] ice: fix reads from NVM Shadow RAM on E830 and E825-C devices
>>>     (no matching commit)  
>>
>> I saw this one didn't get applied either, but don't see any comment on
>> the list regarding if you have any objections or questions.
> 
> I wasn't 100% sure there's no dependency on 6, better safe than sorry?
> :)

Sure. I can include it in the next batch of fixes. I think we just got a
few more through testing yesterday.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 6/8] ice: implement AQ download pkg retry
  2024-05-30 16:50     ` Jacob Keller
@ 2024-05-31  8:25       ` Wojciech Drewek
  0 siblings, 0 replies; 23+ messages in thread
From: Wojciech Drewek @ 2024-05-31  8:25 UTC (permalink / raw)
  To: Jacob Keller, Jakub Kicinski
  Cc: David Miller, netdev, Michal Swiatkowski, Brett Creeley,
	Pucha Himasekhar Reddy



On 30.05.2024 18:50, Jacob Keller wrote:
> 
> 
> On 5/29/2024 6:51 PM, Jakub Kicinski wrote:
>> On Tue, 28 May 2024 15:06:09 -0700 Jacob Keller wrote:
>>> +		while (try_cnt < 5) {
>>> +			status = ice_aq_download_pkg(hw, bh, ICE_PKG_BUF_SIZE,
>>> +						     last, &offset, &info,
>>> +						     NULL);
>>> +			if (hw->adminq.sq_last_status != ICE_AQ_RC_ENOSEC &&
>>> +			    hw->adminq.sq_last_status != ICE_AQ_RC_EBADSIG)
>>> +				break;
>>> +
>>> +			try_cnt++;
>>> +			msleep(20);
>>> +		}
>>> +
>>> +		if (try_cnt)
>>> +			dev_dbg(ice_hw_to_dev(hw),
>>> +				"ice_aq_download_pkg number of retries: %d\n",
>>> +				try_cnt);
>>>  
>>
>> That's not a great wait loop. Last iteration sleeps for 20msec and then
>> gives up without checking the condition.
> 
> Yea, that seems rather silly.
> 
> @Wojciech, would you please look into this?

Sure, I'll send next version

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/8] i40e: Fix XDP program unloading while removing the driver
  2024-05-30 16:38     ` Jacob Keller
@ 2024-06-05 15:00       ` Michal Kubiak
  2024-06-05 19:29         ` Jakub Kicinski
  0 siblings, 1 reply; 23+ messages in thread
From: Michal Kubiak @ 2024-06-05 15:00 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Jakub Kicinski, David Miller, netdev, Wojciech Drewek,
	George Kuruvinakunnel, Maciej Fijalkowski

On Thu, May 30, 2024 at 09:38:04AM -0700, Jacob Keller wrote:
> 
> 
> On 5/29/2024 6:54 PM, Jakub Kicinski wrote:
> > On Tue, 28 May 2024 15:06:07 -0700 Jacob Keller wrote:
> >> +	/* Called from netdev unregister context. Unload the XDP program. */
> >> +	if (vsi->netdev->reg_state == NETREG_UNREGISTERING) {
> >> +		xdp_features_clear_redirect_target(vsi->netdev);
> >> +		old_prog = xchg(&vsi->xdp_prog, NULL);
> >> +		if (old_prog)
> >> +			bpf_prog_put(old_prog);
> >> +
> >> +		return 0;
> >> +	}
> > 
> > This is not great. The netdevice is closed at this stage, why is the xdp
> > setup try to do work if the device is closed even when not
> > unregistering?
> 
> The comment makes this seem like its happening during unregistration. It
> looks like i40e_xdp_setup() is only called from i40e_xdp(), which is if
> xdp->command is XDP_SETUP_PROG
> 
> From the looks of things, ndo_bpf is called both for setup and teardown?

Exactly, ndo_bpf with the command XDP_SETUP_PROG can be called for both
loading and unloading the XDP program. When the XDP program is unloaded,
the callback is simply called with NULL as the pointer to the program.
Also, unloading the XDP program can be initiated not only from the user
space directly, but also from the kernel core.

In this specific case we are handling that last case. Calling ndo_bpf
when reg_state == NETREG_UNREGISTERING is the case when unloading the
XDP program is an immanent part of the netdevice unregistering process.

In such a case we have to unconditionally decrease the reference counter for
the XDP program using bpf_prog_put() call and exit with no error to
assure the consistency between BPF core code and our driver.

> 
> >    7 >-------/* Set or clear a bpf program used in the earliest stages of packet
> >    6 >------- * rx. The prog will have been loaded as BPF_PROG_TYPE_XDP. The callee
> >    5 >------- * is responsible for calling bpf_prog_put on any old progs that are
> >    4 >------- * stored. In case of error, the callee need not release the new prog
> >    3 >------- * reference, but on success it takes ownership and must bpf_prog_put
> >    2 >------- * when it is no longer used.
> >    1 >------- */
> 
> Indeed, dev_xdp_uninstall calls dev_xdp_install in a loop to remove
> programs.
> 
> As far as I can tell, it looks like the .ndo_bpf call is made with a
> program set to NULL during uninstall:
> 
> >    30 static void dev_xdp_uninstall(struct net_device *dev)
> >    29 {
> >    28 >-------struct bpf_xdp_link *link;
> >    27 >-------struct bpf_prog *prog;
> >    26 >-------enum bpf_xdp_mode mode;
> >    25 >-------bpf_op_t bpf_op;
> >    24
> >    23 >-------ASSERT_RTNL();
> >    22
> >    21 >-------for (mode = XDP_MODE_SKB; mode < __MAX_XDP_MODE; mode++) {
> >    20 >------->-------prog = dev_xdp_prog(dev, mode);
> >    19 >------->-------if (!prog)
> >    18 >------->------->-------continue;
> >    17
> >    16 >------->-------bpf_op = dev_xdp_bpf_op(dev, mode);
> >    15 >------->-------if (!bpf_op)
> >    14 >------->------->-------continue;
> >    13
> >    12 >------->-------WARN_ON(dev_xdp_install(dev, mode, bpf_op, NULL, 0, NULL));
> >    11
> 
> Here, dev_xdp_install is called with a prog of NULL
> 
> >    10 >------->-------/* auto-detach link from net device */
> >     9 >------->-------link = dev_xdp_link(dev, mode);
> >     8 >------->-------if (link)
> >     7 >------->------->-------link->dev = NULL;
> >     6 >------->-------else
> >     5 >------->------->-------bpf_prog_put(prog);
> >     4
> >     3 >------->-------dev_xdp_set_link(dev, mode, NULL);
> >     2 >-------}
> >     1 }
> 

I confirm. The current design of netdevice unregistering algorithm
includes checking (in a loop) if the netdevice has any XDP program
attached and forces unloading that program because it won't be used
anymore on that device.

> I think the semantics are confusing here.
> 
> Basically, the issue is this function needs to remove the XDP properly
> when called by the netdev unregister flow but has a check against adding
> a new program if its called during remove...

The check for __I40E_IN_REMOVE has been introduced by the commit
6533e558c650 ("i40e: Fix reset path while removing the driver").
Similar checks have been added in other callbacks. I believe the
intention was to fix some synchronization issues by exiting from callbacks
or reset immediately if the driver is being removed.
Unfortunately, although it could work for other callbacks, we cannot do that
in ndo_bpf because we need to leave kernel structures and ref counters
consistent.
I decided to keep the check for IN_REMOVE because I believe it covers
the case when NETREG_UNREGISTERING flag is not set yet but we started to
destroy our internal data structures.

> 
> I think this is confusing and could be improved by refactoring how the
> i40e flow works. If the passed-in prog is NULL, its a request to remove.
> If its otherwise, its a request to add a new program (possibly replacing
> an existing one?).
> 
> I think we ought to just be checking NULL and not needing to bother with
> the netdev_unregister state at all here?

I am afraid checking for NULL won't be enough here.
Normally, when ndo_bpf is called from the user space application, that
callback can be called with NULL too (when the user just wants to unload
the XDP program). In such a case, apart from calling bpf_prog_put(), we
have to rebuild our internal data structures (queues, pointers, counters
etc.) to restore the i40e driver working back in normal mode (with no
XDP program).
My intention of adding the check for NETREG_REGISTERING was to implement
a quick handler for unloading the XDP program from the netdev
unregistering context only, when our internal data structures are
already destroyed but we need to leave kernel's ref counters in a
consistent state.

> 
> Hopefully Michal can chime in with a better understanding.

Thanks,
Michal

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/8] i40e: Fix XDP program unloading while removing the driver
  2024-06-05 15:00       ` Michal Kubiak
@ 2024-06-05 19:29         ` Jakub Kicinski
  2024-06-06 10:02           ` Michal Kubiak
  0 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2024-06-05 19:29 UTC (permalink / raw)
  To: Michal Kubiak
  Cc: Jacob Keller, David Miller, netdev, Wojciech Drewek,
	George Kuruvinakunnel, Maciej Fijalkowski

On Wed, 5 Jun 2024 17:00:02 +0200 Michal Kubiak wrote:
> I am afraid checking for NULL won't be enough here.
> Normally, when ndo_bpf is called from the user space application, that
> callback can be called with NULL too (when the user just wants to unload
> the XDP program). In such a case, apart from calling bpf_prog_put(), we
> have to rebuild our internal data structures (queues, pointers, counters
> etc.) to restore the i40e driver working back in normal mode (with no
> XDP program).

Apologizes for asking a question which can be answered by studying 
the code longer, but why do you need to rebuild internal data
structures for a device which is *down*. Unregistering or not.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/8] i40e: Fix XDP program unloading while removing the driver
  2024-06-05 19:29         ` Jakub Kicinski
@ 2024-06-06 10:02           ` Michal Kubiak
  2024-06-06 13:43             ` Jakub Kicinski
  0 siblings, 1 reply; 23+ messages in thread
From: Michal Kubiak @ 2024-06-06 10:02 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jacob Keller, David Miller, netdev, Wojciech Drewek,
	George Kuruvinakunnel, Maciej Fijalkowski

On Wed, Jun 05, 2024 at 12:29:57PM -0700, Jakub Kicinski wrote:
> On Wed, 5 Jun 2024 17:00:02 +0200 Michal Kubiak wrote:
> > I am afraid checking for NULL won't be enough here.
> > Normally, when ndo_bpf is called from the user space application, that
> > callback can be called with NULL too (when the user just wants to unload
> > the XDP program). In such a case, apart from calling bpf_prog_put(), we
> > have to rebuild our internal data structures (queues, pointers, counters
> > etc.) to restore the i40e driver working back in normal mode (with no
> > XDP program).
> 
> Apologizes for asking a question which can be answered by studying 
> the code longer, but why do you need to rebuild internal data
> structures for a device which is *down*. Unregistering or not.

Excuse me, but I don't understand why we should assume that a device is
*down* when that callback is being called?
Maybe I didn't make it clear, but the ndo_bpf can be called every time
when the userspace application wants to load or unload the XDP program.
It can happen when a device is *up* and also when the link is *up*.

Michal

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/8] i40e: Fix XDP program unloading while removing the driver
  2024-06-06 10:02           ` Michal Kubiak
@ 2024-06-06 13:43             ` Jakub Kicinski
  2024-06-11 18:52               ` Michal Kubiak
  0 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2024-06-06 13:43 UTC (permalink / raw)
  To: Michal Kubiak
  Cc: Jacob Keller, David Miller, netdev, Wojciech Drewek,
	George Kuruvinakunnel, Maciej Fijalkowski

On Thu, 6 Jun 2024 12:02:27 +0200 Michal Kubiak wrote:
> > Apologizes for asking a question which can be answered by studying 
> > the code longer, but why do you need to rebuild internal data
> > structures for a device which is *down*. Unregistering or not.  
> 
> Excuse me, but I don't understand why we should assume that a device is
> *down* when that callback is being called?
> Maybe I didn't make it clear, but the ndo_bpf can be called every time
> when the userspace application wants to load or unload the XDP program.
> It can happen when a device is *up* and also when the link is *up*.

The patch was adding a special case for NETREG_UNREGISTERING,
at that point the device will be closed. Calling ndo_close is one
of the first things core does during unregistering.
Simplifying the handling for when the device is closed would be
better.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net 4/8] i40e: Fix XDP program unloading while removing the driver
  2024-06-06 13:43             ` Jakub Kicinski
@ 2024-06-11 18:52               ` Michal Kubiak
  0 siblings, 0 replies; 23+ messages in thread
From: Michal Kubiak @ 2024-06-11 18:52 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jacob Keller, David Miller, netdev, Wojciech Drewek,
	George Kuruvinakunnel, Maciej Fijalkowski

On Thu, Jun 06, 2024 at 06:43:28AM -0700, Jakub Kicinski wrote:
> On Thu, 6 Jun 2024 12:02:27 +0200 Michal Kubiak wrote:
> > > Apologizes for asking a question which can be answered by studying 
> > > the code longer, but why do you need to rebuild internal data
> > > structures for a device which is *down*. Unregistering or not.  
> > 
> > Excuse me, but I don't understand why we should assume that a device is
> > *down* when that callback is being called?
> > Maybe I didn't make it clear, but the ndo_bpf can be called every time
> > when the userspace application wants to load or unload the XDP program.
> > It can happen when a device is *up* and also when the link is *up*.
> 
> The patch was adding a special case for NETREG_UNREGISTERING,
> at that point the device will be closed. Calling ndo_close is one
> of the first things core does during unregistering.
> Simplifying the handling for when the device is closed would be
> better.

I think I'm getting your point but moving the code for
NETREG_UNREGISTERING to ndo_stop wouldn't be enough and it seems to be
against the current design of 'unregister_netdevice_many_notify()'.

In 'unregister_netdevice_many_notify()' we have the call to
'dev_close_many()' (which calls ndo_stop on i40e driver) and then
'dev_xdp_uninstall()' is called (where there is the call to ndo_bpf).

'dev_xdp_uninstall()' seems to be the right function where all
activities related to XDP program unloading during unregistering are
expected from the driver.

Anyway, I analyzed that code one more time and I agree that the special
case for NETREG_UNREGISTERING makes the code more complex and I can
implement it in a simpler way.
The root cause of the problem I'm trying to fix is that __I40E_IN_REMOVE
flag is handled in a wrong way.

I will post the v2 then.

Thanks,
Michal

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-06-11 18:53 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-28 22:06 [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) Jacob Keller
2024-05-28 22:06 ` [PATCH net 1/8] e1000e: move force SMBUS near the end of enable_ulp function Jacob Keller
2024-05-28 22:06 ` [PATCH net 2/8] i40e: factoring out i40e_suspend/i40e_resume Jacob Keller
2024-05-28 22:06 ` [PATCH net 3/8] i40e: Fully suspend and resume IO operations in EEH case Jacob Keller
2024-05-28 22:06 ` [PATCH net 4/8] i40e: Fix XDP program unloading while removing the driver Jacob Keller
2024-05-30  1:54   ` Jakub Kicinski
2024-05-30 16:38     ` Jacob Keller
2024-06-05 15:00       ` Michal Kubiak
2024-06-05 19:29         ` Jakub Kicinski
2024-06-06 10:02           ` Michal Kubiak
2024-06-06 13:43             ` Jakub Kicinski
2024-06-11 18:52               ` Michal Kubiak
2024-05-28 22:06 ` [PATCH net 5/8] ice: fix 200G PHY types to link speed mapping Jacob Keller
2024-05-28 22:06 ` [PATCH net 6/8] ice: implement AQ download pkg retry Jacob Keller
2024-05-30  1:51   ` Jakub Kicinski
2024-05-30 16:50     ` Jacob Keller
2024-05-31  8:25       ` Wojciech Drewek
2024-05-28 22:06 ` [PATCH net 7/8] ice: fix reads from NVM Shadow RAM on E830 and E825-C devices Jacob Keller
2024-05-28 22:06 ` [PATCH net 8/8] ice: check for unregistering correct number of devlink params Jacob Keller
2024-05-30  2:00 ` [PATCH net 0/8] Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) patchwork-bot+netdevbpf
2024-05-30 16:45   ` Jacob Keller
2024-05-30 16:58     ` Jakub Kicinski
2024-05-30 17:04       ` Jacob Keller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).