* [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2023-05-09 (igc, igb)
@ 2023-05-09 17:09 Tony Nguyen
2023-05-09 17:09 ` [PATCH net 1/3] igc: Clean the TX buffer and TX descriptor ring Tony Nguyen
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Tony Nguyen @ 2023-05-09 17:09 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet, netdev; +Cc: Tony Nguyen
This series contains updates to igc and igb drivers.
Husaini clears Tx rings when interface is brought down for igc.
Vinicius disables PTM and PCI busmaster when removing igc driver.
Alex adds error check and path for NVM read error on igb.
The following are changes since commit 162bd18eb55adf464a0fa2b4144b8d61c75ff7c2:
linux/dim: Do nothing if no time delta between samples
and are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue 1GbE
Aleksandr Loktionov (1):
igb: fix nvm.ops.read() error handling
Muhammad Husaini Zulkifli (1):
igc: Clean the TX buffer and TX descriptor ring
Vinicius Costa Gomes (1):
igc: Fix possible system crash when loading module
drivers/net/ethernet/intel/igb/igb_ethtool.c | 4 +++-
drivers/net/ethernet/intel/igc/igc_main.c | 10 ++++++++++
2 files changed, 13 insertions(+), 1 deletion(-)
--
2.38.1
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH net 1/3] igc: Clean the TX buffer and TX descriptor ring 2023-05-09 17:09 [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2023-05-09 (igc, igb) Tony Nguyen @ 2023-05-09 17:09 ` Tony Nguyen 2023-05-11 2:14 ` Jakub Kicinski 2023-05-09 17:09 ` [PATCH net 2/3] igc: Fix possible system crash when loading module Tony Nguyen 2023-05-09 17:09 ` [PATCH net 3/3] igb: fix nvm.ops.read() error handling Tony Nguyen 2 siblings, 1 reply; 8+ messages in thread From: Tony Nguyen @ 2023-05-09 17:09 UTC (permalink / raw) To: davem, kuba, pabeni, edumazet, netdev Cc: Muhammad Husaini Zulkifli, anthony.l.nguyen, sasha.neftin, Naama Meir From: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> There could be a race condition during link down where interrupt being generated and igc_clean_tx_irq() been called to perform the TX completion. Properly clear the TX buffer and TX descriptor ring to avoid those case. Fixes: 13b5b7fd6a4a ("igc: Add support for Tx/Rx rings") Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> --- drivers/net/ethernet/intel/igc/igc_main.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 1c4676882082..5767e691b401 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -254,6 +254,13 @@ static void igc_clean_tx_ring(struct igc_ring *tx_ring) /* reset BQL for queue */ netdev_tx_reset_queue(txring_txq(tx_ring)); + /* Zero out the buffer ring */ + memset(tx_ring->tx_buffer_info, 0, + sizeof(*tx_ring->tx_buffer_info) * tx_ring->count); + + /* Zero out the descriptor ring */ + memset(tx_ring->desc, 0, tx_ring->size); + /* reset next_to_use and next_to_clean */ tx_ring->next_to_use = 0; tx_ring->next_to_clean = 0; -- 2.38.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net 1/3] igc: Clean the TX buffer and TX descriptor ring 2023-05-09 17:09 ` [PATCH net 1/3] igc: Clean the TX buffer and TX descriptor ring Tony Nguyen @ 2023-05-11 2:14 ` Jakub Kicinski 2023-05-12 8:51 ` Zulkifli, Muhammad Husaini 0 siblings, 1 reply; 8+ messages in thread From: Jakub Kicinski @ 2023-05-11 2:14 UTC (permalink / raw) To: Tony Nguyen Cc: davem, pabeni, edumazet, netdev, Muhammad Husaini Zulkifli, sasha.neftin, Naama Meir On Tue, 9 May 2023 10:09:33 -0700 Tony Nguyen wrote: > There could be a race condition during link down where interrupt > being generated and igc_clean_tx_irq() been called to perform the > TX completion. Properly clear the TX buffer and TX descriptor ring > to avoid those case. > + /* Zero out the buffer ring */ > + memset(tx_ring->tx_buffer_info, 0, > + sizeof(*tx_ring->tx_buffer_info) * tx_ring->count); > + > + /* Zero out the descriptor ring */ > + memset(tx_ring->desc, 0, tx_ring->size); Just from the diff and the commit description this does not seem obviously correct. Race condition means the two functions can run at the same time, and memset() is not atomic. -- pw-bot: cr ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH net 1/3] igc: Clean the TX buffer and TX descriptor ring 2023-05-11 2:14 ` Jakub Kicinski @ 2023-05-12 8:51 ` Zulkifli, Muhammad Husaini 2023-05-12 15:30 ` Maciej Fijalkowski 0 siblings, 1 reply; 8+ messages in thread From: Zulkifli, Muhammad Husaini @ 2023-05-12 8:51 UTC (permalink / raw) To: Jakub Kicinski, Nguyen, Anthony L Cc: davem@davemloft.net, pabeni@redhat.com, edumazet@google.com, netdev@vger.kernel.org, Neftin, Sasha, Naama Meir Hi Jakub, > On Tue, 9 May 2023 10:09:33 -0700 Tony Nguyen wrote: > > There could be a race condition during link down where interrupt being > > generated and igc_clean_tx_irq() been called to perform the TX > > completion. Properly clear the TX buffer and TX descriptor ring to > > avoid those case. > > > + /* Zero out the buffer ring */ > > + memset(tx_ring->tx_buffer_info, 0, > > + sizeof(*tx_ring->tx_buffer_info) * tx_ring->count); > > + > > + /* Zero out the descriptor ring */ > > + memset(tx_ring->desc, 0, tx_ring->size); > > Just from the diff and the commit description this does not seem obviously > correct. Race condition means the two functions can run at the same time, > and memset() is not atomic. While a link is going up or down and a lot of packets(UDP) are being sent transmitted, we are observing some kernel panic issues. On my side, it was easily to reproduce. It's possible that igc_clean_tx_irq() was called to complete the TX during link up/down based on how the call trace looks. With this fix, I not observed the issue anymore. Almost similar issue reported before in here: https://lore.kernel.org/all/SJ1PR11MB6180CDB866753CFBC2F9AF75B8959@SJ1PR11MB6180.namprd11.prod.outlook.com/ > -- > pw-bot: cr ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net 1/3] igc: Clean the TX buffer and TX descriptor ring 2023-05-12 8:51 ` Zulkifli, Muhammad Husaini @ 2023-05-12 15:30 ` Maciej Fijalkowski 2023-05-15 3:30 ` Zulkifli, Muhammad Husaini 0 siblings, 1 reply; 8+ messages in thread From: Maciej Fijalkowski @ 2023-05-12 15:30 UTC (permalink / raw) To: Zulkifli, Muhammad Husaini Cc: Jakub Kicinski, Nguyen, Anthony L, davem@davemloft.net, pabeni@redhat.com, edumazet@google.com, netdev@vger.kernel.org, Neftin, Sasha, Naama Meir On Fri, May 12, 2023 at 08:51:23AM +0000, Zulkifli, Muhammad Husaini wrote: > Hi Jakub, > > > On Tue, 9 May 2023 10:09:33 -0700 Tony Nguyen wrote: > > > There could be a race condition during link down where interrupt being > > > generated and igc_clean_tx_irq() been called to perform the TX > > > completion. Properly clear the TX buffer and TX descriptor ring to > > > avoid those case. > > > > > + /* Zero out the buffer ring */ > > > + memset(tx_ring->tx_buffer_info, 0, > > > + sizeof(*tx_ring->tx_buffer_info) * tx_ring->count); > > > + > > > + /* Zero out the descriptor ring */ > > > + memset(tx_ring->desc, 0, tx_ring->size); > > > > Just from the diff and the commit description this does not seem obviously > > correct. Race condition means the two functions can run at the same time, > > and memset() is not atomic. > > While a link is going up or down and a lot of packets(UDP) are being sent transmitted, > we are observing some kernel panic issues. On my side, it was easily to reproduce. > It's possible that igc_clean_tx_irq() was called to complete the TX during link up/down > based on how the call trace looks. With this fix, I not observed the issue anymore. then include the splat you were getting in the commit msg as well as steps to repro. from a brief look it looks like ndo_stop() path does not disable Tx rings before cleaning them? This is being done when configuring xsk_pool on a given Tx ring, though. > > Almost similar issue reported before in here: > https://lore.kernel.org/all/SJ1PR11MB6180CDB866753CFBC2F9AF75B8959@SJ1PR11MB6180.namprd11.prod.outlook.com/ > > > -- > > pw-bot: cr > ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH net 1/3] igc: Clean the TX buffer and TX descriptor ring 2023-05-12 15:30 ` Maciej Fijalkowski @ 2023-05-15 3:30 ` Zulkifli, Muhammad Husaini 0 siblings, 0 replies; 8+ messages in thread From: Zulkifli, Muhammad Husaini @ 2023-05-15 3:30 UTC (permalink / raw) To: Fijalkowski, Maciej Cc: Jakub Kicinski, Nguyen, Anthony L, davem@davemloft.net, pabeni@redhat.com, edumazet@google.com, netdev@vger.kernel.org, Neftin, Sasha, Naama Meir Hi Maciej, > On Fri, May 12, 2023 at 08:51:23AM +0000, Zulkifli, Muhammad Husaini > wrote: > > Hi Jakub, > > > > > On Tue, 9 May 2023 10:09:33 -0700 Tony Nguyen wrote: > > > > There could be a race condition during link down where interrupt > > > > being generated and igc_clean_tx_irq() been called to perform the > > > > TX completion. Properly clear the TX buffer and TX descriptor ring > > > > to avoid those case. > > > > > > > + /* Zero out the buffer ring */ > > > > + memset(tx_ring->tx_buffer_info, 0, > > > > + sizeof(*tx_ring->tx_buffer_info) * tx_ring->count); > > > > + > > > > + /* Zero out the descriptor ring */ > > > > + memset(tx_ring->desc, 0, tx_ring->size); > > > > > > Just from the diff and the commit description this does not seem > > > obviously correct. Race condition means the two functions can run at > > > the same time, and memset() is not atomic. > > > > While a link is going up or down and a lot of packets(UDP) are being > > sent transmitted, we are observing some kernel panic issues. On my side, it > was easily to reproduce. > > It's possible that igc_clean_tx_irq() was called to complete the TX > > during link up/down based on how the call trace looks. With this fix, I not > observed the issue anymore. > > then include the splat you were getting in the commit msg as well as steps to > repro. > > from a brief look it looks like ndo_stop() path does not disable Tx rings before > cleaning them? This is being done when configuring xsk_pool on a given Tx ring, > though. Yes you are right. We shall disable tx queue ring as well. I will submit the V2 to replace the igc_clean_tx_ring() to igc_disable_tx_ring() in igc_free_tx_resources() during ndo_stop callback while maintaining the memset to clear all the ring des/buffer during igc_clean_tx_ring(). I have tested this on my TGL setup with multiple loops iterations and no more Kernel panic observed. Will update this in commit message as well Thanks, Husaini > > > > > Almost similar issue reported before in here: > > > https://lore.kernel.org/all/SJ1PR11MB6180CDB866753CFBC2F9AF75B8959@S > J1 > > PR11MB6180.namprd11.prod.outlook.com/ > > > > > -- > > > pw-bot: cr > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH net 2/3] igc: Fix possible system crash when loading module 2023-05-09 17:09 [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2023-05-09 (igc, igb) Tony Nguyen 2023-05-09 17:09 ` [PATCH net 1/3] igc: Clean the TX buffer and TX descriptor ring Tony Nguyen @ 2023-05-09 17:09 ` Tony Nguyen 2023-05-09 17:09 ` [PATCH net 3/3] igb: fix nvm.ops.read() error handling Tony Nguyen 2 siblings, 0 replies; 8+ messages in thread From: Tony Nguyen @ 2023-05-09 17:09 UTC (permalink / raw) To: davem, kuba, pabeni, edumazet, netdev Cc: Vinicius Costa Gomes, anthony.l.nguyen, sasha.neftin, Muhammad Husaini Zulkifli, Naama Meir From: Vinicius Costa Gomes <vinicius.gomes@intel.com> Guarantee that when probe() is run again, PTM and PCI busmaster will be in the same state as it was if the driver was never loaded. Avoid an i225/i226 hardware issue that PTM requests can be made even though PCI bus mastering is not enabled. These unexpected PTM requests can crash some systems. So, "force" disable PTM and busmastering before removing the driver, so they can be re-enabled in the right order during probe(). This is more like a workaround and should be applicable for i225 and i226, in any platform. Fixes: 1b5d73fb8624 ("igc: Enable PCIe PTM") Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> --- drivers/net/ethernet/intel/igc/igc_main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 5767e691b401..5213f3da1fe7 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -6730,6 +6730,9 @@ static void igc_remove(struct pci_dev *pdev) igc_ptp_stop(adapter); + pci_disable_ptm(pdev); + pci_clear_master(pdev); + set_bit(__IGC_DOWN, &adapter->state); del_timer_sync(&adapter->watchdog_timer); -- 2.38.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net 3/3] igb: fix nvm.ops.read() error handling 2023-05-09 17:09 [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2023-05-09 (igc, igb) Tony Nguyen 2023-05-09 17:09 ` [PATCH net 1/3] igc: Clean the TX buffer and TX descriptor ring Tony Nguyen 2023-05-09 17:09 ` [PATCH net 2/3] igc: Fix possible system crash when loading module Tony Nguyen @ 2023-05-09 17:09 ` Tony Nguyen 2 siblings, 0 replies; 8+ messages in thread From: Tony Nguyen @ 2023-05-09 17:09 UTC (permalink / raw) To: davem, kuba, pabeni, edumazet, netdev Cc: Aleksandr Loktionov, anthony.l.nguyen, Pucha Himasekhar Reddy From: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Add error handling into igb_set_eeprom() function, in case nvm.ops.read() fails just quit with error code asap. Fixes: 9d5c824399de ("igb: PCI-Express 82575 Gigabit Ethernet driver") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> --- drivers/net/ethernet/intel/igb/igb_ethtool.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c index 7d60da1b7bf4..99b6b21caa02 100644 --- a/drivers/net/ethernet/intel/igb/igb_ethtool.c +++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c @@ -822,6 +822,8 @@ static int igb_set_eeprom(struct net_device *netdev, */ ret_val = hw->nvm.ops.read(hw, last_word, 1, &eeprom_buff[last_word - first_word]); + if (ret_val) + goto out; } /* Device's eeprom is always little-endian, word addressable */ @@ -839,7 +841,7 @@ static int igb_set_eeprom(struct net_device *netdev, /* Update the checksum if nvm write succeeded */ if (ret_val == 0) hw->nvm.ops.update(hw); - +out: igb_set_fw_version(adapter); kfree(eeprom_buff); return ret_val; -- 2.38.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-05-15 3:32 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-05-09 17:09 [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2023-05-09 (igc, igb) Tony Nguyen 2023-05-09 17:09 ` [PATCH net 1/3] igc: Clean the TX buffer and TX descriptor ring Tony Nguyen 2023-05-11 2:14 ` Jakub Kicinski 2023-05-12 8:51 ` Zulkifli, Muhammad Husaini 2023-05-12 15:30 ` Maciej Fijalkowski 2023-05-15 3:30 ` Zulkifli, Muhammad Husaini 2023-05-09 17:09 ` [PATCH net 2/3] igc: Fix possible system crash when loading module Tony Nguyen 2023-05-09 17:09 ` [PATCH net 3/3] igb: fix nvm.ops.read() error handling Tony Nguyen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).