netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2021-10-06
@ 2021-10-06 16:56 Tony Nguyen
  2021-10-06 16:56 ` [PATCH net 1/3] i40e: fix endless loop under rtnl Tony Nguyen
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Tony Nguyen @ 2021-10-06 16:56 UTC (permalink / raw)
  To: davem, kuba; +Cc: Tony Nguyen, netdev, sassmann

This series contains updates to i40e and iavf drivers.

Jiri Benc expands an error check to prevent infinite loop for i40e.

Sylwester prevents freeing of uninitialized IRQ vector to resolve a
kernel oops for i40e.

Stefan Assmann fixes a double mutex unlock for iavf.

The following are changes since commit a50a0595230d38be15183699f7bbc963bf3d127a:
  dt-bindings: net: dsa: marvell: fix compatible in example
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue 40GbE

Jiri Benc (1):
  i40e: fix endless loop under rtnl

Stefan Assmann (1):
  iavf: fix double unlock of crit_lock

Sylwester Dziedziuch (1):
  i40e: Fix freeing of uninitialized misc IRQ vector

 drivers/net/ethernet/intel/i40e/i40e_main.c | 5 +++--
 drivers/net/ethernet/intel/iavf/iavf_main.c | 1 -
 2 files changed, 3 insertions(+), 3 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net 1/3] i40e: fix endless loop under rtnl
  2021-10-06 16:56 [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2021-10-06 Tony Nguyen
@ 2021-10-06 16:56 ` Tony Nguyen
  2021-10-06 16:56 ` [PATCH net 2/3] i40e: Fix freeing of uninitialized misc IRQ vector Tony Nguyen
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Tony Nguyen @ 2021-10-06 16:56 UTC (permalink / raw)
  To: davem, kuba
  Cc: Jiri Benc, netdev, anthony.l.nguyen, sassmann, Jesse Brandeburg,
	Dave Switzer

From: Jiri Benc <jbenc@redhat.com>

The loop in i40e_get_capabilities can never end. The problem is that
although i40e_aq_discover_capabilities returns with an error if there's
a firmware problem, the returned error is not checked. There is a check for
pf->hw.aq.asq_last_status but that value is set to I40E_AQ_RC_OK on most
firmware problems.

When i40e_aq_discover_capabilities encounters a firmware problem, it will
encounter the same problem on its next invocation. As the result, the loop
becomes endless. We hit this with I40E_ERR_ADMIN_QUEUE_TIMEOUT but looking
at the code, it can happen with a range of other firmware errors.

I don't know what the correct behavior should be: whether the firmware
should be retried a few times, or whether pf->hw.aq.asq_last_status should
be always set to the encountered firmware error (but then it would be
pointless and can be just replaced by the i40e_aq_discover_capabilities
return value). However, the current behavior with an endless loop under the
rtnl mutex(!) is unacceptable and Intel has not submitted a fix, although we
explained the bug to them 7 months ago.

This may not be the best possible fix but it's better than hanging the whole
system on a firmware bug.

Fixes: 56a62fc86895 ("i40e: init code and hardware support")
Tested-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 2f20980dd9a5..b5b984754ec9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -10113,7 +10113,7 @@ static int i40e_get_capabilities(struct i40e_pf *pf,
 		if (pf->hw.aq.asq_last_status == I40E_AQ_RC_ENOMEM) {
 			/* retry with a larger buffer */
 			buf_len = data_size;
-		} else if (pf->hw.aq.asq_last_status != I40E_AQ_RC_OK) {
+		} else if (pf->hw.aq.asq_last_status != I40E_AQ_RC_OK || err) {
 			dev_info(&pf->pdev->dev,
 				 "capability discovery failed, err %s aq_err %s\n",
 				 i40e_stat_str(&pf->hw, err),
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net 2/3] i40e: Fix freeing of uninitialized misc IRQ vector
  2021-10-06 16:56 [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2021-10-06 Tony Nguyen
  2021-10-06 16:56 ` [PATCH net 1/3] i40e: fix endless loop under rtnl Tony Nguyen
@ 2021-10-06 16:56 ` Tony Nguyen
  2021-10-06 16:56 ` [PATCH net 3/3] iavf: fix double unlock of crit_lock Tony Nguyen
  2021-10-07 11:50 ` [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2021-10-06 patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: Tony Nguyen @ 2021-10-06 16:56 UTC (permalink / raw)
  To: davem, kuba
  Cc: Sylwester Dziedziuch, netdev, anthony.l.nguyen, sassmann,
	PJ Waskiewicz, Mateusz Palczewski, Dave Switzer

From: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>

When VSI set up failed in i40e_probe() as part of PF switch set up
driver was trying to free misc IRQ vectors in
i40e_clear_interrupt_scheme and produced a kernel Oops:

   Trying to free already-free IRQ 266
   WARNING: CPU: 0 PID: 5 at kernel/irq/manage.c:1731 __free_irq+0x9a/0x300
   Workqueue: events work_for_cpu_fn
   RIP: 0010:__free_irq+0x9a/0x300
   Call Trace:
   ? synchronize_irq+0x3a/0xa0
   free_irq+0x2e/0x60
   i40e_clear_interrupt_scheme+0x53/0x190 [i40e]
   i40e_probe.part.108+0x134b/0x1a40 [i40e]
   ? kmem_cache_alloc+0x158/0x1c0
   ? acpi_ut_update_ref_count.part.1+0x8e/0x345
   ? acpi_ut_update_object_reference+0x15e/0x1e2
   ? strstr+0x21/0x70
   ? irq_get_irq_data+0xa/0x20
   ? mp_check_pin_attr+0x13/0xc0
   ? irq_get_irq_data+0xa/0x20
   ? mp_map_pin_to_irq+0xd3/0x2f0
   ? acpi_register_gsi_ioapic+0x93/0x170
   ? pci_conf1_read+0xa4/0x100
   ? pci_bus_read_config_word+0x49/0x70
   ? do_pci_enable_device+0xcc/0x100
   local_pci_probe+0x41/0x90
   work_for_cpu_fn+0x16/0x20
   process_one_work+0x1a7/0x360
   worker_thread+0x1cf/0x390
   ? create_worker+0x1a0/0x1a0
   kthread+0x112/0x130
   ? kthread_flush_work_fn+0x10/0x10
   ret_from_fork+0x1f/0x40

The problem is that at that point misc IRQ vectors
were not allocated yet and we get a call trace
that driver is trying to free already free IRQ vectors.

Add a check in i40e_clear_interrupt_scheme for __I40E_MISC_IRQ_REQUESTED
PF state before calling i40e_free_misc_vector. This state is set only if
misc IRQ vectors were properly initialized.

Fixes: c17401a1dd21 ("i40e: use separate state bit for miscellaneous IRQ setup")
Reported-by: PJ Waskiewicz <pwaskiewicz@jumptrading.com>
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index b5b984754ec9..e04b540cedc8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -4871,7 +4871,8 @@ static void i40e_clear_interrupt_scheme(struct i40e_pf *pf)
 {
 	int i;
 
-	i40e_free_misc_vector(pf);
+	if (test_bit(__I40E_MISC_IRQ_REQUESTED, pf->state))
+		i40e_free_misc_vector(pf);
 
 	i40e_put_lump(pf->irq_pile, pf->iwarp_base_vector,
 		      I40E_IWARP_IRQ_PILE_ID);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net 3/3] iavf: fix double unlock of crit_lock
  2021-10-06 16:56 [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2021-10-06 Tony Nguyen
  2021-10-06 16:56 ` [PATCH net 1/3] i40e: fix endless loop under rtnl Tony Nguyen
  2021-10-06 16:56 ` [PATCH net 2/3] i40e: Fix freeing of uninitialized misc IRQ vector Tony Nguyen
@ 2021-10-06 16:56 ` Tony Nguyen
  2021-10-07 11:50 ` [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2021-10-06 patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: Tony Nguyen @ 2021-10-06 16:56 UTC (permalink / raw)
  To: davem, kuba
  Cc: Stefan Assmann, netdev, anthony.l.nguyen, sassmann, Dan Carpenter

From: Stefan Assmann <sassmann@kpanic.de>

The crit_lock mutex could be unlocked twice as reported here
https://lists.osuosl.org/pipermail/intel-wired-lan/Week-of-Mon-20210823/025525.html

Remove the superfluous unlock. Technically the problem was already
present before 5ac49f3c2702 as that commit only replaced the locking
primitive, but no functional change.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections")
Fixes: bac8486116b0 ("iavf: Refactor the watchdog state machine")
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
---
 drivers/net/ethernet/intel/iavf/iavf_main.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 23762a7ef740..cada4e0e40b4 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -1965,7 +1965,6 @@ static void iavf_watchdog_task(struct work_struct *work)
 		}
 		adapter->aq_required = 0;
 		adapter->current_op = VIRTCHNL_OP_UNKNOWN;
-		mutex_unlock(&adapter->crit_lock);
 		queue_delayed_work(iavf_wq,
 				   &adapter->watchdog_task,
 				   msecs_to_jiffies(10));
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2021-10-06
  2021-10-06 16:56 [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2021-10-06 Tony Nguyen
                   ` (2 preceding siblings ...)
  2021-10-06 16:56 ` [PATCH net 3/3] iavf: fix double unlock of crit_lock Tony Nguyen
@ 2021-10-07 11:50 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-10-07 11:50 UTC (permalink / raw)
  To: Tony Nguyen; +Cc: davem, kuba, netdev, sassmann

Hello:

This series was applied to netdev/net.git (refs/heads/master)
by Tony Nguyen <anthony.l.nguyen@intel.com>:

On Wed,  6 Oct 2021 09:56:56 -0700 you wrote:
> This series contains updates to i40e and iavf drivers.
> 
> Jiri Benc expands an error check to prevent infinite loop for i40e.
> 
> Sylwester prevents freeing of uninitialized IRQ vector to resolve a
> kernel oops for i40e.
> 
> [...]

Here is the summary with links:
  - [net,1/3] i40e: fix endless loop under rtnl
    https://git.kernel.org/netdev/net/c/857b6c6f665c
  - [net,2/3] i40e: Fix freeing of uninitialized misc IRQ vector
    https://git.kernel.org/netdev/net/c/2e5a20573a92
  - [net,3/3] iavf: fix double unlock of crit_lock
    https://git.kernel.org/netdev/net/c/54ee39439acd

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-10-07 11:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-10-06 16:56 [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2021-10-06 Tony Nguyen
2021-10-06 16:56 ` [PATCH net 1/3] i40e: fix endless loop under rtnl Tony Nguyen
2021-10-06 16:56 ` [PATCH net 2/3] i40e: Fix freeing of uninitialized misc IRQ vector Tony Nguyen
2021-10-06 16:56 ` [PATCH net 3/3] iavf: fix double unlock of crit_lock Tony Nguyen
2021-10-07 11:50 ` [PATCH net 0/3][pull request] Intel Wired LAN Driver Updates 2021-10-06 patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).