* [PATCH V3 1/2] bnx2x: Adapter not recovery from EEH error injection
2014-06-03 19:14 [PATCH V3 0/2] Adapter not recovery from EEH error injection wenxiong
@ 2014-06-03 19:14 ` wenxiong
2014-06-03 19:14 ` [PATCH V3 2/2] bnx2x: Fix kernel crash and data miscompare after EEH recovery wenxiong
2014-06-04 1:38 ` [PATCH V3 0/2] Adapter not recovery from EEH error injection David Miller
2 siblings, 0 replies; 4+ messages in thread
From: wenxiong @ 2014-06-03 19:14 UTC (permalink / raw)
To: davem; +Cc: ariel.elior, netdev, Wen Xiong
[-- Attachment #1: bnx2x_eeh_fix --]
[-- Type: text/plain, Size: 985 bytes --]
When injecting EEH error to bnx2x adapter, adapter couldn't be recovery
and caused recursive EEH errors. The patch fixes the issue.
Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Index: b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
===================================================================
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 2014-05-22 18:42:48.000000000 -0500
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 2014-05-22 18:44:36.757765539 -0500
@@ -13279,8 +13279,8 @@ static int bnx2x_eeh_nic_unload(struct b
netdev_reset_tc(bp->dev);
del_timer_sync(&bp->timer);
- cancel_delayed_work(&bp->sp_task);
- cancel_delayed_work(&bp->period_task);
+ cancel_delayed_work_sync(&bp->sp_task);
+ cancel_delayed_work_sync(&bp->period_task);
spin_lock_bh(&bp->stats_lock);
bp->stats_state = STATS_STATE_DISABLED;
--
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH V3 2/2] bnx2x: Fix kernel crash and data miscompare after EEH recovery
2014-06-03 19:14 [PATCH V3 0/2] Adapter not recovery from EEH error injection wenxiong
2014-06-03 19:14 ` [PATCH V3 1/2] bnx2x: " wenxiong
@ 2014-06-03 19:14 ` wenxiong
2014-06-04 1:38 ` [PATCH V3 0/2] Adapter not recovery from EEH error injection David Miller
2 siblings, 0 replies; 4+ messages in thread
From: wenxiong @ 2014-06-03 19:14 UTC (permalink / raw)
To: davem; +Cc: ariel.elior, netdev, Milton Miller, Wen Xiong
[-- Attachment #1: bnx2x_rmb_fix --]
[-- Type: text/plain, Size: 1791 bytes --]
A rmb() is required to ensure that the CQE is not read before it
is written by the adapter DMA. PCI ordering rules will make sure
the other fields are written before the marker at the end of struct
eth_fast_path_rx_cqe but without rmb() a weakly ordered processor can
process stale data.
Without the barrier we have observed various crashes including
bnx2x_tpa_start being called on queues not stopped (resulting in message
start of bin not in stop) and NULL pointer exceptions from bnx2x_rx_int.
Signed-off-by: Milton Miller <miltonm@us.ibm.com>
Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
Index: b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
===================================================================
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c 2014-05-23 10:34:21.000000000 -0500
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c 2014-06-03 11:55:56.747765400 -0500
@@ -906,6 +906,18 @@ static int bnx2x_rx_int(struct bnx2x_fas
bd_prod = RX_BD(bd_prod);
bd_cons = RX_BD(bd_cons);
+ /* A rmb() is required to ensure that the CQE is not read
+ * before it is written by the adapter DMA. PCI ordering
+ * rules will make sure the other fields are written before
+ * the marker at the end of struct eth_fast_path_rx_cqe
+ * but without rmb() a weakly ordered processor can process
+ * stale data. Without the barrier TPA state-machine might
+ * enter inconsistent state and kernel stack might be
+ * provided with incorrect packet description - these lead
+ * to various kernel crashed.
+ */
+ rmb();
+
cqe_fp_flags = cqe_fp->type_error_flags;
cqe_fp_type = cqe_fp_flags & ETH_FAST_PATH_RX_CQE_TYPE;
--
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH V3 0/2] Adapter not recovery from EEH error injection
2014-06-03 19:14 [PATCH V3 0/2] Adapter not recovery from EEH error injection wenxiong
2014-06-03 19:14 ` [PATCH V3 1/2] bnx2x: " wenxiong
2014-06-03 19:14 ` [PATCH V3 2/2] bnx2x: Fix kernel crash and data miscompare after EEH recovery wenxiong
@ 2014-06-04 1:38 ` David Miller
2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2014-06-04 1:38 UTC (permalink / raw)
To: wenxiong; +Cc: ariel.elior, netdev
From: wenxiong@linux.vnet.ibm.com
Date: Tue, 03 Jun 2014 14:14:44 -0500
> Re-submitted with space fix and new sign-off address for Milton.
Applied, thanks.
BTW, it isn't your fault, but this driver does not build very well
with BNX2X_SRIOV disabled:
CC [M] drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.o
In file included from drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h:27:0,
from drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c:62:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h:574:48: warning: ‘struct bnx2’ declared inside parameter list [enabled by default]
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h:574:48: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h: In function ‘bnx2x_vf_pci_dealloc’:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h:574:59: warning: ‘return’ with a value, in function returning void [enabled by default]
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c: In function ‘__bnx2x_remove’:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c:13241:4: warning: passing argument 1 of ‘bnx2x_vf_pci_dealloc’ from incompatible pointer type [enabled by default]
In file included from drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h:27:0,
from drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c:62:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h:574:20: note: expected ‘struct bnx2 *’ but argument is of type ‘struct bnx2x *’
CC [M] drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.o
In file included from drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h:27:0,
from drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c:29:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h:574:48: warning: ‘struct bnx2’ declared inside parameter list [enabled by default]
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h:574:48: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h: In function ‘bnx2x_vf_pci_dealloc’:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h:574:59: warning: ‘return’ with a value, in function returning void [enabled by default]
Also, if qlogic.com is the correct email address for the driver maintainer,
the top-level MAINTAINERS file needs to be updated.
^ permalink raw reply [flat|nested] 4+ messages in thread