* [PATCH] [BACKPORT] [3.14.56] bnx2x: Don't notify about scratchpad parities
@ 2015-11-05 10:18 Patrick Schaaf
2015-11-06 17:32 ` Greg KH
0 siblings, 1 reply; 5+ messages in thread
From: Patrick Schaaf @ 2015-11-05 10:18 UTC (permalink / raw)
To: netdev; +Cc: Greg KH, Yuval Mintz
bnx2x: Don't notify about scratchpad parities
This is a (trivial) "backport" of ad6afbe9578d1fa26680faf78c846bd8c00d1d6e to
stable kernel 3.14.56.
Original commit message:
The scratchpad is a shared block between all functions of a given device.
Due to HW limitations, we can't properly close its parity notifications
to all functions on legal flows.
E.g., it's possible that while taking a register dump from one function
a parity error would be triggered on other functions.
Today driver doesn't consider this parity as a 'real' parity unless its
being accompanied by additional indications [which would happen in a real
parity scenario]; But it does print notifications for such events in the
system logs.
This eliminates such prints - in case of real parities driver would have
additional indications; But if this is the only signal user will not even
see a parity being logged in the system.
Signed-off-by: Patrick Schaaf <netdev@bof.de>
Tested-by: Patrick Schaaf <netdev@bof.de>
---
Related discussion + more info in http://marc.info/?l=linux-netdev&m=144663711626469
I experienced a production server network outage where over 1 million kernel
messages were produced within 8 seconds. This change is supposed to suppress
these messages.
I'm running the patched 3.14.56 on three production boxes now, and hope it
helps should the original issue reoccur....
--- linux-3.14.56-vanilla/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
2015-10-27 01:46:24.000000000 +0100
+++ linux-3.14.56-eightball/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
2015-11-05 09:44:45.126824041 +0100
@@ -2401,10 +2401,13 @@
AEU_INPUTS_ATTN_BITS_IGU_PARITY_ERROR | \
AEU_INPUTS_ATTN_BITS_MISC_PARITY_ERROR)
-#define HW_PRTY_ASSERT_SET_3 (AEU_INPUTS_ATTN_BITS_MCP_LATCHED_ROM_PARITY | \
- AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_RX_PARITY | \
- AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_TX_PARITY | \
- AEU_INPUTS_ATTN_BITS_MCP_LATCHED_SCPAD_PARITY)
+#define HW_PRTY_ASSERT_SET_3_WITHOUT_SCPAD \
+ (AEU_INPUTS_ATTN_BITS_MCP_LATCHED_ROM_PARITY | \
+ AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_RX_PARITY | \
+ AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_TX_PARITY)
+
+#define HW_PRTY_ASSERT_SET_3 (HW_PRTY_ASSERT_SET_3_WITHOUT_SCPAD | \
+ AEU_INPUTS_ATTN_BITS_MCP_LATCHED_SCPAD_PARITY)
#define HW_PRTY_ASSERT_SET_4 (AEU_INPUTS_ATTN_BITS_PGLUE_PARITY_ERROR | \
AEU_INPUTS_ATTN_BITS_ATC_PARITY_ERROR)
--- linux-3.14.56-vanilla/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
2015-10-27 01:46:24.000000000 +0100
+++ linux-3.14.56-eightball/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
2015-11-05 09:44:45.126824041 +0100
@@ -4631,9 +4631,7 @@
res |= true;
break;
case AEU_INPUTS_ATTN_BITS_MCP_LATCHED_SCPAD_PARITY:
- if (print)
- _print_next_block((*par_num)++,
- "MCP SCPAD");
+ (*par_num)++;
/* clear latched SCPAD PATIRY from MCP */
REG_WR(bp, MISC_REG_AEU_CLR_LATCH_SIGNAL,
1UL << 10);
@@ -4695,6 +4693,7 @@
(sig[3] & HW_PRTY_ASSERT_SET_3) ||
(sig[4] & HW_PRTY_ASSERT_SET_4)) {
int par_num = 0;
+
DP(NETIF_MSG_HW, "Was parity error: HW block parity attention:\n"
"[0]:0x%08x [1]:0x%08x [2]:0x%08x [3]:0x%08x [4]:0x%08x\n",
sig[0] & HW_PRTY_ASSERT_SET_0,
@@ -4702,9 +4701,18 @@
sig[2] & HW_PRTY_ASSERT_SET_2,
sig[3] & HW_PRTY_ASSERT_SET_3,
sig[4] & HW_PRTY_ASSERT_SET_4);
- if (print)
- netdev_err(bp->dev,
- "Parity errors detected in blocks: ");
+ if (print) {
+ if (((sig[0] & HW_PRTY_ASSERT_SET_0) ||
+ (sig[1] & HW_PRTY_ASSERT_SET_1) ||
+ (sig[2] & HW_PRTY_ASSERT_SET_2) ||
+ (sig[4] & HW_PRTY_ASSERT_SET_4)) ||
+ (sig[3] & HW_PRTY_ASSERT_SET_3_WITHOUT_SCPAD)) {
+ netdev_err(bp->dev,
+ "Parity errors detected in blocks: ");
+ } else {
+ print = false;
+ }
+ }
res |= bnx2x_check_blocks_with_parity0(bp,
sig[0] & HW_PRTY_ASSERT_SET_0, &par_num, print);
res |= bnx2x_check_blocks_with_parity1(bp,
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] [BACKPORT] [3.14.56] bnx2x: Don't notify about scratchpad parities
2015-11-05 10:18 [PATCH] [BACKPORT] [3.14.56] bnx2x: Don't notify about scratchpad parities Patrick Schaaf
@ 2015-11-06 17:32 ` Greg KH
2015-11-06 17:40 ` Patrick Schaaf
2015-12-10 13:37 ` Patrick Schaaf
0 siblings, 2 replies; 5+ messages in thread
From: Greg KH @ 2015-11-06 17:32 UTC (permalink / raw)
To: Patrick Schaaf; +Cc: netdev, Yuval Mintz
On Thu, Nov 05, 2015 at 11:18:37AM +0100, Patrick Schaaf wrote:
> bnx2x: Don't notify about scratchpad parities
>
> This is a (trivial) "backport" of ad6afbe9578d1fa26680faf78c846bd8c00d1d6e to
> stable kernel 3.14.56.
This patch isn't in 4.1 either, do you want it there as well?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] [BACKPORT] [3.14.56] bnx2x: Don't notify about scratchpad parities
2015-11-06 17:32 ` Greg KH
@ 2015-11-06 17:40 ` Patrick Schaaf
2015-12-10 13:37 ` Patrick Schaaf
1 sibling, 0 replies; 5+ messages in thread
From: Patrick Schaaf @ 2015-11-06 17:40 UTC (permalink / raw)
To: Greg KH; +Cc: netdev, Yuval Mintz
On Friday 06 November 2015 09:32:46 Greg KH wrote:
> On Thu, Nov 05, 2015 at 11:18:37AM +0100, Patrick Schaaf wrote:
> > bnx2x: Don't notify about scratchpad parities
> >
> > This is a (trivial) "backport" of ad6afbe9578d1fa26680faf78c846bd8c00d1d6e
> > to stable kernel 3.14.56.
>
> This patch isn't in 4.1 either, do you want it there as well?
Personally I'll probably stay on 3.14 until 4.4 comes out, but the patch seems
to make huge sense generally, given the 1 million in 8 second printk's I
experienced :)
My 3 servers with the patch applied to 3.14.56 run rock solid so far (but the
issue I had manifested only after 22 days of uptime, on one of them)
The cited commit from mainline applied nicely with some offset to 3.14.56, so
I think you could just pull it down to all stable kernels.
best regards
Patrick
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] [BACKPORT] [3.14.56] bnx2x: Don't notify about scratchpad parities
2015-11-06 17:32 ` Greg KH
2015-11-06 17:40 ` Patrick Schaaf
@ 2015-12-10 13:37 ` Patrick Schaaf
2016-03-01 6:25 ` Greg KH
1 sibling, 1 reply; 5+ messages in thread
From: Patrick Schaaf @ 2015-12-10 13:37 UTC (permalink / raw)
To: Greg KH; +Cc: netdev, Yuval Mintz
On Friday 06 November 2015 09:32:46 Greg KH wrote:
> On Thu, Nov 05, 2015 at 11:18:37AM +0100, Patrick Schaaf wrote:
> > bnx2x: Don't notify about scratchpad parities
> >
> > This is a (trivial) "backport" of ad6afbe9578d1fa26680faf78c846bd8c00d1d6e
> > to stable kernel 3.14.56.
>
> This patch isn't in 4.1 either, do you want it there as well?
Hi Greg,
I didn't see the patch in 3.14.57 or 3.14.58 - could you please consider it
again (for all stable kernels that don't have it)?
My three machines with bnx2x interfaces have been running file with patch
3.14.56, for the last 35 days. The original problematic event (spewing a
million messages which are suppressed by that patch), did not reoccur so far
(neither did any other issue, dmesg is completely empty since boot).
best regards
Patrick
Related earlier posts / reports, for reference:
http://marc.info/?l=linux-netdev&m=144663711626469
http://lists.openwall.net/netdev/2015/11/05/48
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] [BACKPORT] [3.14.56] bnx2x: Don't notify about scratchpad parities
2015-12-10 13:37 ` Patrick Schaaf
@ 2016-03-01 6:25 ` Greg KH
0 siblings, 0 replies; 5+ messages in thread
From: Greg KH @ 2016-03-01 6:25 UTC (permalink / raw)
To: Patrick Schaaf; +Cc: netdev, Yuval Mintz
On Thu, Dec 10, 2015 at 02:37:34PM +0100, Patrick Schaaf wrote:
> On Friday 06 November 2015 09:32:46 Greg KH wrote:
> > On Thu, Nov 05, 2015 at 11:18:37AM +0100, Patrick Schaaf wrote:
> > > bnx2x: Don't notify about scratchpad parities
> > >
> > > This is a (trivial) "backport" of ad6afbe9578d1fa26680faf78c846bd8c00d1d6e
> > > to stable kernel 3.14.56.
> >
> > This patch isn't in 4.1 either, do you want it there as well?
>
> Hi Greg,
>
> I didn't see the patch in 3.14.57 or 3.14.58 - could you please consider it
> again (for all stable kernels that don't have it)?
>
> My three machines with bnx2x interfaces have been running file with patch
> 3.14.56, for the last 35 days. The original problematic event (spewing a
> million messages which are suppressed by that patch), did not reoccur so far
> (neither did any other issue, dmesg is completely empty since boot).
>
> best regards
> Patrick
>
> Related earlier posts / reports, for reference:
>
> http://marc.info/?l=linux-netdev&m=144663711626469
> http://lists.openwall.net/netdev/2015/11/05/48
Sorry for the long delay, now queued up.
greg k-h
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-03-01 6:25 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-05 10:18 [PATCH] [BACKPORT] [3.14.56] bnx2x: Don't notify about scratchpad parities Patrick Schaaf
2015-11-06 17:32 ` Greg KH
2015-11-06 17:40 ` Patrick Schaaf
2015-12-10 13:37 ` Patrick Schaaf
2016-03-01 6:25 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).