netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* sun neptune mis-detecting ethernet crc faults?
@ 2009-06-29 20:57 Chris Friesen
  2009-06-29 21:29 ` Rick Jones
  0 siblings, 1 reply; 3+ messages in thread
From: Chris Friesen @ 2009-06-29 20:57 UTC (permalink / raw)
  To: netdev


Hi all,

David Miller is busy and suggested someone on the list might be able to
help.

We have some boards using the Sun Neptune ethernet adapters.  We're
seeing behaviour that at this point looks like a hardware glitch
in the ethernet CRC validation on the receive path.  It appears to be
incorrectly detecting a corrupt CRC and dropping the frames.  (We've
enabled port mirroring on the switch and the frames are received without
errors on the eavesdropper board.)

The odd thing is that we're using a TCP connection and once the CRC
glitch shows up for a particular chunk of data it continues to drop all
the retransmissions for that chunk as having bad CRCs, even though their
CRC values are totally different due to different embedded timestamps.

Has anyone heard of anything like this on the Neptune hardware?  MTU is
set to 2000 if it matters, though we're planning on retesting with it
set to 1500.

I'm considering disabling the hardware CRC check as a
verification--looking at the niu driver I think I should be able to do
this by not including XMAC_CONFIG_RX_CRC_CHK_DIS in the big list of
flags being OR'd in niu_init_rx_xmac().

Anyone have any suggestions?

Thanks,

Chris

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: sun neptune mis-detecting ethernet crc faults?
  2009-06-29 20:57 sun neptune mis-detecting ethernet crc faults? Chris Friesen
@ 2009-06-29 21:29 ` Rick Jones
  2009-06-29 23:13   ` Matheos Worku
  0 siblings, 1 reply; 3+ messages in thread
From: Rick Jones @ 2009-06-29 21:29 UTC (permalink / raw)
  To: Chris Friesen; +Cc: netdev

Chris Friesen wrote:
> Hi all,
> 
> David Miller is busy and suggested someone on the list might be able to
> help.
> 
> We have some boards using the Sun Neptune ethernet adapters.  We're
> seeing behaviour that at this point looks like a hardware glitch
> in the ethernet CRC validation on the receive path.  It appears to be
> incorrectly detecting a corrupt CRC and dropping the frames.  (We've
> enabled port mirroring on the switch and the frames are received without
> errors on the eavesdropper board.)

A simplistic question, but are you sure that the eavesdropper board is 
checking CRCs?

> The odd thing is that we're using a TCP connection and once the CRC
> glitch shows up for a particular chunk of data it continues to drop all
> the retransmissions for that chunk as having bad CRCs, even though their
> CRC values are totally different due to different embedded timestamps.

Do you mean TCP timestamp options?

> Has anyone heard of anything like this on the Neptune hardware? 

Can't say as I have, but the history of "networking" is littered with 
data pattern induced bugs in all manner of hardware.

> MTU is set to 2000 if it matters, though we're planning on retesting
> with it set to 1500.

An MTU of 2000 bytes means a TCP segment with timestamps enabled will be 
2032 plus the ethernet header (assuming no vlan tags) of 14 bytes for 
2046 and then there is the trailing CRC - which is getting very close to 
a magic power of two boundary, another place where history is repleat 
with examples of bugs.  One that comes to mind is that the old Alteon 
AceNICs got very unhappy if one crossed a 4G boundary with a DMA...

rick jones

> 
> I'm considering disabling the hardware CRC check as a
> verification--looking at the niu driver I think I should be able to do
> this by not including XMAC_CONFIG_RX_CRC_CHK_DIS in the big list of
> flags being OR'd in niu_init_rx_xmac().
> 
> Anyone have any suggestions?
> 
> Thanks,
> 
> Chris
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: sun neptune mis-detecting ethernet crc faults?
  2009-06-29 21:29 ` Rick Jones
@ 2009-06-29 23:13   ` Matheos Worku
  0 siblings, 0 replies; 3+ messages in thread
From: Matheos Worku @ 2009-06-29 23:13 UTC (permalink / raw)
  To: Rick Jones; +Cc: Chris Friesen, netdev

Rick Jones wrote:
> Chris Friesen wrote:
>> Hi all,
>>
>> David Miller is busy and suggested someone on the list might be able to
>> help.
>>
>> We have some boards using the Sun Neptune ethernet adapters. We're
>> seeing behaviour that at this point looks like a hardware glitch
>> in the ethernet CRC validation on the receive path. It appears to be
>> incorrectly detecting a corrupt CRC and dropping the frames. (We've
>> enabled port mirroring on the switch and the frames are received without
>> errors on the eavesdropper board.)
>
> A simplistic question, but are you sure that the eavesdropper board is 
> checking CRCs?
>
>> The odd thing is that we're using a TCP connection and once the CRC
>> glitch shows up for a particular chunk of data it continues to drop all
>> the retransmissions for that chunk as having bad CRCs, even though their
>> CRC values are totally different due to different embedded timestamps.
>
> Do you mean TCP timestamp options?
>
>> Has anyone heard of anything like this on the Neptune hardware?
At Sun, we haven't seen such RX CRC error before.
>
> Can't say as I have, but the history of "networking" is littered with 
> data pattern induced bugs in all manner of hardware.
>
>> MTU is set to 2000 if it matters, though we're planning on retesting
>> with it set to 1500.
>
> An MTU of 2000 bytes means a TCP segment with timestamps enabled will 
> be 2032 plus the ethernet header (assuming no vlan tags) of 14 bytes 
> for 2046 and then there is the trailing CRC - which is getting very 
> close to a magic power of two boundary, another place where history is 
> repleat with examples of bugs. One that comes to mind is that the old 
> Alteon AceNICs got very unhappy if one crossed a 4G boundary with a 
> DMA...

>
> rick jones
>
>>
>> I'm considering disabling the hardware CRC check as a
>> verification--looking at the niu driver I think I should be able to do
>> this by not including XMAC_CONFIG_RX_CRC_CHK_DIS in the big list of
>> flags being OR'd in niu_init_rx_xmac().
That is right.

Regards,
Matheos

>>
>> Anyone have any suggestions?
>>
>> Thanks,
>>
>> Chris
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-06-29 23:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-29 20:57 sun neptune mis-detecting ethernet crc faults? Chris Friesen
2009-06-29 21:29 ` Rick Jones
2009-06-29 23:13   ` Matheos Worku

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).