* ibm_newemac tx problem with jumbo frame enabled
@ 2011-11-18 5:03 Prashant Bhole
2011-11-25 5:25 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 7+ messages in thread
From: Prashant Bhole @ 2011-11-18 5:03 UTC (permalink / raw)
To: linuxppc-dev
Hi,
I have been facing problem with ibm_newemac driver (v3.54).
The board gets disconnected and can not be pinged in between
some heavy network traffic. In my case I am running IOmeter
"All-in-One" 8 threads on the iSCSI target. MTU is 4088.
I found that after executing emac_full_tx_reset(), the board can
be pinged again. Again after some heavy traffic of 5-6 seconds,
traffic stops. This can be repeated after full tx reset.
Is this a known issue? what could cause this?
Any pointers would be greatly appreciated.
-
Prashant
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ibm_newemac tx problem with jumbo frame enabled
2011-11-18 5:03 ibm_newemac tx problem with jumbo frame enabled Prashant Bhole
@ 2011-11-25 5:25 ` Benjamin Herrenschmidt
2011-12-07 8:05 ` Prashant Bhole
0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2011-11-25 5:25 UTC (permalink / raw)
To: Prashant Bhole; +Cc: linuxppc-dev
On Fri, 2011-11-18 at 10:33 +0530, Prashant Bhole wrote:
> Hi,
> I have been facing problem with ibm_newemac driver (v3.54).
> The board gets disconnected and can not be pinged in between
> some heavy network traffic. In my case I am running IOmeter
> "All-in-One" 8 threads on the iSCSI target. MTU is 4088.
>
> I found that after executing emac_full_tx_reset(), the board can
> be pinged again. Again after some heavy traffic of 5-6 seconds,
> traffic stops. This can be repeated after full tx reset.
>
> Is this a known issue? what could cause this?
> Any pointers would be greatly appreciated.
Not that I know of. Can you check if any of the error reporting
registers trip anything ? Could it just be a fifo overflow which we may
not be handling properly in the driver ?
Cheers,
Ben.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ibm_newemac tx problem with jumbo frame enabled
2011-11-25 5:25 ` Benjamin Herrenschmidt
@ 2011-12-07 8:05 ` Prashant Bhole
2011-12-07 22:03 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 7+ messages in thread
From: Prashant Bhole @ 2011-12-07 8:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
On Fri, Nov 25, 2011 at 10:55 AM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Fri, 2011-11-18 at 10:33 +0530, Prashant Bhole wrote:
>> Hi,
>> I have been facing problem with ibm_newemac driver (v3.54).
>> The board gets disconnected and can not be pinged in between
>> some heavy network traffic. In my case I am running IOmeter
>> "All-in-One" 8 threads on the iSCSI target. MTU is 4088.
>>
>> I found that after executing emac_full_tx_reset(), the board can
>> be pinged again. Again after some heavy traffic of 5-6 seconds,
>> traffic stops. This can be repeated after full tx reset.
>>
>> Is this a known issue? what could cause this?
>> Any pointers would be greatly appreciated.
>
> Not that I know of. Can you check if any of the error reporting
> registers trip anything ? Could it just be a fifo overflow which we may
> not be handling properly in the driver ?
>
> Cheers,
> Ben.
Still couldn't find anything like fifo overflow...
I noticed one more thing, this problem happens only when mtu size on
the initiator (the other end) is set to 4088, regardless of any mtu size set
for EMAC.
-
Prashant
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ibm_newemac tx problem with jumbo frame enabled
2011-12-07 8:05 ` Prashant Bhole
@ 2011-12-07 22:03 ` Benjamin Herrenschmidt
2011-12-08 13:01 ` Prashant Bhole
0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2011-12-07 22:03 UTC (permalink / raw)
To: Prashant Bhole; +Cc: linuxppc-dev
On Wed, 2011-12-07 at 13:35 +0530, Prashant Bhole wrote:
> Still couldn't find anything like fifo overflow...
> I noticed one more thing, this problem happens only when mtu size on
> the initiator (the other end) is set to 4088, regardless of any mtu
> size set for EMAC.
Did you check all the registers that may carry errors ? Nothing showed
up ? Did you check that things like Pause frames were properly
negociated on both sides ? Tried playing with the pause and FIFO
thresholds ?
Other than using the tx timeout to perform resets I don't see a good way
to fix that problem.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ibm_newemac tx problem with jumbo frame enabled
2011-12-07 22:03 ` Benjamin Herrenschmidt
@ 2011-12-08 13:01 ` Prashant Bhole
2011-12-08 22:59 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 7+ messages in thread
From: Prashant Bhole @ 2011-12-08 13:01 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
[-- Attachment #1: Type: text/plain, Size: 2238 bytes --]
On Thu, Dec 8, 2011 at 3:33 AM, Benjamin Herrenschmidt <
benh@kernel.crashing.org> wrote:
> On Wed, 2011-12-07 at 13:35 +0530, Prashant Bhole wrote:
> > Still couldn't find anything like fifo overflow...
> > I noticed one more thing, this problem happens only when mtu size on
> > the initiator (the other end) is set to 4088, regardless of any mtu
> > size set for EMAC.
>
> Did you check all the registers that may carry errors ? Nothing showed
> up ? Did you check that things like Pause frames were properly
> negociated on both sides ? Tried playing with the pause and FIFO
> thresholds ?
>
> Other than using the tx timeout to perform resets I don't see a good way
> to fix that problem.
>
> Cheers,
> Ben.
>
>
I checked RX descriptor status and TX descriptor status and ethtool output.
However I don't know about pause packet/frame, how do I check if pause
frames are properly negotiated on both sides?
I need to try changing pause and FIFO thresholds.
ethtool output after disconnection is as follows:
# ethtool -S eth0
NIC statistics:
rx_packets: 330939
rx_bytes: 804963241
tx_packets: 248554
tx_bytes: 798853638
rx_packets_csum: 330716
tx_packets_csum: 179526
tx_undo: 0
rx_dropped_stack: 0
rx_dropped_oom: 0
rx_dropped_error: 0
rx_dropped_resize: 0
rx_dropped_mtu: 0
rx_stopped: 0
rx_bd_errors: 0
rx_bd_overrun: 0
rx_bd_bad_packet: 0
rx_bd_runt_packet: 0
rx_bd_short_event: 0
rx_bd_alignment_error: 0
rx_bd_bad_fcs: 0
rx_bd_packet_too_long: 0
rx_bd_out_of_range: 0
rx_bd_in_range: 0
rx_parity: 0
rx_fifo_overrun: 0
rx_overrun: 0
rx_bad_packet: 0
rx_runt_packet: 0
rx_short_event: 0
rx_alignment_error: 0
rx_bad_fcs: 0
rx_packet_too_long: 0
rx_out_of_range: 0
rx_in_range: 0
tx_dropped: 0
tx_bd_errors: 0
tx_bd_bad_fcs: 0
tx_bd_carrier_loss: 0
tx_bd_excessive_deferral: 0
tx_bd_excessive_collisions: 0
tx_bd_late_collision: 0
tx_bd_multple_collisions: 0
tx_bd_single_collision: 0
tx_bd_underrun: 0
tx_bd_sqe: 0
tx_parity: 0
tx_underrun: 0
tx_sqe: 0
tx_errors: 0
Thanks,
Prashant
[-- Attachment #2: Type: text/html, Size: 2804 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ibm_newemac tx problem with jumbo frame enabled
2011-12-08 13:01 ` Prashant Bhole
@ 2011-12-08 22:59 ` Benjamin Herrenschmidt
2011-12-08 23:11 ` Tirumala Marri
0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2011-12-08 22:59 UTC (permalink / raw)
To: Prashant Bhole; +Cc: Tirumala Marri, linuxppc-dev
On Thu, 2011-12-08 at 18:31 +0530, Prashant Bhole wrote:
>
> I checked RX descriptor status and TX descriptor status and ethtool
> output.
> However I don't know about pause packet/frame, how do I check if pause
> frames are properly negotiated on both sides?
> I need to try changing pause and FIFO thresholds.
>
> ethtool output after disconnection is as follows:
> # ethtool -S eth0
> NIC statistics:
> rx_packets: 330939
> rx_bytes: 804963241
> tx_packets: 248554
> tx_bytes: 798853638
> rx_packets_csum: 330716
> tx_packets_csum: 179526
> tx_undo: 0
.../...
Ok so none of the error counters seem to trip, odd. No idea what's up,
you may want to ask the folks at APM (CCed Tirumala).
I wonder also if we are properly enabling the reporting of error
interrupts... if we got that wrong we may never detect FIFO overruns.
What you describe really looks like a fifo overrun to me.
Additionally, look at emac_configure(), sees how it configures the pause
packet thresholds, maybe you can tweak the watermark to be more
aggressive. Also check that pause is actually enabled (with ethtool) and
that the PHY negociated it properly (that the link partner supports
pause frames).
Cheers,
Ben.
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: ibm_newemac tx problem with jumbo frame enabled
2011-12-08 22:59 ` Benjamin Herrenschmidt
@ 2011-12-08 23:11 ` Tirumala Marri
0 siblings, 0 replies; 7+ messages in thread
From: Tirumala Marri @ 2011-12-08 23:11 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Prashant Bhole; +Cc: linuxppc-dev
Hi Ben,
>-----Original Message-----
>From: Benjamin Herrenschmidt [mailto:benh@kernel.crashing.org]
>Sent: Thursday, December 08, 2011 2:59 PM
>To: Prashant Bhole
>Cc: linuxppc-dev@ozlabs.org; Tirumala Marri
>Subject: Re: ibm_newemac tx problem with jumbo frame enabled
>
>On Thu, 2011-12-08 at 18:31 +0530, Prashant Bhole wrote:
>
>>
>> I checked RX descriptor status and TX descriptor status and ethtool
>> output.
>> However I don't know about pause packet/frame, how do I check if pause
>> frames are properly negotiated on both sides?
>> I need to try changing pause and FIFO thresholds.
>>
>> ethtool output after disconnection is as follows:
>> # ethtool -S eth0
>> NIC statistics:
>> rx_packets: 330939
>> rx_bytes: 804963241
>> tx_packets: 248554
>> tx_bytes: 798853638
>> rx_packets_csum: 330716
>> tx_packets_csum: 179526
>> tx_undo: 0
>
> .../...
>
>Ok so none of the error counters seem to trip, odd. No idea what's up,
>you may want to ask the folks at APM (CCed Tirumala).
>
>I wonder also if we are properly enabling the reporting of error
>interrupts... if we got that wrong we may never detect FIFO overruns.
>What you describe really looks like a fifo overrun to me.
>
>Additionally, look at emac_configure(), sees how it configures the pause
>packet thresholds, maybe you can tweak the watermark to be more
>aggressive. Also check that pause is actually enabled (with ethtool) and
>that the PHY negociated it properly (that the link partner supports
>pause frames).
>
I will take a look.
Thx,
Marri
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-12-08 23:16 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-18 5:03 ibm_newemac tx problem with jumbo frame enabled Prashant Bhole
2011-11-25 5:25 ` Benjamin Herrenschmidt
2011-12-07 8:05 ` Prashant Bhole
2011-12-07 22:03 ` Benjamin Herrenschmidt
2011-12-08 13:01 ` Prashant Bhole
2011-12-08 22:59 ` Benjamin Herrenschmidt
2011-12-08 23:11 ` Tirumala Marri
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).