netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Intel ixgb driver bug in linux-2.6.17-rc6-mm2
@ 2006-06-20 19:35 Linas Vepstas
  2006-06-20 21:13 ` Jesse Brandeburg
  0 siblings, 1 reply; 3+ messages in thread
From: Linas Vepstas @ 2006-06-20 19:35 UTC (permalink / raw)
  To: jeffrey.t.kirsher, ayyappan.veeraiyan, john.ronciak,
	jesse.brandeburg, auke-jan.h.kok
  Cc: linux-pci, netdev


Hi,

I sat down to do some testing of the ixgb driver a few days ago, and
get failures within seconds.  From what I can tell, I'm getting either a
DMA to a bad address or some other PCI bus error, not sure which. 
The problem appears to happen only for the driver that's in
2.6.17-rc6-mm2. As a sanity check, I'm testing the SuSE SLES10 beta,
which is 2.6.16 based, and it doesn't seem to have any problems.

My test is dirt-simple: telnet to the chargen port.  After an eyeblink,
I get the pci bus error, that's that. "eyeblink" is after about 300MBytes
transfered.  That was with a driver with NAPI enabled. I tried again
with NAPI disabled, and got to about 1.8 GB transfered in two eyeblinks.

To make sure that I'm not dealing with faulty hardware, I tried the same
thing w/ SLES10 2.6.16.18-1.8  and have gotten to RX bytes:20889480686
(19921.7 Mb) so far, with no problems. I don't have easy access to a PCI
bus analyzer, otherwise, I'd tell you more. Ideas? Suggestions? 

I could try taking the diff between these two driver versions, and
seeing what change caused the problem, but thought I should email first,
before doing that.

--linas

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Intel ixgb driver bug in linux-2.6.17-rc6-mm2
  2006-06-20 19:35 Intel ixgb driver bug in linux-2.6.17-rc6-mm2 Linas Vepstas
@ 2006-06-20 21:13 ` Jesse Brandeburg
  2006-06-21 20:18   ` Linas Vepstas
  0 siblings, 1 reply; 3+ messages in thread
From: Jesse Brandeburg @ 2006-06-20 21:13 UTC (permalink / raw)
  To: Linas Vepstas
  Cc: jeffrey.t.kirsher, ayyappan.veeraiyan, john.ronciak,
	jesse.brandeburg, auke-jan.h.kok, linux-pci, netdev

On 6/20/06, Linas Vepstas <linas@austin.ibm.com> wrote:
>
> Hi,
>
> I sat down to do some testing of the ixgb driver a few days ago, and
> get failures within seconds.  From what I can tell, I'm getting either a
> DMA to a bad address or some other PCI bus error, not sure which.
> The problem appears to happen only for the driver that's in
> 2.6.17-rc6-mm2. As a sanity check, I'm testing the SuSE SLES10 beta,
> which is 2.6.16 based, and it doesn't seem to have any problems.
>
> My test is dirt-simple: telnet to the chargen port.  After an eyeblink,
> I get the pci bus error, that's that. "eyeblink" is after about 300MBytes
> transfered.  That was with a driver with NAPI enabled. I tried again
> with NAPI disabled, and got to about 1.8 GB transfered in two eyeblinks.
>
> To make sure that I'm not dealing with faulty hardware, I tried the same
> thing w/ SLES10 2.6.16.18-1.8  and have gotten to RX bytes:20889480686
> (19921.7 Mb) so far, with no problems. I don't have easy access to a PCI
> bus analyzer, otherwise, I'd tell you more. Ideas? Suggestions?
>
> I could try taking the diff between these two driver versions, and
> seeing what change caused the problem, but thought I should email first,
> before doing that.

For some reason I didn't get your mail at intel yet.  anyway, please
try disabling TSO using ethtool and see if that helps any.

you're running 1.0.109, correct?
what does cat /proc/interrupts say (are you running MSI?)

I'd also like to know if LLTX support (recently added) is causing you
the issue.  What hardware platform? pSeries?  does it EEH? what does
the dump say?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Intel ixgb driver bug in linux-2.6.17-rc6-mm2
  2006-06-20 21:13 ` Jesse Brandeburg
@ 2006-06-21 20:18   ` Linas Vepstas
  0 siblings, 0 replies; 3+ messages in thread
From: Linas Vepstas @ 2006-06-21 20:18 UTC (permalink / raw)
  To: Jesse Brandeburg
  Cc: jeffrey.t.kirsher, ayyappan.veeraiyan, john.ronciak,
	jesse.brandeburg, auke-jan.h.kok, linux-pci, netdev, wenxiong

On Tue, Jun 20, 2006 at 02:13:45PM -0700, Jesse Brandeburg wrote:
> On 6/20/06, Linas Vepstas <linas@austin.ibm.com> wrote:
> >
> >I sat down to do some testing of the ixgb driver a few days ago, and
> >get failures within seconds.  From what I can tell, I'm getting either a
> >DMA to a bad address or some other PCI bus error, not sure which.
> >The problem appears to happen only for the driver that's in
> >2.6.17-rc6-mm2. As a sanity check, I'm testing the SuSE SLES10 beta,
> >which is 2.6.16 based, and it doesn't seem to have any problems.
> >
> >My test is dirt-simple: telnet to the chargen port.  After an eyeblink,
> >I get the pci bus error, that's that. "eyeblink" is after about 300MBytes
> >transfered.  That was with a driver with NAPI enabled. I tried again
> >with NAPI disabled, and got to about 1.8 GB transfered in two eyeblinks.
> >
> >To make sure that I'm not dealing with faulty hardware, I tried the same
> >thing w/ SLES10 2.6.16.18-1.8  and have gotten to RX bytes:20889480686
> >(19921.7 Mb) so far, with no problems. I don't have easy access to a PCI
> >bus analyzer, otherwise, I'd tell you more. Ideas? Suggestions?
> >
> >I could try taking the diff between these two driver versions, and
> >seeing what change caused the problem, but thought I should email first,
> >before doing that.

> try disabling TSO using ethtool and see if that helps any.

Bing!  That appears to have fixed it !

> you're running 1.0.109, correct?

Yes.  DRV_VERSION     "1.0.109-k2"

> what does cat /proc/interrupts say (are you running MSI?)

No MSI, this is on older hardware.  
163:       3769     450130      14983     439113   XICS      Edge nic5

> I'd also like to know if LLTX support (recently added) is causing you
> the issue.  What hardware platform? pSeries?  does it EEH? what does
> the dump say?

Yes, its pseries; yes, I see this as EEH errors. However, the EEH error
detection is asynchronous, and so the Linux tack trace is throughly
boring: the error is first noticed when the watchdog runs, typically.

--linas

p.s. version 1.0.100-k2 works gret with NAPI on, and the defal TSO.
I haven't yet tried 1.0.109 with NAPI on and TSO off.

--linas

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-06-21 20:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-20 19:35 Intel ixgb driver bug in linux-2.6.17-rc6-mm2 Linas Vepstas
2006-06-20 21:13 ` Jesse Brandeburg
2006-06-21 20:18   ` Linas Vepstas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).