public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* ethtool -d MCAs rx2600
@ 2004-01-23 23:12 Grant Grundler
  2004-01-23 23:56 ` Grant Grundler
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Grant Grundler @ 2004-01-23 23:12 UTC (permalink / raw)
  To: linux-ia64

Hi all.
This is just a warning: "Don't do this at home"
man ethtool says:

       -d     retrieves and prints a register dump for the specified
              ethernet device.

But it doesn't work so well on rx2600...I'll need to figure out why.
Or is it obvious to anyone?

This is with 2.6.1-rc1 on rx2600 talking to the built-in bcm5701.

hth,
grant

gsyprf3:~# ethtool eth1
Settings for eth1:
        Supported ports: [ MII ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: g
        Wake-on: d
        Current message level: 0x000000ff (255)
        Link detected: yes
gsyprf3:~# ethtool -d eth1
+CPU 1: SAL log contains MCA error record
+Err Record ID: 1    SAL Rev:  0.02
+Time: 01/23/2004 05:56:58    Severity 0

In case someone wants to dig more now, I've dropped the "errdump mca"
output on
	ftp://gsyprf10.external.hp.com/kernels/rx2600/mca_ethtool

(Matching vmlinuz, System.map, .config is also there 2.6.1-rc1.tgz)

thanks,
grant

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ethtool -d MCAs rx2600
  2004-01-23 23:12 ethtool -d MCAs rx2600 Grant Grundler
@ 2004-01-23 23:56 ` Grant Grundler
  2004-01-26 16:30 ` Jack Steiner
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Grant Grundler @ 2004-01-23 23:56 UTC (permalink / raw)
  To: linux-ia64

On Fri, Jan 23, 2004 at 03:12:54PM -0800, Grant Grundler wrote:
> In case someone wants to dig more now, I've dropped the "errdump mca"
> output on
> 	ftp://gsyprf10.external.hp.com/kernels/rx2600/mca_ethtool
> 
> (Matching vmlinuz, System.map, .config is also there 2.6.1-rc1.tgz)

Alex Williams tells me it's a PIO read timeout.
(confirms my guess given what man page said)

Offending address is likely 0x0000000090807000.
Matches nicely with what /proc/iomem thinks:
...
90000000-97ffffff : PCI Bus 0000:20
  90800000-9080ffff : tg3
...

Now just need to hunt down the code that pokes at +0x7000.

thanks,
grant

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ethtool -d MCAs rx2600
  2004-01-23 23:12 ethtool -d MCAs rx2600 Grant Grundler
  2004-01-23 23:56 ` Grant Grundler
@ 2004-01-26 16:30 ` Jack Steiner
  2004-01-26 16:58 ` Matthew Wilcox
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Jack Steiner @ 2004-01-26 16:30 UTC (permalink / raw)
  To: linux-ia64

On Fri, Jan 23, 2004 at 03:56:50PM -0800, Grant Grundler wrote:
> On Fri, Jan 23, 2004 at 03:12:54PM -0800, Grant Grundler wrote:
> > In case someone wants to dig more now, I've dropped the "errdump mca"
> > output on
> > 	ftp://gsyprf10.external.hp.com/kernels/rx2600/mca_ethtool
> > 
> > (Matching vmlinuz, System.map, .config is also there 2.6.1-rc1.tgz)
> 
> Alex Williams tells me it's a PIO read timeout.
> (confirms my guess given what man page said)
> 
> Offending address is likely 0x0000000090807000.
> Matches nicely with what /proc/iomem thinks:
> ...
> 90000000-97ffffff : PCI Bus 0000:20
>   90800000-9080ffff : tg3
> ...
> 
> Now just need to hunt down the code that pokes at +0x7000.
> 
> thanks,
> grant

We see a similar problem with "ethtool -d" on the SGI SN systems. We havent
isolated the cause, but it looks similar - PIO read timeout.

FWIW, the failure occurs in the vicinity of tg3_get_regs+0xb60 called from
tg3_ethtool_ioctl+0xbb0.  (This is on 2.4.21+).  

Looks like it occurs here (but I dont put a lot of faith in this):
        GET_REG32_LOOP(BUFMGR_MODE, 0x58);
        GET_REG32_LOOP(RDMAC_MODE, 0x08);
   >>>> GET_REG32_LOOP(WDMAC_MODE, 0x08);
        GET_REG32_LOOP(RX_CPU_BASE, 0x280);
        GET_REG32_LOOP(TX_CPU_BASE, 0x280);


-- 
Thanks

Jack Steiner (steiner@sgi.com)          651-683-5302
Principal Engineer                      SGI - Silicon Graphics, Inc.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ethtool -d MCAs rx2600
  2004-01-23 23:12 ethtool -d MCAs rx2600 Grant Grundler
  2004-01-23 23:56 ` Grant Grundler
  2004-01-26 16:30 ` Jack Steiner
@ 2004-01-26 16:58 ` Matthew Wilcox
  2004-01-26 17:39 ` Jeff Garzik
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Matthew Wilcox @ 2004-01-26 16:58 UTC (permalink / raw)
  To: linux-ia64

On Mon, Jan 26, 2004 at 10:30:23AM -0600, Jack Steiner wrote:
> On Fri, Jan 23, 2004 at 03:56:50PM -0800, Grant Grundler wrote:
> > On Fri, Jan 23, 2004 at 03:12:54PM -0800, Grant Grundler wrote:
> > > In case someone wants to dig more now, I've dropped the "errdump mca"
> > > output on
> > > 	ftp://gsyprf10.external.hp.com/kernels/rx2600/mca_ethtool
> > > 
> > > (Matching vmlinuz, System.map, .config is also there 2.6.1-rc1.tgz)
> > 
> > Alex Williams tells me it's a PIO read timeout.
> > (confirms my guess given what man page said)
> > 
> > Offending address is likely 0x0000000090807000.
> > Matches nicely with what /proc/iomem thinks:
> > ...
> > 90000000-97ffffff : PCI Bus 0000:20
> >   90800000-9080ffff : tg3
> > ...
> > 
> > Now just need to hunt down the code that pokes at +0x7000.
> > 
> > thanks,
> > grant
> 
> We see a similar problem with "ethtool -d" on the SGI SN systems. We havent
> isolated the cause, but it looks similar - PIO read timeout.
> 
> FWIW, the failure occurs in the vicinity of tg3_get_regs+0xb60 called from
> tg3_ethtool_ioctl+0xbb0.  (This is on 2.4.21+).  
> 
> Looks like it occurs here (but I dont put a lot of faith in this):
>         GET_REG32_LOOP(BUFMGR_MODE, 0x58);
>         GET_REG32_LOOP(RDMAC_MODE, 0x08);
>    >>>> GET_REG32_LOOP(WDMAC_MODE, 0x08);
>         GET_REG32_LOOP(RX_CPU_BASE, 0x280);
>         GET_REG32_LOOP(TX_CPU_BASE, 0x280);

My suspicion is that some tg3 variants don't support this register, but
it's OK to read the register on x86 because it soft-fails.  Most ia64
chipsets hard-fail so we need to avoid this.  Jeff, Dave, can you comment?

-- 
"Next the statesmen will invent cheap lies, putting the blame upon 
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince 
himself that the war is just, and will thank God for the better sleep 
he enjoys after this process of grotesque self-deception." -- Mark Twain

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ethtool -d MCAs rx2600
  2004-01-23 23:12 ethtool -d MCAs rx2600 Grant Grundler
                   ` (2 preceding siblings ...)
  2004-01-26 16:58 ` Matthew Wilcox
@ 2004-01-26 17:39 ` Jeff Garzik
  2004-01-29 19:18 ` Grant Grundler
  2004-01-29 20:41 ` Jack Steiner
  5 siblings, 0 replies; 7+ messages in thread
From: Jeff Garzik @ 2004-01-26 17:39 UTC (permalink / raw)
  To: linux-ia64

Matthew Wilcox wrote:
> On Mon, Jan 26, 2004 at 10:30:23AM -0600, Jack Steiner wrote:
>>Looks like it occurs here (but I dont put a lot of faith in this):
>>        GET_REG32_LOOP(BUFMGR_MODE, 0x58);
>>        GET_REG32_LOOP(RDMAC_MODE, 0x08);
>>   >>>> GET_REG32_LOOP(WDMAC_MODE, 0x08);
>>        GET_REG32_LOOP(RX_CPU_BASE, 0x280);
>>        GET_REG32_LOOP(TX_CPU_BASE, 0x280);
> 
> 
> My suspicion is that some tg3 variants don't support this register, but
> it's OK to read the register on x86 because it soft-fails.  Most ia64
> chipsets hard-fail so we need to avoid this.  Jeff, Dave, can you comment?


I get lockups occasionally on x86 too, but have had higher priority 
things to look at.  Since regdump is mainly an engineer's tool, we felt 
it was a "use at your own risk" feature.

But if we can fix it, all the better.

Tangent -- I would love for somebody to take this output and prettyprint 
it in userland ethtool package (d/l and cvs at 
http://sf.net/projects/gkernel/).

	Jeff




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ethtool -d MCAs rx2600
  2004-01-23 23:12 ethtool -d MCAs rx2600 Grant Grundler
                   ` (3 preceding siblings ...)
  2004-01-26 17:39 ` Jeff Garzik
@ 2004-01-29 19:18 ` Grant Grundler
  2004-01-29 20:41 ` Jack Steiner
  5 siblings, 0 replies; 7+ messages in thread
From: Grant Grundler @ 2004-01-29 19:18 UTC (permalink / raw)
  To: linux-ia64

On Mon, Jan 26, 2004 at 12:39:42PM -0500, Jeff Garzik wrote:
> I get lockups occasionally on x86 too, but have had higher priority 
> things to look at.  Since regdump is mainly an engineer's tool, we felt 
> it was a "use at your own risk" feature.
> 
> But if we can fix it, all the better.

tg3_get_regs() is reading registers that don't exist.
Neither HPUX nor tru64 drivers attempt to touch NVRAM on BCM5700/1 chips.
And tg3 in most other places doesn't either.
It just needs to check TG3_FLAG_NVRAM before reading NVRAM regs.

Jack, you also using the bcm5701 chip?

Jeff, please apply. Following patch is against 2.6.2-rc2.

thanks,
grant

=== drivers/net/tg3.c 1.81 vs edited ==--- 1.81/drivers/net/tg3.c	Wed Dec 31 23:40:32 2003
+++ edited/drivers/net/tg3.c	Thu Jan 29 10:19:46 2004
@@ -5904,7 +5904,9 @@
 	GET_REG32_LOOP(MSGINT_MODE, 0x0c);
 	GET_REG32_1(DMAC_MODE);
 	GET_REG32_LOOP(GRC_MODE, 0x4c);
-	GET_REG32_LOOP(NVRAM_CMD, 0x24);
+	if (tp->tg3_flags & TG3_FLAG_NVRAM) {
+		GET_REG32_LOOP(NVRAM_CMD, 0x24);
+	}
 
 #undef __GET_REG32
 #undef GET_REG32_LOOP

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ethtool -d MCAs rx2600
  2004-01-23 23:12 ethtool -d MCAs rx2600 Grant Grundler
                   ` (4 preceding siblings ...)
  2004-01-29 19:18 ` Grant Grundler
@ 2004-01-29 20:41 ` Jack Steiner
  5 siblings, 0 replies; 7+ messages in thread
From: Jack Steiner @ 2004-01-29 20:41 UTC (permalink / raw)
  To: linux-ia64

On Thu, Jan 29, 2004 at 11:18:37AM -0800, Grant Grundler wrote:
> On Mon, Jan 26, 2004 at 12:39:42PM -0500, Jeff Garzik wrote:
> > I get lockups occasionally on x86 too, but have had higher priority 
> > things to look at.  Since regdump is mainly an engineer's tool, we felt 
> > it was a "use at your own risk" feature.
> > 
> > But if we can fix it, all the better.
> 
> tg3_get_regs() is reading registers that don't exist.
> Neither HPUX nor tru64 drivers attempt to touch NVRAM on BCM5700/1 chips.
> And tg3 in most other places doesn't either.
> It just needs to check TG3_FLAG_NVRAM before reading NVRAM regs.
> 
> Jack, you also using the bcm5701 chip?

Yes. Tigon3 [rev 0105 PHY(5701)] (PCI:66MHz:64-bit


We'll apply the patch. 

Thanks....



> 
> Jeff, please apply. Following patch is against 2.6.2-rc2.
> 
> thanks,
> grant
> 
> === drivers/net/tg3.c 1.81 vs edited ==> --- 1.81/drivers/net/tg3.c	Wed Dec 31 23:40:32 2003
> +++ edited/drivers/net/tg3.c	Thu Jan 29 10:19:46 2004
> @@ -5904,7 +5904,9 @@
>  	GET_REG32_LOOP(MSGINT_MODE, 0x0c);
>  	GET_REG32_1(DMAC_MODE);
>  	GET_REG32_LOOP(GRC_MODE, 0x4c);
> -	GET_REG32_LOOP(NVRAM_CMD, 0x24);
> +	if (tp->tg3_flags & TG3_FLAG_NVRAM) {
> +		GET_REG32_LOOP(NVRAM_CMD, 0x24);
> +	}
>  
>  #undef __GET_REG32
>  #undef GET_REG32_LOOP
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Thanks

Jack Steiner (steiner@sgi.com)          651-683-5302
Principal Engineer                      SGI - Silicon Graphics, Inc.



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-01-29 20:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-01-23 23:12 ethtool -d MCAs rx2600 Grant Grundler
2004-01-23 23:56 ` Grant Grundler
2004-01-26 16:30 ` Jack Steiner
2004-01-26 16:58 ` Matthew Wilcox
2004-01-26 17:39 ` Jeff Garzik
2004-01-29 19:18 ` Grant Grundler
2004-01-29 20:41 ` Jack Steiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox