All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: [Bug 9106] Sun Fire v100 dmfe driver bug
@ 2007-11-04 21:34 Richard Mortimer
  2007-11-04 21:49 ` David Miller
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Richard Mortimer @ 2007-11-04 21:34 UTC (permalink / raw)
  To: ultralinux

I don't think that the seprom is set on SPARC systems so there
is not MAC address in there to read.
You need to read it from the appropriate OBP properties.

Search for local-mac-address and CONFIG_SPARC in tulip_core.c

Regards

Richard 

> -----Original Message-----
> From: ultralinux-owner@vger.kernel.org 
> [mailto:ultralinux-owner@vger.kernel.org] On Behalf Of Grant Grundler
> Sent: 04 November 2007 19:03
> To: ultralinux@vger.kernel.org
> Cc: grundler@parisc-linux.org; kovedi@gmail.com
> Subject: fwd: [Bug 9106] Sun Fire v100 dmfe driver bug
> 
> http://bugzilla.kernel.org/show_bug.cgi?id‘06
> 
> I have two possible reasons why dmfe driver is reading zeros 
> from the seprom for the MAC address.
> 1) IO port space routing is fu-bar - ie we are only able to 
> talk to the chip's config space.
> 2) outl() is implemented as a postable MMIO write (semantics 
> demand outl be non-postable).
> 
> Pursueing (2) for a bit, I looked at 
> linux-2.6.22.9/include/asm-sparc64/io.h:
> ...
>  74 static __inline__ void _outl(u32 l, unsigned long addr)
>  75 {
>  76         __asm__ __volatile__("stwa\t%r0, [%1] %2\t/* pci_outl */"
>  77                              : /* no outputs */
>  78                              : "Jr" (l), "r" (addr), "i" 
> (ASI_PHYS_BYPASS_EC    _E_L));
>  79 }
> ...
>  86 #define outl(__l, __addr)       (_outl((u32)(__l), 
> (unsigned long)(__addr)))
> ...
> 
> 
> Is "stwa" a posted write?
> I'm also open to ideas of what else might be wrong in dmfe.c driver.
> 
> 
> The initial symptom seems related to 
> linux-2.6.22.9/drivers/net/tulip/dmfe.c:
> ...
> 1576 static u16 read_srom_word(long ioaddr, int offset)
> 1577 {
> 1578         int i;
> 1579         u16 srom_data = 0;
> 1580         long cr9_ioaddr = ioaddr + DCR9;
> 1581 
> 1582         outl(CR9_SROM_READ, cr9_ioaddr);
> 1583         outl(CR9_SROM_READ | CR9_SRCS, cr9_ioaddr);
> 1584 
> 1585         /* Send the Read Command 110b */
> 1586         SROM_CLK_WRITE(SROM_DATA_1, cr9_ioaddr);
> 1587         SROM_CLK_WRITE(SROM_DATA_1, cr9_ioaddr);
> 1588         SROM_CLK_WRITE(SROM_DATA_0, cr9_ioaddr);
> ...
> 
> where SROM_CLK_WRITE is defined as:
> ...
>  175 #define SROM_CLK_WRITE(data, ioaddr) \
>  176         outl(data|CR9_SROM_READ|CR9_SRCS,ioaddr); \
>  177         udelay(5); \
>  178         outl(data|CR9_SROM_READ|CR9_SRCS|CR9_SRCLK,ioaddr); \
>  179         udelay(5); \
>  180         outl(data|CR9_SROM_READ|CR9_SRCS,ioaddr); \
>  181         udelay(5);
> 
> obviously, if outl() is posted, the udelay() won't be respected
> and there are already two writes in flight when the CPU gets 
> to line 1586.
> 
> 
> thanks,
> grant
> 
> ----- Forwarded message from bugme-daemon@bugzilla.kernel.org -----
> 
> X-Spam-Checker-Version: SpamAssassin 3.1.4 (2006-07-26) on 
> colo.lackof.org
> X-Spam-Level: 
> X-Spam-Status: No, score=0.9 required=5.5 
> testsºYES_40,FORGED_RCVD_HELO,
> 	NO_REAL_NAME autolearn=no version=3.1.4
> Subject: [Bug 9106] Sun Fire v100 dmfe driver bug
> X-Bugzilla-Product: Drivers
> X-Bugzilla-Severity: high
> X-Bugzilla-Keywords: 
> X-Bugzilla-Reason: CC AssignedTo
> X-Bugzilla-Component: Network
> To: grundler@parisc-linux.org
> From: bugme-daemon@bugzilla.kernel.org
> X-MIMEDefang-Filter: lf$Revision: 1.188 $
> X-Scanned-By: MIMEDefang 2.53 on 207.189.120.14
> X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at lackof.org
> 
> http://bugzilla.kernel.org/show_bug.cgi?id‘06
> 
> 
> 
> 
> 
> ------- Comment #4 from kovedi@gmail.com  2007-10-08 00:29 -------
> I try to use the udelay(5) after line 1583 and 1596 but the error
> coming again after reboot.
> 
> the devices not have mac adresses
> 
> [   69.450854] eth0: Davicom DM9102 at pci0000:00:0c.0,
> 00:00:00:00:00:00, irq 9.
> [   69.477577] eth1: Davicom DM9102 at pci0000:00:05.0,
> 00:00:00:00:00:00, irq 10.
> 
> and always resetting the network devices.
> 
> [   84.923730] eth0: Tx timeout - resetting
> [   86.923705] eth0: Tx timeout - resetting
> [   88.923650] eth0: Tx timeout - resetting
> [   90.923636] eth0: Tx timeout - resetting
> [   92.923587] eth0: Tx timeout - resetting
> 
> 
> -- 
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
> You are the assignee for the bug, or are watching the assignee.
> 
> ----- End forwarded message -----
> -
> To unsubscribe from this list: send the line "unsubscribe 
> ultralinux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread
* Re: [Bug 9106] Sun Fire v100 dmfe driver bug
@ 2008-01-22  7:27 David Miller
  2008-01-23  6:19 ` Grant Grundler
  2008-01-23  6:23 ` David Miller
  0 siblings, 2 replies; 8+ messages in thread
From: David Miller @ 2008-01-22  7:27 UTC (permalink / raw)
  To: sparclinux

From: Grant Grundler <grundler@parisc-linux.org>
Date: Sun, 6 Jan 2008 01:35:32 -0700

[ Andrew, one of your warning fixes from yesteryear broke
  SROM parsing in the DMFE tulip driver, see below.  ]

> Thanks! This was the case with the dmfe driver on SPARC.
> I've attached a patch as comment #6 on:
> 	http://bugzilla.kernel.org/show_bug.cgi?id‘06

If there is no MAC address in the SROM you'll also have to handle the
fact that there's almost guarenteed to not be any media mode data
there either.  Actually, there might be such information there and
this could be a clue.  See below.

This driver did work on sparc64 to some extent at some point in the
past.  So something did change over time to subtly break this thing.

The users log with the TX timeouts probably indicates that the chip
cannot DMA properly or send to the network for whatever reason.  Maybe
the media type is wrong.

BTW, I noticed this while reading the DMFE code:

	db->buf_pool_ptr = pci_alloc_consistent(pdev, TX_BUF_ALLOC *
			TX_DESC_CNT + 4, &db->buf_pool_dma_ptr);

That size looks strange, is this supposed to be:

	(TX_BUF_ALLOC * TX_DESC_CNT) + 4

or:

	TX_BUF_ALLOC * (TX_DESC_CNT + 4)

I think the latter is the intention, but the former is what
is actually happening.

Looking through the driver history I definitely see some bugs
introduced into the SROM code, for example:

	commit 16b110c3fd760620b4a787db6ed512fe531ab1b5
	Author: Andrew Morton <akpm@osdl.org>
	Date:   Mon Jun 20 15:32:59 2005 -0700

	[PATCH] dmfe warning fix
 ...
	This is basically a guess:
    
	Cc: Jeff Garzik <jgarzik@pobox.com>
	Signed-off-by: Andrew Morton <akpm@osdl.org>

--- a/drivers/net/tulip/dmfe.c
+++ b/drivers/net/tulip/dmfe.c
@@ -1802,7 +1802,7 @@ static void dmfe_parse_srom(struct dmfe_board_info * db)
 	if ( ( (int) srom[18] & 0xff) = SROM_V41_CODE) {
 		/* SROM V4.01 */
 		/* Get NIC support media mode */
-		db->NIC_capability = le16_to_cpup(srom + 34);
+		db->NIC_capability = le16_to_cpup((__le16 *)srom + 34/2);
 		db->PHY_reg4 = 0;
 		for (tmp_reg = 1; tmp_reg < 0x10; tmp_reg <<= 1) {
 			switch( db->NIC_capability & tmp_reg ) {
@@ -1814,7 +1814,8 @@ static void dmfe_parse_srom(struct dmfe_board_info * db)
 		}
 
 		/* Media Mode Force or not check */
-		dmfe_mode = le32_to_cpup(srom + 34) & le32_to_cpup(srom + 36);
+		dmfe_mode = le32_to_cpup((__le32 *)srom + 34/4) &
+				le32_to_cpup((__le32 *)srom + 36/4);
 		switch(dmfe_mode) {
 		case 0x4: dmfe_media_mode = DMFE_100MHF; break;	/* 100MHF */
 		case 0x2: dmfe_media_mode = DMFE_10MFD; break;	/* 10MFD */

Basically he is trying to deal with the fact that "srom" is
a "char *" but the typing requires converting to 16-bit
and 32-bit pointers so the offsets have to be adjusted.

But it is totally wrong in the cases where the offsets were not
a multiple or 2 or 4, respectively.  (hint: (34 % 4) = 2)

It would have been better to use expressions like:

	(__le16 *) (srom + 34)
	(__le32 *) (srom + 34)
	(__le32 *) (srom + 36)

So that change definitely subtly changed how the SROM is accessed
and likely is a good explanation for certain bugs, maybe even this
one we are discussing.

Thinking about this some more, these accesses are extremely queer.
Why would it use 32-bit accesses here to two 32-bit datums which
overlap, and "and" them together to figure out the 'dmfe_mode'?

This driver is in a bit of a state of disrepair.  Who is Tobias
Ringstrom (from MAINTAINERS) and when was the last time he did any
maintainence on dmfe?  He definitely hasn't done anything in
the GIT era as I look through the changelog.

Anyways, here is a fix for the above regression, if that DMA
sizing error above is a real one that will need a fix too:

[TULIP] DMFE: Fix SROM parsing regression.

Changeset 16b110c3fd760620b4a787db6ed512fe531ab1b5 (dmfe warning fix)
bothed up the offsets read from the SROM so that it doesn't read the
same datums it used to.

The change made transformations like turning:

	"srom + 34"

into

	"(__le32 *)srom + 34/4"

which doesn't work because 4 does not divide evenly
into 34 so we're using a different pointer offset
than in the original code.

I've changed theses cases in dmfe_parse_srom() to
consistently use "(type *)(srom + offset)" preserving
the offsets from the original code.

Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/drivers/net/tulip/dmfe.c b/drivers/net/tulip/dmfe.c
index b4891ca..6562004 100644
--- a/drivers/net/tulip/dmfe.c
+++ b/drivers/net/tulip/dmfe.c
@@ -1909,7 +1909,7 @@ static void dmfe_parse_srom(struct dmfe_board_info * db)
 	if ( ( (int) srom[18] & 0xff) = SROM_V41_CODE) {
 		/* SROM V4.01 */
 		/* Get NIC support media mode */
-		db->NIC_capability = le16_to_cpup((__le16 *)srom + 34/2);
+		db->NIC_capability = le16_to_cpup((__le16 *) (srom + 34));
 		db->PHY_reg4 = 0;
 		for (tmp_reg = 1; tmp_reg < 0x10; tmp_reg <<= 1) {
 			switch( db->NIC_capability & tmp_reg ) {
@@ -1921,8 +1921,8 @@ static void dmfe_parse_srom(struct dmfe_board_info * db)
 		}
 
 		/* Media Mode Force or not check */
-		dmfe_mode = le32_to_cpup((__le32 *)srom + 34/4) &
-				le32_to_cpup((__le32 *)srom + 36/4);
+		dmfe_mode = (le32_to_cpup((__le32 *) (srom + 34)) &
+			     le32_to_cpup((__le32 *) (srom + 36)));
 		switch(dmfe_mode) {
 		case 0x4: dmfe_media_mode = DMFE_100MHF; break;	/* 100MHF */
 		case 0x2: dmfe_media_mode = DMFE_10MFD; break;	/* 10MFD */

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-01-23  6:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-04 21:34 [Bug 9106] Sun Fire v100 dmfe driver bug Richard Mortimer
2007-11-04 21:49 ` David Miller
2007-11-06  6:53 ` Grant Grundler
2007-11-06  7:11 ` David Miller
2008-01-06  8:35 ` Grant Grundler
  -- strict thread matches above, loose matches on Subject: below --
2008-01-22  7:27 David Miller
2008-01-23  6:19 ` Grant Grundler
2008-01-23  6:23 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.