* Patch: Fix SMP hang on modem close
@ 2002-04-05 20:20 roger blofeld
2002-04-06 10:24 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 10+ messages in thread
From: roger blofeld @ 2002-04-05 20:20 UTC (permalink / raw)
To: benh; +Cc: linuxppc-dev
Ben,
This patch removes two dangling LOCK() statements for
core99/pangea. The core99 one hung my dual g4.
-roger
--- linux/arch/ppc/kernel/pmac_feature.c.orig Tue
Apr 2 08:17:31 2002
+++ linux/arch/ppc/kernel/pmac_feature.c Fri
Apr 5 14:02:13 2002
@@ -788,7 +788,7 @@
UNLOCK(flags); mdelay(250);
LOCK(flags);
MACIO_OUT8(KL_GPIO_MODEM_RESET, gpio |
KEYLARGO_GPIO_OUTOUT_DATA);
(void)MACIO_IN8(KL_GPIO_MODEM_RESET);
- UNLOCK(flags); mdelay(250);
LOCK(flags);
+ UNLOCK(flags); mdelay(250);
}
return 0;
}
@@ -1445,7 +1445,7 @@
UNLOCK(flags); mdelay(250);
LOCK(flags);
MACIO_OUT8(KL_GPIO_MODEM_RESET, gpio |
KEYLARGO_GPIO_OUTOUT_DATA);
(void)MACIO_IN8(KL_GPIO_MODEM_RESET);
- UNLOCK(flags); mdelay(250);
LOCK(flags);
+ UNLOCK(flags); mdelay(250);
}
return 0;
}
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: Patch: Fix SMP hang on modem close 2002-04-05 20:20 Patch: Fix SMP hang on modem close roger blofeld @ 2002-04-06 10:24 ` Benjamin Herrenschmidt 2002-06-06 19:25 ` Sungem bug or something else? roger blofeld 0 siblings, 1 reply; 10+ messages in thread From: Benjamin Herrenschmidt @ 2002-04-06 10:24 UTC (permalink / raw) To: roger blofeld; +Cc: linuxppc-dev > >Ben, > This patch removes two dangling LOCK() statements for >core99/pangea. The core99 one hung my dual g4. >-roger Good catch ! That would indeed have cause SMP lockups when using the modem. Thanks, Ben. >--- linux/arch/ppc/kernel/pmac_feature.c.orig Tue >Apr 2 08:17:31 2002 >+++ linux/arch/ppc/kernel/pmac_feature.c Fri >Apr 5 14:02:13 2002 >@@ -788,7 +788,7 @@ > UNLOCK(flags); mdelay(250); >LOCK(flags); > MACIO_OUT8(KL_GPIO_MODEM_RESET, gpio | >KEYLARGO_GPIO_OUTOUT_DATA); > (void)MACIO_IN8(KL_GPIO_MODEM_RESET); >- UNLOCK(flags); mdelay(250); >LOCK(flags); >+ UNLOCK(flags); mdelay(250); > } > return 0; > } >@@ -1445,7 +1445,7 @@ > UNLOCK(flags); mdelay(250); >LOCK(flags); > MACIO_OUT8(KL_GPIO_MODEM_RESET, gpio | >KEYLARGO_GPIO_OUTOUT_DATA); > (void)MACIO_IN8(KL_GPIO_MODEM_RESET); >- UNLOCK(flags); mdelay(250); >LOCK(flags); >+ UNLOCK(flags); mdelay(250); > } > return 0; > } > > > ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Sungem bug or something else? 2002-04-06 10:24 ` Benjamin Herrenschmidt @ 2002-06-06 19:25 ` roger blofeld 2002-06-06 19:30 ` Tom Rini 2002-06-06 19:45 ` Benjamin Herrenschmidt 0 siblings, 2 replies; 10+ messages in thread From: roger blofeld @ 2002-06-06 19:25 UTC (permalink / raw) To: linuxppc-dev I encounter an oops during boot bringing up a sungem interface. (smp g4 450/gcc 3.1/glibc 2.2.5) If I defer bringing up the network at boot, I can successfully start eth0 (sungem) if I start eth1 (tulip) first, so it may not be the sungem driver itself. This happens on benh 2.4.19-Bpre10, and pre9. The area which fails (according to ksymoops) is in sungem.c <__phy_read+54/a4> static u16 __phy_read(struct gem *gp, int reg, int phy_addr) { u32 cmd; int limit = 10000; cmd = (1 << 30); cmd |= (2 << 28); cmd |= (phy_addr << 23) & MIF_FRAME_PHYAD; cmd |= (reg << 18) & MIF_FRAME_REGAD; cmd |= (MIF_FRAME_TAMSB); writel(cmd, gp->regs + MIF_FRAME); while (limit--) { cmd = readl(gp->regs + MIF_FRAME); *** here *** if (cmd & MIF_FRAME_TALSB) break; udelay(10); } if (!limit) cmd = 0xffff; return cmd & MIF_FRAME_DATA; } The actual faulting address is 0xe20d920c (the value of gpr0; gpr31 is 0) 0xc00de2a4 <__phy_read+80>: lwbrx r31,r0,r0 0xc00de2a8 <__phy_read+84>: eieio Any clues where I should look? Thanks -roger ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Sungem bug or something else? 2002-06-06 19:25 ` Sungem bug or something else? roger blofeld @ 2002-06-06 19:30 ` Tom Rini 2002-06-06 19:45 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 10+ messages in thread From: Tom Rini @ 2002-06-06 19:30 UTC (permalink / raw) To: roger blofeld; +Cc: linuxppc-dev On Thu, Jun 06, 2002 at 12:25:10PM -0700, roger blofeld wrote: > I encounter an oops during boot bringing up a sungem > interface. (smp g4 450/gcc 3.1/glibc 2.2.5) If I defer > bringing up the network at boot, I can successfully > start eth0 (sungem) if I start eth1 (tulip) first, so > it may not be the sungem driver itself. This happens > on benh 2.4.19-Bpre10, and pre9. Have you tried gcc-3.0 or gcc-2.95 ? -- Tom Rini (TR1265) http://gate.crashing.org/~trini/ ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Sungem bug or something else? 2002-06-06 19:25 ` Sungem bug or something else? roger blofeld 2002-06-06 19:30 ` Tom Rini @ 2002-06-06 19:45 ` Benjamin Herrenschmidt 2002-06-06 20:35 ` roger blofeld ` (2 more replies) 1 sibling, 3 replies; 10+ messages in thread From: Benjamin Herrenschmidt @ 2002-06-06 19:45 UTC (permalink / raw) To: roger blofeld, linuxppc-dev >I encounter an oops during boot bringing up a sungem >interface. (smp g4 450/gcc 3.1/glibc 2.2.5) If I defer >bringing up the network at boot, I can successfully >start eth0 (sungem) if I start eth1 (tulip) first, so >it may not be the sungem driver itself. This happens >on benh 2.4.19-Bpre10, and pre9. What kind of error is it ? A Machine Check ? Looking at your backtrace, it looks like the driver is trying to access the PHY chip. That can sometimes happen if you have some tool like miitool or ethtool trying to get at the link status while the chip isn't powered up. The problem here is that sungem on Apple HW only powers the chip when the interface is brought up, and powers it down about 10 seconds after bringing the interface down. This improve power management, but kills link monitoring tools. There may be also a bug in the driver causing it to try to access the PHY registers when the chip is in down mode & getting the ethtool ioctl's Ben. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Sungem bug or something else? 2002-06-06 19:45 ` Benjamin Herrenschmidt @ 2002-06-06 20:35 ` roger blofeld 2002-06-06 20:41 ` Kevin B. Hendricks 2002-06-07 0:51 ` roger blofeld 2 siblings, 0 replies; 10+ messages in thread From: roger blofeld @ 2002-06-06 20:35 UTC (permalink / raw) To: Benjamin Herrenschmidt, linuxppc-dev --- Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > >I encounter an oops during boot bringing up a > sungem > >interface. (smp g4 450/gcc 3.1/glibc 2.2.5) If I > defer > >bringing up the network at boot, I can successfully > >start eth0 (sungem) if I start eth1 (tulip) first, > so > >it may not be the sungem driver itself. This > happens > >on benh 2.4.19-Bpre10, and pre9. > > What kind of error is it ? A Machine Check ? > > Looking at your backtrace, it looks like the driver > is > trying to access the PHY chip. That can sometimes > happen > if you have some tool like miitool or ethtool trying > to > get at the link status while the chip isn't powered > up. > > The problem here is that sungem on Apple HW only > powers > the chip when the interface is brought up, and > powers it > down about 10 seconds after bringing the interface > down. > > This improve power management, but kills link > monitoring > tools. > > There may be also a bug in the driver causing it to > try > to access the PHY registers when the chip is in down > mode & getting the ethtool ioctl's > > Ben. > > Ben, I suspect your last thought may be correct. From the oops: Machine check in kernel mode. Oops: machine check, sig: 7 NIP: C00DE2A8 XER: 20000000 LR: C00E319C SP: DDBB5E10 REGS: ddbb5d60 TRAP: 0200 Not tainted MSR: 00049030 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11 TASK = ddbb4000[753] 'mii-tool' Last syscall: 54 last math 00000000 last altivec 00000000 CPU: 0 Note mii-tool is running. The backtrace: Trace; 02000000 Before first symbol Trace; c00e319c <gem_ioctl+158/178> Trace; c01785d0 <dev_ifsioc+414/484> Trace; c0178850 <dev_ioctl+210/39c> Trace; c01b6f08 <inet_ioctl+200/20c> Trace; c016de8c <sock_ioctl+40/ac> Trace; c005785c <sys_ioctl+13c/338> Trace; c000601c <ret_from_syscall_1+0/b4> Trace; 7ffff9e0 Before first symbol Trace; 100012cc Before first symbol Trace; 1000195c Before first symbol Trace; 0fed8d94 Before first symbol Trace; 00000000 Before first symbol shows clearly that an ioctl is in progress. -roger ===== no microsoft products were used in the production of this email ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Sungem bug or something else? 2002-06-06 19:45 ` Benjamin Herrenschmidt 2002-06-06 20:35 ` roger blofeld @ 2002-06-06 20:41 ` Kevin B. Hendricks 2002-06-06 20:25 ` benh 2002-06-06 21:02 ` roger blofeld 2002-06-07 0:51 ` roger blofeld 2 siblings, 2 replies; 10+ messages in thread From: Kevin B. Hendricks @ 2002-06-06 20:41 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: roger blofeld, linuxppc-dev Hi, Does sungem use autonegotiate to determine its interface type and speed (like some of the more advanced interface drivers) or does it look at the rom or use a table? If it autonegotiates, does the driver actually wait long enough for the autonegotiation to fully complete before returning the first time? Under some tulip drivers, I noticed something very similar (but no oops, just a inability to use the driver until I rmmod and then insmod it once). I think it happens because the the autonegotiation results where handled asynchronously and the main driver routine simply started it and returned before the auonegotiation actually completed and the interface and speed were properly determined. The problem was right after bringing up the network in the boot sequence things tried to use it (the appletalk drivers, etc). So if I waited to insert the module for the driver until after everything else was started (at the end of the bootsequence) all was well. This is all just a wag, but it is something to look at. Kevin On Thursday, June 6, 2002, at 03:45 PM, Benjamin Herrenschmidt wrote: > >> I encounter an oops during boot bringing up a sungem >> interface. (smp g4 450/gcc 3.1/glibc 2.2.5) If I defer >> bringing up the network at boot, I can successfully >> start eth0 (sungem) if I start eth1 (tulip) first, so >> it may not be the sungem driver itself. This happens >> on benh 2.4.19-Bpre10, and pre9. > > What kind of error is it ? A Machine Check ? > > Looking at your backtrace, it looks like the driver is > trying to access the PHY chip. That can sometimes happen > if you have some tool like miitool or ethtool trying to > get at the link status while the chip isn't powered up. > > The problem here is that sungem on Apple HW only powers > the chip when the interface is brought up, and powers it > down about 10 seconds after bringing the interface down. > > This improve power management, but kills link monitoring > tools. > > There may be also a bug in the driver causing it to try > to access the PHY registers when the chip is in down > mode & getting the ethtool ioctl's > > Ben. > > > ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Sungem bug or something else? 2002-06-06 20:41 ` Kevin B. Hendricks @ 2002-06-06 20:25 ` benh 2002-06-06 21:02 ` roger blofeld 1 sibling, 0 replies; 10+ messages in thread From: benh @ 2002-06-06 20:25 UTC (permalink / raw) To: Kevin B. Hendricks; +Cc: roger blofeld, linuxppc-dev >Hi, > >Does sungem use autonegotiate to determine its interface type and speed >(like some of the more advanced interface drivers) or does it look at >the rom or use a table? > >If it autonegotiates, does the driver actually wait long enough for the >autonegotiation to fully complete before returning the first time? It autonegociates first, then tries fixed speeds, etc.. >Under some tulip drivers, I noticed something very similar (but no oops, >just a inability to use the driver until I rmmod and then insmod it >once). I think it happens because the the autonegotiation results where >handled asynchronously and the main driver routine simply started it and >returned before the auonegotiation actually completed and the interface >and speed were properly determined. The problem was right after >bringing up the network in the boot sequence things tried to use it (the >appletalk drivers, etc). So if I waited to insert the module for the >driver until after everything else was started (at the end of the >bootsequence) all was well. > >This is all just a wag, but it is something to look at. Nah, it's clearly the chip beeing powered down. I know I have a bug in the driver that doesn't prevent HW access to the PHY via the ethtool ioctl's when the chip is down, and that will cause a Machine Check. I just didn't yet take the time to fix it. Ben. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Sungem bug or something else? 2002-06-06 20:41 ` Kevin B. Hendricks 2002-06-06 20:25 ` benh @ 2002-06-06 21:02 ` roger blofeld 1 sibling, 0 replies; 10+ messages in thread From: roger blofeld @ 2002-06-06 21:02 UTC (permalink / raw) To: Kevin B. Hendricks, Benjamin Herrenschmidt; +Cc: roger blofeld, linuxppc-dev Kevin, That is a good thought. After getting everything working, I get: # mii-tool eth0 eth0: autonegotiation failed, link ok so I require the maximum timeout. -roger --- "Kevin B. Hendricks" <khendricks@ivey.uwo.ca> wrote: > Hi, > > Does sungem use autonegotiate to determine its > interface type and speed > (like some of the more advanced interface drivers) > or does it look at > the rom or use a table? > > If it autonegotiates, does the driver actually wait > long enough for the > autonegotiation to fully complete before returning > the first time? > > Under some tulip drivers, I noticed something very > similar (but no oops, > just a inability to use the driver until I rmmod and > then insmod it > once). I think it happens because the the > autonegotiation results where > handled asynchronously and the main driver routine > simply started it and > returned before the auonegotiation actually > completed and the interface > and speed were properly determined. The problem was > right after > bringing up the network in the boot sequence things > tried to use it (the > appletalk drivers, etc). So if I waited to insert > the module for the > driver until after everything else was started (at > the end of the > bootsequence) all was well. > > This is all just a wag, but it is something to look > at. > > Kevin > > On Thursday, June 6, 2002, at 03:45 PM, Benjamin > Herrenschmidt wrote: > > > > >> I encounter an oops during boot bringing up a > sungem > >> interface. (smp g4 450/gcc 3.1/glibc 2.2.5) If I > defer > >> bringing up the network at boot, I can > successfully > >> start eth0 (sungem) if I start eth1 (tulip) > first, so > >> it may not be the sungem driver itself. This > happens > >> on benh 2.4.19-Bpre10, and pre9. > > > > What kind of error is it ? A Machine Check ? > > > > Looking at your backtrace, it looks like the > driver is > > trying to access the PHY chip. That can sometimes > happen > > if you have some tool like miitool or ethtool > trying to > > get at the link status while the chip isn't > powered up. > > > > The problem here is that sungem on Apple HW only > powers > > the chip when the interface is brought up, and > powers it > > down about 10 seconds after bringing the interface > down. > > > > This improve power management, but kills link > monitoring > > tools. > > > > There may be also a bug in the driver causing it > to try > > to access the PHY registers when the chip is in > down > > mode & getting the ethtool ioctl's > > > > Ben. > > > > > > > ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Sungem bug or something else? 2002-06-06 19:45 ` Benjamin Herrenschmidt 2002-06-06 20:35 ` roger blofeld 2002-06-06 20:41 ` Kevin B. Hendricks @ 2002-06-07 0:51 ` roger blofeld 2 siblings, 0 replies; 10+ messages in thread From: roger blofeld @ 2002-06-07 0:51 UTC (permalink / raw) To: linuxppc-dev The problem was triggered by upgrading to initscripts-6.76. The new ifup script calls the network-scripts function check_link_down, which in turn calls 'ip link set up eth0', then mii-tools. Apparently at boot time the phy is not powered, causing the oops. Work-around: remove the link check in ifup -roger ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2002-06-07 0:51 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-04-05 20:20 Patch: Fix SMP hang on modem close roger blofeld 2002-04-06 10:24 ` Benjamin Herrenschmidt 2002-06-06 19:25 ` Sungem bug or something else? roger blofeld 2002-06-06 19:30 ` Tom Rini 2002-06-06 19:45 ` Benjamin Herrenschmidt 2002-06-06 20:35 ` roger blofeld 2002-06-06 20:41 ` Kevin B. Hendricks 2002-06-06 20:25 ` benh 2002-06-06 21:02 ` roger blofeld 2002-06-07 0:51 ` roger blofeld
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).