* Occasional crash reports
@ 2001-08-13 16:35 Benjamin Herrenschmidt
2001-08-14 2:59 ` Takashi Oe
0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2001-08-13 16:35 UTC (permalink / raw)
To: linuxppc-dev, paulus
I'm getting regular crash reports for which I'm having trouble figuring
out what's going on exactly. Those are with my tree or bk 2_4_devel, but
the problem may be present elsewhere.
So basically, the kernel tends to die in various locations where things
should be just fine, but I did notice one thing: In most of these cases,
I had softirq around in the backtrace (either running in softirqs, or
having do_softirq() in the backtrace). There is one case where I didn't
have it: it dies inside power_save(), so probably because an interrupt
that happened just before messed things up.
I'm wondering if we might be running into some stack overflow...
Any clue ?
Ben.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Occasional crash reports
2001-08-13 16:35 Occasional crash reports Benjamin Herrenschmidt
@ 2001-08-14 2:59 ` Takashi Oe
2001-08-14 18:15 ` Mike Fedyk
0 siblings, 1 reply; 12+ messages in thread
From: Takashi Oe @ 2001-08-14 2:59 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, paulus
On Mon, 13 Aug 2001, Benjamin Herrenschmidt wrote:
> I'm getting regular crash reports for which I'm having trouble figuring
> out what's going on exactly. Those are with my tree or bk 2_4_devel, but
> the problem may be present elsewhere.
>
> So basically, the kernel tends to die in various locations where things
> should be just fine, but I did notice one thing: In most of these cases,
> I had softirq around in the backtrace (either running in softirqs, or
> having do_softirq() in the backtrace). There is one case where I didn't
> have it: it dies inside power_save(), so probably because an interrupt
> that happened just before messed things up.
>
> I'm wondering if we might be running into some stack overflow...
I don't really know, but I can confirm that both 2.2.x and 2.4.x kernels
are unstable on a beige g3 and a bmac equipped b/w g3. I can crash them
rather easily with regular programs, given a day or two. I have not seen
similar instability on other machines (dual g4, 7600, all nubus).
Unfortunately, I don't have any time to track it down right now :(
Takashi Oe
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Occasional crash reports
@ 2001-08-14 9:45 Iain Sandoe
2001-08-14 12:18 ` Takashi Oe
0 siblings, 1 reply; 12+ messages in thread
From: Iain Sandoe @ 2001-08-14 9:45 UTC (permalink / raw)
To: Takashi Oe, Benjamin Herrenschmidt; +Cc: linuxppc-dev
On Tue, Aug 14, 2001, Takashi Oe wrote:
> On Mon, 13 Aug 2001, Benjamin Herrenschmidt wrote:
>
>> I'm getting regular crash reports for which I'm having trouble figuring
>> out what's going on exactly. Those are with my tree or bk 2_4_devel, but
>> the problem may be present elsewhere.
>>
>> So basically, the kernel tends to die in various locations where things
>> should be just fine, but I did notice one thing: In most of these cases,
>> I had softirq around in the backtrace (either running in softirqs, or
>> having do_softirq() in the backtrace). There is one case where I didn't
>> have it: it dies inside power_save(), so probably because an interrupt
>> that happened just before messed things up.
>>
>> I'm wondering if we might be running into some stack overflow...
>
> I don't really know, but I can confirm that both 2.2.x and 2.4.x kernels
> are unstable on a beige g3 and a bmac equipped b/w g3. I can crash them
> rather easily with regular programs, given a day or two.
Can you point at which program reliably crashes beige/G3?
that's what I use as my dev/build machine and it seems to be stable (2.4.x)
except under the following circumstance:
If I boot using the BootX application - at which point something drops a
bomb which almost always ends up in the network stack and shows up as an
illegal instruction (usually an FP one).
If I hard-reboot this doesn't seem to happen.
I haven't had time to fiddle with making the kernel write-only or whatever
to find out who drops the bomb (although is may well be something that is
left over from MacOS that BootX application hasn't managed to stop).
Iain.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Occasional crash reports
2001-08-14 9:45 Iain Sandoe
@ 2001-08-14 12:18 ` Takashi Oe
0 siblings, 0 replies; 12+ messages in thread
From: Takashi Oe @ 2001-08-14 12:18 UTC (permalink / raw)
To: Iain Sandoe; +Cc: Benjamin Herrenschmidt, linuxppc-dev
On Tue, 14 Aug 2001, Iain Sandoe wrote:
> > I don't really know, but I can confirm that both 2.2.x and 2.4.x kernels
> > are unstable on a beige g3 and a bmac equipped b/w g3. I can crash them
> > rather easily with regular programs, given a day or two.
>
> Can you point at which program reliably crashes beige/G3?
It's usually rsync.
> that's what I use as my dev/build machine and it seems to be stable (2.4.x)
> except under the following circumstance:
>
> If I boot using the BootX application - at which point something drops a
> bomb which almost always ends up in the network stack and shows up as an
> illegal instruction (usually an FP one).
Ah, yes, we use BootX app here, too. Version 1.2.2, I think.
Takashi Oe
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Occasional crash reports
@ 2001-08-14 13:03 Iain Sandoe
0 siblings, 0 replies; 12+ messages in thread
From: Iain Sandoe @ 2001-08-14 13:03 UTC (permalink / raw)
To: Takashi Oe; +Cc: Benjamin Herrenschmidt, linuxppc-dev
On Tue, Aug 14, 2001, Takashi Oe wrote:
> On Tue, 14 Aug 2001, Iain Sandoe wrote:
>
>> > I don't really know, but I can confirm that both 2.2.x and 2.4.x kernels
>> > are unstable on a beige g3 and a bmac equipped b/w g3. I can crash them
>> > rather easily with regular programs, given a day or two.
>>
>> Can you point at which program reliably crashes beige/G3?
>
> It's usually rsync.
hmm. don't use that as much - but it's consistent with network activity -
does it show up as an illegal instruction?
>> that's what I use as my dev/build machine and it seems to be stable (2.4.x)
>> except under the following circumstance:
>>
>> If I boot using the BootX application - at which point something drops a
>> bomb which almost always ends up in the network stack and shows up as an
>> illegal instruction (usually an FP one).
>
> Ah, yes, we use BootX app here, too. Version 1.2.2, I think.
try a hard reboot and use the BootX init - that seems to cure the problem
for me. Finding out what the problem actually is might be more tricky ;-)
I'm on 1.2.2 as well - so that's the same.
ciao,
Iain.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Occasional crash reports
@ 2001-08-14 16:02 Jiri Masik
2001-08-14 16:28 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 12+ messages in thread
From: Jiri Masik @ 2001-08-14 16:02 UTC (permalink / raw)
To: linuxppc-dev
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> So basically, the kernel tends to die in various locations where things
> should be just fine, but I did notice one thing: In most of these cases,
> I had softirq around in the backtrace (either running in softirqs, or
> having do_softirq() in the backtrace). There is one case where I didn't
> have it: it dies inside power_save(), so probably because an interrupt
> that happened just before messed things up.
>
> I'm wondering if we might be running into some stack overflow...
>
> Any clue ?
>
> Ben.
Hi,
I haven't observed any problem with recent kernels from your tree with
power supply. When running on battery kernel crashes soon (usually by
15 minutes). This is on my Pismo (March 2000) with 2.4.7-ben0. On the
second thought it might be related to USB as well as I'm not using
battery and USB mouse uncorrelated - IIRC one crash occurred inside
hc_interrupt. I'll rsync and see what's the current status.
cheers,
Jiri
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Occasional crash reports
2001-08-14 16:02 Jiri Masik
@ 2001-08-14 16:28 ` Benjamin Herrenschmidt
2001-08-14 16:45 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2001-08-14 16:28 UTC (permalink / raw)
To: Jiri Masik, linuxppc-dev
>Hi,
>
>I haven't observed any problem with recent kernels from your tree with
>power supply. When running on battery kernel crashes soon (usually by
>15 minutes). This is on my Pismo (March 2000) with 2.4.7-ben0. On the
>second thought it might be related to USB as well as I'm not using
>battery and USB mouse uncorrelated - IIRC one crash occurred inside
>hc_interrupt. I'll rsync and see what's the current status.
Make sure you have the latest yaboot (1.2.3), and add an initrd=<path>
line to your yaboot.conf in order to load the kernel's System.map along
with the kernel image. Also, enable xmon in your kernel. This will give
you some symbols lookup in xmon backtrace which is cool :)
If that crash ever happens again, send me the backtrace (or just copy me
PC, LR and the backtrace values and send along System.map)
Ben.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Occasional crash reports
2001-08-14 16:28 ` Benjamin Herrenschmidt
@ 2001-08-14 16:45 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2001-08-14 16:45 UTC (permalink / raw)
To: Jiri Masik, linuxppc-dev, linuxppc-dev
>Make sure you have the latest yaboot (1.2.3), and add an initrd=<path>
>line to your yaboot.conf in order to load the kernel's System.map along
>with the kernel image. Also, enable xmon in your kernel. This will give
>you some symbols lookup in xmon backtrace which is cool :)
Hrm... I should sleep sometimes... read "sysmap=" and not "initrd=" !
Ben.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Occasional crash reports
2001-08-14 2:59 ` Takashi Oe
@ 2001-08-14 18:15 ` Mike Fedyk
0 siblings, 0 replies; 12+ messages in thread
From: Mike Fedyk @ 2001-08-14 18:15 UTC (permalink / raw)
To: linuxppc-dev
On Mon, Aug 13, 2001 at 09:59:05PM -0500, Takashi Oe wrote:
>
> I don't really know, but I can confirm that both 2.2.x and 2.4.x kernels
> are unstable on a beige g3 and a bmac equipped b/w g3. I can crash them
> rather easily with regular programs, given a day or two. I have not seen
> similar instability on other machines (dual g4, 7600, all nubus).
> Unfortunately, I don't have any time to track it down right now :(
I've been able to trace that to the mace and bmac ethernet drivers, which
are *very* unreliable in 2.2. Get the latest drivers from Donald Becker for
tulip, rtl8139, eepro100, etc. and you're set. I haven't tried 2.4 yet.
Mike
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Occasional crash reports
@ 2001-08-18 1:28 Robert E Brose II
0 siblings, 0 replies; 12+ messages in thread
From: Robert E Brose II @ 2001-08-18 1:28 UTC (permalink / raw)
To: linuxppc-dev
-- forwarded message --
From: mfedyk@matchmail.com (Mike Fedyk)
Subject: Re: Occasional crash reports
On Mon, Aug 13, 2001 at 09:59:05PM -0500, Takashi Oe wrote:
>
> I don't really know, but I can confirm that both 2.2.x and 2.4.x kernels
> are unstable on a beige g3 and a bmac equipped b/w g3. I can crash them
> rather easily with regular programs, given a day or two. I have not seen
> similar instability on other machines (dual g4, 7600, all nubus).
> Unfortunately, I don't have any time to track it down right now :(
I've been able to trace that to the mace and bmac ethernet drivers, which
are *very* unreliable in 2.2. Get the latest drivers from Donald Becker for
tulip, rtl8139, eepro100, etc. and you're set. I haven't tried 2.4 yet.
Mike
-- end of forwarded message --
Mace is still unreliable in 2.4. It's always been just plain unreliable
for me on 7200, 7500 and 7600's. It's a bummer because it sure
would be nice to have another PCI slot free. Tulip works well as does
ne2k-pci.
Bmac has given me problems on a rev C imac. I've almost given up on
the iMac because yaboot seems unable to handle booting partition 13 which
is about 30 gigs up into a 45 gig drive. I can only get at it by using
bootx on a 8.6 CD (the iMac os is 9.1).
Bob
--
Robert E. Brose II N0QBJ
http://www.jriver.com/~bob/
bob@kunk.jriver.com
----- End of forwarded message from Robert E Brose II -----
--
Robert E. Brose II N0QBJ
http://www.jriver.com/~bob/
bob@kunk.jriver.com
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Occasional crash reports
@ 2001-08-18 11:16 Iain Sandoe
2001-08-20 2:51 ` Robert E Brose II
0 siblings, 1 reply; 12+ messages in thread
From: Iain Sandoe @ 2001-08-18 11:16 UTC (permalink / raw)
To: Robert E Brose II, linuxppc-dev
On Sat, Aug 18, 2001, Robert E Brose II wrote:
> On Mon, Aug 13, 2001 at 09:59:05PM -0500, Takashi Oe wrote:
>> I don't really know, but I can confirm that both 2.2.x and 2.4.x kernels
>> are unstable on a beige g3 and a bmac equipped b/w g3. I can crash them
>> rather easily with regular programs, given a day or two. I have not seen
>> similar instability on other machines (dual g4, 7600, all nubus).
>> Unfortunately, I don't have any time to track it down right now :(
>
> I've been able to trace that to the mace and bmac ethernet drivers, which
> are *very* unreliable in 2.2. Get the latest drivers from Donald Becker for
> tulip, rtl8139, eepro100, etc. and you're set. I haven't tried 2.4 yet.
apropos bmac - is there a dbdma problem?
I think Takashi did a dbdma fix ... did that go in?
> -- end of forwarded message --
>
> Mace is still unreliable in 2.4. It's always been just plain unreliable
> for me on 7200, 7500 and 7600's.
- are these clones or Apple originals?
we found other dbdma problems on clone 7x00 machines for PowerComputing (in
the sound side) ...
might be worth checking that there are no cases of "DEAD" status coming up
on the Mace driver...
as I say, booting with the BootX *init* (rather than the application) bmac,
at least, seems reliable on g3/beige (don't use Mace much).
ciao,
Iain.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Occasional crash reports
2001-08-18 11:16 Iain Sandoe
@ 2001-08-20 2:51 ` Robert E Brose II
0 siblings, 0 replies; 12+ messages in thread
From: Robert E Brose II @ 2001-08-20 2:51 UTC (permalink / raw)
To: Iain Sandoe; +Cc: linuxppc-dev
User Iain Sandoe says:
> >
> > Mace is still unreliable in 2.4. It's always been just plain unreliable
> > for me on 7200, 7500 and 7600's.
>
> - are these clones or Apple originals?
> we found other dbdma problems on clone 7x00 machines for PowerComputing (in
> the sound side) ...
All are Apple originals.
> might be worth checking that there are no cases of "DEAD" status coming up
> on the Mace driver...
I am setting up a test machine as I write this. I'm setting up YDL 2.0
on a 7200 using the internal mace. I'll get it to fail (not hard) and
put some printk's in the "DEAD" tests in mace.c
> as I say, booting with the BootX *init* (rather than the application) bmac,
> at least, seems reliable on g3/beige (don't use Mace much).
Will check that as well and let you know.
> ciao,
> Iain.
>
Bob
--
Robert E. Brose II N0QBJ
http://www.jriver.com/~bob/
bob@kunk.jriver.com
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2001-08-20 2:51 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-08-13 16:35 Occasional crash reports Benjamin Herrenschmidt
2001-08-14 2:59 ` Takashi Oe
2001-08-14 18:15 ` Mike Fedyk
-- strict thread matches above, loose matches on Subject: below --
2001-08-14 9:45 Iain Sandoe
2001-08-14 12:18 ` Takashi Oe
2001-08-14 13:03 Iain Sandoe
2001-08-14 16:02 Jiri Masik
2001-08-14 16:28 ` Benjamin Herrenschmidt
2001-08-14 16:45 ` Benjamin Herrenschmidt
2001-08-18 1:28 Robert E Brose II
2001-08-18 11:16 Iain Sandoe
2001-08-20 2:51 ` Robert E Brose II
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).