public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [Linux-ia64] X4.1.0 reboots and log
@ 2001-10-03 18:26 Randolph Chung
  2001-10-03 20:32 ` David Mosberger
  0 siblings, 1 reply; 2+ messages in thread
From: Randolph Chung @ 2001-10-03 18:26 UTC (permalink / raw)
  To: linux-ia64

Like David mentioned in a previous email I've also seen random reboots
when switching between a vt and X (4.1.0, as packaged in Debian's woody
distribution). I've also seen this sometimes when leaving X running
overnight (with xscreensaver kicking in). This is on a HP i2000 smp
system, with a nvidia quadro pro card. (no glx)

One thing I noticed is that the crashes seem to coincide with certain
messages in the event log. I've posted an excerpt at
http://gandalf.tausq.org/tmp/kern.log

Does this help anyone debug the problem? I was told that this:

Oct  2 12:16:52 pippin kernel: +Platform PCI Component Error Info
Section
Oct  2 12:16:52 pippin kernel: + PCI Component Error Detail:  Error
Status: 0x1000
Oct  2 12:16:52 pippin kernel:  Component Info: Vendor Id = 0x8086,
Device Id = 0x84e0, Class Code = 0x0, Seg/Bus/Dev/Func = 4/0/0/6

corresponds to a "address above top of memory" error reported by the SAC, but
don't know how to trace this down more.

help appreciated,
randolph
-- 
   @..@                                         http://www.TauSq.org/
  (----)
 ( >__< )
 ^^ ~~ ^^


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Linux-ia64] X4.1.0 reboots and log
  2001-10-03 18:26 [Linux-ia64] X4.1.0 reboots and log Randolph Chung
@ 2001-10-03 20:32 ` David Mosberger
  0 siblings, 0 replies; 2+ messages in thread
From: David Mosberger @ 2001-10-03 20:32 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Wed, 3 Oct 2001 11:26:44 -0700, Randolph Chung <randolph@tausq.org> said:

  Randolph> One thing I noticed is that the crashes seem to coincide
  Randolph> with certain messages in the event log. I've posted an
  Randolph> excerpt at http://gandalf.tausq.org/tmp/kern.log

  Randolph> Does this help anyone debug the problem? I was told that
  Randolph> this:

It's definitely a useful observation; thanks for pointing it out.

  Randolph> Oct 2 12:16:52 pippin kernel: +Platform PCI Component
  Randolph> Error Info Section Oct 2 12:16:52 pippin kernel: + PCI
  Randolph> Component Error Detail: Error Status: 0x1000 Oct 2
  Randolph> 12:16:52 pippin kernel: Component Info: Vendor Id   Randolph> 0x8086, Device Id = 0x84e0, Class Code = 0x0,
  Randolph> Seg/Bus/Dev/Func = 4/0/0/6

  Randolph> corresponds to a "address above top of memory" error
  Randolph> reported by the SAC, but don't know how to trace this down
  Randolph> more.

Based on tables B-2/B-4 in the SAL spec, I'd interpret an Error status
of "0x1000" as:

	ERR_BUS Error detected in the bus.

That's not very telling... ;-(

I looked through your log file, but couldn't find any useful
addresses.  Could someone more familiar with the MCA reports
tell me what this means:

	+ BUS Check Info [0]
	+ Status Info: 0 ,Severity: 0 ,Transaction Type: 1 ,Transaction Size: 7 ,Error: External

My suspicion is that the machine crashes either because something is
attempting to access a memory hole or because something is attempting
to perform an I/O device access via a cachable translation.  Perhaps
the above line would tell us which one it is, but I'm not sure what a
transaction type of "1" means.

	--david


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2001-10-03 20:32 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-10-03 18:26 [Linux-ia64] X4.1.0 reboots and log Randolph Chung
2001-10-03 20:32 ` David Mosberger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox