All of lore.kernel.org
 help / color / mirror / Atom feed
From: Len Brown <len.brown@intel.com>
To: Richard Browning <richard@redline.org.uk>
Cc: Zwane Mwaikambo <zwane@linuxpower.ca>,
	linux-kernel@vger.kernel.org,
	Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Subject: Re: SMP + Hyperthreading / Asus PCDL Deluxe / Kernel 2.4.x 2.6.x / Crash/Freeze
Date: 12 Mar 2004 02:07:17 -0500	[thread overview]
Message-ID: <1079075236.3885.52.camel@dhcppc4> (raw)
In-Reply-To: <1079072878.3885.33.camel@dhcppc4>

Hmm, read that note too fast...
Since the failure did not follow the package to the BSP socket
(CPU0/CPU1), but instead stayed with the AP (CPU2/CPU3) socket, that
suggests an issue with the MB rather than the processor itself.

-Len

On Fri, 2004-03-12 at 01:27, Len Brown wrote:
> On Thu, 2004-03-11 at 19:42, Richard Browning wrote:
> > On Friday 12 March 2004 00:36, Zwane Mwaikambo wrote:
> > > On Fri, 12 Mar 2004, Richard Browning wrote:
> > > > > For my own curiosity, does switching the processors around do anything?
> > > > > Those MCEs look confined to the non bootstrap processor package.
> > > >
> > > > Switched CPUs. This time I get the following:
> > > >
> > > > CPU3: Machine Check Exception: 000.0004
> > > > CPU2: Machine Check Exception: 000.0004
> > > > Bank 0: a20000008c010400
> > > > Kernel Panic: CPU context corrupt
> > > > In idle task - not syncing
> > > >
> > > > Note that the CPU# designations are swapped and that there's only one
> > > > Bank 0: message. Is this significant?
> > >
> > > Ok, but that's still on the same package so it's not moving with the
> > > processor, thanks. Could you also supply processor info from
> > > /proc/cpuinfo.
> > 
> > I suppose that's good (for me); indicates no hardware error?
> 
> MCE == hardware error.
> In this case un-recoverable.
> 
> I'll take a swing at decoding this, call the Coast Guard if I don't
> return in 30 minutes;-)
> 
> http://developer.intel.com/design/pentium4/manuals/25366813.pdf
> 
> > Machine Check Exception: 000.0004
> 
> fig 14-4 says this means that indeed, you have a valid MCE.
> 
> > Bank 0: a20000008c010400
> 
> fig 14-6 says:
> 63: valid register contents
> 61: UC -- processor did not correct the error
> 57: PCC -- Processor context corrupt (you're dead)
> 
> 0400 is the MCA error code
> 
> fig E2 says
> 10 - internal watchdog timeout.
> 26,27 -- TT -- Thread timeout indicator -- both threads timed out
> 
> > /proc/cpuinfo of course:
> > 
> > processor       : 0
> > vendor_id       : GenuineIntel
> > cpu family      : 15
> > model           : 2
> 
> I have no idea what causes this error, but it sure sounds specific to
> the processor, and specific to HT -- which matches your experiments. 
> I'd imagine that after you verify that you've got the latest BIOS for
> the board and the error persists that you should look into getting that
> specific processor replaced.
> 
> cheers,
> -Len
> 
> 


  reply	other threads:[~2004-03-12  7:07 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <A6974D8E5F98D511BB910002A50A6647615F4B99@hdsmsx402.hd.intel.com>
2004-03-11  6:50 ` SMP + Hyperthreading / Asus PCDL Deluxe / Kernel 2.4.x 2.6.x / Crash/Freeze Len Brown
2004-03-11 13:38   ` Richard Browning
2004-03-11 18:19   ` Richard Browning
2004-03-11 22:17     ` Zwane Mwaikambo
2004-03-12  0:22       ` Richard Browning
2004-03-12  0:36         ` Zwane Mwaikambo
2004-03-12  0:42           ` Richard Browning
2004-03-12  6:27             ` Len Brown
2004-03-12  7:07               ` Len Brown [this message]
2004-03-12  8:24                 ` Richard Browning
2004-03-18  3:01                 ` Richard Browning
2004-03-20 14:33                 ` ANYONE? " Richard Browning
2004-03-20 17:26                   ` Richard Browning
2004-03-20 21:30                     ` Zwane Mwaikambo
2004-03-21  3:41                       ` Richard Browning
2004-03-21  3:33                     ` Horst von Brand
2004-03-21 20:32                       ` Richard Browning
2004-03-21 23:33                         ` Denis Vlasenko
2004-03-22  1:04                           ` Richard Browning
2004-03-22  7:25                             ` Guennadi Liakhovetski
2004-03-24  4:01                               ` Richard Browning
2004-03-24  7:15                                 ` Guennadi Liakhovetski
     [not found]                                   ` <1080214330.982.10.camel@sven>
     [not found]                                     ` <200403251304.56877.richard@redline.org.uk>
2004-03-25 22:47                                       ` Sven Dowideit
2004-03-26  0:36                                         ` Richard Browning
2004-03-21 14:46                   ` Richard B. Johnson
2004-03-21 14:57                     ` Richard Browning
2004-03-12  1:16           ` Richard Browning
2004-03-10 12:27 Richard Browning
2004-03-10 16:10 ` Zwane Mwaikambo
2004-03-11 13:04   ` Richard Browning

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1079075236.3885.52.camel@dhcppc4 \
    --to=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=richard@redline.org.uk \
    --cc=venkatesh.pallipadi@intel.com \
    --cc=zwane@linuxpower.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.