public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Len Brown <len.brown@intel.com>
To: Richard Browning <richard@redline.org.uk>
Cc: Zwane Mwaikambo <zwane@linuxpower.ca>,
	linux-kernel@vger.kernel.org,
	Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Subject: Re: SMP + Hyperthreading / Asus PCDL Deluxe / Kernel 2.4.x 2.6.x / Crash/Freeze
Date: 12 Mar 2004 02:07:17 -0500	[thread overview]
Message-ID: <1079075236.3885.52.camel@dhcppc4> (raw)
In-Reply-To: <1079072878.3885.33.camel@dhcppc4>

Hmm, read that note too fast...
Since the failure did not follow the package to the BSP socket
(CPU0/CPU1), but instead stayed with the AP (CPU2/CPU3) socket, that
suggests an issue with the MB rather than the processor itself.

-Len

On Fri, 2004-03-12 at 01:27, Len Brown wrote:
> On Thu, 2004-03-11 at 19:42, Richard Browning wrote:
> > On Friday 12 March 2004 00:36, Zwane Mwaikambo wrote:
> > > On Fri, 12 Mar 2004, Richard Browning wrote:
> > > > > For my own curiosity, does switching the processors around do anything?
> > > > > Those MCEs look confined to the non bootstrap processor package.
> > > >
> > > > Switched CPUs. This time I get the following:
> > > >
> > > > CPU3: Machine Check Exception: 000.0004
> > > > CPU2: Machine Check Exception: 000.0004
> > > > Bank 0: a20000008c010400
> > > > Kernel Panic: CPU context corrupt
> > > > In idle task - not syncing
> > > >
> > > > Note that the CPU# designations are swapped and that there's only one
> > > > Bank 0: message. Is this significant?
> > >
> > > Ok, but that's still on the same package so it's not moving with the
> > > processor, thanks. Could you also supply processor info from
> > > /proc/cpuinfo.
> > 
> > I suppose that's good (for me); indicates no hardware error?
> 
> MCE == hardware error.
> In this case un-recoverable.
> 
> I'll take a swing at decoding this, call the Coast Guard if I don't
> return in 30 minutes;-)
> 
> http://developer.intel.com/design/pentium4/manuals/25366813.pdf
> 
> > Machine Check Exception: 000.0004
> 
> fig 14-4 says this means that indeed, you have a valid MCE.
> 
> > Bank 0: a20000008c010400
> 
> fig 14-6 says:
> 63: valid register contents
> 61: UC -- processor did not correct the error
> 57: PCC -- Processor context corrupt (you're dead)
> 
> 0400 is the MCA error code
> 
> fig E2 says
> 10 - internal watchdog timeout.
> 26,27 -- TT -- Thread timeout indicator -- both threads timed out
> 
> > /proc/cpuinfo of course:
> > 
> > processor       : 0
> > vendor_id       : GenuineIntel
> > cpu family      : 15
> > model           : 2
> 
> I have no idea what causes this error, but it sure sounds specific to
> the processor, and specific to HT -- which matches your experiments. 
> I'd imagine that after you verify that you've got the latest BIOS for
> the board and the error persists that you should look into getting that
> specific processor replaced.
> 
> cheers,
> -Len
> 
> 


  reply	other threads:[~2004-03-12  7:07 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <A6974D8E5F98D511BB910002A50A6647615F4B99@hdsmsx402.hd.intel.com>
2004-03-11  6:50 ` SMP + Hyperthreading / Asus PCDL Deluxe / Kernel 2.4.x 2.6.x / Crash/Freeze Len Brown
2004-03-11 13:38   ` Richard Browning
2004-03-11 18:19   ` Richard Browning
2004-03-11 22:17     ` Zwane Mwaikambo
2004-03-12  0:22       ` Richard Browning
2004-03-12  0:36         ` Zwane Mwaikambo
2004-03-12  0:42           ` Richard Browning
2004-03-12  6:27             ` Len Brown
2004-03-12  7:07               ` Len Brown [this message]
2004-03-12  8:24                 ` Richard Browning
2004-03-18  3:01                 ` Richard Browning
2004-03-20 14:33                 ` ANYONE? " Richard Browning
2004-03-20 17:26                   ` Richard Browning
2004-03-20 21:30                     ` Zwane Mwaikambo
2004-03-21  3:41                       ` Richard Browning
2004-03-21  3:33                     ` Horst von Brand
2004-03-21 20:32                       ` Richard Browning
2004-03-21 23:33                         ` Denis Vlasenko
2004-03-22  1:04                           ` Richard Browning
2004-03-22  7:25                             ` Guennadi Liakhovetski
2004-03-24  4:01                               ` Richard Browning
2004-03-24  7:15                                 ` Guennadi Liakhovetski
     [not found]                                   ` <1080214330.982.10.camel@sven>
     [not found]                                     ` <200403251304.56877.richard@redline.org.uk>
2004-03-25 22:47                                       ` Sven Dowideit
2004-03-26  0:36                                         ` Richard Browning
2004-03-21 14:46                   ` Richard B. Johnson
2004-03-21 14:57                     ` Richard Browning
2004-03-12  1:16           ` Richard Browning
2004-03-10 12:27 Richard Browning
2004-03-10 16:10 ` Zwane Mwaikambo
2004-03-11 13:04   ` Richard Browning

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1079075236.3885.52.camel@dhcppc4 \
    --to=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=richard@redline.org.uk \
    --cc=venkatesh.pallipadi@intel.com \
    --cc=zwane@linuxpower.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox