From: Len Brown <len.brown@intel.com>
To: Richard Browning <richard@redline.org.uk>
Cc: Zwane Mwaikambo <zwane@linuxpower.ca>,
linux-kernel@vger.kernel.org,
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Subject: Re: SMP + Hyperthreading / Asus PCDL Deluxe / Kernel 2.4.x 2.6.x / Crash/Freeze
Date: 12 Mar 2004 02:07:17 -0500 [thread overview]
Message-ID: <1079075236.3885.52.camel@dhcppc4> (raw)
In-Reply-To: <1079072878.3885.33.camel@dhcppc4>
Hmm, read that note too fast...
Since the failure did not follow the package to the BSP socket
(CPU0/CPU1), but instead stayed with the AP (CPU2/CPU3) socket, that
suggests an issue with the MB rather than the processor itself.
-Len
On Fri, 2004-03-12 at 01:27, Len Brown wrote:
> On Thu, 2004-03-11 at 19:42, Richard Browning wrote:
> > On Friday 12 March 2004 00:36, Zwane Mwaikambo wrote:
> > > On Fri, 12 Mar 2004, Richard Browning wrote:
> > > > > For my own curiosity, does switching the processors around do anything?
> > > > > Those MCEs look confined to the non bootstrap processor package.
> > > >
> > > > Switched CPUs. This time I get the following:
> > > >
> > > > CPU3: Machine Check Exception: 000.0004
> > > > CPU2: Machine Check Exception: 000.0004
> > > > Bank 0: a20000008c010400
> > > > Kernel Panic: CPU context corrupt
> > > > In idle task - not syncing
> > > >
> > > > Note that the CPU# designations are swapped and that there's only one
> > > > Bank 0: message. Is this significant?
> > >
> > > Ok, but that's still on the same package so it's not moving with the
> > > processor, thanks. Could you also supply processor info from
> > > /proc/cpuinfo.
> >
> > I suppose that's good (for me); indicates no hardware error?
>
> MCE == hardware error.
> In this case un-recoverable.
>
> I'll take a swing at decoding this, call the Coast Guard if I don't
> return in 30 minutes;-)
>
> http://developer.intel.com/design/pentium4/manuals/25366813.pdf
>
> > Machine Check Exception: 000.0004
>
> fig 14-4 says this means that indeed, you have a valid MCE.
>
> > Bank 0: a20000008c010400
>
> fig 14-6 says:
> 63: valid register contents
> 61: UC -- processor did not correct the error
> 57: PCC -- Processor context corrupt (you're dead)
>
> 0400 is the MCA error code
>
> fig E2 says
> 10 - internal watchdog timeout.
> 26,27 -- TT -- Thread timeout indicator -- both threads timed out
>
> > /proc/cpuinfo of course:
> >
> > processor : 0
> > vendor_id : GenuineIntel
> > cpu family : 15
> > model : 2
>
> I have no idea what causes this error, but it sure sounds specific to
> the processor, and specific to HT -- which matches your experiments.
> I'd imagine that after you verify that you've got the latest BIOS for
> the board and the error persists that you should look into getting that
> specific processor replaced.
>
> cheers,
> -Len
>
>
next prev parent reply other threads:[~2004-03-12 7:07 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <A6974D8E5F98D511BB910002A50A6647615F4B99@hdsmsx402.hd.intel.com>
2004-03-11 6:50 ` SMP + Hyperthreading / Asus PCDL Deluxe / Kernel 2.4.x 2.6.x / Crash/Freeze Len Brown
2004-03-11 13:38 ` Richard Browning
2004-03-11 18:19 ` Richard Browning
2004-03-11 22:17 ` Zwane Mwaikambo
2004-03-12 0:22 ` Richard Browning
2004-03-12 0:36 ` Zwane Mwaikambo
2004-03-12 0:42 ` Richard Browning
2004-03-12 6:27 ` Len Brown
2004-03-12 7:07 ` Len Brown [this message]
2004-03-12 8:24 ` Richard Browning
2004-03-18 3:01 ` Richard Browning
2004-03-20 14:33 ` ANYONE? " Richard Browning
2004-03-20 17:26 ` Richard Browning
2004-03-20 21:30 ` Zwane Mwaikambo
2004-03-21 3:41 ` Richard Browning
2004-03-21 3:33 ` Horst von Brand
2004-03-21 20:32 ` Richard Browning
2004-03-21 23:33 ` Denis Vlasenko
2004-03-22 1:04 ` Richard Browning
2004-03-22 7:25 ` Guennadi Liakhovetski
2004-03-24 4:01 ` Richard Browning
2004-03-24 7:15 ` Guennadi Liakhovetski
[not found] ` <1080214330.982.10.camel@sven>
[not found] ` <200403251304.56877.richard@redline.org.uk>
2004-03-25 22:47 ` Sven Dowideit
2004-03-26 0:36 ` Richard Browning
2004-03-21 14:46 ` Richard B. Johnson
2004-03-21 14:57 ` Richard Browning
2004-03-12 1:16 ` Richard Browning
2004-03-10 12:27 Richard Browning
2004-03-10 16:10 ` Zwane Mwaikambo
2004-03-11 13:04 ` Richard Browning
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1079075236.3885.52.camel@dhcppc4 \
--to=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=richard@redline.org.uk \
--cc=venkatesh.pallipadi@intel.com \
--cc=zwane@linuxpower.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox