From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Roger Cruz" Subject: How to generate a HW NMI Date: Thu, 30 Sep 2010 12:59:25 -0500 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0651852577==" Return-path: Content-class: urn:content-classes:message List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --===============0651852577== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CB60C9.38DC456A" This is a multi-part message in MIME format. ------_=_NextPart_001_01CB60C9.38DC456A Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi fellow Xen developers, =20 I continue to get system hangs where the watchdog NMI in Xen is not doing its job. I am completely blind as to what is getting jammed. Tried multiple experiments to force the hang and in each, the watchdog has kicked in, so I know the mechanism works 99% of the time except in my one hang. =20 So in the old days of PCI bus, I used to be able to generate a HW NMI by asserting the SERR signal in the connector. With the advent of PCIe, I believe that signal is no longer present, so I am looking for any other way to cause a system error. I have examined the PCI express mini-card specification looking for a signal I can use in the internal WiFi connector, but alas, none of the signals I read about seem like they would do what I need. I am not sure if there is anything I can short in the PCIe signals that could have a similar effect as the SERR signal. The platform is a Lenovo T500 laptop so the number of connectors to play with is limited. =20 I also thought of causing a parity/ECC error but the GM45 chipset used in this laptop does not support ECC memory. =20 So I'm basically looking for any other ideas on how to cause a fault by probing somewhere in the motherboard. This MB has a docking station connector but I have not been able to find the pinout list so I don't know what is brought out there. At this point, I have no problem cracking up the case and soldering something on to the motherboard.. I just need to know what chips and signals to tap. =20 Thanks in advance. =20 Roger R. Cruz ------_=_NextPart_001_01CB60C9.38DC456A Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi fellow Xen developers,

 

I continue to get system hangs where the watchdog = NMI in Xen is not doing its job.  I am completely blind as to what is getting = jammed.  Tried multiple experiments to force the hang and in each, the watchdog = has kicked in, so I know the mechanism works 99% of the time except in my = one hang.

 

So in the old days of PCI bus, I used to be able to = generate a HW NMI by asserting the SERR signal in the connector.  With the = advent of PCIe, I believe that signal is no longer present, so I am looking for = any other way to cause a system error.    I have examined the = PCI express mini-card specification looking for a signal I can use in the = internal WiFi connector, but alas, none of the signals I read about seem like = they would do what I need.  I am not sure if there is anything I can short in = the PCIe signals that could have a similar effect as the SERR signal. =  The platform is a Lenovo T500 laptop so the number of connectors to play = with is limited.

 

I also thought of causing a parity/ECC error but = the GM45 chipset used in this laptop does not support ECC memory.

 

So I’m basically looking for any other ideas = on how to cause a fault by probing somewhere in the motherboard.  This MB has = a docking station connector but I have not been able to find the pinout = list so I don’t know what is brought out there.  At this point, I have = no problem cracking up the case and soldering something on to the = motherboard.. I just need to know what chips and signals to tap.

 

Thanks in advance.

 

Roger R. Cruz

------_=_NextPart_001_01CB60C9.38DC456A-- --===============0651852577== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0651852577==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: How to generate a HW NMI Date: Fri, 1 Oct 2010 10:15:23 -0400 Message-ID: <20101001141523.GB28639@dumpdata.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Roger Cruz Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On Thu, Sep 30, 2010 at 12:59:25PM -0500, Roger Cruz wrote: > Hi fellow Xen developers, > > > > I continue to get system hangs where the watchdog NMI in Xen is not > doing its job. I am completely blind as to what is getting jammed. > Tried multiple experiments to force the hang and in each, the watchdog > has kicked in, so I know the mechanism works 99% of the time except in > my one hang. > > > > So in the old days of PCI bus, I used to be able to generate a HW NMI by > asserting the SERR signal in the connector. With the advent of PCIe, I Nice. > believe that signal is no longer present, so I am looking for any other > way to cause a system error. I have examined the PCI express What about the Mini PCI-e to PCI-e adapter: http://www.hwtools.net/adapter/PM2C.html And then plug in a PCI to PCI-e adapter: http://www.newegg.com/Product/Product.aspx?Item=N82E16815158165&nm_mc=OTC-Froogle&cm_mmc=OTC-Froogle-_-Add-On+Cards-_-STARTECH-_-15158165 And then assert the SERR#? > mini-card specification looking for a signal I can use in the internal > WiFi connector, but alas, none of the signals I read about seem like > they would do what I need. I am not sure if there is anything I can > short in the PCIe signals that could have a similar effect as the SERR Per this slide deck: http://www.pcisig.com/developers/main/training_materials/get_document?doc_id=cdf593816ee20b90d8603d4aeb081a726ddc3091 it looks as if you can program the PCIe bridge to fall to "legacy" mode. And per some folks post: http://forums.gentoo.org/viewtopic-t-752165.html it looks as if the SERR# signal is asserted on SMBus controller? Maybe there is a way to do it via that? > signal. The platform is a Lenovo T500 laptop so the number of > connectors to play with is limited. > IBM on the server sides used to have NMI buttons - it could be that Lenova hadn't completly gotten rid of them. Since you are open to looking at the motherboard, maybe there is a spot marked #NMI ? > > > > I also thought of causing a parity/ECC error but the GM45 chipset used > in this laptop does not support ECC memory. > > > So I'm basically looking for any other ideas on how to cause a fault by > probing somewhere in the motherboard. This MB has a docking station > connector but I have not been able to find the pinout list so I don't > know what is brought out there. At this point, I have no problem How about just shorting the pins randomly :-) > cracking up the case and soldering something on to the motherboard.. I > just need to know what chips and signals to tap. > > > > Thanks in advance. > > > > Roger R. Cruz > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Roger Cruz" Subject: RE: How to generate a HW NMI Date: Fri, 1 Oct 2010 14:33:20 -0500 Message-ID: References: <20101001141523.GB28639@dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Return-path: Content-class: urn:content-classes:message In-Reply-To: <20101001141523.GB28639@dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Konrad Rzeszutek Wilk Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Great ideas Konrad. I have ordered these parts. It will probably take a few days before they get here. The goal of using the HW NMI is to rule out any incorrect SW settings of the Performance Monitoring counters used in Xen to triggered the NMI. Someone else mentioned that another possibility as to why an NMI may not be triggered is that the system is stuck handling an SMI interrupt. I haven't studied Xen code with respect to SMIs yet, but I assume that Xen doesn't do much in that area right? I was under the impression that the BIOS usually set this up and the OSs could not even modify the handlers as they were in protected RAM. R. -----Original Message----- From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]=20 Sent: Friday, October 01, 2010 10:15 AM To: Roger Cruz Cc: xen-devel@lists.xensource.com Subject: Re: [Xen-devel] How to generate a HW NMI On Thu, Sep 30, 2010 at 12:59:25PM -0500, Roger Cruz wrote: > Hi fellow Xen developers, >=20 > =20 >=20 > I continue to get system hangs where the watchdog NMI in Xen is not > doing its job. I am completely blind as to what is getting jammed. > Tried multiple experiments to force the hang and in each, the watchdog > has kicked in, so I know the mechanism works 99% of the time except in > my one hang. >=20 > =20 >=20 > So in the old days of PCI bus, I used to be able to generate a HW NMI by > asserting the SERR signal in the connector. With the advent of PCIe, I Nice. > believe that signal is no longer present, so I am looking for any other > way to cause a system error. I have examined the PCI express What about the Mini PCI-e to PCI-e adapter: http://www.hwtools.net/adapter/PM2C.html And then plug in a PCI to PCI-e adapter: http://www.newegg.com/Product/Product.aspx?Item=3DN82E16815158165&nm_mc=3D= OT C-Froogle&cm_mmc=3DOTC-Froogle-_-Add-On+Cards-_-STARTECH-_-15158165 And then assert the SERR#? > mini-card specification looking for a signal I can use in the internal > WiFi connector, but alas, none of the signals I read about seem like > they would do what I need. I am not sure if there is anything I can > short in the PCIe signals that could have a similar effect as the SERR Per this slide deck: http://www.pcisig.com/developers/main/training_materials/get_document?do c_id=3Dcdf593816ee20b90d8603d4aeb081a726ddc3091 it looks as if you can program the PCIe bridge to fall to "legacy" mode. And per some folks post: http://forums.gentoo.org/viewtopic-t-752165.html it looks as if the SERR# signal is asserted on SMBus controller? Maybe there is a way to do it via that? > signal. The platform is a Lenovo T500 laptop so the number of > connectors to play with is limited. >=20 IBM on the server sides used to have NMI buttons - it could be that Lenova hadn't completly gotten rid of them. Since you are open to looking at the motherboard, maybe there is a spot marked #NMI ? >=20 > =20 >=20 > I also thought of causing a parity/ECC error but the GM45 chipset used > in this laptop does not support ECC memory. > =20 >=20 > So I'm basically looking for any other ideas on how to cause a fault by > probing somewhere in the motherboard. This MB has a docking station > connector but I have not been able to find the pinout list so I don't > know what is brought out there. At this point, I have no problem How about just shorting the pins randomly :-) > cracking up the case and soldering something on to the motherboard.. I > just need to know what chips and signals to tap. >=20 > =20 >=20 > Thanks in advance. >=20 > =20 >=20 > Roger R. Cruz >=20 > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel No virus found in this incoming message. Checked by AVG - www.avg.com=20 Version: 9.0.856 / Virus Database: 271.1.1/3168 - Release Date: 10/01/10 02:34:00 From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: How to generate a HW NMI Date: Fri, 1 Oct 2010 16:01:23 -0400 Message-ID: <20101001200123.GA17776@dumpdata.com> References: <20101001141523.GB28639@dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Roger Cruz Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On Fri, Oct 01, 2010 at 02:33:20PM -0500, Roger Cruz wrote: > Great ideas Konrad. I have ordered these parts. It will probably take > a few days before they get here. > The goal of using the HW NMI is to rule out any incorrect SW settings of > the Performance Monitoring counters used in Xen to triggered the NMI. Right. > > Someone else mentioned that another possibility as to why an NMI may not > be triggered is that the system is stuck handling an SMI interrupt. I > haven't studied Xen code with respect to SMIs yet, but I assume that Xen > doesn't do much in that area right? I was under the impression that the > BIOS usually set this up and the OSs could not even modify the handlers > as they were in protected RAM. Ugh. That is true - we have no notion of when the SMIs run. Not that the SMIs are actually working 100% all the time. Another thought, and this might be a complete shoot in the dark. Look in the upstream (2.6.36-rc6) blacklist.c file. There is an entry for that specific ThinkPad which activates the ACPI _OSI, maybe that needs to be done? From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Huang2, Wei" Subject: pciback doesn't take CardBus device Date: Fri, 1 Oct 2010 15:36:45 -0500 Message-ID: References: <20101001141523.GB28639@dumpdata.com> <20101001200123.GA17776@dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <20101001200123.GA17776@dumpdata.com> Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Konrad Rzeszutek Wilk Cc: Xen-devel List-Id: xen-devel@lists.xenproject.org Hi Konrad, I found that pciback doesn't accept CardBus device. It only handles type-0 = and type-1. Any specific reason to skip it? That caused some trouble for fo= r firewire passthru on my laptop. I want to know the reason before submitti= ng submit a patch. Thanks, -Wei From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: pciback doesn't take CardBus device Date: Fri, 1 Oct 2010 16:45:33 -0400 Message-ID: <20101001204533.GA18203@dumpdata.com> References: <20101001141523.GB28639@dumpdata.com> <20101001200123.GA17776@dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: "Huang2, Wei" Cc: Xen-devel List-Id: xen-devel@lists.xenproject.org On Fri, Oct 01, 2010 at 03:36:45PM -0500, Huang2, Wei wrote: > Hi Konrad, > > I found that pciback doesn't accept CardBus device. It only handles type-0 and type-1. Any specific reason to skip it? That caused some trouble for for firewire passthru on my laptop. I want to know the reason before submitting submit a patch. No reason at all. Was this working in the past (2.6.18?). I will gladly accept any patch. From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Huang2, Wei" Subject: RE: pciback doesn't take CardBus device Date: Fri, 1 Oct 2010 16:04:05 -0500 Message-ID: References: <20101001141523.GB28639@dumpdata.com> <20101001200123.GA17776@dumpdata.com> <20101001204533.GA18203@dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <20101001204533.GA18203@dumpdata.com> Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Konrad Rzeszutek Wilk Cc: Xen-devel List-Id: xen-devel@lists.xenproject.org I haven't test 2.6.18 yet; but will do. The issue I found is with the follo= wing configuration. These devices are behind the same bridge. But because 4= 6:06.5 is a CardBus and can't be assigned, it blocks other devices from bei= ng assigned to a guest VM. I will create a patch for it.=20 Thanks, -Wei =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 46:06.0 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 IEEE 1394 Controller (rev= 06) 46:06.1 SD Host controller: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host A= dapter (rev 25) 46:06.2 System peripheral: Ricoh Co Ltd R5C843 MMC Host Controller (rev 14) 46:06.3 System peripheral: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapte= r (rev 14) 46:06.4 System peripheral: Ricoh Co Ltd xD-Picture Card Controller (rev 14) 46:06.5 CardBus bridge: Ricoh Co Ltd RL5c476 II (rev bb) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D -----Original Message----- From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]=20 Sent: Friday, October 01, 2010 3:46 PM To: Huang2, Wei Cc: Xen-devel Subject: Re: pciback doesn't take CardBus device On Fri, Oct 01, 2010 at 03:36:45PM -0500, Huang2, Wei wrote: > Hi Konrad, >=20 > I found that pciback doesn't accept CardBus device. It only handles type-= 0 and type-1. Any specific reason to skip it? That caused some trouble for = for firewire passthru on my laptop. I want to know the reason before submit= ting submit a patch. No reason at all. Was this working in the past (2.6.18?). I will gladly accept any patch. From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Roger Cruz" Subject: RE: How to generate a HW NMI Date: Mon, 4 Oct 2010 08:56:03 -0500 Message-ID: References: <20101001141523.GB28639@dumpdata.com> <4CA9AC25.6020707@siemens.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0506332333==" Return-path: Content-class: urn:content-classes:message List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jan Kiszka Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --===============0506332333== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CB63CC.2B4316AB" This is a multi-part message in MIME format. ------_=_NextPart_001_01CB63CC.2B4316AB Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Jan,=20 I will try your suggestion of turning off SMIs. I am also interested in = you conducting an experiment for me. If you can, please tell your = kernel not to use any CPU power saving modes. In Xen I use = max_cstate=3D0 in the bootline. I have found that when I do this, the = hangs appear to go away (we had one customer report one since using this = work-around, so it is not 100% working). Thanks Roger -----Original Message----- From: Jan Kiszka [mailto:jan.kiszka@siemens.com] Sent: Mon 10/4/2010 6:27 AM To: Roger Cruz Cc: Konrad Rzeszutek Wilk; xen-devel@lists.xensource.com Subject: Re: How to generate a HW NMI =20 Am 01.10.2010 21:33, Roger Cruz wrote: > Someone else mentioned that another possibility as to why an NMI may = not > be triggered is that the system is stuck handling an SMI interrupt. I > haven't studied Xen code with respect to SMIs yet, but I assume that = Xen > doesn't do much in that area right? I was under the impression that = the > BIOS usually set this up and the OSs could not even modify the = handlers > as they were in protected RAM. We happen to face strange freezes of KVM right now as well (CPU is apparently stuck in guest mode), and turning of SMIs cures them here [1]. However, it's too early to draw final conclusions, we are still collecting test results & data on the systems. It would therefore be interesting to see if you case is similar to ours. If you feel brave enough to turn off your SMIs (there are rumors that CPUs /could/ get fried as some thermal management /might/ be done via SMIs), please check out [2], build it (requires libpci and a kernel source tree), and run "smitctrl -s 0" on your box. Should give something like this: SMI-enabled chipset found: PCI_VENDOR_ID_INTEL:PCI_DEVICE_ID_INTEL_PCH_LPC_MIN+7 (8086:3b07) SMI_EN register: 0006403b new value: 00000002 If the chipset is not detected, add the PCI device ID of your ISA bridge to the list in smictrl.c. If the new value still has bit 0 set, you are unlucky as your BIOS has locked some SMIs against disabling. Otherwise, SMIs are off now, and your lock up /may/ disappear. Looking forward to your results! Jan [1] http://thread.gmane.org/gmane.comp.emulators.kvm.devel/60326 [2] http://git.kiszka.org/?p=3Dsmictrl.git --=20 Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ------_=_NextPart_001_01CB63CC.2B4316AB Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable RE: How to generate a HW NMI

Jan,

I will try your suggestion of turning off SMIs.  I am also = interested in you conducting an experiment for me.  If you can, = please tell your kernel not to use any CPU power saving modes.  In = Xen I use max_cstate=3D0 in the bootline.  I have found that when I = do this, the hangs appear to go away (we had one customer report one = since using this work-around, so it is not 100% working).

Thanks
Roger


-----Original Message-----
From: Jan Kiszka [mailto:jan.kiszka@siemens.com]=
Sent: Mon 10/4/2010 6:27 AM
To: Roger Cruz
Cc: Konrad Rzeszutek Wilk; xen-devel@lists.xensource.com
Subject: Re: How to generate a HW NMI

Am 01.10.2010 21:33, Roger Cruz wrote:
> Someone else mentioned that another possibility as to why an NMI = may not
> be triggered is that the system is stuck handling an SMI = interrupt.  I
> haven't studied Xen code with respect to SMIs yet, but I assume = that Xen
> doesn't do much in that area right?  I was under the = impression that the
> BIOS usually set this up and the OSs could not even modify the = handlers
> as they were in protected RAM.

We happen to face strange freezes of KVM right now as well (CPU is
apparently stuck in guest mode), and turning of SMIs cures them here
[1]. However, it's too early to draw final conclusions, we are still
collecting test results & data on the systems.

It would therefore be interesting to see if you case is similar to = ours.
If you feel brave enough to turn off your SMIs (there are rumors = that
CPUs /could/ get fried as some thermal management /might/ be done = via
SMIs), please check out [2], build it (requires libpci and a kernel
source tree), and run "smitctrl -s 0" on your box. Should give = something
like this:

SMI-enabled chipset found:
 PCI_VENDOR_ID_INTEL:PCI_DEVICE_ID_INTEL_PCH_LPC_MIN+7 = (8086:3b07)
 SMI_EN register:       0006403b
 new = value:           &= nbsp; 00000002

If the chipset is not detected, add the PCI device ID of your ISA = bridge
to the list in smictrl.c. If the new value still has bit 0 set, you = are
unlucky as your BIOS has locked some SMIs against disabling. = Otherwise,
SMIs are off now, and your lock up /may/ disappear. Looking forward = to
your results!

Jan

[1] htt= p://thread.gmane.org/gmane.comp.emulators.kvm.devel/60326
[2] http://git.kiszka.org/?p=3D= smictrl.git

--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

------_=_NextPart_001_01CB63CC.2B4316AB-- --===============0506332333== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0506332333==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Roger Cruz" Subject: RE: How to generate a HW NMI Date: Mon, 4 Oct 2010 09:19:21 -0500 Message-ID: References: <20101001141523.GB28639@dumpdata.com> <4CA9AC25.6020707@siemens.com> <4CA9E0FB.6000109@siemens.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Return-path: Content-class: urn:content-classes:message In-Reply-To: <4CA9E0FB.6000109@siemens.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jan Kiszka Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org Until Friday, all hard hangs that we and our customers had experienced were on Lenovo T500 and X200, even with their latest BIOSes. The Lenovo T400 has never hung for me and I don't have any reports on them from the field. On Friday, I had an HP i5 hard hang with similar footprint as the Lenovos. When this hard hang happens, the Xen watchdog (which is driven by the NMI handler) will not do its job and cause a crash/stack trace. This is why we have started to suspect something with the BIOS and SMIs as they are the only thing that can block an NMI. I am pretty certain that this is somehow related to entering C3 power states and possibly at the same time an SMI comes in. The time it takes to hang varies from 30mins to 24 hrs. Roger -----Original Message----- From: Jan Kiszka [mailto:jan.kiszka@siemens.com]=20 Sent: Monday, October 04, 2010 10:13 AM To: Roger Cruz Cc: Konrad Rzeszutek Wilk; xen-devel@lists.xensource.com Subject: Re: How to generate a HW NMI Am 04.10.2010 15:56, Roger Cruz wrote: > Jan, >=20 > I will try your suggestion of turning off SMIs. I am also interested in you=20 > conducting an experiment for me. If you can, please tell your kernel not to use=20 > any CPU power saving modes. In Xen I use max_cstate=3D0 in the = bootline. I have=20 > found that when I do this, the hangs appear to go away (we had one customer=20 > report one since using this work-around, so it is not 100% working). Will do. My customer reported that he was able to easily crash his i7 notebook by pulling and re-plugging the power cable. I bet all of these events are trapped by the BIOS via power management SMIs... BTW, do you see any correlation between crashable boxes and BIOS vendors? We have no representative numbers yet, just one confirmed instable notebook that is Phoenix-based, while one AMI-based i7 server that is rock-stable. Jan --=20 Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux No virus found in this incoming message. Checked by AVG - www.avg.com=20 Version: 9.0.856 / Virus Database: 271.1.1/3168 - Release Date: 10/04/10 02:35:00 From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dan Magenheimer Subject: RE: RE: How to generate a HW NMI Date: Mon, 4 Oct 2010 08:23:39 -0700 (PDT) Message-ID: <859bc1d6-cb86-4f41-867c-c75e21a4bf95@default> References: <20101001141523.GB28639@dumpdata.com> <4CA9AC25.6020707@siemens.com> <4CA9E0FB.6000109@siemens.com EACA7CA90354A849B1315959042A052C011B021A@BE24.exg4.exghost.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Roger Cruz , Jan Kiszka Cc: xen-devel@lists.xensource.com, Konrad Wilk List-Id: xen-devel@lists.xenproject.org This is a long shot, but since my thoughts jumped to it after reading this, I thought I'd post anyway. Some systems support a special "C1E" power state that can be enabled/disabled in the BIOS. My Dell Core2Duo laptop has this feature. I remember running into some weirdness that went away when I turned it off. Perhaps the power management code is somehow entering the BIOS to see if this is enabled and max_cstate isn't controlling it since the check is done in the BIOS bypassing Xen? Google for C1E to find lots of information about this weird power state. > -----Original Message----- > From: Roger Cruz [mailto:roger.cruz@virtualcomputer.com] > Sent: Monday, October 04, 2010 8:19 AM > To: Jan Kiszka > Cc: xen-devel@lists.xensource.com; Konrad Rzeszutek Wilk > Subject: [Xen-devel] RE: How to generate a HW NMI >=20 > Until Friday, all hard hangs that we and our customers had experienced > were on Lenovo T500 and X200, even with their latest BIOSes. The > Lenovo > T400 has never hung for me and I don't have any reports on them from > the > field. On Friday, I had an HP i5 hard hang with similar footprint as > the Lenovos. When this hard hang happens, the Xen watchdog (which is > driven by the NMI handler) will not do its job and cause a crash/stack > trace. This is why we have started to suspect something with the BIOS > and SMIs as they are the only thing that can block an NMI. I am pretty > certain that this is somehow related to entering C3 power states and > possibly at the same time an SMI comes in. The time it takes to hang > varies from 30mins to 24 hrs. >=20 > Roger >=20 >=20 >=20 >=20 > -----Original Message----- > From: Jan Kiszka [mailto:jan.kiszka@siemens.com] > Sent: Monday, October 04, 2010 10:13 AM > To: Roger Cruz > Cc: Konrad Rzeszutek Wilk; xen-devel@lists.xensource.com > Subject: Re: How to generate a HW NMI >=20 > Am 04.10.2010 15:56, Roger Cruz wrote: > > Jan, > > > > I will try your suggestion of turning off SMIs. I am also interested > in you > > conducting an experiment for me. If you can, please tell your kernel > not to use > > any CPU power saving modes. In Xen I use max_cstate=3D0 in the > bootline. > I have > > found that when I do this, the hangs appear to go away (we had one > customer > > report one since using this work-around, so it is not 100% working). >=20 > Will do. My customer reported that he was able to easily crash his i7 > notebook by pulling and re-plugging the power cable. I bet all of these > events are trapped by the BIOS via power management SMIs... >=20 > BTW, do you see any correlation between crashable boxes and BIOS > vendors? We have no representative numbers yet, just one confirmed > instable notebook that is Phoenix-based, while one AMI-based i7 server > that is rock-stable. >=20 > Jan >=20 > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux >=20 > No virus found in this incoming message. > Checked by AVG - www.avg.com > Version: 9.0.856 / Virus Database: 271.1.1/3168 - Release Date: > 10/04/10 > 02:35:00 >=20 > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Roger Cruz" Subject: RE: How to generate a HW NMI Date: Mon, 4 Oct 2010 14:03:05 -0500 Message-ID: References: <20101001141523.GB28639@dumpdata.com> <4CA9AC25.6020707@siemens.com> <4CA9E0FB.6000109@siemens.com> <4CA9F16C.905@siemens.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Return-path: Content-class: urn:content-classes:message In-Reply-To: <4CA9F16C.905@siemens.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jan Kiszka Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org > BTW, "rmmod processor thermal" (should be equivalent to your Xen I am not familiar with the thermal module but my guess is that they are not the same as the C3 states which can be entered when the kernel becomes idle. I believe the thermal plays with other type of state (P?) where it alters the voltage and frequency of the CPU to keep the CPU still running but at a particular % of the top speed. The C3 state causes the CPU clocks to shutdown entirely and then it is awaken by an external event. R. -----Original Message----- From: Jan Kiszka [mailto:jan.kiszka@siemens.com]=20 Sent: Monday, October 04, 2010 11:23 AM To: Roger Cruz Cc: Konrad Rzeszutek Wilk; xen-devel@lists.xensource.com Subject: Re: How to generate a HW NMI Am 04.10.2010 16:19, Roger Cruz wrote: > Until Friday, all hard hangs that we and our customers had experienced > were on Lenovo T500 and X200, even with their latest BIOSes. Yeah, the T500 was reported as problematic here as well. My Fujitsu Celsius H700 also crashes. In contrast, we have positive results from a Dell server with an Asus P6T Deluxe V2 board and a Core i7 920. > The Lenovo > T400 has never hung for me and I don't have any reports on them from the > field. On Friday, I had an HP i5 hard hang with similar footprint as i5? Mmh, we only have reports from i7 so far. Which BIOS vendor? > the Lenovos. When this hard hang happens, the Xen watchdog (which is > driven by the NMI handler) will not do its job and cause a crash/stack > trace. > This is why we have started to suspect something with the BIOS > and SMIs as they are the only thing that can block an NMI. I am pretty > certain that this is somehow related to entering C3 power states and > possibly at the same time an SMI comes in. I tried various stuff under Linux as well: nmi_watchdog=3D1, tracing to VGA buffer right before/after guest-host switch (it always hangs after entry here), verified guest interruptibility before entry (though hypervisors usually do not play with the critical bits), read-out of host RAM (including kernel log buffer) via Firewire - it all points to a crash outside the scope of the host OS. > The time it takes to hang > varies from 30mins to 24 hrs. We are a bit more lucky, maybe due to our special guest (an old RTOS in 16-bit mode): I can reproduce the hang after a few minutes. BTW, "rmmod processor thermal" (should be equivalent to your Xen parameter) did not make a difference here. Jan --=20 Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux No virus found in this incoming message. Checked by AVG - www.avg.com=20 Version: 9.0.856 / Virus Database: 271.1.1/3168 - Release Date: 10/04/10 02:35:00 From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Roger Cruz" Subject: RE: RE: How to generate a HW NMI Date: Mon, 11 Oct 2010 16:20:22 -0500 Message-ID: References: <20101001141523.GB28639@dumpdata.com><4CA9AC25.6020707@siemens.com><4CA9E0FB.6000109@siemens.com><4CA9F16C.905@siemens.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Return-path: Content-class: urn:content-classes:message In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Roger Cruz , Jan Kiszka Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org Here is some additional info from my experiments over the weekend. I took the Lenovo T500 and removed its internal WiFi miniPCIe card. In its place, I put in a miniPCIe to PCIe converter card with a PCIe socket. Into that socket, I placed a PCIe dump card. This card has a switch that when you press it, it creates an SERR error. Using the utility provided by the vendor, I enabled all the bridges between the card to carry the SERR signal to the CPU and cause the CPU to see it as an NMI. I tested the set-up several times. Every single time I pressed the switch, I got an NMI, followed by a kdump core. So I was sure the HW setup was working correctly. I left two Lenovo T500 running over the weekend and when I returned this morning, both had hung. Completely frozen. I pressed the NMI switch in both systems and nothing. No crashes, no coredumps. It looks as if the SERR/NMI is getting ignored/blocked or CPU is completely shutdown (STPCLK). This experiment helps me prove that the software watchdog code in Xen was not the problem and indeed the NMIs are getting blocked somehow. This is what I now need to investigate. Areas that I care to learn more about are the SMI handler and the external chip's use of the STPCLK signal to the CPU. As an additional bit of info, the only response we get when the systems are hung is a beep when the power cord is unplugged/plugged from the laptop. I don't know if the beep is done via a HW module or whether ACPI/BIOS is involved. Still looking for additional ideas. Regards, Roger R. Cruz -----Original Message----- From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Roger Cruz Sent: Monday, October 04, 2010 3:03 PM To: Jan Kiszka Cc: xen-devel@lists.xensource.com; Konrad Rzeszutek Wilk Subject: [Xen-devel] RE: How to generate a HW NMI > BTW, "rmmod processor thermal" (should be equivalent to your Xen I am not familiar with the thermal module but my guess is that they are not the same as the C3 states which can be entered when the kernel becomes idle. I believe the thermal plays with other type of state (P?) where it alters the voltage and frequency of the CPU to keep the CPU still running but at a particular % of the top speed. The C3 state causes the CPU clocks to shutdown entirely and then it is awaken by an external event. R. -----Original Message----- From: Jan Kiszka [mailto:jan.kiszka@siemens.com]=20 Sent: Monday, October 04, 2010 11:23 AM To: Roger Cruz Cc: Konrad Rzeszutek Wilk; xen-devel@lists.xensource.com Subject: Re: How to generate a HW NMI Am 04.10.2010 16:19, Roger Cruz wrote: > Until Friday, all hard hangs that we and our customers had experienced > were on Lenovo T500 and X200, even with their latest BIOSes. Yeah, the T500 was reported as problematic here as well. My Fujitsu Celsius H700 also crashes. In contrast, we have positive results from a Dell server with an Asus P6T Deluxe V2 board and a Core i7 920. > The Lenovo > T400 has never hung for me and I don't have any reports on them from the > field. On Friday, I had an HP i5 hard hang with similar footprint as i5? Mmh, we only have reports from i7 so far. Which BIOS vendor? > the Lenovos. When this hard hang happens, the Xen watchdog (which is > driven by the NMI handler) will not do its job and cause a crash/stack > trace. > This is why we have started to suspect something with the BIOS > and SMIs as they are the only thing that can block an NMI. I am pretty > certain that this is somehow related to entering C3 power states and > possibly at the same time an SMI comes in. I tried various stuff under Linux as well: nmi_watchdog=3D1, tracing to VGA buffer right before/after guest-host switch (it always hangs after entry here), verified guest interruptibility before entry (though hypervisors usually do not play with the critical bits), read-out of host RAM (including kernel log buffer) via Firewire - it all points to a crash outside the scope of the host OS. > The time it takes to hang > varies from 30mins to 24 hrs. We are a bit more lucky, maybe due to our special guest (an old RTOS in 16-bit mode): I can reproduce the hang after a few minutes. BTW, "rmmod processor thermal" (should be equivalent to your Xen parameter) did not make a difference here. Jan --=20 Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux No virus found in this incoming message. Checked by AVG - www.avg.com=20 Version: 9.0.856 / Virus Database: 271.1.1/3168 - Release Date: 10/04/10 02:35:00 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel No virus found in this incoming message. Checked by AVG - www.avg.com=20 Version: 9.0.856 / Virus Database: 271.1.1/3168 - Release Date: 10/04/10 02:35:00 From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Roger Cruz" Subject: Re: RE: How to generate a HW NMI Date: Tue, 12 Oct 2010 08:42:13 -0400 Message-ID: <17A8F715-1F95-4A4C-AC86-E6366673BEAC@virtualcomputer.com> References: <20101001141523.GB28639@dumpdata.com><4CA9AC25.6020707@siemens.com><4CA9E0FB.6000109@siemens.com><4CA9F16C.905@siemens.com> <4CB420D4.2010507@siemens.com> Mime-Version: 1.0 (iPhone Mail 7A341) Content-Type: text/plain; format=flowed; delsp=yes; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4CB420D4.2010507@siemens.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jan Kiszka Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org Disabling SMIs is part of the experiments to be conducted today or tomorrow. I will keep u posted. On Oct 12, 2010, at 4:48 AM, "Jan Kiszka" wrote: > Am 11.10.2010 23:20, Roger Cruz wrote: >> Here is some additional info from my experiments over the weekend. >> >> I took the Lenovo T500 and removed its internal WiFi miniPCIe >> card. In >> its place, I put in a miniPCIe to PCIe converter card with a PCIe >> socket. Into that socket, I placed a PCIe dump card. This card >> has a >> switch that when you press it, it creates an SERR error. Using the >> utility provided by the vendor, I enabled all the bridges between the >> card to carry the SERR signal to the CPU and cause the CPU to see >> it as >> an NMI. I tested the set-up several times. Every single time I >> pressed >> the switch, I got an NMI, followed by a kdump core. So I was sure >> the >> HW setup was working correctly. >> >> I left two Lenovo T500 running over the weekend and when I returned >> this >> morning, both had hung. Completely frozen. I pressed the NMI >> switch in >> both systems and nothing. No crashes, no coredumps. It looks as >> if the >> SERR/NMI is getting ignored/blocked or CPU is completely shutdown >> (STPCLK). >> >> This experiment helps me prove that the software watchdog code in Xen >> was not the problem and indeed the NMIs are getting blocked somehow. >> This is what I now need to investigate. Areas that I care to learn >> more >> about are the SMI handler and the external chip's use of the STPCLK >> signal to the CPU. >> >> As an additional bit of info, the only response we get when the >> systems >> are hung is a beep when the power cord is unplugged/plugged from the >> laptop. I don't know if the beep is done via a HW module or whether >> ACPI/BIOS is involved. >> >> Still looking for additional ideas. > > Already tried to disable SMIs? > > Jan > > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Roger Cruz" Subject: RE: How to generate a HW NMI Date: Tue, 12 Oct 2010 10:59:21 -0500 Message-ID: References: <20101001141523.GB28639@dumpdata.com> <4CA9AC25.6020707@siemens.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1512112125==" Return-path: Content-class: urn:content-classes:message List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jan Kiszka Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --===============1512112125== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CB6A26.AD2ECED4" This is a multi-part message in MIME format. ------_=_NextPart_001_01CB6A26.AD2ECED4 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi Jan, Just letting you know that I am grateful for the help you have been = providing. I finally got around to doing the SMI test as you have = described here. It takes a day or two to know for sure the problem is = not going to happen so I will let the system stand still for a while. This is the output of your tool. Bit 0 was cleared so SMIs should be = disabled at this point. root@hedley-t500:~# ./smictrl -s 0 SMI-enabled chipset found: PCI_VENDOR_ID_INTEL:PCI_DEVICE_ID_INTEL_ICH9_1 (8086:2917) SMI_EN register: 00062033 new value: 00000002 -----Original Message----- From: Jan Kiszka [mailto:jan.kiszka@siemens.com] Sent: Mon 10/4/2010 6:27 AM To: Roger Cruz Cc: Konrad Rzeszutek Wilk; xen-devel@lists.xensource.com Subject: Re: How to generate a HW NMI =20 Am 01.10.2010 21:33, Roger Cruz wrote: > Someone else mentioned that another possibility as to why an NMI may = not > be triggered is that the system is stuck handling an SMI interrupt. I > haven't studied Xen code with respect to SMIs yet, but I assume that = Xen > doesn't do much in that area right? I was under the impression that = the > BIOS usually set this up and the OSs could not even modify the = handlers > as they were in protected RAM. We happen to face strange freezes of KVM right now as well (CPU is apparently stuck in guest mode), and turning of SMIs cures them here [1]. However, it's too early to draw final conclusions, we are still collecting test results & data on the systems. It would therefore be interesting to see if you case is similar to ours. If you feel brave enough to turn off your SMIs (there are rumors that CPUs /could/ get fried as some thermal management /might/ be done via SMIs), please check out [2], build it (requires libpci and a kernel source tree), and run "smitctrl -s 0" on your box. Should give something like this: SMI-enabled chipset found: PCI_VENDOR_ID_INTEL:PCI_DEVICE_ID_INTEL_PCH_LPC_MIN+7 (8086:3b07) SMI_EN register: 0006403b new value: 00000002 If the chipset is not detected, add the PCI device ID of your ISA bridge to the list in smictrl.c. If the new value still has bit 0 set, you are unlucky as your BIOS has locked some SMIs against disabling. Otherwise, SMIs are off now, and your lock up /may/ disappear. Looking forward to your results! Jan [1] http://thread.gmane.org/gmane.comp.emulators.kvm.devel/60326 [2] http://git.kiszka.org/?p=3Dsmictrl.git --=20 Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux No virus found in this incoming message. Checked by AVG - www.avg.com=20 Version: 9.0.862 / Virus Database: 271.1.1/3168 - Release Date: 10/05/10 = 02:34:00 ------_=_NextPart_001_01CB6A26.AD2ECED4 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable RE: How to generate a HW NMI

Hi Jan,

Just letting you know that I am grateful for the help you have been = providing.  I finally got around to doing the SMI test as you have = described here.  It takes a day or two to know for sure the problem = is not going to happen so I will let the system stand still for a = while.

This is the output of your tool.  Bit 0 was cleared so SMIs should = be disabled at this point.


root@hedley-t500:~# ./smictrl -s 0
SMI-enabled chipset found:
 PCI_VENDOR_ID_INTEL:PCI_DEVICE_ID_INTEL_ICH9_1 (8086:2917)
 SMI_EN register:       00062033
 new value:     =         00000002


-----Original Message-----
From: Jan Kiszka [mailto:jan.kiszka@siemens.com]=
Sent: Mon 10/4/2010 6:27 AM
To: Roger Cruz
Cc: Konrad Rzeszutek Wilk; xen-devel@lists.xensource.com
Subject: Re: How to generate a HW NMI

Am 01.10.2010 21:33, Roger Cruz wrote:
> Someone else mentioned that another possibility as to why an NMI = may not
> be triggered is that the system is stuck handling an SMI = interrupt.  I
> haven't studied Xen code with respect to SMIs yet, but I assume = that Xen
> doesn't do much in that area right?  I was under the = impression that the
> BIOS usually set this up and the OSs could not even modify the = handlers
> as they were in protected RAM.

We happen to face strange freezes of KVM right now as well (CPU is
apparently stuck in guest mode), and turning of SMIs cures them here
[1]. However, it's too early to draw final conclusions, we are still
collecting test results & data on the systems.

It would therefore be interesting to see if you case is similar to = ours.
If you feel brave enough to turn off your SMIs (there are rumors = that
CPUs /could/ get fried as some thermal management /might/ be done = via
SMIs), please check out [2], build it (requires libpci and a kernel
source tree), and run "smitctrl -s 0" on your box. Should give = something
like this:

SMI-enabled chipset found:
 PCI_VENDOR_ID_INTEL:PCI_DEVICE_ID_INTEL_PCH_LPC_MIN+7 = (8086:3b07)
 SMI_EN register:       0006403b
 new = value:           &= nbsp; 00000002

If the chipset is not detected, add the PCI device ID of your ISA = bridge
to the list in smictrl.c. If the new value still has bit 0 set, you = are
unlucky as your BIOS has locked some SMIs against disabling. = Otherwise,
SMIs are off now, and your lock up /may/ disappear. Looking forward = to
your results!

Jan

[1] htt= p://thread.gmane.org/gmane.comp.emulators.kvm.devel/60326
[2] http://git.kiszka.org/?p=3D= smictrl.git

--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.862 / Virus Database: 271.1.1/3168 - Release Date: 10/05/10 = 02:34:00

------_=_NextPart_001_01CB6A26.AD2ECED4-- --===============1512112125== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1512112125==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: RE: How to generate a HW NMI Date: Mon, 25 Oct 2010 11:34:37 -0400 Message-ID: <20101025153437.GA4863@dumpdata.com> References: <4CA9AC25.6020707@siemens.com> <4CA9E0FB.6000109@siemens.com> <4CA9F16C.905@siemens.com> <4CB420D4.2010507@siemens.com> <17A8F715-1F95-4A4C-AC86-E6366673BEAC@virtualcomputer.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <17A8F715-1F95-4A4C-AC86-E6366673BEAC@virtualcomputer.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Roger Cruz Cc: Jan Kiszka , xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On Tue, Oct 12, 2010 at 08:42:13AM -0400, Roger Cruz wrote: > Disabling SMIs is part of the experiments to be conducted today or > tomorrow. I will keep u posted. Soo, what happend? Machine melted down? It caught on fire?