From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tim Deegan Subject: Re: Woes of NMIs and MCEs, and possibly how to fix Date: Sat, 1 Dec 2012 09:09:16 +0000 Message-ID: <20121201090916.GA7987@ocelot.phlegethon.org> References: <50B8EE13.6070301@citrix.com> <50B9050B.7090709@citrix.com> <20121130202420.GC95877@ocelot.phlegethon.org> <50B93964.4040609@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <50B93964.4040609@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andrew Cooper Cc: Mats Petersson , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org At 22:55 +0000 on 30 Nov (1354316132), Andrew Cooper wrote: > Any spinlocking whatsoever is out, until we completely fix the > re-entrant entry to do_{nmi,mce}(), at which point spinlocks which are > ok so long as they are exclusively used inside their respective > handlers. AFAICT spinlocks are bad news even if they're only in their respective handlers, as one CPU can get (NMI (MCE)) while the other gets (MCE (NMI)). > >>> 2) Faults on the MCE path will re-enable NMIs, as will the iret of the > >>> MCE itself if an MCE interrupts an NMI. > >> The same questions apply as to #1 (just replace NMI with MCE) > > Andrew pointed out that some MCE code uses rdmsr_safe(). > > > > FWIW, I think that constraining MCE and NMI code not to do anything that > > can fault is perfectly reasonable. The MCE code has grown a lot > > recently and probably needs an audit to check for spinlocks, faults &c. > > Yes. However, being able to deal gracefully with the case were we miss > things on code review which touches the NMI/MCE paths is certainly > better than crashing in subtle ways. Agreed. > At the moment, the clearing of the MCIP bit is quite early in a few of > the cpu family specific MCE handlers. As it is an architectural MSR, I > was considering moving it outside the family specific handlers, and make > one of the last things on the MCE path, to help reduce the race > condition window until we properly fix reentrant MCEs. Why not have a per-cpu mce-in-progress flag, and clear MCIP early? That way you get a panic instead of silently losing a CPU. > >>> [1] In an effort to prevent a flamewar with my comment, the situation we > >>> find outself in now is almost certainly the result of unforseen > >>> interactions of individual features, but we are left to pick up the many > >>> pieces in way which cant completely be solved. > >>> > >> Happy to have my comments completely shot down into little bits, but I'm > >> worrying that we're looking to solve a problem that doesn't actually > >> need solving - at least as long as the code in the respective handlers > >> are "doing the right thing", and if that happens to be broken, then we > >> should fix THAT, not build lots of extra code to recover from such a thing. > > I agree. The only things we can't fix by DTRT in do_nmi() and do_mce() > > are: > > - IRET in SMM mode re-enabling NMIs; and > > - detecting _every_ case where we get a nested NMI/MCE (all we can > > do is detect _most_ cases, but the detection is just so we can print > > a message before the crash). > > We would need to modify the asm stubs to detect nesting. We can detect _almost all_ nesting from C with an in-progress flag. We can probably expand that to cover all nesting by pushing the flag-setting/flag-clearing out to the asm but that'd still be only a couple of lines of change - a lot simpler than what we'd need to allow nested MCE/NMIs to continue without crashing. > I think it is > unreasonable to expect the C functions do_{nmi,mce}() to be reentrantly > safe. Good. :) I'm not suggesting that. Tim.