From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753003Ab1IUQWf (ORCPT ); Wed, 21 Sep 2011 12:22:35 -0400 Received: from mx1.redhat.com ([209.132.183.28]:18426 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752926Ab1IUQWd (ORCPT ); Wed, 21 Sep 2011 12:22:33 -0400 Date: Wed, 21 Sep 2011 12:13:52 -0400 From: Don Zickus To: Robert Richter Cc: "x86@kernel.org" , Andi Kleen , Peter Zijlstra , "ying.huang@intel.com" , LKML , "paulmck@linux.vnet.ibm.com" , "avi@redhat.com" , "jeremy@goop.org" Subject: Re: [V5][PATCH 4/6] x86, nmi: add in logic to handle multiple events and unknown NMIs Message-ID: <20110921161352.GU5795@redhat.com> References: <1316529792-6560-1-git-send-email-dzickus@redhat.com> <1316529792-6560-5-git-send-email-dzickus@redhat.com> <20110921100842.GA6063@erda.amd.com> <20110921140432.GP5795@redhat.com> <20110921151829.GC6063@erda.amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110921151829.GC6063@erda.amd.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 21, 2011 at 05:18:30PM +0200, Robert Richter wrote: > On 21.09.11 10:04:32, Don Zickus wrote: > > On Wed, Sep 21, 2011 at 12:08:42PM +0200, Robert Richter wrote: > > > On 20.09.11 10:43:10, Don Zickus wrote: > > > > @@ -87,6 +87,16 @@ static int notrace __kprobes nmi_handle(unsigned int type, struct pt_regs *regs) > > > > > > > > handled += a->handler(type, regs); > > > > > > > > + /* > > > > + * Optimization: only loop once if this is not a > > > > + * back-to-back NMI. The idea is nothing is dropped > > > > + * on the first NMI, only on the second of a back-to-back > > > > + * NMI. No need to waste cycles going through all the > > > > + * handlers. > > > > + */ > > > > + if (!b2b && handled) > > > > + break; > > > > > > In rare cases we will lose nmis here. > > > > > > We see a back-to-back nmi in the case if a 2 nmi source triggers > > > *after* the nmi handler is entered. Depending on the internal cpu > > > timing influenced by microcode and SMM code execution, the nmi may not > > > entered immediately. So all sources that trigger *before* the nmi > > > handler is entered raise only one nmi with no subsequent nmi. > > > > Right, but that can only happen with the second NMI in the back-to-back > > NMI case. Here the optimization is only for the first NMI, with the > > assumption that you will always have a second NMI if multiple sources > > trigger, so you can process those in the second iteration (assuming we > > correctly detect the back-to-back NMI condition). Then when the second > > NMI comes in, we have no idea how many we dropped to get here so we > > process all the handlers based on the assumption we might not have another > > NMI behind us to make up for the dropped NMIs. > > > > Unless I misunderstood your point above? > > No, my point was that a second NMI might not be latched even if there > are two nmi sources pending. > > Your logic is correct but assumes you will always receive a second > nmi. This is not always the case depending on the cpu's internal > timing. Usually there is the following sequence for back-to-back nmis: > > 1. HW triggers first NMI, an NMI is pending. > 2. NMI handler is called, no NMI pending anymore. > 3. HW triggers a second NMI, an NMI is pending. > 4. Return from NMI handler. > 5. NMI handler is called again to serve the 2nd, no NMI pending anymore. > 6. Return from NMI handler. > > The above is what your algorithm covers. > > But in rare cases there is the following: > > 1. The cpu executes some microcode or SMM code. > 2. HW triggers the first NMI, an NMI is pending. > 3. HW triggers a second NMI, the NMI is still pending. > 4. The cpu finished microcode or SMM code. > 5. NMI handler is called, no NMI pending anymore. > 6. Return from NMI handler. > > In this case the handler is called only once and the second nmi > remains unhandled with you implementation. > > I don't see a way how this could be catched without serving all > handlers the first time. But as said, in favor of the optimization I > think we can live with losing some NMIs. Ah, I get it know. Crap. Well I think Avi was pushing it to make those ticket_spin_locks work in virt land. It seems like we should lean towards removing the optimization. Avi? Cheers, Don