From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756574Ab1CBSlJ (ORCPT ); Wed, 2 Mar 2011 13:41:09 -0500 Received: from mx1.redhat.com ([209.132.183.28]:40065 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752168Ab1CBSlH (ORCPT ); Wed, 2 Mar 2011 13:41:07 -0500 Date: Wed, 2 Mar 2011 13:40:53 -0500 From: Don Zickus To: Cyrill Gorcunov Cc: Ingo Molnar , "Huang, Ying" , "Maciej W. Rozycki" , lkml Subject: Re: [PATCH -tip 2/2 resend] x86, traps: Drop nmi_reason_lock until it is really needed Message-ID: <20110302184053.GW11359@redhat.com> References: <4D6E631B.6040701@openvz.org> <20110302154645.GA11827@elte.hu> <4D6E6886.2060707@openvz.org> <20110302160315.GA12620@elte.hu> <4D6E6CB6.7000700@openvz.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4D6E6CB6.7000700@openvz.org> User-Agent: Mutt/1.5.20 (2009-08-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 02, 2011 at 07:13:42PM +0300, Cyrill Gorcunov wrote: > On 03/02/2011 07:03 PM, Ingo Molnar wrote: > ... > > > > Well, the lock serializes the read-out of the 'NMI reason' port, the handling of > > whatever known reason and then the reassertion of the NMI (on 32-bit). > > > > EDAC has a callback in pci_serr_error() - and this lock serializes that. So we > > cannot just remove a lock like that, if there's any chance of parallel execution on > > multiple CPUs. > > > > Thanks, > > > > Ingo > > OK, probably we need some UV person CC'ed (not sure whom) just to explain the > reason for such nmi-listening model. Meanwhile -- lets drop my patch. It's for debugging reasons. When their huge machine deadlocks, they wanted an easy mechanism to dump the cpu stacks. That mechanism was an nmi button. The problem was the button would only dump the first cpu. By opening up the other cpus to accept external nmis, they could dump all the cpus. Now this spinlock doesn't affect them, because they registered an nmi handler to catch it and dump their stack (I modified the code to use DIE_NMIUNKNOWN instead of DIE_NMI to avoid conflict with the nmi_watchdog). But I don't know what the affect is, if that spinlock is not there (I sent a private email to SGI inquiring, their guy wasn't around this week). Personally I am indifferent to this patch. I don't have any problems with the code the way it is now, but can understand what you mean having stuff lying around as 'dead code'. I had thought Intel would have pushed more patches upstream to remove the BSP lock-in by now. Cheers, Don