From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752758AbbJAG1x (ORCPT ); Thu, 1 Oct 2015 02:27:53 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:58310 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750767AbbJAG1w (ORCPT ); Thu, 1 Oct 2015 02:27:52 -0400 Date: Thu, 1 Oct 2015 08:27:33 +0200 From: Peter Zijlstra To: =?utf-8?B?5rKz5ZCI6Iux5a6PIC8gS0FXQUnvvIxISURFSElSTw==?= Cc: Jonathan Corbet , Ingo Molnar , "Eric W. Biederman" , "H. Peter Anvin" , Andrew Morton , Thomas Gleixner , Vivek Goyal , "linux-doc@vger.kernel.org" , "x86@kernel.org" , "kexec@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Michal Hocko , Ingo Molnar , =?utf-8?B?5bmz5p2+6ZuF5bezIC8gSElSQU1BVFXvvIxNQVNBTUk=?= Subject: Re: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option Message-ID: <20151001062733.GL2881@worktop.programming.kicks-ass.net> References: <20150925112803.4258.94241.stgit@softrs> <20150925112811.4258.54494.stgit@softrs> <20150930115548.GI2881@worktop.programming.kicks-ass.net> <04EAB7311EE43145B2D3536183D1A8445499CD11@GSjpTKYDCembx31.service.hitachi.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <04EAB7311EE43145B2D3536183D1A8445499CD11@GSjpTKYDCembx31.service.hitachi.net> User-Agent: Mutt/1.5.22.1 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 01, 2015 at 02:33:18AM +0000, 河合英宏 / KAWAI,HIDEHIRO wrote: > > On Fri, Sep 25, 2015 at 08:28:11PM +0900, Hidehiro Kawai wrote: > > > This patch introduces new boot option "noextnmi" which disables > > > external NMI. This option is useful for the dump capture kernel > > > so that an HA application or administrator wouldn't mistakenly > > > shoot down the kernel by NMI. > > > > So that they can get really stuck when the crash kernel crashes, right? > > ;-) > > No, it is different from my intention. > > `mistakenly' in the above means; they issue NMI due to a misconception > that the monitored host is stuck in the 1st kernel while it is actually > taking a crash dump in the 2nd kernel. To avoid this kind of accident, > there is a tool such as fence_kdump which notifies "I'm taking a crash > dump, so don't send NMI" to the HA clustering software. However, there > is a time window between kernel panic and the notification. > > "noextnmi" allows users to avoid this kind of accident all the time of > 2nd kernel. Yes yes, I understand. But if the crash kernel also gets stuck they have no means of recovery, right? (other than power cycling the hardware) Just playing devils advocate here, I don't actually object to the patch.