From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756137AbYEZThv (ORCPT ); Mon, 26 May 2008 15:37:51 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754174AbYEZThn (ORCPT ); Mon, 26 May 2008 15:37:43 -0400 Received: from main.gmane.org ([80.91.229.2]:52180 "EHLO ciao.gmane.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753942AbYEZThn (ORCPT ); Mon, 26 May 2008 15:37:43 -0400 X-Injected-Via-Gmane: http://gmane.org/ To: linux-kernel@vger.kernel.org From: Sitsofe Wheeler Subject: Re: [REGRESSION][BISECTED][X86] next-20080526 hangs on boot Date: Mon, 26 May 2008 20:36:54 +0100 Message-ID: References: <20080526161623.GC7305@cvg> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7Bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: cpc1-cwma5-0-0-cust137.swan.cable.ntl.com User-Agent: KNode/0.10.4 Cc: kernel-testers@vger.kernel.org Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Sitsofe Wheeler wrote: > Cyrill Gorcunov wrote: > >> [Sitsofe Wheeler - Mon, May 26, 2008 at 03:04:54PM +0100] >> | When using a 32 bit linux-next-20080526 the bootup process will hang at >> | a random point (not even sysrq helps) with no additional output on the >> | screen (whereas linux-next-20080523 did boot). Mysteriously, booting >> | with nmi_watchdog=2 allows the boot to finish (booting with >> | nmi_watchdog=1 still stalls). I have bisected it down to commit >> | [d1b946b97d71423f365fa797d1428e1847c0bec1]: >> >> Hi, so it helps by reverting only that commit? I mean all further commits >> are still appiled? > > Ah that I hadn't tested. I believe I might need to revert > 4b82b277707a39b97271439c475f186f63ec4692 too if later commits are applied > (but I'm still testing) > >> and, btw, could you post your config, please? > > http://sucs.org/~sits/test/config-20080526.txt OK applying the following patch (which is more or less a revert of [4b82b277707a39b97271439c475f186f63ec4692]) resolves the problem: diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c index d99ee8a..c55519c 100644 --- a/arch/x86/kernel/nmi.c +++ b/arch/x86/kernel/nmi.c @@ -480,8 +480,12 @@ int proc_nmi_enabled(struct ctl_table *table, int write, struct file *file, return -EIO; } - /* if nmi_watchdog is not set yet, then set it */ - nmi_watchdog_default(); + if (nmi_watchdog == NMI_DEFAULT) { + if (lapic_watchdog_ok()) + nmi_watchdog = NMI_LOCAL_APIC; + else + nmi_watchdog = NMI_IO_APIC; + } if (nmi_watchdog == NMI_LOCAL_APIC) { if (nmi_watchdog_enabled) diff --git a/include/asm-x86/nmi.h b/include/asm-x86/nmi.h index 1e8f34d..7cd5b6a 100644 --- a/include/asm-x86/nmi.h +++ b/include/asm-x86/nmi.h @@ -38,9 +38,11 @@ static inline void unset_nmi_pm_callback(struct pm_dev *dev) #ifdef CONFIG_X86_64 extern void default_do_nmi(struct pt_regs *); +extern void nmi_watchdog_default(void); +#else +#define nmi_watchdog_default() do {} while (0) #endif -extern void nmi_watchdog_default(void); extern void die_nmi(char *str, struct pt_regs *regs, int do_panic); extern int check_nmi_watchdog(void); extern int nmi_watchdog_enabled; The removal of extern void nmi_watchdog_default(void) and the inclusion of #define nmi_watchdog_default() do {} while (0) look suspicious (why would nmi_watchdog_default() need to be an infinite loop on 32 bit systems?). -- Sitsofe | http://sucs.org/~sits/