From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from out03.mta.xmission.com ([166.70.13.233]) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1RsLqG-0002XN-Qg for kexec@lists.infradead.org; Tue, 31 Jan 2012 22:06:21 +0000 From: ebiederm@xmission.com (Eric W. Biederman) Subject: Re: [PATCH] x86, kdump, ioapic: Fix kdump race with migrating irq References: <1328045114-4489-1-git-send-email-dzickus@redhat.com> Date: Tue, 31 Jan 2012 14:08:29 -0800 In-Reply-To: <1328045114-4489-1-git-send-email-dzickus@redhat.com> (Don Zickus's message of "Tue, 31 Jan 2012 16:25:14 -0500") Message-ID: MIME-Version: 1.0 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: kexec-bounces@lists.infradead.org Errors-To: kexec-bounces+dwmw2=infradead.org@lists.infradead.org To: Don Zickus Cc: x86@kernel.org, kexec-list , LKML , vgoyal@redhat.com Don Zickus writes: > A customer of ours noticed when their machine crashed, kdump did not > work but hung instead. Using their firmware dumping solution they > grabbed a vmcore and decoded the stacks on the cpus. What they > noticed seemed to be a rare deadlock with the ioapic_lock. > > CPU4: > machine_crash_shutdown > -> machine_ops.crash_shutdown > -> native_machine_crash_shutdown > -> kdump_nmi_shootdown_cpus ------> Send NMI to other CPUs > -> disable_IO_APIC > -> clear_IO_APIC > -> clear_IO_APIC_pin > -> ioapic_read_entry > -> spin_lock_irqsave(&ioapic_lock, flags) > ---Infinite loop here--- > > CPU0: > do_IRQ > -> handle_irq > -> handle_edge_irq > -> ack_apic_edge > -> move_native_irq > -> mask_IO_APIC_irq > -> mask_IO_APIC_irq_desc > -> spin_lock_irqsave(&ioapic_lock, flags) > ---Receive NMI here after getting spinlock--- > -> nmi > -> do_nmi > -> crash_nmi_callback > ---Infinite loop here--- > > The problem is that although kdump tries to shutdown minimal hardware, > it still needs to disable the IO APIC. This requires spinlocks which > may be held by another cpu. This other cpu is being held infinitely in > an NMI context by kdump in order to serialize the crashing path. Instant > deadlock. Can you test to see if kexec on panic still needs to disable the IO APIC. Last I looked we were close if not all of the way there to not needing to boot the kernel in pic mode? If we can skip the ioapic disable entirely we should be much more robust. Eric _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755430Ab2AaWFu (ORCPT ); Tue, 31 Jan 2012 17:05:50 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:48021 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755383Ab2AaWFt (ORCPT ); Tue, 31 Jan 2012 17:05:49 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Don Zickus Cc: , LKML , vgoyal@redhat.com, kexec-list Subject: Re: [PATCH] x86, kdump, ioapic: Fix kdump race with migrating irq References: <1328045114-4489-1-git-send-email-dzickus@redhat.com> Date: Tue, 31 Jan 2012 14:08:29 -0800 In-Reply-To: <1328045114-4489-1-git-send-email-dzickus@redhat.com> (Don Zickus's message of "Tue, 31 Jan 2012 16:25:14 -0500") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-XM-SPF: eid=;;;mid=;;;hst=in01.mta.xmission.com;;;ip=98.207.153.68;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX18WRVjg8Fgp0UECM1bA4lT4qA3sp/NJX8E= X-SA-Exim-Connect-IP: 98.207.153.68 X-SA-Exim-Mail-From: ebiederm@xmission.com X-SA-Exim-Scanned: No (on in01.mta.xmission.com); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Don Zickus writes: > A customer of ours noticed when their machine crashed, kdump did not > work but hung instead. Using their firmware dumping solution they > grabbed a vmcore and decoded the stacks on the cpus. What they > noticed seemed to be a rare deadlock with the ioapic_lock. > > CPU4: > machine_crash_shutdown > -> machine_ops.crash_shutdown > -> native_machine_crash_shutdown > -> kdump_nmi_shootdown_cpus ------> Send NMI to other CPUs > -> disable_IO_APIC > -> clear_IO_APIC > -> clear_IO_APIC_pin > -> ioapic_read_entry > -> spin_lock_irqsave(&ioapic_lock, flags) > ---Infinite loop here--- > > CPU0: > do_IRQ > -> handle_irq > -> handle_edge_irq > -> ack_apic_edge > -> move_native_irq > -> mask_IO_APIC_irq > -> mask_IO_APIC_irq_desc > -> spin_lock_irqsave(&ioapic_lock, flags) > ---Receive NMI here after getting spinlock--- > -> nmi > -> do_nmi > -> crash_nmi_callback > ---Infinite loop here--- > > The problem is that although kdump tries to shutdown minimal hardware, > it still needs to disable the IO APIC. This requires spinlocks which > may be held by another cpu. This other cpu is being held infinitely in > an NMI context by kdump in order to serialize the crashing path. Instant > deadlock. Can you test to see if kexec on panic still needs to disable the IO APIC. Last I looked we were close if not all of the way there to not needing to boot the kernel in pic mode? If we can skip the ioapic disable entirely we should be much more robust. Eric