From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759286AbZBFQWY (ORCPT ); Fri, 6 Feb 2009 11:22:24 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753491AbZBFQWQ (ORCPT ); Fri, 6 Feb 2009 11:22:16 -0500 Received: from g4t0014.houston.hp.com ([15.201.24.17]:48036 "EHLO g4t0014.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753165AbZBFQWO (ORCPT ); Fri, 6 Feb 2009 11:22:14 -0500 Date: Fri, 6 Feb 2009 09:22:13 -0700 From: Alex Chiang To: tony.luck@intel.com Cc: paulmck@linux.vnet.ibm.com, stable@kernel.org, linux-ia64@vger.kernel.org, linux-kernel Subject: [PATCH] ia64: prevent irq migration race in __cpu_disable path Message-ID: <20090206162213.GD8208@ldl.fc.hp.com> Mail-Followup-To: Alex Chiang , tony.luck@intel.com, paulmck@linux.vnet.ibm.com, stable@kernel.org, linux-ia64@vger.kernel.org, linux-kernel MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.17+20080114 (2008-01-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit e7b14036 (prevent ia64 from invoking irq handlers on offline CPUs) introduced a bug, where we call fixup_irqs before removing the CPU from the cpu_online_map. This is wrong because fixup_irqs calls migrate_irqs, and in migrate_irqs, we use the cpu_online_map to: 1. look for interrupts on current CPU 2. if we find one, move it to the first available CPU in the cpu_online_map This means we can potentially migrate an interrupt off ourself back to... ourself. Uh oh. We hit a NULL deref later which causes us to oops (output trimmed): Unable to handle kernel NULL pointer dereference (address 0000000000000040) ip is at profile_tick+0xd0/0x1c0 Call Trace: [] die+0x1a0/0x2e0 sp=e0000009c922fbd0 bsp=e0000009c9221438 [] ia64_do_page_fault+0x950/0xa80 sp=e0000009c922fbd0 bsp=e0000009c92213d8 [] ia64_native_leave_kernel+0x0/0x270 sp=e0000009c922fc60 bsp=e0000009c92213d8 [] profile_tick+0xd0/0x1c0 sp=e0000009c922fe30 bsp=e0000009c9221398 [] timer_interrupt+0x170/0x3e0 sp=e0000009c922fe30 bsp=e0000009c9221330 [] handle_IRQ_event+0x80/0x120 sp=e0000009c922fe30 bsp=e0000009c92212f8 [] __do_IRQ+0x160/0x4a0 sp=e0000009c922fe30 bsp=e0000009c9221290 [] ia64_process_pending_intr+0x2b0/0x360 sp=e0000009c922fe30 bsp=e0000009c9221208 [] fixup_irqs+0xf0/0x2a0 sp=e0000009c922fe30 bsp=e0000009c92211a8 [] __cpu_disable+0x140/0x240 sp=e0000009c922fe30 bsp=e0000009c9221168 [] take_cpu_down+0x50/0xa0 sp=e0000009c922fe30 bsp=e0000009c9221148 [] stop_cpu+0xd0/0x200 sp=e0000009c922fe30 bsp=e0000009c92210f0 Reading through the original thread: http://lkml.org/lkml/2008/8/31/116 It looks like Paul fixed his original issue correctly, put in a new call to cpu_clear() in the wrong spot, and then was convinced to remove the _correct_ call to cpu_clear(). Cc: stable@kernel.org Cc: Paul E. McKenney Signed-off-by: Alex Chiang --- In my opinion, this is .29 material. Sorry for the huge changelog:patch ratio, but this area is tricky enough that more explanation is better than less, I think. Also, I'm still a little troubled by Paul's original patch. What happens if we're trying to offline the CPEI target? The code in migrate_platform_irqs() uses cpu_online_map to select the new CPEI target, and it seems like we can end up in the same situation as the problem I'm trying to fix now. Paul? My patch has held up for over 24 hours of stress testing, where we put the system under a heavy load and then randomly offline/online CPUs every 2 seconds. Without this patch, the machine would crash reliably within 15 minutes. --- diff --git a/arch/ia64/kernel/smpboot.c b/arch/ia64/kernel/smpboot.c index 1146399..2a17d1c 100644 --- a/arch/ia64/kernel/smpboot.c +++ b/arch/ia64/kernel/smpboot.c @@ -742,8 +742,8 @@ int __cpu_disable(void) } remove_siblinginfo(cpu); - fixup_irqs(); cpu_clear(cpu, cpu_online_map); + fixup_irqs(); local_flush_tlb_all(); cpu_clear(cpu, cpu_callin_map); return 0;