From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761205AbZBMQdV (ORCPT ); Fri, 13 Feb 2009 11:33:21 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752702AbZBMQdL (ORCPT ); Fri, 13 Feb 2009 11:33:11 -0500 Received: from mx2.redhat.com ([66.187.237.31]:47690 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752226AbZBMQdK (ORCPT ); Fri, 13 Feb 2009 11:33:10 -0500 Message-ID: <4995A0B7.1030705@redhat.com> Date: Fri, 13 Feb 2009 11:32:55 -0500 From: Masami Hiramatsu User-Agent: Thunderbird 2.0.0.19 (X11/20090105) MIME-Version: 1.0 To: Mathieu Desnoyers CC: Peter Zijlstra , Nick Piggin , akpm , linux-kernel , Ingo Molnar Subject: Re: irq-disabled vs vmap vs text_poke References: <1234529407.6519.28.camel@twins> <20090213141839.GA31922@Krystal> In-Reply-To: <20090213141839.GA31922@Krystal> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Mathieu Desnoyers wrote: > * Peter Zijlstra (peterz@infradead.org) wrote: >> Hi, >> >> Ingo got the following splat: >> >> [ 5.101748] ------------[ cut here ]------------ >> [ 5.104305] WARNING: at kernel/smp.c:329 smp_call_function_many+0x34/0x1ea() >> [ 5.104305] Hardware name: P4DC6 >> [ 5.104305] Modules linked in: >> [ 5.104305] Pid: 1, comm: swapper Not tainted 2.6.29-rc4-tip-01766-g1757c19-dirty #2 >> [ 5.104305] Call Trace: >> [ 5.104305] [] warn_slowpath+0x79/0x8f >> [ 5.104305] [] ? dump_trace+0x7d/0xac >> [ 5.104305] [] ? __lock_acquire+0x319/0x382 >> [ 5.104305] [] smp_call_function_many+0x34/0x1ea >> [ 5.104305] [] ? do_flush_tlb_all+0x0/0x48 >> [ 5.104305] [] ? do_flush_tlb_all+0x0/0x48 >> [ 5.104305] [] smp_call_function+0x21/0x28 >> [ 5.104305] [] on_each_cpu+0x14/0x23 >> [ 5.104305] [] flush_tlb_all+0x19/0x1b >> [ 5.104305] [] flush_tlb_kernel_range+0xd/0xf >> [ 5.104305] [] vmap_debug_free_range+0x1c/0x20 >> [ 5.104305] [] remove_vm_area+0x28/0x67 >> [ 5.104305] [] __vunmap+0x30/0xab >> [ 5.104305] [] vunmap+0x27/0x29 >> [ 5.104305] [] text_poke+0xd6/0x104 >> [ 5.104305] [] ? kprobe_target+0x0/0x15 >> [ 5.104305] [] arch_disarm_kprobe+0x13/0x15 >> [ 5.104305] [] __unregister_kprobe_top+0x68/0xe8 >> [ 5.104305] [] unregister_kretprobes+0x2c/0xb9 >> [ 5.104305] [] unregister_kretprobe+0x16/0x18 >> [ 5.104305] [] init_test_probes+0x2ed/0x40c >> [ 5.104305] [] init_kprobes+0x127/0x131 >> [ 5.104305] [] ? lock_release_holdtime+0x43/0x48 >> [ 5.104305] [] ? __slab_alloc+0x5d/0x27a >> [ 5.104305] [] ? lockdep_init_map+0x80/0xe7 >> [ 5.104305] [] ? clocksource_read+0xd/0xf >> [ 5.104305] [] ? getnstimeofday+0x5e/0xe9 >> [ 5.104305] [] ? timespec_to_ktime+0xe/0x11 >> [ 5.104305] [] ? init_kprobes+0x0/0x131 >> [ 5.104305] [] do_one_initcall+0x6a/0x169 >> [ 5.104305] [] ? number+0x10d/0x1cf >> [ 5.104305] [] ? register_lock_class+0x17/0x228 >> [ 5.104305] [] ? __lock_acquire+0x319/0x382 >> [ 5.104305] [] ? debug_check_no_obj_freed+0xda/0x126 >> [ 5.104305] [] ? __lock_acquire+0x319/0x382 >> [ 5.104305] [] ? lock_release_holdtime+0x43/0x48 >> [ 5.104305] [] ? _spin_unlock+0x22/0x25 >> [ 5.104305] [] ? proc_register+0x14b/0x15c >> [ 5.104305] [] ? create_proc_entry+0x76/0x8c >> [ 5.104305] [] ? default_affinity_write+0x3f/0x8a >> [ 5.104305] [] ? init_irq_proc+0x58/0x65 >> [ 5.104305] [] kernel_init+0x118/0x169 >> [ 5.104305] [] ? kernel_init+0x0/0x169 >> [ 5.104305] [] kernel_thread_helper+0x7/0x10 >> [ 5.104305] ---[ end trace e93713a9d40cd06c ]--- >> >> Which points to vunmap() being called with interrupts disabled. >> >> Which made me look at the vmap/vunmap calls, and they appear to not be >> irq-safe, therefore this would be a bug in text_poke(). >> >> [ that is, vmap() can end up calling get_vm_area_caller() which in turn >> calls __get_vm_area_node() with GFP_KERNEL, ergo, don't do this from >> an atomic context. ] >> >> Now text_poke() uses local_irq_save/restore(), which conveys that it can >> be called with IRQs disabled, which is exactly what happens in the trace >> above, however we just established that vmap/vunmap() are not irq-safe. >> > > text_poke should actually use local_irq_disable/enable rather than > local_irq_save/restore because it _must_ be called with interrupts on. Could you tell me why it must be called with irq on? > >> Anybody got an idea on how to fix this? > > There is probably something wrong with the caller, kprobes, which calls > text_poke with interrupts off. Hmm, kprobe's smoke test caused this problem. Since (un)register_kprobe() may sleep for waiting a mutex, it should not be called with interrupts off. So, it's not text_poke()'s issue nor vmap(). BTW, what about using map_vm_area() in text_poke() instead of vmap()? Since text_poke() just maps text pages to alias pages temporarily, I think we don't need to use delayed vunmap(). Thanks > > Mathieu > -- Masami Hiramatsu Software Engineer Hitachi Computer Products (America) Inc. Software Solutions Division e-mail: mhiramat@redhat.com