From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751132AbdBJCfP (ORCPT ); Thu, 9 Feb 2017 21:35:15 -0500 Received: from mail.kernel.org ([198.145.29.136]:38344 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750989AbdBJCfN (ORCPT ); Thu, 9 Feb 2017 21:35:13 -0500 Date: Fri, 10 Feb 2017 11:34:45 +0900 From: Masami Hiramatsu To: Russell King - ARM Linux Cc: Ingo Molnar , linux-kernel@vger.kernel.org, Peter Zijlstra , Ananth N Mavinakayanahalli , Thomas Gleixner , "H . Peter Anvin" , Jon Medhurst , Wang Nan , Catalin Marinas , Will Deacon , "David A . Long" , Sandeepa Prabhu Subject: Re: [BUGFIX PATCH tip/master V2 3/3] kprobes/arm: Fix a possible deadlock case in kretprobe Message-Id: <20170210113445.dc7bc683e77dd7ba021663b1@kernel.org> In-Reply-To: <20170209164859.GI27312@n2100.armlinux.org.uk> References: <148665771695.22817.16393459806489781531.stgit@devbox> <148665793195.22817.17284342421982011651.stgit@devbox> <20170209164859.GI27312@n2100.armlinux.org.uk> X-Mailer: Sylpheed 3.5.0 (GTK+ 2.24.31; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 9 Feb 2017 16:49:00 +0000 Russell King - ARM Linux wrote: > On Fri, Feb 10, 2017 at 01:32:22AM +0900, Masami Hiramatsu wrote: > > Fix a possibility of deadlock case in kretprobe on arm > > implementation. There may be a chance that the kretprobe > > hash table lock can cause a dead lock. > > > > The senario is that a user puts 2 kretprobes, one on normal > > function and one on a function which can be called from > > somewhare which can interrupt in irq disabled critical > > section like FIQ. > > If we: > - hit a kernel tracing feature from FIQ context > - the tracing feature takes a lock > - the lock is also taken elsewhere on the same CPU with IRQs disabled > > we will quite simply deadlock. Correct. > In this case, kretprobe_hash_lock() takes the hlist_lock using > raw_spin_lock_irqsave(). > > Now, from what I can see in the kprobes code, this lock is taken in > other contexts (eg, kprobe_flush_task()), which means even with this > fix, it's still risky if a kprobe is placed on a FIQ-called function. Oops, right! I'll fix that too. Thanks for pointed out. > > > In this case, if the kernel hits the 1st kretprobe on a > > normal function return which calls trampoline_handler(), > > acquire a spinlock on the hash table in kretprobe_hash_lock() > > and disable irqs. After that, if the 2nd kretprobe is kicked > > from FIQ, it also calls trampoline_handler() and tries to > > acquire the same spinlock (since the hash is based on > > current task, same as the 1st kretprobe), it causes > > a deadlock. > > So my deadlock scenario is: > > - we're in the middle of kprobe_flush_task() > - FIQ happens, calls trampoline_handler() > - deadlock in kretprobe_hash_lock() > > From what I can see, kretprobes in FIQ are just unsafe. Yes, NMI on x86 too. > I suspect that avoiding these deadlocks means that we have to deny > kprobes from FIQ context - making trampoline_handler() return > immediately if in_nmi() is true. Ah, in_nmi() means FIQ on arm :) OK, but actually it is too late to check it in the enter of trampoline_handler() since we don't know where is the real return address at that point. So I'll check that in setup site - kretprobe_pre_handler(). Thank you! -- Masami Hiramatsu