From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932429AbdBIRUI (ORCPT ); Thu, 9 Feb 2017 12:20:08 -0500 Received: from pandora.armlinux.org.uk ([78.32.30.218]:40284 "EHLO pandora.armlinux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752084AbdBIRSL (ORCPT ); Thu, 9 Feb 2017 12:18:11 -0500 Date: Thu, 9 Feb 2017 16:49:00 +0000 From: Russell King - ARM Linux To: Masami Hiramatsu Cc: Ingo Molnar , linux-kernel@vger.kernel.org, Peter Zijlstra , Ananth N Mavinakayanahalli , Thomas Gleixner , "H . Peter Anvin" , Jon Medhurst , Wang Nan , Catalin Marinas , Will Deacon , "David A . Long" , Sandeepa Prabhu Subject: Re: [BUGFIX PATCH tip/master V2 3/3] kprobes/arm: Fix a possible deadlock case in kretprobe Message-ID: <20170209164859.GI27312@n2100.armlinux.org.uk> References: <148665771695.22817.16393459806489781531.stgit@devbox> <148665793195.22817.17284342421982011651.stgit@devbox> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <148665793195.22817.17284342421982011651.stgit@devbox> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 10, 2017 at 01:32:22AM +0900, Masami Hiramatsu wrote: > Fix a possibility of deadlock case in kretprobe on arm > implementation. There may be a chance that the kretprobe > hash table lock can cause a dead lock. > > The senario is that a user puts 2 kretprobes, one on normal > function and one on a function which can be called from > somewhare which can interrupt in irq disabled critical > section like FIQ. If we: - hit a kernel tracing feature from FIQ context - the tracing feature takes a lock - the lock is also taken elsewhere on the same CPU with IRQs disabled we will quite simply deadlock. In this case, kretprobe_hash_lock() takes the hlist_lock using raw_spin_lock_irqsave(). Now, from what I can see in the kprobes code, this lock is taken in other contexts (eg, kprobe_flush_task()), which means even with this fix, it's still risky if a kprobe is placed on a FIQ-called function. > In this case, if the kernel hits the 1st kretprobe on a > normal function return which calls trampoline_handler(), > acquire a spinlock on the hash table in kretprobe_hash_lock() > and disable irqs. After that, if the 2nd kretprobe is kicked > from FIQ, it also calls trampoline_handler() and tries to > acquire the same spinlock (since the hash is based on > current task, same as the 1st kretprobe), it causes > a deadlock. So my deadlock scenario is: - we're in the middle of kprobe_flush_task() - FIQ happens, calls trampoline_handler() - deadlock in kretprobe_hash_lock() >>From what I can see, kretprobes in FIQ are just unsafe. I suspect that avoiding these deadlocks means that we have to deny kprobes from FIQ context - making trampoline_handler() return immediately if in_nmi() is true. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.