From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755537AbcA1Po6 (ORCPT ); Thu, 28 Jan 2016 10:44:58 -0500 Received: from mail-pa0-f67.google.com ([209.85.220.67]:32987 "EHLO mail-pa0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934799AbcA1Pot (ORCPT ); Thu, 28 Jan 2016 10:44:49 -0500 Date: Fri, 29 Jan 2016 00:42:57 +0900 From: Sergey Senozhatsky To: Byungchul Park Cc: akpm@linux-foundation.org, mingo@kernel.org, linux-kernel@vger.kernel.org, akinobu.mita@gmail.com, jack@suse.cz, torvalds@linux-foundation.org, peter@hurleysoftware.com, sergey.senozhatsky@gmail.com, Sergey Senozhatsky Subject: Re: [PATCH v4] lib/spinlock_debug.c: prevent a recursive cycle in the debug code Message-ID: <20160128154257.GA564@swordfish> References: <1453896061-14102-1-git-send-email-byungchul.park@lge.com> <20160128014253.GC1538@X58A-UD3R> <20160128023750.GB1834@swordfish> <000301d15985$7f416690$7dc433b0$@lge.com> <20160128060530.GC1834@swordfish> <20160128081313.GB31266@X58A-UD3R> <20160128104137.GA610@swordfish> <20160128105342.GB610@swordfish> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160128105342.GB610@swordfish> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On (01/28/16 19:53), Sergey Senozhatsky wrote: > > ah... silly me... you mean the first CPU that triggers the spin_dump() will > ^^^ this, of course, is true for > console_sem->lock and logbuf_lock > only. > > > deadlock itself, so the rest of CPUs will see endless recursive > > spin_lock()->spin_dump()->spin_lock()->spin_dump() calls? [..] > > Can you please update your bug description in the commit message? > > It's the deadlock that is causing the recursion on other CPUs in the > > first place. no, don't update anything. I was completely wrong. it's not a deadlock that is the root cause here. even if at some level of recursion (nested printk calls) spin_dump()->__spin_lock_debug()->arch_spin_trylock() acquires the lock, it returns back with the spin lock unlocked anyway. vprintk_emit() console_trylock() spin_lock() spin_dump() vprintk_emit() console_trylock() spin_lock() spin_dump() vprintk_emit() console_trylock() spin_lock() << OK, got the lock finally sem->count-- spin_unlock() << unlock, return arch_spin_lock() << got the lock, return sem->count-- spin_unlock() << unlock, return arch_spin_lock() << got the lock, return sem->count-- spin_unlock() << unlock, return ...um > But I found there's a possiblity in the debug code *itself* to cause a > lockup. please explain. -ss