From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751762AbaEAUJH (ORCPT ); Thu, 1 May 2014 16:09:07 -0400 Received: from mail-we0-f172.google.com ([74.125.82.172]:62749 "EHLO mail-we0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751247AbaEAUJF (ORCPT ); Thu, 1 May 2014 16:09:05 -0400 Date: Thu, 1 May 2014 22:09:01 +0200 From: Frederic Weisbecker To: Don Zickus Cc: Eric Paris , linux-kernel@vger.kernel.org, Andrew Morton , Michal Hocko , Ben Zhang Subject: Re: [PATCH] watchdog: print all locks on a softlock Message-ID: <20140501200858.GA27787@localhost.localdomain> References: <1398970535-6880-1-git-send-email-eparis@redhat.com> <20140501191720.GA198341@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140501191720.GA198341@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 01, 2014 at 03:17:20PM -0400, Don Zickus wrote: > On Thu, May 01, 2014 at 02:55:35PM -0400, Eric Paris wrote: > > If the CPU hits a softlockup this patch will also have it print the > > information about all locks being held on the system. This might help > > determine if a lock is being held too long leading to this problem. > > I am not sure this helps you. A softlockup is the result of pre-emption > disabled, ie the scheduler not being called after 60 seconds. Holding a > lock does not disable pre-emption usually. So I don't think this is going > to add anything. > > Are you trying to debug a hung task? The the hung_task thread checks to > see if a task hasn't scheduled in 2 minutes or so. That could be the > result of long lock (but that output already dumps the lockdep stuff). There may be some deadlocks that lockdep doesn't detect yet. 2 example: 1) spinlock <-> IPI dependency CPU 0 CPU 1 -------------------------------------------------------- spin_lock_irq(A) smp_send_function_single_async(CPU 1, func) //IPI func { spin_lock(1) } But this should be resolved with a virtual lock on the IPI functions. I should try that. 2) rwlock <-> IPI CPU 0 CPU 1 -------------------------------------------------------- read_lock(A) write_lock_irq(A) smp_send_function_single(CPU 1, func) //IPI never happens This one is much trickier. Anyway those are the only scenario I know of but there may be more. When possible we want to extend lockdep to detect new scenarios of deadlock but we don't have the guarantee that it can detect everything. So, could be useful...