From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matthew Wilcox Subject: Re: per_cpu_counter_sum lockdep warning Date: Tue, 3 Jun 2008 21:59:54 -0600 Message-ID: <20080604035954.GE3549@parisc-linux.org> References: <48460B94.8050000@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-fsdevel@vger.kernel.org, linux kernel mailing list To: Balbir Singh Return-path: Received: from palinux.external.hp.com ([192.25.206.14]:48856 "EHLO mail.parisc-linux.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750723AbYFDEAL (ORCPT ); Wed, 4 Jun 2008 00:00:11 -0400 Content-Disposition: inline In-Reply-To: <48460B94.8050000@linux.vnet.ibm.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, Jun 04, 2008 at 08:57:16AM +0530, Balbir Singh wrote: > Saw this warning on an x86_64 box, while booting up 2.6.26-rc4. Has anybody else > seen it? Working on it? I've neither seen it, nor am I working on it, but I can decode it. > inconsistent {in-hardirq-W} -> {hardirq-on-W} usage. Translation: "This lock was previously grabbed in hardirq context. Now someone's taking it in process context without interrupts disabled. That could lead to a deadlock." > init/1 [HC0[0]:SC0[0]:HE1:SE1] takes: > (&fbc->lock){+...}, at: [] __percpu_counter_sum+0xf/0x5a That's the name of the lock -- &fbc->lock and the function where it happens. > {in-hardirq-W} state was registered at: > [] 0xffffffffffffffff Drat, no backtrace for the guy who took the lock in hardirq context. > Call Trace: > [] print_usage_bug+0x15e/0x16f > [] mark_lock+0x22f/0x416 > [] ? __percpu_counter_sum+0xf/0x5a > [] __lock_acquire+0x4e7/0xc8a > [] ? __percpu_counter_sum+0xf/0x5a > [] lock_acquire+0x8e/0xb2 > [] ? __percpu_counter_sum+0xf/0x5a > [] _spin_lock+0x26/0x53 > [] __percpu_counter_sum+0xf/0x5a > [] ext3_statfs+0xd6/0x160 ext3_statfs was the one who asked for the lock to be taken without disabling interrupts. Some percpu counters are supposed to be used from interrupt context. These are created with percpu_counter_init_irq. Others are not and should be created with percpu_counter_init. It seems like someone's made a mess of that rule. This is likely to be a driver, IMO. Perhaps you could work on tracking this down? -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step."