From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932861AbZHDQDx (ORCPT ); Tue, 4 Aug 2009 12:03:53 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932649AbZHDQDx (ORCPT ); Tue, 4 Aug 2009 12:03:53 -0400 Received: from mx2.redhat.com ([66.187.237.31]:34809 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932635AbZHDQDw (ORCPT ); Tue, 4 Aug 2009 12:03:52 -0400 Date: Tue, 4 Aug 2009 17:59:52 +0200 From: Oleg Nesterov To: Peter Zijlstra Cc: David Howells , Takashi Iwai , Ingo Molnar , Linux filesystem caching discussion list , LKML , Johannes Berg Subject: Re: Incorrect circular locking dependency? Message-ID: <20090804155952.GA5211@redhat.com> References: <6950.1245661098@redhat.com> <24075.1248705430@redhat.com> <1249397486.7924.243.camel@twins> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1249397486.7924.243.camel@twins> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/04, Peter Zijlstra wrote: > > On Mon, 2009-07-27 at 15:37 +0100, David Howells wrote: > > Takashi Iwai wrote: > > > > > ======================================================= > > > [ INFO: possible circular locking dependency detected ] > > > 2.6.30-test #7 > > > ------------------------------------------------------- > > > swapper/0 is trying to acquire lock: > > > (&cwq->lock){-.-...}, at: [] __queue_work+0x1f/0x4e > > > > > > but task is already holding lock: > > > (&q->lock){-.-.-.}, at: [] __wake_up+0x26/0x5c > > > > > > which lock already depends on the new lock. > > > > Okay. I think I understand this: > > > > (1) cachefiles_read_waiter() intercepts wake up events, and, as such, is run > > inside the waitqueue spinlock for the page bit waitqueue. > > > > (2) cachefiles_read_waiter() calls fscache_enqueue_retrieval() which calls > > fscache_enqueue_operation() which calls schedule_work() for fast > > operations, thus taking a per-CPU workqueue spinlock. > > > > (3) queue_work(), which is called by many things, calls __queue_work(), which > > takes the per-CPU workqueue spinlock. > > > > (4) __queue_work() then calls insert_work(), which calls wake_up(), which > > takes the waitqueue spinlock for the per-CPU workqueue waitqueue. > > > > Even though the two waitqueues are separate, I think lockdep sees them as > > having the same lock. > > Yeah, it looks like cwq->lock is always in the same lock class. > > Creating a new class for your second workqueue might help, we'd have to > pass a second key through __create_workqueue_key() and pass that into > init_cpu_workqueue() and apply it to cwq->lock using lockdep_set_class() > and co. Agreed. But otoh, it would be nice to kill cwq->more_work and speedup workqueues a bit. We don't actually need wait_queue_head_t, we have a single thread cwq->thread which should be woken. However this change is not completely trivial, we need cwq->please_wakeup_me to avoid unnecessary wakeups inside run_workqueue(). Not sure this worth the trouble. Oleg.