From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Date: Tue, 30 Mar 2010 18:36:57 +0200 Subject: [Ocfs2-devel] lockdep warning in ocfs2 quota In-Reply-To: <20100324220138.GA14310@mail.oracle.com> References: <20100324220138.GA14310@mail.oracle.com> Message-ID: <20100330163656.GD3424@quack.suse.cz> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ocfs2-devel@oss.oracle.com Hi Joel, On Wed 24-03-10 15:01:39, Joel Becker wrote: > I got this on an ocfs2 filesystem with quota features enabled > (but quota enforcement not turned on). Non-clustered ocfs2. Fresh > mkfs. Untarring a kernel tree. Thanks for the lockdep trace. > ======================================================= > [ INFO: possible circular locking dependency detected ] > 2.6.34-rc1-kvm #179 > ------------------------------------------------------- > tar/2546 is trying to acquire lock: > (&s->s_dquot.dqio_mutex){+.+...}, at: [] > dquot_commit+0x26/0xc8 > > but task is already holding lock: > (&s->s_dquot.dqptr_sem){++++..}, at: [] > dquot_alloc_inode+0x63/0x133 This is the other way around than it should be - dqptr_sem ranks above dqio_mutex. So lockdep must have somehow established a dependency chain in an inverse order. > -> #1 (&journal->j_trans_barrier){.+.+..}: > [] __lock_acquire+0x10ad/0x139e > [] lock_acquire+0x97/0xbb > [] down_read+0x31/0x45 > [] ocfs2_start_trans+0x9b/0x178 [ocfs2] > [] ocfs2_global_read_dquot+0x163/0x265 [ocfs2] > [] ocfs2_local_read_dquot+0x73/0xb42 [ocfs2] > [] dquot_acquire+0x51/0xde > [] ocfs2_acquire_dquot+0x8c/0xee [ocfs2] > [] dqget+0x293/0x2cb > [] __dquot_initialize+0x7c/0x155 > [] dquot_initialize+0x10/0x12 > [] ocfs2_get_init_inode+0xdf/0xe9 [ocfs2] > [] ocfs2_mknod+0x358/0xd8c [ocfs2] > [] ocfs2_mkdir+0x77/0xcd [ocfs2] > [] vfs_mkdir+0x66/0xc9 > [] sys_mkdirat+0x7f/0xba > [] sys_mkdir+0x15/0x17 > [] syscall_call+0x7/0xb This is the culprit - we have to do some writes when reading dquot (to increase dquot use count and possibly also allocate space for further writes) but ocfs2_local_read_dquot is already called with dqio_mutex held so when ocfs2_global_read_dquot tries to start a transaction it is a violation of lock ordering. I guess I'll have to move all the code from ocfs2_local_read_dquot to ocfs2_acquire_dquot and do all the handling there with proper locking. I'll have a look at it. Honza -- Jan Kara SUSE Labs, CR