From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932100AbWFKFcn (ORCPT ); Sun, 11 Jun 2006 01:32:43 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932104AbWFKFcn (ORCPT ); Sun, 11 Jun 2006 01:32:43 -0400 Received: from mx2.mail.elte.hu ([157.181.151.9]:53205 "EHLO mx2.mail.elte.hu") by vger.kernel.org with ESMTP id S932100AbWFKFcm (ORCPT ); Sun, 11 Jun 2006 01:32:42 -0400 Date: Sun, 11 Jun 2006 07:31:54 +0200 From: Ingo Molnar To: Anton Altaparmakov Cc: Miles Lane , LKML , Andrew Morton , Arjan van de Ven Subject: Re: 2.6.17-rc6-mm1 -- BUG: possible circular locking deadlock detected! Message-ID: <20060611053154.GA8581@elte.hu> References: <1149751953.10056.10.camel@imp.csi.cam.ac.uk> <20060608095522.GA30946@elte.hu> <1149764032.10056.82.camel@imp.csi.cam.ac.uk> <20060608112306.GA4234@elte.hu> <1149840563.3619.46.camel@imp.csi.cam.ac.uk> <20060610075954.GA30119@elte.hu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-ELTE-SpamScore: -3.1 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-3.1 required=5.9 tests=ALL_TRUSTED,AWL,BAYES_50 autolearn=no SpamAssassin version=3.0.3 -3.3 ALL_TRUSTED Did not pass through any untrusted hosts 0.0 BAYES_50 BODY: Bayesian spam probability is 40 to 60% [score: 0.5000] 0.2 AWL AWL: From: address is in the auto white-list X-ELTE-VirusStatus: clean Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org * Anton Altaparmakov wrote: > I think the lock validator has the problem of not knowing that there > are two different types of runlist which is why it complains about it. ah, ok! What happened is that the rwlock_init() 'lock type keys' (inlined via ntfs_init_runlist()) for the two runlists were 'merged': ntfs_init_runlist(&ni->runlist); ntfs_init_runlist(&ni->attr_list_rl); i have annotated things by initializing the two locks separately (via a simple oneliner change), and this has solved the problem. The two types are now properly 'split', and the validator tracks them separately and understands their separate roles. So there's no need to touch attribute runlist locking in the NTFS code. Some background: the validator uses lock initialization as a hint about which locks share the same 'type' (or locking domain). Currently this is done via: #define init_rwsem(sem) \ do { \ static struct lockdep_type_key __key; \ \ __init_rwsem((sem), #sem, &__key); \ } while (0) But since ntfs_init_runlist() got inlined to within the same function, the __key there got shared. A better method might be to use the return address in __init_rwsem() [and i used that in earlier versions of the validator] - but even then there's no guarantee that this code will always be inlined. In any case, this is known to be a heuristics, it is totally valid to initialize locks in arbitrary manner, and the validator only tries to guess it right in 99.9% of the cases. In cases where the validator incorrectly merged (or split) lock types [such as in this case], the problem can be found easily - and the annotation is easy as well. The good news is that after this fix things went pretty well for readonly stuff and i got no new complaints from the validator. Phew! :-) It does not fully cover read-write mode yet. When extending an existing file the validator did not understand the following locking construct: ======================================================= [ INFO: possible circular locking dependency detected ] ------------------------------------------------------- cat/2802 is trying to acquire lock: (&vol->lcnbmp_lock){----}, at: [] ntfs_cluster_alloc+0x10d/0x23a0 but task is already holding lock: (&ni->mrec_lock){--..}, at: [] map_mft_record+0x53/0x2c0 which lock already depends on the new lock, which could lead to circular dependencies. the existing dependency chain (in reverse order) is: -> #2 (&ni->mrec_lock){--..}: [] lock_acquire+0x6f/0x90 [] mutex_lock_nested+0x73/0x2a0 [] map_mft_record+0x53/0x2c0 [] ntfs_map_runlist_nolock+0x3d8/0x530 [] ntfs_map_runlist+0x41/0x70 [] ntfs_readpage+0x8c9/0x9b0 [] read_cache_page+0xac/0x150 [] ntfs_statfs+0x41d/0x660 [] vfs_statfs+0x54/0x70 [] vfs_statfs64+0x18/0x30 [] sys_statfs64+0x64/0xa0 [] sysenter_past_esp+0x56/0x8d -> #1 (&rl->lock){----}: [] lock_acquire+0x6f/0x90 [] down_read_nested+0x2a/0x40 [] ntfs_readpage+0x844/0x9b0 [] read_cache_page+0xac/0x150 [] ntfs_statfs+0x41d/0x660 [] vfs_statfs+0x54/0x70 [] vfs_statfs64+0x18/0x30 [] sys_statfs64+0x64/0xa0 [] sysenter_past_esp+0x56/0x8d -> #0 (&vol->lcnbmp_lock){----}: [] lock_acquire+0x6f/0x90 [] down_write+0x2c/0x50 [] ntfs_cluster_alloc+0x10d/0x23a0 [] ntfs_attr_extend_allocation+0x5fd/0x14a0 [] ntfs_file_buffered_write+0x188/0x3880 [] ntfs_file_aio_write_nolock+0x178/0x210 [] ntfs_file_writev+0xb1/0x150 [] ntfs_file_write+0x1f/0x30 [] vfs_write+0x99/0x160 [] sys_write+0x3d/0x70 [] sysenter_past_esp+0x56/0x8d other info that might help us debug this: 3 locks held by cat/2802: #0: (&inode->i_mutex){--..}, at: [] mutex_lock+0x8/0x10 #1: (&rl->lock){----}, at: [] ntfs_attr_extend_allocation+0x13e/0x14a0 #2: (&ni->mrec_lock){--..}, at: [] map_mft_record+0x53/0x2c0 stack backtrace: [] show_trace+0x12/0x20 [] dump_stack+0x19/0x20 [] print_circular_bug_tail+0x61/0x70 [] __lock_acquire+0x74f/0xde0 [] lock_acquire+0x6f/0x90 [] down_write+0x2c/0x50 [] ntfs_cluster_alloc+0x10d/0x23a0 [] ntfs_attr_extend_allocation+0x5fd/0x14a0 [] ntfs_file_buffered_write+0x188/0x3880 [] ntfs_file_aio_write_nolock+0x178/0x210 [] ntfs_file_writev+0xb1/0x150 [] ntfs_file_write+0x1f/0x30 [] vfs_write+0x99/0x160 [] sys_write+0x3d/0x70 [] sysenter_past_esp+0x56/0x8d this seems to be a pretty complex 3-way dependency related to &vol->lcnbmp_lock and &ni->mrec_lock. Should i send a full dependency events trace perhaps? Ingo