From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755769Ab2CEWCn (ORCPT ); Mon, 5 Mar 2012 17:02:43 -0500 Received: from li9-11.members.linode.com ([67.18.176.11]:56050 "EHLO test.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754473Ab2CEWCm (ORCPT ); Mon, 5 Mar 2012 17:02:42 -0500 Date: Mon, 5 Mar 2012 17:02:36 -0500 From: "Ted Ts'o" To: Miles Lane , LKML , Andreas Dilger , Alexander Viro , Andrew Morton , ecryptfs@vger.kernel.org Subject: Re: Linus GIT (3.3.0-rc6+) -- INFO: possible circular locking dependency detected Message-ID: <20120305220236.GB8927@thunk.org> Mail-Followup-To: Ted Ts'o , Miles Lane , LKML , Andreas Dilger , Alexander Viro , Andrew Morton , ecryptfs@vger.kernel.org References: <20120305214628.GA7717@thunk.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120305214628.GA7717@thunk.org> User-Agent: Mutt/1.5.20 (2009-06-14) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on test.thunk.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Ah, I see Al Viro has beaten me to the punch. :-) - Ted On Mon, Mar 05, 2012 at 04:46:28PM -0500, Ted Ts'o wrote: > I've added ecryptfs to the list since this looks like it's caused by > ecryptfs (i.e., it won't happen without ecryptfs). > > This seems to be caused by an munmap of an ecryptfs file, which has > dirty pages; ecryptfs is then calling into ext4 while the mmap is > still holding the mmap_sem, and then when ext4 calls the generic > function generic_file_aio_write(), it tries to grab the inode's > i_mutex, and that's what's causing the possible circular locking > dependency. > > The other locking order is caused by vfs_readdir() grabbing i_mutex, > and then filldir() calling writing to user memory, which means it > calls might_fault(), and might_fault() calls > might_lock_read(¤t->mm->mmap_sem) since if the page needs to be > faulted in, *that* will require taking a read lock of mmap_sem. > > In any case, all of the locks in question are being taken by generic > code, and it's the fact that ecryptfs needs to try to initiate page > writeout at munmap() time, which holds mmap_sem, which is causing the > circular dependency. > > i.e., this particular problem can and will happen with any file system > (which uses generic filemap infrastructure); ext4 just happens to > appear in the stack trace because that's the underlying file system > used by ecryptfs. > > Regards, > > - Ted > > On Mon, Mar 05, 2012 at 04:08:55PM -0500, Miles Lane wrote: > > [ 107.839605] [ INFO: possible circular locking dependency detected ] > > [ 107.839608] 3.3.0-rc6+ #14 Not tainted > > [ 107.839609] ------------------------------------------------------- > > [ 107.839611] gvfsd-metadata/2314 is trying to acquire lock: > > [ 107.839612] (&sb->s_type->i_mutex_key#13){+.+.+.}, at: > > [] generic_file_aio_write+0x45/0xbc > > [ 107.839622] > > [ 107.839623] but task is already holding lock: > > [ 107.839624] (&mm->mmap_sem){++++++}, at: [] > > sys_munmap+0x36/0x5b > > [ 107.839630] > > [ 107.839630] which lock already depends on the new lock. > > [ 107.839631] > > [ 107.839632] > > [ 107.839632] the existing dependency chain (in reverse order) is: > > [ 107.839634] > > [ 107.839634] -> #1 (&mm->mmap_sem){++++++}: > > [ 107.839638] [] lock_acquire+0x8a/0xa7 > > [ 107.839642] [] might_fault+0x7b/0x9e > > [ 107.839646] [] filldir+0x6a/0xc2 > > [ 107.839649] [] call_filldir+0x91/0xb8 > > [ 107.839653] [] ext4_readdir+0x1b2/0x519 > > [ 107.839656] [] vfs_readdir+0x76/0xac > > [ 107.839658] [] sys_getdents+0x79/0xc9 > > [ 107.839661] [] system_call_fastpath+0x16/0x1b > > [ 107.839665] > > [ 107.839665] -> #0 (&sb->s_type->i_mutex_key#13){+.+.+.}: > > [ 107.839669] [] __lock_acquire+0xa81/0xd75 > > [ 107.839672] [] lock_acquire+0x8a/0xa7 > > [ 107.839675] [] __mutex_lock_common+0x61/0x456 > > [ 107.839679] [] mutex_lock_nested+0x36/0x3b > > [ 107.839681] [] generic_file_aio_write+0x45/0xbc > > [ 107.839684] [] ext4_file_write+0x1e2/0x23a > > [ 107.839687] [] do_sync_write+0xbd/0xfd > > [ 107.839691] [] vfs_write+0xa7/0xee > > [ 107.839694] [] > > ecryptfs_write_lower+0x4e/0x73 [ecryptfs] > > [ 107.839700] [] > > ecryptfs_encrypt_page+0x11c/0x182 [ecryptfs] > > [ 107.839704] [] > > ecryptfs_writepage+0x31/0x73 [ecryptfs] > > [ 107.839708] [] __writepage+0x12/0x31 > > [ 107.839710] [] write_cache_pages+0x1e6/0x310 > > [ 107.839713] [] generic_writepages+0x3e/0x54 > > [ 107.839716] [] do_writepages+0x26/0x28 > > [ 107.839719] [] __filemap_fdatawrite_range+0x4e/0x50 > > [ 107.839722] [] filemap_fdatawrite+0x1a/0x1c > > [ 107.839725] [] filemap_write_and_wait+0x1b/0x36 > > [ 107.839727] [] > > ecryptfs_vma_close+0x17/0x19 [ecryptfs] > > [ 107.839731] [] remove_vma+0x3b/0x71 > > [ 107.839733] [] do_munmap+0x2ed/0x306 > > [ 107.839735] [] sys_munmap+0x44/0x5b > > [ 107.839738] [] system_call_fastpath+0x16/0x1b > > [ 107.839741] > > [ 107.839741] other info that might help us debug this: > > [ 107.839741] > > [ 107.839743] Possible unsafe locking scenario: > > [ 107.839743] > > [ 107.839744] CPU0 CPU1 > > [ 107.839746] ---- ---- > > [ 107.839747] lock(&mm->mmap_sem); > > [ 107.839749] lock(&sb->s_type->i_mutex_key#13); > > [ 107.839753] lock(&mm->mmap_sem); > > [ 107.839755] lock(&sb->s_type->i_mutex_key#13); > > [ 107.839758] > > [ 107.839758] *** DEADLOCK *** > > [ 107.839759] > > [ 107.839761] 1 lock held by gvfsd-metadata/2314: > > [ 107.839762] #0: (&mm->mmap_sem){++++++}, at: [] > > sys_munmap+0x36/0x5b > > [ 107.839767] > > [ 107.839767] stack backtrace: > > [ 107.839769] Pid: 2314, comm: gvfsd-metadata Not tainted 3.3.0-rc6+ #14 > > [ 107.839771] Call Trace: > > [ 107.839775] [] print_circular_bug+0x1f8/0x209 > > [ 107.839778] [] __lock_acquire+0xa81/0xd75 > > [ 107.839781] [] ? __lock_acquire+0xd66/0xd75 > > [ 107.839784] [] lock_acquire+0x8a/0xa7 > > [ 107.839787] [] ? generic_file_aio_write+0x45/0xbc > > [ 107.839790] [] __mutex_lock_common+0x61/0x456 > > [ 107.839792] [] ? generic_file_aio_write+0x45/0xbc > > [ 107.839795] [] ? mark_lock+0x2d/0x258 > > [ 107.839798] [] ? generic_file_aio_write+0x45/0xbc > > [ 107.839801] [] ? lock_is_held+0x92/0x9d > > [ 107.839803] [] mutex_lock_nested+0x36/0x3b > > [ 107.839806] [] generic_file_aio_write+0x45/0xbc > > [ 107.839810] [] ? scatterwalk_map+0x2b/0x5d > > [ 107.839813] [] ? get_parent_ip+0xe/0x3e > > [ 107.839816] [] ext4_file_write+0x1e2/0x23a > > [ 107.839818] [] ? mark_lock+0x2d/0x258 > > [ 107.839821] [] do_sync_write+0xbd/0xfd > > [ 107.839824] [] ? __mutex_unlock_slowpath+0x11e/0x152 > > [ 107.839828] [] ? security_file_permission+0x29/0x2e > > [ 107.839831] [] ? rw_verify_area+0xab/0xc8 > > [ 107.839834] [] vfs_write+0xa7/0xee > > [ 107.839838] [] ecryptfs_write_lower+0x4e/0x73 [ecryptfs] > > [ 107.839842] [] ecryptfs_encrypt_page+0x11c/0x182 > > [ecryptfs] > > [ 107.839846] [] ecryptfs_writepage+0x31/0x73 [ecryptfs] > > [ 107.839849] [] __writepage+0x12/0x31 > > [ 107.839851] [] write_cache_pages+0x1e6/0x310 > > [ 107.839854] [] ? bdi_set_max_ratio+0x6a/0x6a > > [ 107.839857] [] ? sub_preempt_count+0x90/0xa3 > > [ 107.839860] [] generic_writepages+0x3e/0x54 > > [ 107.839863] [] do_writepages+0x26/0x28 > > [ 107.839866] [] __filemap_fdatawrite_range+0x4e/0x50 > > [ 107.839869] [] filemap_fdatawrite+0x1a/0x1c > > [ 107.839871] [] filemap_write_and_wait+0x1b/0x36 > > [ 107.839875] [] ecryptfs_vma_close+0x17/0x19 [ecryptfs] > > [ 107.839877] [] remove_vma+0x3b/0x71 > > [ 107.839879] [] do_munmap+0x2ed/0x306 > > [ 107.839882] [] sys_munmap+0x44/0x5b > > [ 107.839884] [] system_call_fastpath+0x16/0x1b