Re: 2.6.26-rc1: possible circular locking dependency

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: 2.6.26-rc1: possible circular locking dependency
       [not found] <a4423d670805101005x113c4813w2b95c1fb535cf080@mail.gmail.com>
@ 2008-05-10 17:46 ` Kamalesh Babulal
  2008-05-11  3:48   ` 2.6.26-rc1: possible circular locking dependency with xfs filesystem Kamalesh Babulal
  0 siblings, 1 reply; 5+ messages in thread
From: Kamalesh Babulal @ 2008-05-10 17:46 UTC (permalink / raw)
  To: Alexander Beregalov; +Cc: kernel-testers, kernel list, Ingo Molnar, peterz

Adding the cc to kernel-list, Ingo Molnar and Peter Zijlstra

Alexander Beregalov wrote:
> [ INFO: possible circular locking dependency detected ]
> 2.6.26-rc1-00279-g28a4acb #13
> -------------------------------------------------------
> nfsd/3087 is trying to acquire lock:
>  (iprune_mutex){--..}, at: [<c016f947>] shrink_icache_memory+0x38/0x19b
> 
> but task is already holding lock:
>  (&(&ip->i_iolock)->mr_lock){----}, at: [<c0210b83>] xfs_ilock+0xa2/0xd6
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (&(&ip->i_iolock)->mr_lock){----}:
>        [<c01352e6>] __lock_acquire+0xa0c/0xbc6
>        [<c013550a>] lock_acquire+0x6a/0x86
>        [<c012c39a>] down_write_nested+0x33/0x6a
>        [<c0210b5c>] xfs_ilock+0x7b/0xd6
>        [<c0210cd5>] xfs_ireclaim+0x1d/0x59
>        [<c022edfe>] xfs_finish_reclaim+0x173/0x195
>        [<c0230fa3>] xfs_reclaim+0xb3/0x138
>        [<c023b4cb>] xfs_fs_clear_inode+0x55/0x8e
>        [<c016f60b>] clear_inode+0x83/0xd2
>        [<c016f88a>] dispose_list+0x3c/0xc1
>        [<c016fa82>] shrink_icache_memory+0x173/0x19b
>        [<c014a68d>] shrink_slab+0xda/0x14e
>        [<c014a8e5>] try_to_free_pages+0x1e4/0x2a2
>        [<c0146997>] __alloc_pages_internal+0x23a/0x39d
>        [<c0146b11>] __alloc_pages+0xa/0xc
>        [<c01483b2>] __do_page_cache_readahead+0xaa/0x16a
>        [<c01484bc>] force_page_cache_readahead+0x4a/0x74
>        [<c014c9b0>] sys_madvise+0x308/0x400
>        [<c0102b25>] sysenter_past_esp+0x6a/0xb1
>        [<ffffffff>] 0xffffffff
> 
> -> #0 (iprune_mutex){--..}:
>        [<c0135203>] __lock_acquire+0x929/0xbc6
>        [<c013550a>] lock_acquire+0x6a/0x86
>        [<c0356a6f>] mutex_lock_nested+0xb4/0x226
>        [<c016f947>] shrink_icache_memory+0x38/0x19b
>        [<c014a68d>] shrink_slab+0xda/0x14e
>        [<c014a8e5>] try_to_free_pages+0x1e4/0x2a2
>        [<c0146997>] __alloc_pages_internal+0x23a/0x39d
>        [<c0146b11>] __alloc_pages+0xa/0xc
>        [<c01483b2>] __do_page_cache_readahead+0xaa/0x16a
>        [<c014866c>] ondemand_readahead+0x119/0x127
>        [<c01486cc>] page_cache_async_readahead+0x52/0x5d
>        [<c0178e46>] generic_file_splice_read+0x290/0x4a8
>        [<c0239f06>] xfs_splice_read+0x4b/0x78
>        [<c0237713>] xfs_file_splice_read+0x24/0x29
>        [<c0178182>] do_splice_to+0x45/0x63
>        [<c01783f6>] splice_direct_to_actor+0xab/0x150
>        [<c01ce8e1>] nfsd_vfs_read+0x1ed/0x2d0
>        [<c01ced50>] nfsd_read+0x82/0x99
>        [<c01d42bc>] nfsd3_proc_read+0xdf/0x12a
>        [<c01cb40b>] nfsd_dispatch+0xcf/0x19e
>        [<c033f484>] svc_process+0x3b3/0x68b
>        [<c01cb939>] nfsd+0x168/0x26b
>        [<c0103747>] kernel_thread_helper+0x7/0x10
>        [<ffffffff>] 0xffffffff
> 
> other info that might help us debug this:
> 
> 3 locks held by nfsd/3087:
>  #0:  (hash_sem){..--}, at: [<c01d1538>] exp_readlock+0xd/0xf
>  #1:  (&(&ip->i_iolock)->mr_lock){----}, at: [<c0210b83>] xfs_ilock+0xa2/0xd6
>  #2:  (shrinker_rwsem){----}, at: [<c014a5d7>] shrink_slab+0x24/0x14e
> 
> stack backtrace:
> Pid: 3087, comm: nfsd Not tainted 2.6.26-rc1-00279-g28a4acb #13
>  [<c0133498>] print_circular_bug_tail+0x5a/0x65
>  [<c0133d99>] ? print_circular_bug_header+0xa8/0xb3
>  [<c0135203>] __lock_acquire+0x929/0xbc6
>  [<c0106c1a>] ? native_sched_clock+0x8b/0x9f
>  [<c013550a>] lock_acquire+0x6a/0x86
>  [<c016f947>] ? shrink_icache_memory+0x38/0x19b
>  [<c0356a6f>] mutex_lock_nested+0xb4/0x226
>  [<c016f947>] ? shrink_icache_memory+0x38/0x19b
>  [<c016f947>] ? shrink_icache_memory+0x38/0x19b
>  [<c016f947>] shrink_icache_memory+0x38/0x19b
>  [<c014a68d>] shrink_slab+0xda/0x14e
>  [<c014a8e5>] try_to_free_pages+0x1e4/0x2a2
>  [<c03583c8>] ? _spin_unlock_irqrestore+0x36/0x58
>  [<c014982f>] ? isolate_pages_global+0x0/0x3e
>  [<c0146997>] __alloc_pages_internal+0x23a/0x39d
>  [<c0146b11>] __alloc_pages+0xa/0xc
>  [<c01483b2>] __do_page_cache_readahead+0xaa/0x16a
>  [<c014866c>] ondemand_readahead+0x119/0x127
>  [<c01486cc>] page_cache_async_readahead+0x52/0x5d
>  [<c0178e46>] generic_file_splice_read+0x290/0x4a8
>  [<c0358305>] ? _spin_unlock+0x27/0x3c
>  [<c0250e55>] ? _atomic_dec_and_lock+0x25/0x30
>  [<c016ed3f>] ? iput+0x24/0x4e
>  [<c0135484>] ? __lock_acquire+0xbaa/0xbc6
>  [<c01cb12a>] ? exportfs_decode_fh+0x9b/0x1a1
>  [<c0178245>] ? spd_release_page+0x0/0xf
>  [<c0239f06>] xfs_splice_read+0x4b/0x78
>  [<c0237713>] xfs_file_splice_read+0x24/0x29
>  [<c0178182>] do_splice_to+0x45/0x63
>  [<c01783f6>] splice_direct_to_actor+0xab/0x150
>  [<c01ce9c4>] ? nfsd_direct_splice_actor+0x0/0xf
>  [<c01ce8e1>] nfsd_vfs_read+0x1ed/0x2d0
>  [<c01ced50>] nfsd_read+0x82/0x99
>  [<c01d42bc>] nfsd3_proc_read+0xdf/0x12a
>  [<c01cb40b>] nfsd_dispatch+0xcf/0x19e
>  [<c033f484>] svc_process+0x3b3/0x68b
>  [<c01cb939>] nfsd+0x168/0x26b
>  [<c01cb7d1>] ? nfsd+0x0/0x26b
>  [<c0103747>] kernel_thread_helper+0x7/0x10
> --
> To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.26-rc1: possible circular locking dependency with xfs filesystem
  2008-05-10 17:46 ` 2.6.26-rc1: possible circular locking dependency Kamalesh Babulal
@ 2008-05-11  3:48   ` Kamalesh Babulal
  2008-05-11 23:10     ` David Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Kamalesh Babulal @ 2008-05-11  3:48 UTC (permalink / raw)
  To: pvp-lsts
  Cc: Alexander Beregalov, kernel-testers, kernel list, Ingo Molnar,
	peterz, xfs, David Chinner

Kamalesh Babulal wrote:
> Adding the cc to kernel-list, Ingo Molnar and Peter Zijlstra
> 
> Alexander Beregalov wrote:
>> [ INFO: possible circular locking dependency detected ]
>> 2.6.26-rc1-00279-g28a4acb #13
>> -------------------------------------------------------
>> nfsd/3087 is trying to acquire lock:
>>  (iprune_mutex){--..}, at: [<c016f947>] shrink_icache_memory+0x38/0x19b
>>
>> but task is already holding lock:
>>  (&(&ip->i_iolock)->mr_lock){----}, at: [<c0210b83>] xfs_ilock+0xa2/0xd6
>>
>> which lock already depends on the new lock.
>>
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #1 (&(&ip->i_iolock)->mr_lock){----}:
>>        [<c01352e6>] __lock_acquire+0xa0c/0xbc6
>>        [<c013550a>] lock_acquire+0x6a/0x86
>>        [<c012c39a>] down_write_nested+0x33/0x6a
>>        [<c0210b5c>] xfs_ilock+0x7b/0xd6
>>        [<c0210cd5>] xfs_ireclaim+0x1d/0x59
>>        [<c022edfe>] xfs_finish_reclaim+0x173/0x195
>>        [<c0230fa3>] xfs_reclaim+0xb3/0x138
>>        [<c023b4cb>] xfs_fs_clear_inode+0x55/0x8e
>>        [<c016f60b>] clear_inode+0x83/0xd2
>>        [<c016f88a>] dispose_list+0x3c/0xc1
>>        [<c016fa82>] shrink_icache_memory+0x173/0x19b
>>        [<c014a68d>] shrink_slab+0xda/0x14e
>>        [<c014a8e5>] try_to_free_pages+0x1e4/0x2a2
>>        [<c0146997>] __alloc_pages_internal+0x23a/0x39d
>>        [<c0146b11>] __alloc_pages+0xa/0xc
>>        [<c01483b2>] __do_page_cache_readahead+0xaa/0x16a
>>        [<c01484bc>] force_page_cache_readahead+0x4a/0x74
>>        [<c014c9b0>] sys_madvise+0x308/0x400
>>        [<c0102b25>] sysenter_past_esp+0x6a/0xb1
>>        [<ffffffff>] 0xffffffff
>>
>> -> #0 (iprune_mutex){--..}:
>>        [<c0135203>] __lock_acquire+0x929/0xbc6
>>        [<c013550a>] lock_acquire+0x6a/0x86
>>        [<c0356a6f>] mutex_lock_nested+0xb4/0x226
>>        [<c016f947>] shrink_icache_memory+0x38/0x19b
>>        [<c014a68d>] shrink_slab+0xda/0x14e
>>        [<c014a8e5>] try_to_free_pages+0x1e4/0x2a2
>>        [<c0146997>] __alloc_pages_internal+0x23a/0x39d
>>        [<c0146b11>] __alloc_pages+0xa/0xc
>>        [<c01483b2>] __do_page_cache_readahead+0xaa/0x16a
>>        [<c014866c>] ondemand_readahead+0x119/0x127
>>        [<c01486cc>] page_cache_async_readahead+0x52/0x5d
>>        [<c0178e46>] generic_file_splice_read+0x290/0x4a8
>>        [<c0239f06>] xfs_splice_read+0x4b/0x78
>>        [<c0237713>] xfs_file_splice_read+0x24/0x29
>>        [<c0178182>] do_splice_to+0x45/0x63
>>        [<c01783f6>] splice_direct_to_actor+0xab/0x150
>>        [<c01ce8e1>] nfsd_vfs_read+0x1ed/0x2d0
>>        [<c01ced50>] nfsd_read+0x82/0x99
>>        [<c01d42bc>] nfsd3_proc_read+0xdf/0x12a
>>        [<c01cb40b>] nfsd_dispatch+0xcf/0x19e
>>        [<c033f484>] svc_process+0x3b3/0x68b
>>        [<c01cb939>] nfsd+0x168/0x26b
>>        [<c0103747>] kernel_thread_helper+0x7/0x10
>>        [<ffffffff>] 0xffffffff
>>
>> other info that might help us debug this:
>>
>> 3 locks held by nfsd/3087:
>>  #0:  (hash_sem){..--}, at: [<c01d1538>] exp_readlock+0xd/0xf
>>  #1:  (&(&ip->i_iolock)->mr_lock){----}, at: [<c0210b83>] xfs_ilock+0xa2/0xd6
>>  #2:  (shrinker_rwsem){----}, at: [<c014a5d7>] shrink_slab+0x24/0x14e
>>
>> stack backtrace:
>> Pid: 3087, comm: nfsd Not tainted 2.6.26-rc1-00279-g28a4acb #13
>>  [<c0133498>] print_circular_bug_tail+0x5a/0x65
>>  [<c0133d99>] ? print_circular_bug_header+0xa8/0xb3
>>  [<c0135203>] __lock_acquire+0x929/0xbc6
>>  [<c0106c1a>] ? native_sched_clock+0x8b/0x9f
>>  [<c013550a>] lock_acquire+0x6a/0x86
>>  [<c016f947>] ? shrink_icache_memory+0x38/0x19b
>>  [<c0356a6f>] mutex_lock_nested+0xb4/0x226
>>  [<c016f947>] ? shrink_icache_memory+0x38/0x19b
>>  [<c016f947>] ? shrink_icache_memory+0x38/0x19b
>>  [<c016f947>] shrink_icache_memory+0x38/0x19b
>>  [<c014a68d>] shrink_slab+0xda/0x14e
>>  [<c014a8e5>] try_to_free_pages+0x1e4/0x2a2
>>  [<c03583c8>] ? _spin_unlock_irqrestore+0x36/0x58
>>  [<c014982f>] ? isolate_pages_global+0x0/0x3e
>>  [<c0146997>] __alloc_pages_internal+0x23a/0x39d
>>  [<c0146b11>] __alloc_pages+0xa/0xc
>>  [<c01483b2>] __do_page_cache_readahead+0xaa/0x16a
>>  [<c014866c>] ondemand_readahead+0x119/0x127
>>  [<c01486cc>] page_cache_async_readahead+0x52/0x5d
>>  [<c0178e46>] generic_file_splice_read+0x290/0x4a8
>>  [<c0358305>] ? _spin_unlock+0x27/0x3c
>>  [<c0250e55>] ? _atomic_dec_and_lock+0x25/0x30
>>  [<c016ed3f>] ? iput+0x24/0x4e
>>  [<c0135484>] ? __lock_acquire+0xbaa/0xbc6
>>  [<c01cb12a>] ? exportfs_decode_fh+0x9b/0x1a1
>>  [<c0178245>] ? spd_release_page+0x0/0xf
>>  [<c0239f06>] xfs_splice_read+0x4b/0x78
>>  [<c0237713>] xfs_file_splice_read+0x24/0x29
>>  [<c0178182>] do_splice_to+0x45/0x63
>>  [<c01783f6>] splice_direct_to_actor+0xab/0x150
>>  [<c01ce9c4>] ? nfsd_direct_splice_actor+0x0/0xf
>>  [<c01ce8e1>] nfsd_vfs_read+0x1ed/0x2d0
>>  [<c01ced50>] nfsd_read+0x82/0x99
>>  [<c01d42bc>] nfsd3_proc_read+0xdf/0x12a
>>  [<c01cb40b>] nfsd_dispatch+0xcf/0x19e
>>  [<c033f484>] svc_process+0x3b3/0x68b
>>  [<c01cb939>] nfsd+0x168/0x26b
>>  [<c01cb7d1>] ? nfsd+0x0/0x26b
>>  [<c0103747>] kernel_thread_helper+0x7/0x10
>> --
Adding the trimmed forward message of syslog from Plamen Petrov <pvp-lsts@fs.ru.acad.bg>

May  9 02:16:46 nomad64 kernel: [42951853.992912] 
May  9 02:16:46 nomad64 kernel: [42951853.992913] =======================================================
May  9 02:16:46 nomad64 kernel: [42951853.992920] [ INFO: possible circular locking dependency detected ]
May  9 02:16:46 nomad64 kernel: [42951853.992922] 2.6.26-rc1-00243-g46e4965 #1
May  9 02:16:46 nomad64 kernel: [42951853.992924] -------------------------------------------------------
May  9 02:16:46 nomad64 kernel: [42951853.992927] kio_http/3813 is trying to acquire lock:
May  9 02:16:46 nomad64 kernel: [42951853.992930]  (&mm->mmap_sem){----}, at: [<ffffffff80222bbd>] do_page_fault+0xdd/0x890
May  9 02:16:46 nomad64 kernel: [42951853.992944] 
May  9 02:16:46 nomad64 kernel: [42951853.992944] but task is already holding lock:
May  9 02:16:46 nomad64 kernel: [42951853.992947]  (&(&ip->i_iolock)->mr_lock){----}, at: [<ffffffff80387f85>] xfs_ilock+0x65/0xa0
May  9 02:16:46 nomad64 kernel: [42951853.992960] 
May  9 02:16:46 nomad64 kernel: [42951853.992960] which lock already depends on the new lock.
May  9 02:16:46 nomad64 kernel: [42951853.992961] 
May  9 02:16:46 nomad64 kernel: [42951853.992964] 
May  9 02:16:46 nomad64 kernel: [42951853.992965] the existing dependency chain (in reverse order) is:
May  9 02:16:46 nomad64 kernel: [42951853.992967] 
May  9 02:16:46 nomad64 kernel: [42951853.992968] -> #1 (&(&ip->i_iolock)->mr_lock){----}:
May  9 02:16:46 nomad64 kernel: [42951853.992974]        [<ffffffff80261d72>] __lock_acquire+0xf92/0x1080
May  9 02:16:46 nomad64 kernel: [42951853.992989]        [<ffffffff80261f02>] lock_acquire+0xa2/0xd0
May  9 02:16:46 nomad64 kernel: [42951853.993002]        [<ffffffff80255556>] down_write_nested+0x46/0x80
May  9 02:16:46 nomad64 kernel: [42951853.993018]        [<ffffffff80387fb9>] xfs_ilock+0x99/0xa0
May  9 02:16:46 nomad64 kernel: [42951853.993034]        [<ffffffff803a5117>] xfs_free_eofblocks+0x1c7/0x250
May  9 02:16:46 nomad64 kernel: [42951853.993049]        [<ffffffff803a8a26>] xfs_release+0x186/0x1d0
May  9 02:16:46 nomad64 kernel: [42951853.993062]        [<ffffffff803aeeb0>] xfs_file_release+0x10/0x20
May  9 02:16:46 nomad64 kernel: [42951853.993076]        [<ffffffff802a01cc>] __fput+0xcc/0x1c0
May  9 02:16:46 nomad64 kernel: [42951853.993091]        [<ffffffff802a05e6>] fput+0x16/0x20
May  9 02:16:46 nomad64 kernel: [42951853.993105]        [<ffffffff8028865a>] remove_vma+0x4a/0x80
May  9 02:16:46 nomad64 kernel: [42951853.993120]        [<ffffffff802894e1>] do_munmap+0x281/0x2e0
May  9 02:16:46 nomad64 kernel: [42951853.993134]        [<ffffffff8028958b>] sys_munmap+0x4b/0x70
May  9 02:16:46 nomad64 kernel: [42951853.993148]        [<ffffffff8020b62b>] system_call_after_swapgs+0x7b/0x80
May  9 02:16:46 nomad64 kernel: [42951853.993161]        [<ffffffffffffffff>] 0xffffffffffffffff
May  9 02:16:46 nomad64 kernel: [42951853.993178] 
May  9 02:16:46 nomad64 kernel: [42951853.993178] -> #0 (&mm->mmap_sem){----}:
May  9 02:16:46 nomad64 kernel: [42951853.993185]        [<ffffffff80261b90>] __lock_acquire+0xdb0/0x1080
May  9 02:16:46 nomad64 kernel: [42951853.993197]        [<ffffffff80261f02>] lock_acquire+0xa2/0xd0
May  9 02:16:46 nomad64 kernel: [42951853.993213]        [<ffffffff806b887b>] down_read+0x3b/0x70
May  9 02:16:46 nomad64 kernel: [42951853.993228]        [<ffffffff80222bbd>] do_page_fault+0xdd/0x890
May  9 02:16:46 nomad64 kernel: [42951853.993241]        [<ffffffff806ba5dd>] error_exit+0x0/0xa9
May  9 02:16:46 nomad64 kernel: [42951853.993256]        [<ffffffffffffffff>] 0xffffffffffffffff
May  9 02:16:46 nomad64 kernel: [42951853.993269] 
May  9 02:16:46 nomad64 kernel: [42951853.993270] other info that might help us debug this:
May  9 02:16:46 nomad64 kernel: [42951853.993270] 
May  9 02:16:46 nomad64 kernel: [42951853.993273] 1 lock held by kio_http/3813:
May  9 02:16:46 nomad64 kernel: [42951853.993275]  #0:  (&(&ip->i_iolock)->mr_lock){----}, at: [<ffffffff80387f85>] xfs_ilock+0x65/0xa0
May  9 02:16:46 nomad64 kernel: [42951853.993286] 
May  9 02:16:46 nomad64 kernel: [42951853.993287] stack backtrace:
May  9 02:16:46 nomad64 kernel: [42951853.993290] Pid: 3813, comm: kio_http Not tainted 2.6.26-rc1-00243-g46e4965 #1
May  9 02:16:46 nomad64 kernel: [42951853.993292] 
May  9 02:16:46 nomad64 kernel: [42951853.993293] Call Trace:
May  9 02:16:46 nomad64 kernel: [42951853.993297]  [<ffffffff8025f2b3>] print_circular_bug_tail+0x83/0x90
May  9 02:16:46 nomad64 kernel: [42951853.993302]  [<ffffffff80261b90>] __lock_acquire+0xdb0/0x1080
May  9 02:16:46 nomad64 kernel: [42951853.993306]  [<ffffffff80222bbd>] ? do_page_fault+0xdd/0x890
May  9 02:16:46 nomad64 kernel: [42951853.993310]  [<ffffffff80261f02>] lock_acquire+0xa2/0xd0
May  9 02:16:46 nomad64 kernel: [42951853.993313]  [<ffffffff80222bbd>] ? do_page_fault+0xdd/0x890
May  9 02:16:46 nomad64 kernel: [42951853.993317]  [<ffffffff806b887b>] down_read+0x3b/0x70
May  9 02:16:46 nomad64 kernel: [42951853.993320]  [<ffffffff80222bbd>] do_page_fault+0xdd/0x890
May  9 02:16:46 nomad64 kernel: [42951853.993324]  [<ffffffff806ba5dd>] error_exit+0x0/0xa9
May  9 02:16:46 nomad64 kernel: [42951853.993328]  [<ffffffff802739b6>] ? file_read_actor+0x46/0x1b0
May  9 02:16:46 nomad64 kernel: [42951853.993331]  [<ffffffff806ba3d6>] ? _read_unlock_irq+0x36/0x60
May  9 02:16:46 nomad64 kernel: [42951853.993335]  [<ffffffff80275dbc>] ? generic_file_aio_read+0x2cc/0x5d0
May  9 02:16:46 nomad64 kernel: [42951853.993339]  [<ffffffff8025ddb9>] ? get_lock_stats+0x19/0x70
May  9 02:16:46 nomad64 kernel: [42951853.993343]  [<ffffffff803b2769>] ? xfs_read+0x139/0x220
May  9 02:16:46 nomad64 kernel: [42951853.993347]  [<ffffffff803af06d>] ? xfs_file_aio_read+0x4d/0x60
May  9 02:16:46 nomad64 kernel: [42951853.993350]  [<ffffffff8029eeb1>] ? do_sync_read+0xf1/0x130
May  9 02:16:46 nomad64 kernel: [42951853.993354]  [<ffffffff802516e0>] ? autoremove_wake_function+0x0/0x40
May  9 02:16:46 nomad64 kernel: [42951853.993358]  [<ffffffff8026089a>] ? trace_hardirqs_on+0xda/0x170
May  9 02:16:46 nomad64 kernel: [42951853.993361]  [<ffffffff80272e45>] ? __rcu_read_unlock+0xb5/0xc0
May  9 02:16:46 nomad64 kernel: [42951853.993365]  [<ffffffff8026089a>] ? trace_hardirqs_on+0xda/0x170
May  9 02:16:46 nomad64 kernel: [42951853.993369]  [<ffffffff803c4381>] ? security_file_permission+0x11/0x20
May  9 02:16:46 nomad64 kernel: [42951853.993374]  [<ffffffff8029f794>] ? vfs_read+0xc4/0x160
May  9 02:16:46 nomad64 kernel: [42951853.993377]  [<ffffffff8029fc30>] ? sys_read+0x50/0x90
May  9 02:16:46 nomad64 kernel: [42951853.993380]  [<ffffffff8020b62b>] ? system_call_after_swapgs+0x7b/0x80

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.26-rc1: possible circular locking dependency with xfs filesystem
  2008-05-11  3:48   ` 2.6.26-rc1: possible circular locking dependency with xfs filesystem Kamalesh Babulal
@ 2008-05-11 23:10     ` David Chinner
  2008-05-15 17:45       ` Alexander Beregalov
  0 siblings, 1 reply; 5+ messages in thread
From: David Chinner @ 2008-05-11 23:10 UTC (permalink / raw)
  To: Kamalesh Babulal
  Cc: pvp-lsts, Alexander Beregalov, kernel-testers, kernel list,
	Ingo Molnar, peterz, xfs, David Chinner

On Sun, May 11, 2008 at 09:18:07AM +0530, Kamalesh Babulal wrote:
> Kamalesh Babulal wrote:
> > Adding the cc to kernel-list, Ingo Molnar and Peter Zijlstra
> > 
> > Alexander Beregalov wrote:
> >> [ INFO: possible circular locking dependency detected ]
> >> 2.6.26-rc1-00279-g28a4acb #13
> >> -------------------------------------------------------
> >> nfsd/3087 is trying to acquire lock:
> >>  (iprune_mutex){--..}, at: [<c016f947>] shrink_icache_memory+0x38/0x19b
> >>
> >> but task is already holding lock:
> >>  (&(&ip->i_iolock)->mr_lock){----}, at: [<c0210b83>] xfs_ilock+0xa2/0xd6
> >>
> >> which lock already depends on the new lock.
> >>
> >>
> >> the existing dependency chain (in reverse order) is:
> >>
> >> -> #1 (&(&ip->i_iolock)->mr_lock){----}:
> >>        [<c01352e6>] __lock_acquire+0xa0c/0xbc6
> >>        [<c013550a>] lock_acquire+0x6a/0x86
> >>        [<c012c39a>] down_write_nested+0x33/0x6a
> >>        [<c0210b5c>] xfs_ilock+0x7b/0xd6
> >>        [<c0210cd5>] xfs_ireclaim+0x1d/0x59
> >>        [<c022edfe>] xfs_finish_reclaim+0x173/0x195
> >>        [<c0230fa3>] xfs_reclaim+0xb3/0x138
> >>        [<c023b4cb>] xfs_fs_clear_inode+0x55/0x8e
> >>        [<c016f60b>] clear_inode+0x83/0xd2
> >>        [<c016f88a>] dispose_list+0x3c/0xc1
> >>        [<c016fa82>] shrink_icache_memory+0x173/0x19b
> >>        [<c014a68d>] shrink_slab+0xda/0x14e
> >>        [<c014a8e5>] try_to_free_pages+0x1e4/0x2a2
> >>        [<c0146997>] __alloc_pages_internal+0x23a/0x39d
> >>        [<c0146b11>] __alloc_pages+0xa/0xc
> >>        [<c01483b2>] __do_page_cache_readahead+0xaa/0x16a
> >>        [<c01484bc>] force_page_cache_readahead+0x4a/0x74
> >>        [<c014c9b0>] sys_madvise+0x308/0x400
> >>        [<c0102b25>] sysenter_past_esp+0x6a/0xb1
> >>        [<ffffffff>] 0xffffffff
> >>
> >> -> #0 (iprune_mutex){--..}:
> >>        [<c0135203>] __lock_acquire+0x929/0xbc6
> >>        [<c013550a>] lock_acquire+0x6a/0x86
> >>        [<c0356a6f>] mutex_lock_nested+0xb4/0x226
> >>        [<c016f947>] shrink_icache_memory+0x38/0x19b
> >>        [<c014a68d>] shrink_slab+0xda/0x14e
> >>        [<c014a8e5>] try_to_free_pages+0x1e4/0x2a2
> >>        [<c0146997>] __alloc_pages_internal+0x23a/0x39d
> >>        [<c0146b11>] __alloc_pages+0xa/0xc
> >>        [<c01483b2>] __do_page_cache_readahead+0xaa/0x16a
> >>        [<c014866c>] ondemand_readahead+0x119/0x127
> >>        [<c01486cc>] page_cache_async_readahead+0x52/0x5d
> >>        [<c0178e46>] generic_file_splice_read+0x290/0x4a8
> >>        [<c0239f06>] xfs_splice_read+0x4b/0x78
> >>        [<c0237713>] xfs_file_splice_read+0x24/0x29
> >>        [<c0178182>] do_splice_to+0x45/0x63
> >>        [<c01783f6>] splice_direct_to_actor+0xab/0x150
> >>        [<c01ce8e1>] nfsd_vfs_read+0x1ed/0x2d0
> >>        [<c01ced50>] nfsd_read+0x82/0x99
> >>        [<c01d42bc>] nfsd3_proc_read+0xdf/0x12a
> >>        [<c01cb40b>] nfsd_dispatch+0xcf/0x19e
> >>        [<c033f484>] svc_process+0x3b3/0x68b
> >>        [<c01cb939>] nfsd+0x168/0x26b
> >>        [<c0103747>] kernel_thread_helper+0x7/0x10
> >>        [<ffffffff>] 0xffffffff

Oh, yeah, that. Direct inode reclaim through memory pressure.

Effectively memory reclaim inverts locking order w.r.t. iprune_mutex
when it recurses into the filesystem. False positive - can never
cause a deadlock on XFS. Can't be solved from the XFS side of things
without effectively turning off lockdep checking for xfs inode
locking.

The fix is needed to lockdep via iprune_mutex annotations here....

> May  9 02:16:46 nomad64 kernel: [42951853.992965] the existing dependency chain (in reverse order) is:
> May  9 02:16:46 nomad64 kernel: [42951853.992967] 
> May  9 02:16:46 nomad64 kernel: [42951853.992968] -> #1 (&(&ip->i_iolock)->mr_lock){----}:
> May  9 02:16:46 nomad64 kernel: [42951853.992974]        [<ffffffff80261d72>] __lock_acquire+0xf92/0x1080
> May  9 02:16:46 nomad64 kernel: [42951853.992989]        [<ffffffff80261f02>] lock_acquire+0xa2/0xd0
> May  9 02:16:46 nomad64 kernel: [42951853.993002]        [<ffffffff80255556>] down_write_nested+0x46/0x80
> May  9 02:16:46 nomad64 kernel: [42951853.993018]        [<ffffffff80387fb9>] xfs_ilock+0x99/0xa0
> May  9 02:16:46 nomad64 kernel: [42951853.993034]        [<ffffffff803a5117>] xfs_free_eofblocks+0x1c7/0x250
> May  9 02:16:46 nomad64 kernel: [42951853.993049]        [<ffffffff803a8a26>] xfs_release+0x186/0x1d0
> May  9 02:16:46 nomad64 kernel: [42951853.993062]        [<ffffffff803aeeb0>] xfs_file_release+0x10/0x20
> May  9 02:16:46 nomad64 kernel: [42951853.993076]        [<ffffffff802a01cc>] __fput+0xcc/0x1c0
> May  9 02:16:46 nomad64 kernel: [42951853.993091]        [<ffffffff802a05e6>] fput+0x16/0x20
> May  9 02:16:46 nomad64 kernel: [42951853.993105]        [<ffffffff8028865a>] remove_vma+0x4a/0x80
> May  9 02:16:46 nomad64 kernel: [42951853.993120]        [<ffffffff802894e1>] do_munmap+0x281/0x2e0
> May  9 02:16:46 nomad64 kernel: [42951853.993134]        [<ffffffff8028958b>] sys_munmap+0x4b/0x70
> May  9 02:16:46 nomad64 kernel: [42951853.993148]        [<ffffffff8020b62b>] system_call_after_swapgs+0x7b/0x80
> May  9 02:16:46 nomad64 kernel: [42951853.993161]        [<ffffffffffffffff>] 0xffffffffffffffff

hmmmm. Sounds like:

	fd = open()
	addr = mmap(fd)
	close(fd)
	.....
	munmap(addr);

But yes, XFS takes locks in ->release which means.....

> May  9 02:16:46 nomad64 kernel: [42951853.993293] Call Trace:
> May  9 02:16:46 nomad64 kernel: [42951853.993297]  [<ffffffff8025f2b3>] print_circular_bug_tail+0x83/0x90
> May  9 02:16:46 nomad64 kernel: [42951853.993302]  [<ffffffff80261b90>] __lock_acquire+0xdb0/0x1080
> May  9 02:16:46 nomad64 kernel: [42951853.993306]  [<ffffffff80222bbd>] ? do_page_fault+0xdd/0x890
> May  9 02:16:46 nomad64 kernel: [42951853.993310]  [<ffffffff80261f02>] lock_acquire+0xa2/0xd0
> May  9 02:16:46 nomad64 kernel: [42951853.993313]  [<ffffffff80222bbd>] ? do_page_fault+0xdd/0x890
> May  9 02:16:46 nomad64 kernel: [42951853.993317]  [<ffffffff806b887b>] down_read+0x3b/0x70
> May  9 02:16:46 nomad64 kernel: [42951853.993320]  [<ffffffff80222bbd>] do_page_fault+0xdd/0x890
> May  9 02:16:46 nomad64 kernel: [42951853.993324]  [<ffffffff806ba5dd>] error_exit+0x0/0xa9
> May  9 02:16:46 nomad64 kernel: [42951853.993328]  [<ffffffff802739b6>] ? file_read_actor+0x46/0x1b0
> May  9 02:16:46 nomad64 kernel: [42951853.993331]  [<ffffffff806ba3d6>] ? _read_unlock_irq+0x36/0x60
> May  9 02:16:46 nomad64 kernel: [42951853.993335]  [<ffffffff80275dbc>] ? generic_file_aio_read+0x2cc/0x5d0
> May  9 02:16:46 nomad64 kernel: [42951853.993339]  [<ffffffff8025ddb9>] ? get_lock_stats+0x19/0x70
> May  9 02:16:46 nomad64 kernel: [42951853.993343]  [<ffffffff803b2769>] ? xfs_read+0x139/0x220
> May  9 02:16:46 nomad64 kernel: [42951853.993347]  [<ffffffff803af06d>] ? xfs_file_aio_read+0x4d/0x60
> May  9 02:16:46 nomad64 kernel: [42951853.993350]  [<ffffffff8029eeb1>] ? do_sync_read+0xf1/0x130
> May  9 02:16:46 nomad64 kernel: [42951853.993354]  [<ffffffff802516e0>] ? autoremove_wake_function+0x0/0x40
> May  9 02:16:46 nomad64 kernel: [42951853.993358]  [<ffffffff8026089a>] ? trace_hardirqs_on+0xda/0x170
> May  9 02:16:46 nomad64 kernel: [42951853.993361]  [<ffffffff80272e45>] ? __rcu_read_unlock+0xb5/0xc0
> May  9 02:16:46 nomad64 kernel: [42951853.993365]  [<ffffffff8026089a>] ? trace_hardirqs_on+0xda/0x170
> May  9 02:16:46 nomad64 kernel: [42951853.993369]  [<ffffffff803c4381>] ? security_file_permission+0x11/0x20
> May  9 02:16:46 nomad64 kernel: [42951853.993374]  [<ffffffff8029f794>] ? vfs_read+0xc4/0x160
> May  9 02:16:46 nomad64 kernel: [42951853.993377]  [<ffffffff8029fc30>] ? sys_read+0x50/0x90
> May  9 02:16:46 nomad64 kernel: [42951853.993380]  [<ffffffff8020b62b>] ? system_call_after_swapgs+0x7b/0x80

Oh, joy - a page fault during a read() call triggers lock order
inversions on the mmap->sem. I don't think this can deadlock
(can't be page faulting in a vma that is being torn down), but
it's clear from the last trace that the VM has a mmap->sem
inversion problem with ->release vs ->read and page faults...

Basically what we are seeing here in both cases is that the VM is
calling inode ->release or ->clear_inode methods with different high
level locks held. If the filesystem has to take the same locks in
these methods as it does in, say, ->read (like XFS does), then we
are guaranteed to get reports like this. AFAICT there's nothing we
can do from the filesystem perspective to prevent false positives like
this from being reported....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.26-rc1: possible circular locking dependency with xfs filesystem
  2008-05-11 23:10     ` David Chinner
@ 2008-05-15 17:45       ` Alexander Beregalov
  2008-05-15 22:27         ` David Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Beregalov @ 2008-05-15 17:45 UTC (permalink / raw)
  To: David Chinner
  Cc: Kamalesh Babulal, pvp-lsts, kernel-testers, kernel list,
	Ingo Molnar, peterz, xfs

2008/5/12 David Chinner <dgc@sgi.com>:
> On Sun, May 11, 2008 at 09:18:07AM +0530, Kamalesh Babulal wrote:
>> Kamalesh Babulal wrote:
>> > Adding the cc to kernel-list, Ingo Molnar and Peter Zijlstra
>> >
>> > Alexander Beregalov wrote:
>> >> [ INFO: possible circular locking dependency detected ]
>> >> 2.6.26-rc1-00279-g28a4acb #13
>> >> -------------------------------------------------------
>> >> nfsd/3087 is trying to acquire lock:
>> >>  (iprune_mutex){--..}, at: [<c016f947>] shrink_icache_memory+0x38/0x19b
>> >>
>> >> but task is already holding lock:
>> >>  (&(&ip->i_iolock)->mr_lock){----}, at: [<c0210b83>] xfs_ilock+0xa2/0xd6
>> >>
>> >> which lock already depends on the new lock.
>> >>
>> >>
>> >> the existing dependency chain (in reverse order) is:
>> >>
>> >> -> #1 (&(&ip->i_iolock)->mr_lock){----}:
>> >>        [<c01352e6>] __lock_acquire+0xa0c/0xbc6
>> >>        [<c013550a>] lock_acquire+0x6a/0x86
>> >>        [<c012c39a>] down_write_nested+0x33/0x6a
>> >>        [<c0210b5c>] xfs_ilock+0x7b/0xd6
>> >>        [<c0210cd5>] xfs_ireclaim+0x1d/0x59
>> >>        [<c022edfe>] xfs_finish_reclaim+0x173/0x195
>> >>        [<c0230fa3>] xfs_reclaim+0xb3/0x138
>> >>        [<c023b4cb>] xfs_fs_clear_inode+0x55/0x8e
>> >>        [<c016f60b>] clear_inode+0x83/0xd2
>> >>        [<c016f88a>] dispose_list+0x3c/0xc1
>> >>        [<c016fa82>] shrink_icache_memory+0x173/0x19b
>> >>        [<c014a68d>] shrink_slab+0xda/0x14e
>> >>        [<c014a8e5>] try_to_free_pages+0x1e4/0x2a2
>> >>        [<c0146997>] __alloc_pages_internal+0x23a/0x39d
>> >>        [<c0146b11>] __alloc_pages+0xa/0xc
>> >>        [<c01483b2>] __do_page_cache_readahead+0xaa/0x16a
>> >>        [<c01484bc>] force_page_cache_readahead+0x4a/0x74
>> >>        [<c014c9b0>] sys_madvise+0x308/0x400
>> >>        [<c0102b25>] sysenter_past_esp+0x6a/0xb1
>> >>        [<ffffffff>] 0xffffffff
>> >>
>> >> -> #0 (iprune_mutex){--..}:
>> >>        [<c0135203>] __lock_acquire+0x929/0xbc6
>> >>        [<c013550a>] lock_acquire+0x6a/0x86
>> >>        [<c0356a6f>] mutex_lock_nested+0xb4/0x226
>> >>        [<c016f947>] shrink_icache_memory+0x38/0x19b
>> >>        [<c014a68d>] shrink_slab+0xda/0x14e
>> >>        [<c014a8e5>] try_to_free_pages+0x1e4/0x2a2
>> >>        [<c0146997>] __alloc_pages_internal+0x23a/0x39d
>> >>        [<c0146b11>] __alloc_pages+0xa/0xc
>> >>        [<c01483b2>] __do_page_cache_readahead+0xaa/0x16a
>> >>        [<c014866c>] ondemand_readahead+0x119/0x127
>> >>        [<c01486cc>] page_cache_async_readahead+0x52/0x5d
>> >>        [<c0178e46>] generic_file_splice_read+0x290/0x4a8
>> >>        [<c0239f06>] xfs_splice_read+0x4b/0x78
>> >>        [<c0237713>] xfs_file_splice_read+0x24/0x29
>> >>        [<c0178182>] do_splice_to+0x45/0x63
>> >>        [<c01783f6>] splice_direct_to_actor+0xab/0x150
>> >>        [<c01ce8e1>] nfsd_vfs_read+0x1ed/0x2d0
>> >>        [<c01ced50>] nfsd_read+0x82/0x99
>> >>        [<c01d42bc>] nfsd3_proc_read+0xdf/0x12a
>> >>        [<c01cb40b>] nfsd_dispatch+0xcf/0x19e
>> >>        [<c033f484>] svc_process+0x3b3/0x68b
>> >>        [<c01cb939>] nfsd+0x168/0x26b
>> >>        [<c0103747>] kernel_thread_helper+0x7/0x10
>> >>        [<ffffffff>] 0xffffffff
>
> Oh, yeah, that. Direct inode reclaim through memory pressure.
>
> Effectively memory reclaim inverts locking order w.r.t. iprune_mutex
> when it recurses into the filesystem. False positive - can never
> cause a deadlock on XFS. Can't be solved from the XFS side of things
> without effectively turning off lockdep checking for xfs inode
> locking.
Yes, it is not a deadlock, but machine hangs for few seconds.
It still happens about once a day for me. Every kernel report looks
similar to the above.
I cannot reproduce it quickly, so bisect is not possible.

>
> The fix is needed to lockdep via iprune_mutex annotations here....
>
>> May  9 02:16:46 nomad64 kernel: [42951853.992965] the existing dependency chain (in reverse order) is:
>> May  9 02:16:46 nomad64 kernel: [42951853.992967]
>> May  9 02:16:46 nomad64 kernel: [42951853.992968] -> #1 (&(&ip->i_iolock)->mr_lock){----}:
>> May  9 02:16:46 nomad64 kernel: [42951853.992974]        [<ffffffff80261d72>] __lock_acquire+0xf92/0x1080
>> May  9 02:16:46 nomad64 kernel: [42951853.992989]        [<ffffffff80261f02>] lock_acquire+0xa2/0xd0
>> May  9 02:16:46 nomad64 kernel: [42951853.993002]        [<ffffffff80255556>] down_write_nested+0x46/0x80
>> May  9 02:16:46 nomad64 kernel: [42951853.993018]        [<ffffffff80387fb9>] xfs_ilock+0x99/0xa0
>> May  9 02:16:46 nomad64 kernel: [42951853.993034]        [<ffffffff803a5117>] xfs_free_eofblocks+0x1c7/0x250
>> May  9 02:16:46 nomad64 kernel: [42951853.993049]        [<ffffffff803a8a26>] xfs_release+0x186/0x1d0
>> May  9 02:16:46 nomad64 kernel: [42951853.993062]        [<ffffffff803aeeb0>] xfs_file_release+0x10/0x20
>> May  9 02:16:46 nomad64 kernel: [42951853.993076]        [<ffffffff802a01cc>] __fput+0xcc/0x1c0
>> May  9 02:16:46 nomad64 kernel: [42951853.993091]        [<ffffffff802a05e6>] fput+0x16/0x20
>> May  9 02:16:46 nomad64 kernel: [42951853.993105]        [<ffffffff8028865a>] remove_vma+0x4a/0x80
>> May  9 02:16:46 nomad64 kernel: [42951853.993120]        [<ffffffff802894e1>] do_munmap+0x281/0x2e0
>> May  9 02:16:46 nomad64 kernel: [42951853.993134]        [<ffffffff8028958b>] sys_munmap+0x4b/0x70
>> May  9 02:16:46 nomad64 kernel: [42951853.993148]        [<ffffffff8020b62b>] system_call_after_swapgs+0x7b/0x80
>> May  9 02:16:46 nomad64 kernel: [42951853.993161]        [<ffffffffffffffff>] 0xffffffffffffffff
>
> hmmmm. Sounds like:
>
>        fd = open()
>        addr = mmap(fd)
>        close(fd)
>        .....
>        munmap(addr);
>
> But yes, XFS takes locks in ->release which means.....
>
>> May  9 02:16:46 nomad64 kernel: [42951853.993293] Call Trace:
>> May  9 02:16:46 nomad64 kernel: [42951853.993297]  [<ffffffff8025f2b3>] print_circular_bug_tail+0x83/0x90
>> May  9 02:16:46 nomad64 kernel: [42951853.993302]  [<ffffffff80261b90>] __lock_acquire+0xdb0/0x1080
>> May  9 02:16:46 nomad64 kernel: [42951853.993306]  [<ffffffff80222bbd>] ? do_page_fault+0xdd/0x890
>> May  9 02:16:46 nomad64 kernel: [42951853.993310]  [<ffffffff80261f02>] lock_acquire+0xa2/0xd0
>> May  9 02:16:46 nomad64 kernel: [42951853.993313]  [<ffffffff80222bbd>] ? do_page_fault+0xdd/0x890
>> May  9 02:16:46 nomad64 kernel: [42951853.993317]  [<ffffffff806b887b>] down_read+0x3b/0x70
>> May  9 02:16:46 nomad64 kernel: [42951853.993320]  [<ffffffff80222bbd>] do_page_fault+0xdd/0x890
>> May  9 02:16:46 nomad64 kernel: [42951853.993324]  [<ffffffff806ba5dd>] error_exit+0x0/0xa9
>> May  9 02:16:46 nomad64 kernel: [42951853.993328]  [<ffffffff802739b6>] ? file_read_actor+0x46/0x1b0
>> May  9 02:16:46 nomad64 kernel: [42951853.993331]  [<ffffffff806ba3d6>] ? _read_unlock_irq+0x36/0x60
>> May  9 02:16:46 nomad64 kernel: [42951853.993335]  [<ffffffff80275dbc>] ? generic_file_aio_read+0x2cc/0x5d0
>> May  9 02:16:46 nomad64 kernel: [42951853.993339]  [<ffffffff8025ddb9>] ? get_lock_stats+0x19/0x70
>> May  9 02:16:46 nomad64 kernel: [42951853.993343]  [<ffffffff803b2769>] ? xfs_read+0x139/0x220
>> May  9 02:16:46 nomad64 kernel: [42951853.993347]  [<ffffffff803af06d>] ? xfs_file_aio_read+0x4d/0x60
>> May  9 02:16:46 nomad64 kernel: [42951853.993350]  [<ffffffff8029eeb1>] ? do_sync_read+0xf1/0x130
>> May  9 02:16:46 nomad64 kernel: [42951853.993354]  [<ffffffff802516e0>] ? autoremove_wake_function+0x0/0x40
>> May  9 02:16:46 nomad64 kernel: [42951853.993358]  [<ffffffff8026089a>] ? trace_hardirqs_on+0xda/0x170
>> May  9 02:16:46 nomad64 kernel: [42951853.993361]  [<ffffffff80272e45>] ? __rcu_read_unlock+0xb5/0xc0
>> May  9 02:16:46 nomad64 kernel: [42951853.993365]  [<ffffffff8026089a>] ? trace_hardirqs_on+0xda/0x170
>> May  9 02:16:46 nomad64 kernel: [42951853.993369]  [<ffffffff803c4381>] ? security_file_permission+0x11/0x20
>> May  9 02:16:46 nomad64 kernel: [42951853.993374]  [<ffffffff8029f794>] ? vfs_read+0xc4/0x160
>> May  9 02:16:46 nomad64 kernel: [42951853.993377]  [<ffffffff8029fc30>] ? sys_read+0x50/0x90
>> May  9 02:16:46 nomad64 kernel: [42951853.993380]  [<ffffffff8020b62b>] ? system_call_after_swapgs+0x7b/0x80
>
> Oh, joy - a page fault during a read() call triggers lock order
> inversions on the mmap->sem. I don't think this can deadlock
> (can't be page faulting in a vma that is being torn down), but
> it's clear from the last trace that the VM has a mmap->sem
> inversion problem with ->release vs ->read and page faults...
>
> Basically what we are seeing here in both cases is that the VM is
> calling inode ->release or ->clear_inode methods with different high
> level locks held. If the filesystem has to take the same locks in
> these methods as it does in, say, ->read (like XFS does), then we
> are guaranteed to get reports like this. AFAICT there's nothing we
> can do from the filesystem perspective to prevent false positives like
> this from being reported....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> Principal Engineer
> SGI Australian Software Group
> --
> To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.26-rc1: possible circular locking dependency with xfs filesystem
  2008-05-15 17:45       ` Alexander Beregalov
@ 2008-05-15 22:27         ` David Chinner
  0 siblings, 0 replies; 5+ messages in thread
From: David Chinner @ 2008-05-15 22:27 UTC (permalink / raw)
  To: Alexander Beregalov
  Cc: David Chinner, Kamalesh Babulal, pvp-lsts, kernel-testers,
	kernel list, Ingo Molnar, peterz, xfs

On Thu, May 15, 2008 at 09:45:55PM +0400, Alexander Beregalov wrote:
> 2008/5/12 David Chinner <dgc@sgi.com>:
> > On Sun, May 11, 2008 at 09:18:07AM +0530, Kamalesh Babulal wrote:
> >> Kamalesh Babulal wrote:
> >> > Adding the cc to kernel-list, Ingo Molnar and Peter Zijlstra
> >> >
> >> > Alexander Beregalov wrote:
> >> >> [ INFO: possible circular locking dependency detected ]
> >> >> 2.6.26-rc1-00279-g28a4acb #13
> >> >> -------------------------------------------------------
> >> >> nfsd/3087 is trying to acquire lock:
> >> >>  (iprune_mutex){--..}, at: [<c016f947>] shrink_icache_memory+0x38/0x19b
> >> >>
> >> >> but task is already holding lock:
> >> >>  (&(&ip->i_iolock)->mr_lock){----}, at: [<c0210b83>] xfs_ilock+0xa2/0xd6

[snip]

> > Oh, yeah, that. Direct inode reclaim through memory pressure.
> >
> > Effectively memory reclaim inverts locking order w.r.t. iprune_mutex
> > when it recurses into the filesystem. False positive - can never
> > cause a deadlock on XFS. Can't be solved from the XFS side of things
> > without effectively turning off lockdep checking for xfs inode
> > locking.
> Yes, it is not a deadlock, but machine hangs for few seconds.
> It still happens about once a day for me. Every kernel report looks
> similar to the above.

That hang is just memory reclaim running, I think you'll find.
It can take some time for reclaim to find pages to use, and meanwhile
everything in the machine will back up behind it....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-05-15 22:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <a4423d670805101005x113c4813w2b95c1fb535cf080@mail.gmail.com>
2008-05-10 17:46 ` 2.6.26-rc1: possible circular locking dependency Kamalesh Babulal
2008-05-11  3:48   ` 2.6.26-rc1: possible circular locking dependency with xfs filesystem Kamalesh Babulal
2008-05-11 23:10     ` David Chinner
2008-05-15 17:45       ` Alexander Beregalov
2008-05-15 22:27         ` David Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox