From: Miao Xie <miaox@cn.fujitsu.com>
To: Itaru Kitayama <kitayama@cl.bb4u.ne.jp>
Cc: Chris Mason <chris.mason@oracle.com>,
Linux Btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH V2] btrfs: fix possible deadlock by clearing __GFP_FS flag
Date: Tue, 29 Mar 2011 14:16:53 +0800 [thread overview]
Message-ID: <4D917955.2060500@cn.fujitsu.com> (raw)
In-Reply-To: <20110329144805.507dfe30.kitayama@cl.bb4u.ne.jp>
On tue, 29 Mar 2011 14:48:05 +0900, Itaru Kitayama wrote:
> Hi Miao,
>
> On Sun, 27 Mar 2011 20:27:30 +0800
> Miao Xie <miaox@cn.fujitsu.com> wrote:
>
>> Changelog V1 -> V2:
>> - modify the explanation of the deadlock.
>> - clear __GFP_FS flag in the free space's page cache.
>
> I think this is also needed on top of your V5 patch to avoid a recursion. Could you
> review it and give your Signed-off-by?
It is good to me.
>
> Signed-off-by: Itaru Kitayama <kitayama@cl.bb4u.ne.jp>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 8862dda..03e5ab3 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2641,7 +2641,7 @@ int extent_readpages(struct extent_io_tree *tree,
> prefetchw(&page->flags);
> list_del(&page->lru);
> if (!add_to_page_cache_lru(page, mapping,
> - page->index, GFP_KERNEL)) {
> + page->index, GFP_NOFS)) {
> __extent_read_full_page(tree, page, get_extent,
> &bio, 0, &bio_flags);
> }
>
> After applying the patch above, I don't see the warning below during Chris' stress test.
>
> =========================================================
> [ INFO: possible irq lock inversion dependency detected ]
> 2.6.36-v5+ #10
> ---------------------------------------------------------
> kswapd0/49 just changed the state of lock:
> (&delayed_node->mutex){+.+.-.}, at: [<ffffffff81213283>] btrfs_remove_delayed_node+0x3e/0xd2
> but this lock took another, RECLAIM_FS-READ-unsafe lock in the past:
> (&found->groups_sem){++++.+}
>
> and interrupts could create inverse lock ordering between them.
>
>
> other info that might help us debug this:
> 2 locks held by kswapd0/49:
> #0: (shrinker_rwsem){++++..}, at: [<ffffffff810e242a>] shrink_slab+0x3d/0x164
> #1: (iprune_sem){++++.-}, at: [<ffffffff811316d0>] shrink_icache_memory+0x4d/0x213
>
> the shortest dependencies between 2nd lock and 1st lock:
> -> (&found->groups_sem){++++.+} ops: 3649 {
> HARDIRQ-ON-W at:
> [<ffffffff81075ec0>] __lock_acquire+0x346/0xda6
> [<ffffffff81076a3d>] lock_acquire+0x11d/0x143
> [<ffffffff814c6aba>] down_write+0x55/0x9b
> [<ffffffff811c352a>] __link_block_group+0x5a/0x83
> [<ffffffff811ca562>] btrfs_read_block_groups+0x2fb/0x56c
> [<ffffffff811d4974>] open_ctree+0xf8f/0x14c3
> [<ffffffff811bafdf>] btrfs_get_sb+0x236/0x467
> [<ffffffff8111f25e>] vfs_kern_mount+0xbd/0x1a7
> [<ffffffff8111f3b0>] do_kern_mount+0x4d/0xed
> [<ffffffff8113668d>] do_mount+0x74e/0x7c5
> [<ffffffff8113678c>] sys_mount+0x88/0xc2
> [<ffffffff81002ddb>] system_call_fastpath+0x16/0x1b
> HARDIRQ-ON-R at:
> [<ffffffff81075e98>] __lock_acquire+0x31e/0xda6
> [<ffffffff81076a3d>] lock_acquire+0x11d/0x143
> [<ffffffff814c6b4c>] down_read+0x4c/0x91
> [<ffffffff811cb5b2>] find_free_extent+0x3ec/0xa86
> [<ffffffff811cbd00>] btrfs_reserve_extent+0xb4/0x142
> [<ffffffff811cbef5>] btrfs_alloc_free_block+0x167/0x2b2
> [<ffffffff811be610>] __btrfs_cow_block+0x103/0x346
> [<ffffffff811bedb8>] btrfs_cow_block+0x101/0x110
> [<ffffffff811c05d8>] btrfs_search_slot+0x143/0x513
> [<ffffffff811dc0d9>] btrfs_truncate_inode_items+0x12a/0x61a
> [<ffffffff811defa7>] btrfs_evict_inode+0x154/0x1be
> [<ffffffff811311b0>] evict+0x27/0x97
> [<ffffffff81131615>] iput+0x1d0/0x23e
> [<ffffffff811e1143>] btrfs_orphan_cleanup+0x1c8/0x269
> [<ffffffff811d05e1>] btrfs_cleanup_fs_roots+0x6d/0x8c
> [<ffffffff811bac48>] btrfs_remount+0x9e/0xe9
> [<ffffffff8111e9b2>] do_remount_sb+0xbb/0x106
> [<ffffffff81136194>] do_mount+0x255/0x7c5
> [<ffffffff8113678c>] sys_mount+0x88/0xc2
> [<ffffffff81002ddb>] system_call_fastpath+0x16/0x1b
> SOFTIRQ-ON-W at:
> [<ffffffff81075ee1>] __lock_acquire+0x367/0xda6
> [<ffffffff81076a3d>] lock_acquire+0x11d/0x143
> [<ffffffff814c6aba>] down_write+0x55/0x9b
> [<ffffffff811c352a>] __link_block_group+0x5a/0x83
> [<ffffffff811ca562>] btrfs_read_block_groups+0x2fb/0x56c
> [<ffffffff811d4974>] open_ctree+0xf8f/0x14c3
> [<ffffffff811bafdf>] btrfs_get_sb+0x236/0x467
> [<ffffffff8111f25e>] vfs_kern_mount+0xbd/0x1a7
> [<ffffffff8111f3b0>] do_kern_mount+0x4d/0xed
> [<ffffffff8113668d>] do_mount+0x74e/0x7c5
> [<ffffffff8113678c>] sys_mount+0x88/0xc2
> [<ffffffff81002ddb>] system_call_fastpath+0x16/0x1b
> SOFTIRQ-ON-R at:
> [<ffffffff81075ee1>] __lock_acquire+0x367/0xda6
> [<ffffffff81076a3d>] lock_acquire+0x11d/0x143
> [<ffffffff814c6b4c>] down_read+0x4c/0x91
> [<ffffffff811cb5b2>] find_free_extent+0x3ec/0xa86
> [<ffffffff811cbd00>] btrfs_reserve_extent+0xb4/0x142
> [<ffffffff811cbef5>] btrfs_alloc_free_block+0x167/0x2b2
> [<ffffffff811be610>] __btrfs_cow_block+0x103/0x346
> [<ffffffff811bedb8>] btrfs_cow_block+0x101/0x110
> [<ffffffff811c05d8>] btrfs_search_slot+0x143/0x513
> [<ffffffff811dc0d9>] btrfs_truncate_inode_items+0x12a/0x61a
> [<ffffffff811defa7>] btrfs_evict_inode+0x154/0x1be
> [<ffffffff811311b0>] evict+0x27/0x97
> [<ffffffff81131615>] iput+0x1d0/0x23e
> [<ffffffff811e1143>] btrfs_orphan_cleanup+0x1c8/0x269
> [<ffffffff811d05e1>] btrfs_cleanup_fs_roots+0x6d/0x8c
> [<ffffffff811bac48>] btrfs_remount+0x9e/0xe9
> [<ffffffff8111e9b2>] do_remount_sb+0xbb/0x106
> [<ffffffff81136194>] do_mount+0x255/0x7c5
> [<ffffffff8113678c>] sys_mount+0x88/0xc2
> [<ffffffff81002ddb>] system_call_fastpath+0x16/0x1b
> RECLAIM_FS-ON-R at:
> [<ffffffff81074292>] mark_held_locks+0x52/0x70
> [<ffffffff81074354>] lockdep_trace_alloc+0xa4/0xc2
> [<ffffffff8110fcc9>] kmem_cache_alloc+0x32/0x186
> [<ffffffff81265caa>] radix_tree_preload+0x6f/0xd5
> [<ffffffff810d4df8>] add_to_page_cache_locked+0x60/0x147
> [<ffffffff810d4f0c>] add_to_page_cache_lru+0x2d/0x5b
> [<ffffffff811f348a>] extent_readpages+0x6c/0xcb
> [<ffffffff811da3b6>] btrfs_readpages+0x1f/0x21
> [<ffffffff810ddf68>] __do_page_cache_readahead+0x127/0x19d
> [<ffffffff810ddfff>] ra_submit+0x21/0x25
> [<ffffffff810de3b9>] ondemand_readahead+0x1b6/0x1c9
> [<ffffffff810de4b2>] page_cache_sync_readahead+0x3d/0x3f
> [<ffffffff81207a24>] load_free_space_cache+0x27e/0x682
> [<ffffffff811c886f>] cache_block_group+0x97/0x233
> [<ffffffff811cb63f>] find_free_extent+0x479/0xa86
> [<ffffffff811cbd00>] btrfs_reserve_extent+0xb4/0x142
> [<ffffffff811cbef5>] btrfs_alloc_free_block+0x167/0x2b2
> [<ffffffff811be610>] __btrfs_cow_block+0x103/0x346
> [<ffffffff811bedb8>] btrfs_cow_block+0x101/0x110
> [<ffffffff811c05d8>] btrfs_search_slot+0x143/0x513
> [<ffffffff811dc0d9>] btrfs_truncate_inode_items+0x12a/0x61a
> [<ffffffff811defa7>] btrfs_evict_inode+0x154/0x1be
> [<ffffffff811311b0>] evict+0x27/0x97
> [<ffffffff81131615>] iput+0x1d0/0x23e
> [<ffffffff811e1143>] btrfs_orphan_cleanup+0x1c8/0x269
> [<ffffffff811d05e1>] btrfs_cleanup_fs_roots+0x6d/0x8c
> [<ffffffff811bac48>] btrfs_remount+0x9e/0xe9
> [<ffffffff8111e9b2>] do_remount_sb+0xbb/0x106
> [<ffffffff81136194>] do_mount+0x255/0x7c5
> [<ffffffff8113678c>] sys_mount+0x88/0xc2
> [<ffffffff81002ddb>] system_call_fastpath+0x16/0x1b
> INITIAL USE at:
> [<ffffffff81075f37>] __lock_acquire+0x3bd/0xda6
> [<ffffffff81076a3d>] lock_acquire+0x11d/0x143
> [<ffffffff814c6aba>] down_write+0x55/0x9b
> [<ffffffff811c352a>] __link_block_group+0x5a/0x83
> [<ffffffff811ca562>] btrfs_read_block_groups+0x2fb/0x56c
> [<ffffffff811d4974>] open_ctree+0xf8f/0x14c3
> [<ffffffff811bafdf>] btrfs_get_sb+0x236/0x467
> [<ffffffff8111f25e>] vfs_kern_mount+0xbd/0x1a7
> [<ffffffff8111f3b0>] do_kern_mount+0x4d/0xed
> [<ffffffff8113668d>] do_mount+0x74e/0x7c5
> [<ffffffff8113678c>] sys_mount+0x88/0xc2
> [<ffffffff81002ddb>] system_call_fastpath+0x16/0x1b
> }
> ... key at: [<ffffffff82924fb8>] __key.40112+0x0/0x8
> ... acquired at:
> [<ffffffff81076a3d>] lock_acquire+0x11d/0x143
> [<ffffffff814c6b4c>] down_read+0x4c/0x91
> [<ffffffff811cb48a>] find_free_extent+0x2c4/0xa86
> [<ffffffff811cbd00>] btrfs_reserve_extent+0xb4/0x142
> [<ffffffff811cbef5>] btrfs_alloc_free_block+0x167/0x2b2
> [<ffffffff811be610>] __btrfs_cow_block+0x103/0x346
> [<ffffffff811bedb8>] btrfs_cow_block+0x101/0x110
> [<ffffffff811c05d8>] btrfs_search_slot+0x143/0x513
> [<ffffffff811cf58b>] btrfs_lookup_inode+0x2f/0x8f
> [<ffffffff81212471>] btrfs_update_delayed_inode+0x75/0x135
> [<ffffffff812130fa>] btrfs_async_run_delayed_node_done+0xd5/0x194
> [<ffffffff811fb4f6>] worker_loop+0x198/0x4dd
> [<ffffffff81061a60>] kthread+0x9d/0xa5
> [<ffffffff81003c14>] kernel_thread_helper+0x4/0x10
>
> -> (&delayed_node->mutex){+.+.-.} ops: 32488 {
> HARDIRQ-ON-W at:
> [<ffffffff81075ec0>] __lock_acquire+0x346/0xda6
> [<ffffffff81076a3d>] lock_acquire+0x11d/0x143
> [<ffffffff814c6321>] __mutex_lock_common+0x5a/0x444
> [<ffffffff814c67c0>] mutex_lock_nested+0x39/0x3e
> [<ffffffff81212040>] btrfs_delayed_update_inode+0x45/0x101
> [<ffffffff811dc5f7>] btrfs_update_inode+0x2e/0x129
> [<ffffffff811de8b0>] btrfs_dirty_inode+0x57/0x113
> [<ffffffff8113c2a5>] __mark_inode_dirty+0x33/0x1aa
> [<ffffffff81130939>] touch_atime+0x107/0x12a
> [<ffffffff810d63ea>] generic_file_aio_read+0x567/0x5bc
> [<ffffffff8111c717>] do_sync_read+0xcb/0x108
> [<ffffffff8111cd89>] vfs_read+0xab/0x107
> [<ffffffff8111cea8>] sys_read+0x4d/0x74
> [<ffffffff81002ddb>] system_call_fastpath+0x16/0x1b
> SOFTIRQ-ON-W at:
> [<ffffffff81075ee1>] __lock_acquire+0x367/0xda6
> [<ffffffff81076a3d>] lock_acquire+0x11d/0x143
> [<ffffffff814c6321>] __mutex_lock_common+0x5a/0x444
> [<ffffffff814c67c0>] mutex_lock_nested+0x39/0x3e
> [<ffffffff81212040>] btrfs_delayed_update_inode+0x45/0x101
> [<ffffffff811dc5f7>] btrfs_update_inode+0x2e/0x129
> [<ffffffff811de8b0>] btrfs_dirty_inode+0x57/0x113
> [<ffffffff8113c2a5>] __mark_inode_dirty+0x33/0x1aa
> [<ffffffff81130939>] touch_atime+0x107/0x12a
> [<ffffffff810d63ea>] generic_file_aio_read+0x567/0x5bc
> [<ffffffff8111c717>] do_sync_read+0xcb/0x108
> [<ffffffff8111cd89>] vfs_read+0xab/0x107
> [<ffffffff8111cea8>] sys_read+0x4d/0x74
> [<ffffffff81002ddb>] system_call_fastpath+0x16/0x1b
> IN-RECLAIM_FS-W at:
> [<ffffffff81075f1f>] __lock_acquire+0x3a5/0xda6
> [<ffffffff81076a3d>] lock_acquire+0x11d/0x143
> [<ffffffff814c6321>] __mutex_lock_common+0x5a/0x444
> [<ffffffff814c67c0>] mutex_lock_nested+0x39/0x3e
> [<ffffffff81213283>] btrfs_remove_delayed_node+0x3e/0xd2
> [<ffffffff811d77fe>] btrfs_destroy_inode+0x2ae/0x2d4
> [<ffffffff81130dc1>] destroy_inode+0x2f/0x45
> [<ffffffff811312ca>] dispose_list+0xaa/0xdf
> [<ffffffff81131866>] shrink_icache_memory+0x1e3/0x213
> [<ffffffff810e24cd>] shrink_slab+0xe0/0x164
> [<ffffffff810e4619>] balance_pgdat+0x2e8/0x50b
> [<ffffffff810e4bbc>] kswapd+0x380/0x3c0
> [<ffffffff81061a60>] kthread+0x9d/0xa5
> [<ffffffff81003c14>] kernel_thread_helper+0x4/0x10
> INITIAL USE at:
> [<ffffffff81075f37>] __lock_acquire+0x3bd/0xda6
> [<ffffffff81076a3d>] lock_acquire+0x11d/0x143
> [<ffffffff814c6321>] __mutex_lock_common+0x5a/0x444
> [<ffffffff814c67c0>] mutex_lock_nested+0x39/0x3e
> [<ffffffff81212040>] btrfs_delayed_update_inode+0x45/0x101
> [<ffffffff811dc5f7>] btrfs_update_inode+0x2e/0x129
> [<ffffffff811de8b0>] btrfs_dirty_inode+0x57/0x113
> [<ffffffff8113c2a5>] __mark_inode_dirty+0x33/0x1aa
> [<ffffffff81130939>] touch_atime+0x107/0x12a
> [<ffffffff810d63ea>] generic_file_aio_read+0x567/0x5bc
> [<ffffffff8111c717>] do_sync_read+0xcb/0x108
> [<ffffffff8111cd89>] vfs_read+0xab/0x107
> [<ffffffff8111cea8>] sys_read+0x4d/0x74
> [<ffffffff81002ddb>] system_call_fastpath+0x16/0x1b
> }
> ... key at: [<ffffffff82925450>] __key.31289+0x0/0x8
> ... acquired at:
> [<ffffffff810749bf>] check_usage_forwards+0x71/0x7e
> [<ffffffff81074162>] mark_lock+0x18c/0x26a
> [<ffffffff81075f1f>] __lock_acquire+0x3a5/0xda6
> [<ffffffff81076a3d>] lock_acquire+0x11d/0x143
> [<ffffffff814c6321>] __mutex_lock_common+0x5a/0x444
> [<ffffffff814c67c0>] mutex_lock_nested+0x39/0x3e
> [<ffffffff81213283>] btrfs_remove_delayed_node+0x3e/0xd2
> [<ffffffff811d77fe>] btrfs_destroy_inode+0x2ae/0x2d4
> [<ffffffff81130dc1>] destroy_inode+0x2f/0x45
> [<ffffffff811312ca>] dispose_list+0xaa/0xdf
> [<ffffffff81131866>] shrink_icache_memory+0x1e3/0x213
> [<ffffffff810e24cd>] shrink_slab+0xe0/0x164
> [<ffffffff810e4619>] balance_pgdat+0x2e8/0x50b
> [<ffffffff810e4bbc>] kswapd+0x380/0x3c0
> [<ffffffff81061a60>] kthread+0x9d/0xa5
> [<ffffffff81003c14>] kernel_thread_helper+0x4/0x10
>
>
> stack backtrace:
> Pid: 49, comm: kswapd0 Not tainted 2.6.36-v5+ #10
> Call Trace:
> [<ffffffff8107493d>] print_irq_inversion_bug+0x124/0x135
> [<ffffffff810749bf>] check_usage_forwards+0x71/0x7e
> [<ffffffff8107494e>] ? check_usage_forwards+0x0/0x7e
> [<ffffffff81074162>] mark_lock+0x18c/0x26a
> [<ffffffff81075f1f>] __lock_acquire+0x3a5/0xda6
> [<ffffffff81076911>] ? __lock_acquire+0xd97/0xda6
> [<ffffffff81213283>] ? btrfs_remove_delayed_node+0x3e/0xd2
> [<ffffffff81076a3d>] lock_acquire+0x11d/0x143
> [<ffffffff81213283>] ? btrfs_remove_delayed_node+0x3e/0xd2
> [<ffffffff81213283>] ? btrfs_remove_delayed_node+0x3e/0xd2
> [<ffffffff814c6321>] __mutex_lock_common+0x5a/0x444
> [<ffffffff81213283>] ? btrfs_remove_delayed_node+0x3e/0xd2
> [<ffffffff81074604>] ? trace_hardirqs_on+0xd/0xf
> [<ffffffff814c67c0>] mutex_lock_nested+0x39/0x3e
> [<ffffffff81213283>] btrfs_remove_delayed_node+0x3e/0xd2
> [<ffffffff811d77fe>] btrfs_destroy_inode+0x2ae/0x2d4
> [<ffffffff81130dc1>] destroy_inode+0x2f/0x45
> [<ffffffff811312ca>] dispose_list+0xaa/0xdf
> [<ffffffff81131866>] shrink_icache_memory+0x1e3/0x213
> [<ffffffff810e24cd>] shrink_slab+0xe0/0x164
> [<ffffffff810e4619>] balance_pgdat+0x2e8/0x50b
> [<ffffffff810e4bbc>] kswapd+0x380/0x3c0
> [<ffffffff81062032>] ? autoremove_wake_function+0x0/0x39
> [<ffffffff810e483c>] ? kswapd+0x0/0x3c0
> [<ffffffff81061a60>] kthread+0x9d/0xa5
> [<ffffffff81003c14>] kernel_thread_helper+0x4/0x10
> [<ffffffff81038cd9>] ? finish_task_switch+0x70/0xb9
> [<ffffffff814c8940>] ? restore_args+0x0/0x30
> [<ffffffff810619c3>] ? kthread+0x0/0xa5
> [<ffffffff81003c10>] ? kernel_thread_helper+0x0/0x10
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
prev parent reply other threads:[~2011-03-29 6:16 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-27 8:07 [PATCH] btrfs: fix possible deadlock by clearing __GFP_FS flag Miao Xie
2011-03-27 12:27 ` [PATCH V2] " Miao Xie
2011-03-27 14:02 ` Chris Mason
2011-03-29 5:48 ` Itaru Kitayama
2011-03-29 6:16 ` Miao Xie [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D917955.2060500@cn.fujitsu.com \
--to=miaox@cn.fujitsu.com \
--cc=chris.mason@oracle.com \
--cc=kitayama@cl.bb4u.ne.jp \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.