linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Btrfs: fix fsync deadlock in log_new_dir_dentries
@ 2016-10-07 10:05 robbieko
  2016-10-07 10:23 ` Filipe Manana
  0 siblings, 1 reply; 7+ messages in thread
From: robbieko @ 2016-10-07 10:05 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Robbie Ko

From: Robbie Ko <robbieko@synology.com>

We found a fsync deadlock ie. 32021->32020->32028->14431->14436->32021,
in log_new_dir_dentries, because btrfs_search_forward get path lock, then
call btrfs_iget will get another extent_buffer lock, maybe occur deadlock.

we can release path before call btrfs_iget, avoid deadlock occur.

some process call trace like below:
[ 4077.478852] kworker/u24:10  D ffff88107fc90640     0 14431      2 0x00000000
[ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[ 4077.494346]  ffff880ffa56bad0 0000000000000046 0000000000009000 ffff880ffa56bfd8
[ 4077.502629]  ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26 ffff88101a5d6138
[ 4077.510915]  ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410 ffff881016ce21c0
[ 4077.519202] Call Trace:
[ 4077.528752]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
[ 4077.536049]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
[ 4077.542574]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
[ 4077.550171]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs]
[ 4077.558252]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs]
[ 4077.566140]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs]
[ 4077.573928]  [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
[ 4077.582399]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs]
[ 4077.589896]  [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
[ 4077.599632]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
[ 4077.607134]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs]
[ 4077.615329]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
[ 4077.622043]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
[ 4077.628459]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
[ 4077.635759]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
[ 4077.641404]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
[ 4077.648696]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
[ 4077.654926]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110

[ 4078.358087] kworker/u24:15  D ffff88107fcd0640     0 14436      2 0x00000000
[ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[ 4078.373574]  ffff880ffa57fad0 0000000000000046 0000000000009000 ffff880ffa57ffd8
[ 4078.381864]  ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26 ffff88101a5d6138
[ 4078.390163]  ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8 ffff88103004d0a0
[ 4078.398466] Call Trace:
[ 4078.408019]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
[ 4078.415322]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
[ 4078.421844]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
[ 4078.429438]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs]
[ 4078.437518]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs]
[ 4078.445404]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs]
[ 4078.453194]  [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
[ 4078.461663]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs]
[ 4078.469161]  [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
[ 4078.478893]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
[ 4078.486388]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs]
[ 4078.494561]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
[ 4078.501278]  [<ffffffff8104a507>] ? pwq_activate_delayed_work+0x27/0x40
[ 4078.508673]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
[ 4078.515098]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
[ 4078.522396]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
[ 4078.528032]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
[ 4078.535325]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
[ 4078.541552]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110

[ 4079.355824] user-space-program D ffff88107fd30640     0 32020      1 0x00000000
[ 4079.363716]  ffff880eae8eba10 0000000000000086 0000000000009000 ffff880eae8ebfd8
[ 4079.372003]  ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26 ffff88101a5d6138
[ 4079.380294]  ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528 ffff881016c162c0
[ 4079.388586] Call Trace:
[ 4079.398134]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
[ 4079.405431]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
[ 4079.411955]  [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40 [btrfs]
[ 4079.419644]  [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10 [btrfs]
[ 4079.427237]  [<ffffffffa06aba52>] ? btrfs_buffer_uptodate+0x52/0x70 [btrfs]
[ 4079.435041]  [<ffffffffa0689b60>] ? generic_bin_search.constprop.38+0x80/0x190 [btrfs]
[ 4079.443897]  [<ffffffffa068ea44>] ? btrfs_insert_empty_items+0x74/0xd0 [btrfs]
[ 4079.451975]  [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs]
[ 4079.458890]  [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3 [btrfs]
[ 4079.466292]  [<ffffffffa06f34a1>] ? btrfs_log_inode_parent+0xc61/0xf30 [btrfs]
[ 4079.474373]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs]
[ 4079.482161]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
[ 4079.489558]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
[ 4079.495300]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
[ 4079.501422]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b

[ 4079.508334] user-space-program D ffff88107fc30640     0 32021      1 0x00000004
[ 4079.516226]  ffff880eae8efbf8 0000000000000086 0000000000009000 ffff880eae8effd8
[ 4079.524513]  ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26 ffff88101a5d6138
[ 4079.532802]  ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8 ffff881030279610
[ 4079.541092] Call Trace:
[ 4079.550642]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
[ 4079.557941]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
[ 4079.564463]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
[ 4079.572058]  [<ffffffffa06bb7d8>] ? btrfs_truncate_inode_items+0x168/0xb90 [btrfs]
[ 4079.580526]  [<ffffffffa06b04be>] ? join_transaction.isra.15+0x1e/0x3a0 [btrfs]
[ 4079.588701]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
[ 4079.596196]  [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50 [btrfs]
[ 4079.603789]  [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 [btrfs]
[ 4079.610994]  [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 [btrfs]
[ 4079.618197]  [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680
[ 4079.624625]  [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160
[ 4079.630854]  [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90
[ 4079.636889]  [<ffffffff810f59fa>] ? do_sys_ftruncate.constprop.15+0x10a/0x160
[ 4079.644869]  [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570
[ 4079.650805]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b

[ 4080.410607] user-space-program D ffff88107fc70640     0 32028  12639 0x00000004
[ 4080.418489]  ffff880eaeccbbe0 0000000000000086 0000000000009000 ffff880eaeccbfd8
[ 4080.426778]  ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26 ffff88101a5d6138
[ 4080.435067]  ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08 ffff880f317ef1e0
[ 4080.443353] Call Trace:
[ 4080.452920]  [<ffffffffa06ed15d>] ? btrfs_tree_read_lock+0xdd/0x190 [btrfs]
[ 4080.460703]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
[ 4080.467225]  [<ffffffffa06876bb>] ? btrfs_read_lock_root_node+0x2b/0x40 [btrfs]
[ 4080.475400]  [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10 [btrfs]
[ 4080.482994]  [<ffffffffa06b2df0>] ? btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs]
[ 4080.491857]  [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90 [btrfs]
[ 4080.499353]  [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0
[ 4080.505879]  [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs]
[ 4080.512696]  [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120 [btrfs]
[ 4080.520387]  [<ffffffffa06f341f>] ? btrfs_log_inode_parent+0xbdf/0xf30 [btrfs]
[ 4080.528469]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs]
[ 4080.536258]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
[ 4080.543657]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
[ 4080.549399]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
[ 4080.555534]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b

Signed-off-by: Robbie Ko <robbieko@synology.com>
---
 fs/btrfs/tree-log.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index ef9c55b..9913084 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -5213,6 +5213,7 @@ process_leaf:
 			if (di_key.type == BTRFS_ROOT_ITEM_KEY)
 				continue;
 
+			btrfs_release_path(path);
 			di_inode = btrfs_iget(root->fs_info->sb, &di_key,
 					      root, NULL);
 			if (IS_ERR(di_inode)) {
@@ -5222,13 +5223,12 @@ process_leaf:
 
 			if (btrfs_inode_in_log(di_inode, trans->transid)) {
 				iput(di_inode);
-				continue;
+				break;
 			}
 
 			ctx->log_new_dentries = false;
 			if (type == BTRFS_FT_DIR || type == BTRFS_FT_SYMLINK)
 				log_mode = LOG_INODE_ALL;
-			btrfs_release_path(path);
 			ret = btrfs_log_inode(trans, root, di_inode,
 					      log_mode, 0, LLONG_MAX, ctx);
 			if (!ret &&
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] Btrfs: fix fsync deadlock in log_new_dir_dentries
  2016-10-07 10:05 [PATCH] Btrfs: fix fsync deadlock in log_new_dir_dentries robbieko
@ 2016-10-07 10:23 ` Filipe Manana
       [not found]   ` <e077183a-e7b4-43c1-bb8d-a1cf900072fe@Mail>
  0 siblings, 1 reply; 7+ messages in thread
From: Filipe Manana @ 2016-10-07 10:23 UTC (permalink / raw)
  To: robbieko; +Cc: linux-btrfs@vger.kernel.org

On Fri, Oct 7, 2016 at 11:05 AM, robbieko <robbieko@synology.com> wrote:
> From: Robbie Ko <robbieko@synology.com>
>
> We found a fsync deadlock ie. 32021->32020->32028->14431->14436->32021,
> in log_new_dir_dentries, because btrfs_search_forward get path lock, then
> call btrfs_iget will get another extent_buffer lock, maybe occur deadlock.

What are those numbers? Are they inode numbers?
If so you're suggesting a deadlock due to recursive logging of the same inode.
However the trace below, and the code change, has nothing to do with that.
It's just about btrfs_iget trying to do a search on a btree and
attempting to read lock some node/leaf that already has a write lock
acquired previously by the same task.

Please be more clear on your change logs.


>
> we can release path before call btrfs_iget, avoid deadlock occur.
>
> some process call trace like below:
> [ 4077.478852] kworker/u24:10  D ffff88107fc90640     0 14431      2 0x00000000
> [ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
> [ 4077.494346]  ffff880ffa56bad0 0000000000000046 0000000000009000 ffff880ffa56bfd8
> [ 4077.502629]  ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26 ffff88101a5d6138
> [ 4077.510915]  ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410 ffff881016ce21c0
> [ 4077.519202] Call Trace:
> [ 4077.528752]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
> [ 4077.536049]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
> [ 4077.542574]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
> [ 4077.550171]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs]
> [ 4077.558252]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs]
> [ 4077.566140]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs]
> [ 4077.573928]  [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
> [ 4077.582399]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs]
> [ 4077.589896]  [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
> [ 4077.599632]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
> [ 4077.607134]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs]
> [ 4077.615329]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
> [ 4077.622043]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
> [ 4077.628459]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
> [ 4077.635759]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
> [ 4077.641404]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
> [ 4077.648696]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
> [ 4077.654926]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
>
> [ 4078.358087] kworker/u24:15  D ffff88107fcd0640     0 14436      2 0x00000000
> [ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
> [ 4078.373574]  ffff880ffa57fad0 0000000000000046 0000000000009000 ffff880ffa57ffd8
> [ 4078.381864]  ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26 ffff88101a5d6138
> [ 4078.390163]  ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8 ffff88103004d0a0
> [ 4078.398466] Call Trace:
> [ 4078.408019]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
> [ 4078.415322]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
> [ 4078.421844]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
> [ 4078.429438]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs]
> [ 4078.437518]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs]
> [ 4078.445404]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs]
> [ 4078.453194]  [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
> [ 4078.461663]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs]
> [ 4078.469161]  [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
> [ 4078.478893]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
> [ 4078.486388]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs]
> [ 4078.494561]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
> [ 4078.501278]  [<ffffffff8104a507>] ? pwq_activate_delayed_work+0x27/0x40
> [ 4078.508673]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
> [ 4078.515098]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
> [ 4078.522396]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
> [ 4078.528032]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
> [ 4078.535325]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
> [ 4078.541552]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
>
> [ 4079.355824] user-space-program D ffff88107fd30640     0 32020      1 0x00000000
> [ 4079.363716]  ffff880eae8eba10 0000000000000086 0000000000009000 ffff880eae8ebfd8
> [ 4079.372003]  ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26 ffff88101a5d6138
> [ 4079.380294]  ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528 ffff881016c162c0
> [ 4079.388586] Call Trace:
> [ 4079.398134]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
> [ 4079.405431]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
> [ 4079.411955]  [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40 [btrfs]
> [ 4079.419644]  [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10 [btrfs]
> [ 4079.427237]  [<ffffffffa06aba52>] ? btrfs_buffer_uptodate+0x52/0x70 [btrfs]
> [ 4079.435041]  [<ffffffffa0689b60>] ? generic_bin_search.constprop.38+0x80/0x190 [btrfs]
> [ 4079.443897]  [<ffffffffa068ea44>] ? btrfs_insert_empty_items+0x74/0xd0 [btrfs]
> [ 4079.451975]  [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs]
> [ 4079.458890]  [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3 [btrfs]
> [ 4079.466292]  [<ffffffffa06f34a1>] ? btrfs_log_inode_parent+0xc61/0xf30 [btrfs]
> [ 4079.474373]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs]
> [ 4079.482161]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
> [ 4079.489558]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
> [ 4079.495300]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
> [ 4079.501422]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>
> [ 4079.508334] user-space-program D ffff88107fc30640     0 32021      1 0x00000004
> [ 4079.516226]  ffff880eae8efbf8 0000000000000086 0000000000009000 ffff880eae8effd8
> [ 4079.524513]  ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26 ffff88101a5d6138
> [ 4079.532802]  ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8 ffff881030279610
> [ 4079.541092] Call Trace:
> [ 4079.550642]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
> [ 4079.557941]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
> [ 4079.564463]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
> [ 4079.572058]  [<ffffffffa06bb7d8>] ? btrfs_truncate_inode_items+0x168/0xb90 [btrfs]
> [ 4079.580526]  [<ffffffffa06b04be>] ? join_transaction.isra.15+0x1e/0x3a0 [btrfs]
> [ 4079.588701]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
> [ 4079.596196]  [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50 [btrfs]
> [ 4079.603789]  [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 [btrfs]
> [ 4079.610994]  [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 [btrfs]
> [ 4079.618197]  [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680
> [ 4079.624625]  [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160
> [ 4079.630854]  [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90
> [ 4079.636889]  [<ffffffff810f59fa>] ? do_sys_ftruncate.constprop.15+0x10a/0x160
> [ 4079.644869]  [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570
> [ 4079.650805]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>
> [ 4080.410607] user-space-program D ffff88107fc70640     0 32028  12639 0x00000004
> [ 4080.418489]  ffff880eaeccbbe0 0000000000000086 0000000000009000 ffff880eaeccbfd8
> [ 4080.426778]  ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26 ffff88101a5d6138
> [ 4080.435067]  ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08 ffff880f317ef1e0
> [ 4080.443353] Call Trace:
> [ 4080.452920]  [<ffffffffa06ed15d>] ? btrfs_tree_read_lock+0xdd/0x190 [btrfs]
> [ 4080.460703]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
> [ 4080.467225]  [<ffffffffa06876bb>] ? btrfs_read_lock_root_node+0x2b/0x40 [btrfs]
> [ 4080.475400]  [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10 [btrfs]
> [ 4080.482994]  [<ffffffffa06b2df0>] ? btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs]
> [ 4080.491857]  [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90 [btrfs]
> [ 4080.499353]  [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0
> [ 4080.505879]  [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs]
> [ 4080.512696]  [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120 [btrfs]
> [ 4080.520387]  [<ffffffffa06f341f>] ? btrfs_log_inode_parent+0xbdf/0xf30 [btrfs]
> [ 4080.528469]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs]
> [ 4080.536258]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
> [ 4080.543657]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
> [ 4080.549399]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
> [ 4080.555534]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>
> Signed-off-by: Robbie Ko <robbieko@synology.com>
> ---
>  fs/btrfs/tree-log.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index ef9c55b..9913084 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -5213,6 +5213,7 @@ process_leaf:
>                         if (di_key.type == BTRFS_ROOT_ITEM_KEY)
>                                 continue;
>
> +                       btrfs_release_path(path);
>                         di_inode = btrfs_iget(root->fs_info->sb, &di_key,
>                                               root, NULL);
>                         if (IS_ERR(di_inode)) {
> @@ -5222,13 +5223,12 @@ process_leaf:
>
>                         if (btrfs_inode_in_log(di_inode, trans->transid)) {
>                                 iput(di_inode);
> -                               continue;
> +                               break;

Why the break instead of continue? This seems totally unrelated to the
deadlock on a leaf/node and incorrect.

thanks

>                         }
>
>                         ctx->log_new_dentries = false;
>                         if (type == BTRFS_FT_DIR || type == BTRFS_FT_SYMLINK)
>                                 log_mode = LOG_INODE_ALL;
> -                       btrfs_release_path(path);
>                         ret = btrfs_log_inode(trans, root, di_inode,
>                                               log_mode, 0, LLONG_MAX, ctx);
>                         if (!ret &&
> --
> 1.9.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"People will forget what you said,
 people will forget what you did,
 but people will never forget how you made them feel."

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Btrfs: fix fsync deadlock in log_new_dir_dentries
       [not found]   ` <e077183a-e7b4-43c1-bb8d-a1cf900072fe@Mail>
@ 2016-10-07 10:46     ` Filipe Manana
  2016-10-11  7:47       ` robbieko
  0 siblings, 1 reply; 7+ messages in thread
From: Filipe Manana @ 2016-10-07 10:46 UTC (permalink / raw)
  To: robbieko; +Cc: linux-btrfs@vger.kernel.org

On Fri, Oct 7, 2016 at 11:43 AM, robbieko <robbieko@synology.com> wrote:
> Hi Filipe,
>
> I am sorry, I express not clear enough.
> This number is pid, and the above are their call trace respectively.

And why did you replace the continue statement with a break statement?

Also please avoid mixing inline replies with top posting, it just
breaks the thread.

thanks

>
> Thanks.
> robbieko
>
> Filipe Manana <fdmanana@gmail.com> 於 2016-10-07 18:23 寫道:
>
> On Fri, Oct 7, 2016 at 11:05 AM, robbieko <robbieko@synology.com> wrote:
>> From: Robbie Ko <robbieko@synology.com>
>>
>> We found a fsync deadlock ie. 32021->32020->32028->14431->14436->32021,
>> in log_new_dir_dentries, because btrfs_search_forward get path lock, then
>> call btrfs_iget will get another extent_buffer lock, maybe occur deadlock.
>
> What are those numbers? Are they inode numbers?
> If so you're suggesting a deadlock due to recursive logging of the same
> inode.
> However the trace below, and the code change, has nothing to do with that.
> It's just about btrfs_iget trying to do a search on a btree and
> attempting to read lock some node/leaf that already has a write lock
> acquired previously by the same task.
>
> Please be more clear on your change logs.
>
>
>>
>> we can release path before call btrfs_iget, avoid deadlock occur.
>>
>> some process call trace like below:
>> [ 4077.478852] kworker/u24:10  D ffff88107fc90640     0 14431      2
>> 0x00000000
>> [ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper
>> [btrfs]
>> [ 4077.494346]  ffff880ffa56bad0 0000000000000046 0000000000009000
>> ffff880ffa56bfd8
>> [ 4077.502629]  ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26
>> ffff88101a5d6138
>> [ 4077.510915]  ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410
>> ffff881016ce21c0
>> [ 4077.519202] Call Trace:
>> [ 4077.528752]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
>> [ 4077.536049]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>> [ 4077.542574]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10
>> [btrfs]
>> [ 4077.550171]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40
>> [btrfs]
>> [ 4077.558252]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0
>> [btrfs]
>> [ 4077.566140]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150
>> [btrfs]
>> [ 4077.573928]  [<ffffffffa06fd629>] ?
>> btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
>> [ 4077.582399]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0
>> [btrfs]
>> [ 4077.589896]  [<ffffffffa06b4a64>] ?
>> insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
>> [ 4077.599632]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470
>> [btrfs]
>> [ 4077.607134]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600
>> [btrfs]
>> [ 4077.615329]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
>> [ 4077.622043]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
>> [ 4077.628459]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
>> [ 4077.635759]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
>> [ 4077.641404]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
>> [ 4077.648696]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
>> [ 4077.654926]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
>>
>> [ 4078.358087] kworker/u24:15  D ffff88107fcd0640     0 14436      2
>> 0x00000000
>> [ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper
>> [btrfs]
>> [ 4078.373574]  ffff880ffa57fad0 0000000000000046 0000000000009000
>> ffff880ffa57ffd8
>> [ 4078.381864]  ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26
>> ffff88101a5d6138
>> [ 4078.390163]  ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8
>> ffff88103004d0a0
>> [ 4078.398466] Call Trace:
>> [ 4078.408019]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
>> [ 4078.415322]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>> [ 4078.421844]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10
>> [btrfs]
>> [ 4078.429438]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40
>> [btrfs]
>> [ 4078.437518]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0
>> [btrfs]
>> [ 4078.445404]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150
>> [btrfs]
>> [ 4078.453194]  [<ffffffffa06fd629>] ?
>> btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
>> [ 4078.461663]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0
>> [btrfs]
>> [ 4078.469161]  [<ffffffffa06b4a64>] ?
>> insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
>> [ 4078.478893]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470
>> [btrfs]
>> [ 4078.486388]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600
>> [btrfs]
>> [ 4078.494561]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
>> [ 4078.501278]  [<ffffffff8104a507>] ? pwq_activate_delayed_work+0x27/0x40
>> [ 4078.508673]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
>> [ 4078.515098]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
>> [ 4078.522396]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
>> [ 4078.528032]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
>> [ 4078.535325]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
>> [ 4078.541552]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
>>
>> [ 4079.355824] user-space-program D ffff88107fd30640     0 32020      1
>> 0x00000000
>> [ 4079.363716]  ffff880eae8eba10 0000000000000086 0000000000009000
>> ffff880eae8ebfd8
>> [ 4079.372003]  ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26
>> ffff88101a5d6138
>> [ 4079.380294]  ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528
>> ffff881016c162c0
>> [ 4079.388586] Call Trace:
>> [ 4079.398134]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
>> [ 4079.405431]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>> [ 4079.411955]  [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40
>> [btrfs]
>> [ 4079.419644]  [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10
>> [btrfs]
>> [ 4079.427237]  [<ffffffffa06aba52>] ? btrfs_buffer_uptodate+0x52/0x70
>> [btrfs]
>> [ 4079.435041]  [<ffffffffa0689b60>] ?
>> generic_bin_search.constprop.38+0x80/0x190 [btrfs]
>> [ 4079.443897]  [<ffffffffa068ea44>] ? btrfs_insert_empty_items+0x74/0xd0
>> [btrfs]
>> [ 4079.451975]  [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs]
>> [ 4079.458890]  [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3 [btrfs]
>> [ 4079.466292]  [<ffffffffa06f34a1>] ? btrfs_log_inode_parent+0xc61/0xf30
>> [btrfs]
>> [ 4079.474373]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80
>> [btrfs]
>> [ 4079.482161]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
>> [ 4079.489558]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
>> [ 4079.495300]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
>> [ 4079.501422]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>>
>> [ 4079.508334] user-space-program D ffff88107fc30640     0 32021      1
>> 0x00000004
>> [ 4079.516226]  ffff880eae8efbf8 0000000000000086 0000000000009000
>> ffff880eae8effd8
>> [ 4079.524513]  ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26
>> ffff88101a5d6138
>> [ 4079.532802]  ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8
>> ffff881030279610
>> [ 4079.541092] Call Trace:
>> [ 4079.550642]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
>> [ 4079.557941]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>> [ 4079.564463]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10
>> [btrfs]
>> [ 4079.572058]  [<ffffffffa06bb7d8>] ?
>> btrfs_truncate_inode_items+0x168/0xb90 [btrfs]
>> [ 4079.580526]  [<ffffffffa06b04be>] ? join_transaction.isra.15+0x1e/0x3a0
>> [btrfs]
>> [ 4079.588701]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470
>> [btrfs]
>> [ 4079.596196]  [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50
>> [btrfs]
>> [ 4079.603789]  [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 [btrfs]
>> [ 4079.610994]  [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 [btrfs]
>> [ 4079.618197]  [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680
>> [ 4079.624625]  [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160
>> [ 4079.630854]  [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90
>> [ 4079.636889]  [<ffffffff810f59fa>] ?
>> do_sys_ftruncate.constprop.15+0x10a/0x160
>> [ 4079.644869]  [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570
>> [ 4079.650805]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>>
>> [ 4080.410607] user-space-program D ffff88107fc70640     0 32028  12639
>> 0x00000004
>> [ 4080.418489]  ffff880eaeccbbe0 0000000000000086 0000000000009000
>> ffff880eaeccbfd8
>> [ 4080.426778]  ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26
>> ffff88101a5d6138
>> [ 4080.435067]  ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08
>> ffff880f317ef1e0
>> [ 4080.443353] Call Trace:
>> [ 4080.452920]  [<ffffffffa06ed15d>] ? btrfs_tree_read_lock+0xdd/0x190
>> [btrfs]
>> [ 4080.460703]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>> [ 4080.467225]  [<ffffffffa06876bb>] ? btrfs_read_lock_root_node+0x2b/0x40
>> [btrfs]
>> [ 4080.475400]  [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10
>> [btrfs]
>> [ 4080.482994]  [<ffffffffa06b2df0>] ?
>> btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs]
>> [ 4080.491857]  [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90
>> [btrfs]
>> [ 4080.499353]  [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0
>> [ 4080.505879]  [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs]
>> [ 4080.512696]  [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120
>> [btrfs]
>> [ 4080.520387]  [<ffffffffa06f341f>] ? btrfs_log_inode_parent+0xbdf/0xf30
>> [btrfs]
>> [ 4080.528469]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80
>> [btrfs]
>> [ 4080.536258]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
>> [ 4080.543657]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
>> [ 4080.549399]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
>> [ 4080.555534]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>>
>> Signed-off-by: Robbie Ko <robbieko@synology.com>
>> ---
>>  fs/btrfs/tree-log.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
>> index ef9c55b..9913084 100644
>> --- a/fs/btrfs/tree-log.c
>> +++ b/fs/btrfs/tree-log.c
>> @@ -5213,6 +5213,7 @@ process_leaf:
>>                         if (di_key.type == BTRFS_ROOT_ITEM_KEY)
>>                                 continue;
>>
>> +                       btrfs_release_path(path);
>>                         di_inode = btrfs_iget(root->fs_info->sb, &di_key,
>>                                               root, NULL);
>>                         if (IS_ERR(di_inode)) {
>> @@ -5222,13 +5223,12 @@ process_leaf:
>>
>>                         if (btrfs_inode_in_log(di_inode, trans->transid))
>> {
>>                                 iput(di_inode);
>> -                               continue;
>> +                               break;
>
> Why the break instead of continue? This seems totally unrelated to the
> deadlock on a leaf/node and incorrect.
>
> thanks
>
>>                         }
>>
>>                         ctx->log_new_dentries = false;
>>                         if (type == BTRFS_FT_DIR || type ==
>> BTRFS_FT_SYMLINK)
>>                                 log_mode = LOG_INODE_ALL;
>> -                       btrfs_release_path(path);
>>                         ret = btrfs_log_inode(trans, root, di_inode,
>>                                               log_mode, 0, LLONG_MAX,
>> ctx);
>>                         if (!ret &&
>> --
>> 1.9.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Filipe David Manana,
>
> "People will forget what you said,
>  people will forget what you did,
>  but people will never forget how you made them feel."



-- 
Filipe David Manana,

"People will forget what you said,
 people will forget what you did,
 but people will never forget how you made them feel."

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Btrfs: fix fsync deadlock in log_new_dir_dentries
  2016-10-07 10:46     ` Filipe Manana
@ 2016-10-11  7:47       ` robbieko
  2016-10-12  9:04         ` Filipe Manana
  0 siblings, 1 reply; 7+ messages in thread
From: robbieko @ 2016-10-11  7:47 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

Hi Filipe:

why did you replace the continue statement with a break statement:
because we released ahead of the path, it can not continue to use,
need to jump out, and then go to again.

supplement:
We found a fsync deadlock ie. 32021->32020->32028->14431->14436->32021, 
the number id pid.
extent_buffer: start:207060992, len:16384
     locker pid: 32020 read lock
     wait pid: 32021 write lock
extent_buffer: start:14730821632, len:16384
     locker pid: 32028 read lock
     wait pid: 32020 write lock
extent_buffer: start:446503813120, len:16384
     locker pid: 14431 write lock
     wait pid: 32028 read lock
extent_buffer: start:446503845888, len: 16384
     locker pid: 14436 write lock
     wait pid: 14431 write lock
extent_buffer: start: 446504386560, len: 16384
     locker pid: 32021 write lock
     wait pid: 14436 write lock

Thanks.
Robbie Ko

Filipe Manana 於 2016-10-07 18:46 寫到:
> On Fri, Oct 7, 2016 at 11:43 AM, robbieko <robbieko@synology.com> 
> wrote:
>> Hi Filipe,
>> 
>> I am sorry, I express not clear enough.
>> This number is pid, and the above are their call trace respectively.
> 
> And why did you replace the continue statement with a break statement?
> 
> Also please avoid mixing inline replies with top posting, it just
> breaks the thread.
> 
> thanks
> 
>> 
>> Thanks.
>> robbieko
>> 
>> Filipe Manana <fdmanana@gmail.com> 於 2016-10-07 18:23 寫道:
>> 
>> On Fri, Oct 7, 2016 at 11:05 AM, robbieko <robbieko@synology.com> 
>> wrote:
>>> From: Robbie Ko <robbieko@synology.com>
>>> 
>>> We found a fsync deadlock ie. 
>>> 32021->32020->32028->14431->14436->32021,
>>> in log_new_dir_dentries, because btrfs_search_forward get path lock, 
>>> then
>>> call btrfs_iget will get another extent_buffer lock, maybe occur 
>>> deadlock.
>> 
>> What are those numbers? Are they inode numbers?
>> If so you're suggesting a deadlock due to recursive logging of the 
>> same
>> inode.
>> However the trace below, and the code change, has nothing to do with 
>> that.
>> It's just about btrfs_iget trying to do a search on a btree and
>> attempting to read lock some node/leaf that already has a write lock
>> acquired previously by the same task.
>> 
>> Please be more clear on your change logs.
>> 
>> 
>>> 
>>> we can release path before call btrfs_iget, avoid deadlock occur.
>>> 
>>> some process call trace like below:
>>> [ 4077.478852] kworker/u24:10  D ffff88107fc90640     0 14431      2
>>> 0x00000000
>>> [ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper
>>> [btrfs]
>>> [ 4077.494346]  ffff880ffa56bad0 0000000000000046 0000000000009000
>>> ffff880ffa56bfd8
>>> [ 4077.502629]  ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26
>>> ffff88101a5d6138
>>> [ 4077.510915]  ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410
>>> ffff881016ce21c0
>>> [ 4077.519202] Call Trace:
>>> [ 4077.528752]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 
>>> [btrfs]
>>> [ 4077.536049]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>>> [ 4077.542574]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10
>>> [btrfs]
>>> [ 4077.550171]  [<ffffffffa06a5073>] ? 
>>> btrfs_lookup_file_extent+0x33/0x40
>>> [btrfs]
>>> [ 4077.558252]  [<ffffffffa06c600b>] ? 
>>> __btrfs_drop_extents+0x13b/0xdf0
>>> [btrfs]
>>> [ 4077.566140]  [<ffffffffa06fc9e2>] ? 
>>> add_delayed_data_ref+0xe2/0x150
>>> [btrfs]
>>> [ 4077.573928]  [<ffffffffa06fd629>] ?
>>> btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
>>> [ 4077.582399]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0
>>> [btrfs]
>>> [ 4077.589896]  [<ffffffffa06b4a64>] ?
>>> insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
>>> [ 4077.599632]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470
>>> [btrfs]
>>> [ 4077.607134]  [<ffffffffa06bab57>] ? 
>>> btrfs_finish_ordered_io+0x2e7/0x600
>>> [btrfs]
>>> [ 4077.615329]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
>>> [ 4077.622043]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
>>> [ 4077.628459]  [<ffffffff8104d620>] ? 
>>> manage_workers.isra.26+0x270/0x270
>>> [ 4077.635759]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
>>> [ 4077.641404]  [<ffffffff81052a60>] ? 
>>> kthread_create_on_node+0x110/0x110
>>> [ 4077.648696]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
>>> [ 4077.654926]  [<ffffffff81052a60>] ? 
>>> kthread_create_on_node+0x110/0x110
>>> 
>>> [ 4078.358087] kworker/u24:15  D ffff88107fcd0640     0 14436      2
>>> 0x00000000
>>> [ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper
>>> [btrfs]
>>> [ 4078.373574]  ffff880ffa57fad0 0000000000000046 0000000000009000
>>> ffff880ffa57ffd8
>>> [ 4078.381864]  ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26
>>> ffff88101a5d6138
>>> [ 4078.390163]  ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8
>>> ffff88103004d0a0
>>> [ 4078.398466] Call Trace:
>>> [ 4078.408019]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 
>>> [btrfs]
>>> [ 4078.415322]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>>> [ 4078.421844]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10
>>> [btrfs]
>>> [ 4078.429438]  [<ffffffffa06a5073>] ? 
>>> btrfs_lookup_file_extent+0x33/0x40
>>> [btrfs]
>>> [ 4078.437518]  [<ffffffffa06c600b>] ? 
>>> __btrfs_drop_extents+0x13b/0xdf0
>>> [btrfs]
>>> [ 4078.445404]  [<ffffffffa06fc9e2>] ? 
>>> add_delayed_data_ref+0xe2/0x150
>>> [btrfs]
>>> [ 4078.453194]  [<ffffffffa06fd629>] ?
>>> btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
>>> [ 4078.461663]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0
>>> [btrfs]
>>> [ 4078.469161]  [<ffffffffa06b4a64>] ?
>>> insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
>>> [ 4078.478893]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470
>>> [btrfs]
>>> [ 4078.486388]  [<ffffffffa06bab57>] ? 
>>> btrfs_finish_ordered_io+0x2e7/0x600
>>> [btrfs]
>>> [ 4078.494561]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
>>> [ 4078.501278]  [<ffffffff8104a507>] ? 
>>> pwq_activate_delayed_work+0x27/0x40
>>> [ 4078.508673]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
>>> [ 4078.515098]  [<ffffffff8104d620>] ? 
>>> manage_workers.isra.26+0x270/0x270
>>> [ 4078.522396]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
>>> [ 4078.528032]  [<ffffffff81052a60>] ? 
>>> kthread_create_on_node+0x110/0x110
>>> [ 4078.535325]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
>>> [ 4078.541552]  [<ffffffff81052a60>] ? 
>>> kthread_create_on_node+0x110/0x110
>>> 
>>> [ 4079.355824] user-space-program D ffff88107fd30640     0 32020      
>>> 1
>>> 0x00000000
>>> [ 4079.363716]  ffff880eae8eba10 0000000000000086 0000000000009000
>>> ffff880eae8ebfd8
>>> [ 4079.372003]  ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26
>>> ffff88101a5d6138
>>> [ 4079.380294]  ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528
>>> ffff881016c162c0
>>> [ 4079.388586] Call Trace:
>>> [ 4079.398134]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 
>>> [btrfs]
>>> [ 4079.405431]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>>> [ 4079.411955]  [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40
>>> [btrfs]
>>> [ 4079.419644]  [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10
>>> [btrfs]
>>> [ 4079.427237]  [<ffffffffa06aba52>] ? 
>>> btrfs_buffer_uptodate+0x52/0x70
>>> [btrfs]
>>> [ 4079.435041]  [<ffffffffa0689b60>] ?
>>> generic_bin_search.constprop.38+0x80/0x190 [btrfs]
>>> [ 4079.443897]  [<ffffffffa068ea44>] ? 
>>> btrfs_insert_empty_items+0x74/0xd0
>>> [btrfs]
>>> [ 4079.451975]  [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs]
>>> [ 4079.458890]  [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3 
>>> [btrfs]
>>> [ 4079.466292]  [<ffffffffa06f34a1>] ? 
>>> btrfs_log_inode_parent+0xc61/0xf30
>>> [btrfs]
>>> [ 4079.474373]  [<ffffffffa06f45a9>] ? 
>>> btrfs_log_dentry_safe+0x59/0x80
>>> [btrfs]
>>> [ 4079.482161]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 
>>> [btrfs]
>>> [ 4079.489558]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
>>> [ 4079.495300]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
>>> [ 4079.501422]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>>> 
>>> [ 4079.508334] user-space-program D ffff88107fc30640     0 32021      
>>> 1
>>> 0x00000004
>>> [ 4079.516226]  ffff880eae8efbf8 0000000000000086 0000000000009000
>>> ffff880eae8effd8
>>> [ 4079.524513]  ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26
>>> ffff88101a5d6138
>>> [ 4079.532802]  ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8
>>> ffff881030279610
>>> [ 4079.541092] Call Trace:
>>> [ 4079.550642]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 
>>> [btrfs]
>>> [ 4079.557941]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>>> [ 4079.564463]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10
>>> [btrfs]
>>> [ 4079.572058]  [<ffffffffa06bb7d8>] ?
>>> btrfs_truncate_inode_items+0x168/0xb90 [btrfs]
>>> [ 4079.580526]  [<ffffffffa06b04be>] ? 
>>> join_transaction.isra.15+0x1e/0x3a0
>>> [btrfs]
>>> [ 4079.588701]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470
>>> [btrfs]
>>> [ 4079.596196]  [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50
>>> [btrfs]
>>> [ 4079.603789]  [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 
>>> [btrfs]
>>> [ 4079.610994]  [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 
>>> [btrfs]
>>> [ 4079.618197]  [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680
>>> [ 4079.624625]  [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160
>>> [ 4079.630854]  [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90
>>> [ 4079.636889]  [<ffffffff810f59fa>] ?
>>> do_sys_ftruncate.constprop.15+0x10a/0x160
>>> [ 4079.644869]  [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570
>>> [ 4079.650805]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>>> 
>>> [ 4080.410607] user-space-program D ffff88107fc70640     0 32028  
>>> 12639
>>> 0x00000004
>>> [ 4080.418489]  ffff880eaeccbbe0 0000000000000086 0000000000009000
>>> ffff880eaeccbfd8
>>> [ 4080.426778]  ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26
>>> ffff88101a5d6138
>>> [ 4080.435067]  ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08
>>> ffff880f317ef1e0
>>> [ 4080.443353] Call Trace:
>>> [ 4080.452920]  [<ffffffffa06ed15d>] ? 
>>> btrfs_tree_read_lock+0xdd/0x190
>>> [btrfs]
>>> [ 4080.460703]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>>> [ 4080.467225]  [<ffffffffa06876bb>] ? 
>>> btrfs_read_lock_root_node+0x2b/0x40
>>> [btrfs]
>>> [ 4080.475400]  [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10
>>> [btrfs]
>>> [ 4080.482994]  [<ffffffffa06b2df0>] ?
>>> btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs]
>>> [ 4080.491857]  [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90
>>> [btrfs]
>>> [ 4080.499353]  [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0
>>> [ 4080.505879]  [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs]
>>> [ 4080.512696]  [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120
>>> [btrfs]
>>> [ 4080.520387]  [<ffffffffa06f341f>] ? 
>>> btrfs_log_inode_parent+0xbdf/0xf30
>>> [btrfs]
>>> [ 4080.528469]  [<ffffffffa06f45a9>] ? 
>>> btrfs_log_dentry_safe+0x59/0x80
>>> [btrfs]
>>> [ 4080.536258]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 
>>> [btrfs]
>>> [ 4080.543657]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
>>> [ 4080.549399]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
>>> [ 4080.555534]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>>> 
>>> Signed-off-by: Robbie Ko <robbieko@synology.com>
>>> ---
>>>  fs/btrfs/tree-log.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
>>> index ef9c55b..9913084 100644
>>> --- a/fs/btrfs/tree-log.c
>>> +++ b/fs/btrfs/tree-log.c
>>> @@ -5213,6 +5213,7 @@ process_leaf:
>>>                         if (di_key.type == BTRFS_ROOT_ITEM_KEY)
>>>                                 continue;
>>> 
>>> +                       btrfs_release_path(path);
>>>                         di_inode = btrfs_iget(root->fs_info->sb, 
>>> &di_key,
>>>                                               root, NULL);
>>>                         if (IS_ERR(di_inode)) {
>>> @@ -5222,13 +5223,12 @@ process_leaf:
>>> 
>>>                         if (btrfs_inode_in_log(di_inode, 
>>> trans->transid))
>>> {
>>>                                 iput(di_inode);
>>> -                               continue;
>>> +                               break;
>> 
>> Why the break instead of continue? This seems totally unrelated to the
>> deadlock on a leaf/node and incorrect.
>> 
>> thanks
>> 
>>>                         }
>>> 
>>>                         ctx->log_new_dentries = false;
>>>                         if (type == BTRFS_FT_DIR || type ==
>>> BTRFS_FT_SYMLINK)
>>>                                 log_mode = LOG_INODE_ALL;
>>> -                       btrfs_release_path(path);
>>>                         ret = btrfs_log_inode(trans, root, di_inode,
>>>                                               log_mode, 0, LLONG_MAX,
>>> ctx);
>>>                         if (!ret &&
>>> --
>>> 1.9.1
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe 
>>> linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> 
>> 
>> --
>> Filipe David Manana,
>> 
>> "People will forget what you said,
>>  people will forget what you did,
>>  but people will never forget how you made them feel."


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Btrfs: fix fsync deadlock in log_new_dir_dentries
  2016-10-11  7:47       ` robbieko
@ 2016-10-12  9:04         ` Filipe Manana
  0 siblings, 0 replies; 7+ messages in thread
From: Filipe Manana @ 2016-10-12  9:04 UTC (permalink / raw)
  To: robbieko; +Cc: linux-btrfs@vger.kernel.org

On Tue, Oct 11, 2016 at 8:47 AM, robbieko <robbieko@synology.com> wrote:
> Hi Filipe:
>
> why did you replace the continue statement with a break statement:
> because we released ahead of the path, it can not continue to use,
> need to jump out, and then go to again.

Seeing the code and not only the patch's diff, I can see it now, thanks.
Btw, please avoid top posting, it's not how you should reply to a
thread, see [1].

[1] http://linux.sgms-centre.com/misc/netiquette.php

>
> supplement:
> We found a fsync deadlock ie. 32021->32020->32028->14431->14436->32021, the
> number id pid.
> extent_buffer: start:207060992, len:16384
>     locker pid: 32020 read lock
>     wait pid: 32021 write lock
> extent_buffer: start:14730821632, len:16384
>     locker pid: 32028 read lock
>     wait pid: 32020 write lock
> extent_buffer: start:446503813120, len:16384
>     locker pid: 14431 write lock
>     wait pid: 32028 read lock
> extent_buffer: start:446503845888, len: 16384
>     locker pid: 14436 write lock
>     wait pid: 14431 write lock
> extent_buffer: start: 446504386560, len: 16384
>     locker pid: 32021 write lock
>     wait pid: 14436 write lock
>
> Thanks.
> Robbie Ko
>
> Filipe Manana 於 2016-10-07 18:46 寫到:
>
>> On Fri, Oct 7, 2016 at 11:43 AM, robbieko <robbieko@synology.com> wrote:
>>>
>>> Hi Filipe,
>>>
>>> I am sorry, I express not clear enough.
>>> This number is pid, and the above are their call trace respectively.
>>
>>
>> And why did you replace the continue statement with a break statement?
>>
>> Also please avoid mixing inline replies with top posting, it just
>> breaks the thread.
>>
>> thanks
>>
>>>
>>> Thanks.
>>> robbieko
>>>
>>> Filipe Manana <fdmanana@gmail.com> 於 2016-10-07 18:23 寫道:
>>>
>>> On Fri, Oct 7, 2016 at 11:05 AM, robbieko <robbieko@synology.com> wrote:
>>>>
>>>> From: Robbie Ko <robbieko@synology.com>
>>>>
>>>> We found a fsync deadlock ie. 32021->32020->32028->14431->14436->32021,
>>>> in log_new_dir_dentries, because btrfs_search_forward get path lock,
>>>> then
>>>> call btrfs_iget will get another extent_buffer lock, maybe occur
>>>> deadlock.
>>>
>>>
>>> What are those numbers? Are they inode numbers?
>>> If so you're suggesting a deadlock due to recursive logging of the same
>>> inode.
>>> However the trace below, and the code change, has nothing to do with
>>> that.
>>> It's just about btrfs_iget trying to do a search on a btree and
>>> attempting to read lock some node/leaf that already has a write lock
>>> acquired previously by the same task.
>>>
>>> Please be more clear on your change logs.
>>>
>>>
>>>>
>>>> we can release path before call btrfs_iget, avoid deadlock occur.
>>>>
>>>> some process call trace like below:
>>>> [ 4077.478852] kworker/u24:10  D ffff88107fc90640     0 14431      2
>>>> 0x00000000
>>>> [ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper
>>>> [btrfs]
>>>> [ 4077.494346]  ffff880ffa56bad0 0000000000000046 0000000000009000
>>>> ffff880ffa56bfd8
>>>> [ 4077.502629]  ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26
>>>> ffff88101a5d6138
>>>> [ 4077.510915]  ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410
>>>> ffff881016ce21c0
>>>> [ 4077.519202] Call Trace:
>>>> [ 4077.528752]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0
>>>> [btrfs]
>>>> [ 4077.536049]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>>>> [ 4077.542574]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10
>>>> [btrfs]
>>>> [ 4077.550171]  [<ffffffffa06a5073>] ?
>>>> btrfs_lookup_file_extent+0x33/0x40
>>>> [btrfs]
>>>> [ 4077.558252]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0
>>>> [btrfs]
>>>> [ 4077.566140]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150
>>>> [btrfs]
>>>> [ 4077.573928]  [<ffffffffa06fd629>] ?
>>>> btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
>>>> [ 4077.582399]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0
>>>> [btrfs]
>>>> [ 4077.589896]  [<ffffffffa06b4a64>] ?
>>>> insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
>>>> [ 4077.599632]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470
>>>> [btrfs]
>>>> [ 4077.607134]  [<ffffffffa06bab57>] ?
>>>> btrfs_finish_ordered_io+0x2e7/0x600
>>>> [btrfs]
>>>> [ 4077.615329]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
>>>> [ 4077.622043]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
>>>> [ 4077.628459]  [<ffffffff8104d620>] ?
>>>> manage_workers.isra.26+0x270/0x270
>>>> [ 4077.635759]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
>>>> [ 4077.641404]  [<ffffffff81052a60>] ?
>>>> kthread_create_on_node+0x110/0x110
>>>> [ 4077.648696]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
>>>> [ 4077.654926]  [<ffffffff81052a60>] ?
>>>> kthread_create_on_node+0x110/0x110
>>>>
>>>> [ 4078.358087] kworker/u24:15  D ffff88107fcd0640     0 14436      2
>>>> 0x00000000
>>>> [ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper
>>>> [btrfs]
>>>> [ 4078.373574]  ffff880ffa57fad0 0000000000000046 0000000000009000
>>>> ffff880ffa57ffd8
>>>> [ 4078.381864]  ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26
>>>> ffff88101a5d6138
>>>> [ 4078.390163]  ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8
>>>> ffff88103004d0a0
>>>> [ 4078.398466] Call Trace:
>>>> [ 4078.408019]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0
>>>> [btrfs]
>>>> [ 4078.415322]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>>>> [ 4078.421844]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10
>>>> [btrfs]
>>>> [ 4078.429438]  [<ffffffffa06a5073>] ?
>>>> btrfs_lookup_file_extent+0x33/0x40
>>>> [btrfs]
>>>> [ 4078.437518]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0
>>>> [btrfs]
>>>> [ 4078.445404]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150
>>>> [btrfs]
>>>> [ 4078.453194]  [<ffffffffa06fd629>] ?
>>>> btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
>>>> [ 4078.461663]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0
>>>> [btrfs]
>>>> [ 4078.469161]  [<ffffffffa06b4a64>] ?
>>>> insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
>>>> [ 4078.478893]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470
>>>> [btrfs]
>>>> [ 4078.486388]  [<ffffffffa06bab57>] ?
>>>> btrfs_finish_ordered_io+0x2e7/0x600
>>>> [btrfs]
>>>> [ 4078.494561]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
>>>> [ 4078.501278]  [<ffffffff8104a507>] ?
>>>> pwq_activate_delayed_work+0x27/0x40
>>>> [ 4078.508673]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
>>>> [ 4078.515098]  [<ffffffff8104d620>] ?
>>>> manage_workers.isra.26+0x270/0x270
>>>> [ 4078.522396]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
>>>> [ 4078.528032]  [<ffffffff81052a60>] ?
>>>> kthread_create_on_node+0x110/0x110
>>>> [ 4078.535325]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
>>>> [ 4078.541552]  [<ffffffff81052a60>] ?
>>>> kthread_create_on_node+0x110/0x110
>>>>
>>>> [ 4079.355824] user-space-program D ffff88107fd30640     0 32020      1
>>>> 0x00000000
>>>> [ 4079.363716]  ffff880eae8eba10 0000000000000086 0000000000009000
>>>> ffff880eae8ebfd8
>>>> [ 4079.372003]  ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26
>>>> ffff88101a5d6138
>>>> [ 4079.380294]  ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528
>>>> ffff881016c162c0
>>>> [ 4079.388586] Call Trace:
>>>> [ 4079.398134]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0
>>>> [btrfs]
>>>> [ 4079.405431]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>>>> [ 4079.411955]  [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40
>>>> [btrfs]
>>>> [ 4079.419644]  [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10
>>>> [btrfs]
>>>> [ 4079.427237]  [<ffffffffa06aba52>] ? btrfs_buffer_uptodate+0x52/0x70
>>>> [btrfs]
>>>> [ 4079.435041]  [<ffffffffa0689b60>] ?
>>>> generic_bin_search.constprop.38+0x80/0x190 [btrfs]
>>>> [ 4079.443897]  [<ffffffffa068ea44>] ?
>>>> btrfs_insert_empty_items+0x74/0xd0
>>>> [btrfs]
>>>> [ 4079.451975]  [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs]
>>>> [ 4079.458890]  [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3
>>>> [btrfs]
>>>> [ 4079.466292]  [<ffffffffa06f34a1>] ?
>>>> btrfs_log_inode_parent+0xc61/0xf30
>>>> [btrfs]
>>>> [ 4079.474373]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80
>>>> [btrfs]
>>>> [ 4079.482161]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330
>>>> [btrfs]
>>>> [ 4079.489558]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
>>>> [ 4079.495300]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
>>>> [ 4079.501422]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>>>>
>>>> [ 4079.508334] user-space-program D ffff88107fc30640     0 32021      1
>>>> 0x00000004
>>>> [ 4079.516226]  ffff880eae8efbf8 0000000000000086 0000000000009000
>>>> ffff880eae8effd8
>>>> [ 4079.524513]  ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26
>>>> ffff88101a5d6138
>>>> [ 4079.532802]  ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8
>>>> ffff881030279610
>>>> [ 4079.541092] Call Trace:
>>>> [ 4079.550642]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0
>>>> [btrfs]
>>>> [ 4079.557941]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>>>> [ 4079.564463]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10
>>>> [btrfs]
>>>> [ 4079.572058]  [<ffffffffa06bb7d8>] ?
>>>> btrfs_truncate_inode_items+0x168/0xb90 [btrfs]
>>>> [ 4079.580526]  [<ffffffffa06b04be>] ?
>>>> join_transaction.isra.15+0x1e/0x3a0
>>>> [btrfs]
>>>> [ 4079.588701]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470
>>>> [btrfs]
>>>> [ 4079.596196]  [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50
>>>> [btrfs]
>>>> [ 4079.603789]  [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 [btrfs]
>>>> [ 4079.610994]  [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 [btrfs]
>>>> [ 4079.618197]  [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680
>>>> [ 4079.624625]  [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160
>>>> [ 4079.630854]  [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90
>>>> [ 4079.636889]  [<ffffffff810f59fa>] ?
>>>> do_sys_ftruncate.constprop.15+0x10a/0x160
>>>> [ 4079.644869]  [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570
>>>> [ 4079.650805]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>>>>
>>>> [ 4080.410607] user-space-program D ffff88107fc70640     0 32028  12639
>>>> 0x00000004
>>>> [ 4080.418489]  ffff880eaeccbbe0 0000000000000086 0000000000009000
>>>> ffff880eaeccbfd8
>>>> [ 4080.426778]  ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26
>>>> ffff88101a5d6138
>>>> [ 4080.435067]  ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08
>>>> ffff880f317ef1e0
>>>> [ 4080.443353] Call Trace:
>>>> [ 4080.452920]  [<ffffffffa06ed15d>] ? btrfs_tree_read_lock+0xdd/0x190
>>>> [btrfs]
>>>> [ 4080.460703]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
>>>> [ 4080.467225]  [<ffffffffa06876bb>] ?
>>>> btrfs_read_lock_root_node+0x2b/0x40
>>>> [btrfs]
>>>> [ 4080.475400]  [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10
>>>> [btrfs]
>>>> [ 4080.482994]  [<ffffffffa06b2df0>] ?
>>>> btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs]
>>>> [ 4080.491857]  [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90
>>>> [btrfs]
>>>> [ 4080.499353]  [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0
>>>> [ 4080.505879]  [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs]
>>>> [ 4080.512696]  [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120
>>>> [btrfs]
>>>> [ 4080.520387]  [<ffffffffa06f341f>] ?
>>>> btrfs_log_inode_parent+0xbdf/0xf30
>>>> [btrfs]
>>>> [ 4080.528469]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80
>>>> [btrfs]
>>>> [ 4080.536258]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330
>>>> [btrfs]
>>>> [ 4080.543657]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
>>>> [ 4080.549399]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
>>>> [ 4080.555534]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>>>>
>>>> Signed-off-by: Robbie Ko <robbieko@synology.com>
>>>> ---
>>>>  fs/btrfs/tree-log.c | 4 ++--
>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
>>>> index ef9c55b..9913084 100644
>>>> --- a/fs/btrfs/tree-log.c
>>>> +++ b/fs/btrfs/tree-log.c
>>>> @@ -5213,6 +5213,7 @@ process_leaf:
>>>>                         if (di_key.type == BTRFS_ROOT_ITEM_KEY)
>>>>                                 continue;
>>>>
>>>> +                       btrfs_release_path(path);
>>>>                         di_inode = btrfs_iget(root->fs_info->sb,
>>>> &di_key,
>>>>                                               root, NULL);
>>>>                         if (IS_ERR(di_inode)) {
>>>> @@ -5222,13 +5223,12 @@ process_leaf:
>>>>
>>>>                         if (btrfs_inode_in_log(di_inode,
>>>> trans->transid))
>>>> {
>>>>                                 iput(di_inode);
>>>> -                               continue;
>>>> +                               break;
>>>
>>>
>>> Why the break instead of continue? This seems totally unrelated to the
>>> deadlock on a leaf/node and incorrect.
>>>
>>> thanks
>>>
>>>>                         }
>>>>
>>>>                         ctx->log_new_dentries = false;
>>>>                         if (type == BTRFS_FT_DIR || type ==
>>>> BTRFS_FT_SYMLINK)
>>>>                                 log_mode = LOG_INODE_ALL;
>>>> -                       btrfs_release_path(path);
>>>>                         ret = btrfs_log_inode(trans, root, di_inode,
>>>>                                               log_mode, 0, LLONG_MAX,
>>>> ctx);
>>>>                         if (!ret &&
>>>> --
>>>> 1.9.1
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>>>> in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
>>>
>>> --
>>> Filipe David Manana,
>>>
>>> "People will forget what you said,
>>>  people will forget what you did,
>>>  but people will never forget how you made them feel."
>
>



-- 
Filipe David Manana,

"People will forget what you said,
 people will forget what you did,
 but people will never forget how you made them feel."

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] Btrfs: fix fsync deadlock in log_new_dir_dentries
@ 2016-10-28  2:48 robbieko
  2016-11-30 16:25 ` Filipe Manana
  0 siblings, 1 reply; 7+ messages in thread
From: robbieko @ 2016-10-28  2:48 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Robbie Ko

From: Robbie Ko <robbieko@synology.com>

We found a fsync deadlock in log_new_dir_dentries, because
btrfs_search_forward get path lock, then call btrfs_iget will
get another extent_buffer lock, maybe occur deadlock.

Fix this by release path before call btrfs_iget, avoid deadlock occur.

Example:
    Pid waiting: 32021->32020->32028->14431->14436->32021

The following are their extent_buffer locked/waiting respectively:
extent_buffer: start:207060992, len:16384
    locker pid: 32020 read lock
    wait pid: 32021 write lock
extent_buffer: start:14730821632, len:16384
    locker pid: 32028 read lock
    wait pid: 32020 write lock
extent_buffer: start:446503813120, len:16384
    locker pid: 14431 write lock
    wait pid: 32028 read lock
extent_buffer: start:446503845888, len: 16384
    locker pid: 14436 write lock
    wait pid: 14431 write lock
extent_buffer: start: 446504386560, len: 16384
    locker pid: 32021 write lock
    wait pid: 14436 write lock

The following are their call trace respectively.
[ 4077.478852] kworker/u24:10  D ffff88107fc90640     0 14431      2 0x00000000
[ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[ 4077.494346]  ffff880ffa56bad0 0000000000000046 0000000000009000 ffff880ffa56bfd8
[ 4077.502629]  ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26 ffff88101a5d6138
[ 4077.510915]  ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410 ffff881016ce21c0
[ 4077.519202] Call Trace:
[ 4077.528752]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
[ 4077.536049]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
[ 4077.542574]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
[ 4077.550171]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs]
[ 4077.558252]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs]
[ 4077.566140]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs]
[ 4077.573928]  [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
[ 4077.582399]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs]
[ 4077.589896]  [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
[ 4077.599632]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
[ 4077.607134]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs]
[ 4077.615329]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
[ 4077.622043]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
[ 4077.628459]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
[ 4077.635759]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
[ 4077.641404]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
[ 4077.648696]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
[ 4077.654926]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110

[ 4078.358087] kworker/u24:15  D ffff88107fcd0640     0 14436      2 0x00000000
[ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[ 4078.373574]  ffff880ffa57fad0 0000000000000046 0000000000009000 ffff880ffa57ffd8
[ 4078.381864]  ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26 ffff88101a5d6138
[ 4078.390163]  ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8 ffff88103004d0a0
[ 4078.398466] Call Trace:
[ 4078.408019]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
[ 4078.415322]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
[ 4078.421844]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
[ 4078.429438]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs]
[ 4078.437518]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs]
[ 4078.445404]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs]
[ 4078.453194]  [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
[ 4078.461663]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs]
[ 4078.469161]  [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
[ 4078.478893]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
[ 4078.486388]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs]
[ 4078.494561]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
[ 4078.501278]  [<ffffffff8104a507>] ? pwq_activate_delayed_work+0x27/0x40
[ 4078.508673]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
[ 4078.515098]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
[ 4078.522396]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
[ 4078.528032]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
[ 4078.535325]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
[ 4078.541552]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110

[ 4079.355824] user-space-program D ffff88107fd30640     0 32020      1 0x00000000
[ 4079.363716]  ffff880eae8eba10 0000000000000086 0000000000009000 ffff880eae8ebfd8
[ 4079.372003]  ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26 ffff88101a5d6138
[ 4079.380294]  ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528 ffff881016c162c0
[ 4079.388586] Call Trace:
[ 4079.398134]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
[ 4079.405431]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
[ 4079.411955]  [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40 [btrfs]
[ 4079.419644]  [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10 [btrfs]
[ 4079.427237]  [<ffffffffa06aba52>] ? btrfs_buffer_uptodate+0x52/0x70 [btrfs]
[ 4079.435041]  [<ffffffffa0689b60>] ? generic_bin_search.constprop.38+0x80/0x190 [btrfs]
[ 4079.443897]  [<ffffffffa068ea44>] ? btrfs_insert_empty_items+0x74/0xd0 [btrfs]
[ 4079.451975]  [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs]
[ 4079.458890]  [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3 [btrfs]
[ 4079.466292]  [<ffffffffa06f34a1>] ? btrfs_log_inode_parent+0xc61/0xf30 [btrfs]
[ 4079.474373]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs]
[ 4079.482161]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
[ 4079.489558]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
[ 4079.495300]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
[ 4079.501422]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b

[ 4079.508334] user-space-program D ffff88107fc30640     0 32021      1 0x00000004
[ 4079.516226]  ffff880eae8efbf8 0000000000000086 0000000000009000 ffff880eae8effd8
[ 4079.524513]  ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26 ffff88101a5d6138
[ 4079.532802]  ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8 ffff881030279610
[ 4079.541092] Call Trace:
[ 4079.550642]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
[ 4079.557941]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
[ 4079.564463]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
[ 4079.572058]  [<ffffffffa06bb7d8>] ? btrfs_truncate_inode_items+0x168/0xb90 [btrfs]
[ 4079.580526]  [<ffffffffa06b04be>] ? join_transaction.isra.15+0x1e/0x3a0 [btrfs]
[ 4079.588701]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
[ 4079.596196]  [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50 [btrfs]
[ 4079.603789]  [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 [btrfs]
[ 4079.610994]  [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 [btrfs]
[ 4079.618197]  [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680
[ 4079.624625]  [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160
[ 4079.630854]  [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90
[ 4079.636889]  [<ffffffff810f59fa>] ? do_sys_ftruncate.constprop.15+0x10a/0x160
[ 4079.644869]  [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570
[ 4079.650805]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b

[ 4080.410607] user-space-program D ffff88107fc70640     0 32028  12639 0x00000004
[ 4080.418489]  ffff880eaeccbbe0 0000000000000086 0000000000009000 ffff880eaeccbfd8
[ 4080.426778]  ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26 ffff88101a5d6138
[ 4080.435067]  ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08 ffff880f317ef1e0
[ 4080.443353] Call Trace:
[ 4080.452920]  [<ffffffffa06ed15d>] ? btrfs_tree_read_lock+0xdd/0x190 [btrfs]
[ 4080.460703]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
[ 4080.467225]  [<ffffffffa06876bb>] ? btrfs_read_lock_root_node+0x2b/0x40 [btrfs]
[ 4080.475400]  [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10 [btrfs]
[ 4080.482994]  [<ffffffffa06b2df0>] ? btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs]
[ 4080.491857]  [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90 [btrfs]
[ 4080.499353]  [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0
[ 4080.505879]  [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs]
[ 4080.512696]  [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120 [btrfs]
[ 4080.520387]  [<ffffffffa06f341f>] ? btrfs_log_inode_parent+0xbdf/0xf30 [btrfs]
[ 4080.528469]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs]
[ 4080.536258]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
[ 4080.543657]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
[ 4080.549399]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
[ 4080.555534]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b

v2:
* Improve the change log

Signed-off-by: Robbie Ko <robbieko@synology.com>
---
 fs/btrfs/tree-log.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index ef9c55b..9913084 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -5213,6 +5213,7 @@ process_leaf:
 			if (di_key.type == BTRFS_ROOT_ITEM_KEY)
 				continue;
 
+			btrfs_release_path(path);
 			di_inode = btrfs_iget(root->fs_info->sb, &di_key,
 					      root, NULL);
 			if (IS_ERR(di_inode)) {
@@ -5222,13 +5223,12 @@ process_leaf:
 
 			if (btrfs_inode_in_log(di_inode, trans->transid)) {
 				iput(di_inode);
-				continue;
+				break;
 			}
 
 			ctx->log_new_dentries = false;
 			if (type == BTRFS_FT_DIR || type == BTRFS_FT_SYMLINK)
 				log_mode = LOG_INODE_ALL;
-			btrfs_release_path(path);
 			ret = btrfs_log_inode(trans, root, di_inode,
 					      log_mode, 0, LLONG_MAX, ctx);
 			if (!ret &&
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] Btrfs: fix fsync deadlock in log_new_dir_dentries
  2016-10-28  2:48 robbieko
@ 2016-11-30 16:25 ` Filipe Manana
  0 siblings, 0 replies; 7+ messages in thread
From: Filipe Manana @ 2016-11-30 16:25 UTC (permalink / raw)
  To: robbieko; +Cc: linux-btrfs@vger.kernel.org

On Fri, Oct 28, 2016 at 3:48 AM, robbieko <robbieko@synology.com> wrote:
> From: Robbie Ko <robbieko@synology.com>
>
> We found a fsync deadlock in log_new_dir_dentries, because
> btrfs_search_forward get path lock, then call btrfs_iget will
> get another extent_buffer lock, maybe occur deadlock.

This still doesn't explain how the deadlock happens.
For it to happen it's necessary that before btrfs_iget() does a tree
search, some other task gets write locks on nodes and blocks waiting
for the leaf locked by btrfs_search_forward() to be unlocked, and that
btrfs_iget() tries to read lock those same nodes write locked by that
other task.

It's important to have detailed information in the change logs. I
understand that english isn't your native tongue (it's not mine
either, and I'm far from mastering it), but that's not an excuse to
not express all the important information in detail (we can all live
with grammar errors and typos).

>
> Fix this by release path before call btrfs_iget, avoid deadlock occur.
>
> Example:
>     Pid waiting: 32021->32020->32028->14431->14436->32021
>
> The following are their extent_buffer locked/waiting respectively:
> extent_buffer: start:207060992, len:16384
>     locker pid: 32020 read lock
>     wait pid: 32021 write lock
> extent_buffer: start:14730821632, len:16384
>     locker pid: 32028 read lock
>     wait pid: 32020 write lock
> extent_buffer: start:446503813120, len:16384
>     locker pid: 14431 write lock
>     wait pid: 32028 read lock
> extent_buffer: start:446503845888, len: 16384
>     locker pid: 14436 write lock
>     wait pid: 14431 write lock
> extent_buffer: start: 446504386560, len: 16384
>     locker pid: 32021 write lock
>     wait pid: 14436 write lock
>
> The following are their call trace respectively.
> [ 4077.478852] kworker/u24:10  D ffff88107fc90640     0 14431      2 0x00000000
> [ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
> [ 4077.494346]  ffff880ffa56bad0 0000000000000046 0000000000009000 ffff880ffa56bfd8
> [ 4077.502629]  ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26 ffff88101a5d6138
> [ 4077.510915]  ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410 ffff881016ce21c0
> [ 4077.519202] Call Trace:
> [ 4077.528752]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
> [ 4077.536049]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
> [ 4077.542574]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
> [ 4077.550171]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs]
> [ 4077.558252]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs]
> [ 4077.566140]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs]
> [ 4077.573928]  [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
> [ 4077.582399]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs]
> [ 4077.589896]  [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
> [ 4077.599632]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
> [ 4077.607134]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs]
> [ 4077.615329]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
> [ 4077.622043]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
> [ 4077.628459]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
> [ 4077.635759]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
> [ 4077.641404]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
> [ 4077.648696]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
> [ 4077.654926]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
>
> [ 4078.358087] kworker/u24:15  D ffff88107fcd0640     0 14436      2 0x00000000
> [ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
> [ 4078.373574]  ffff880ffa57fad0 0000000000000046 0000000000009000 ffff880ffa57ffd8
> [ 4078.381864]  ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26 ffff88101a5d6138
> [ 4078.390163]  ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8 ffff88103004d0a0
> [ 4078.398466] Call Trace:
> [ 4078.408019]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
> [ 4078.415322]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
> [ 4078.421844]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
> [ 4078.429438]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs]
> [ 4078.437518]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs]
> [ 4078.445404]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs]
> [ 4078.453194]  [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
> [ 4078.461663]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs]
> [ 4078.469161]  [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
> [ 4078.478893]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
> [ 4078.486388]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs]
> [ 4078.494561]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
> [ 4078.501278]  [<ffffffff8104a507>] ? pwq_activate_delayed_work+0x27/0x40
> [ 4078.508673]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
> [ 4078.515098]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
> [ 4078.522396]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
> [ 4078.528032]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
> [ 4078.535325]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
> [ 4078.541552]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
>
> [ 4079.355824] user-space-program D ffff88107fd30640     0 32020      1 0x00000000
> [ 4079.363716]  ffff880eae8eba10 0000000000000086 0000000000009000 ffff880eae8ebfd8
> [ 4079.372003]  ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26 ffff88101a5d6138
> [ 4079.380294]  ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528 ffff881016c162c0
> [ 4079.388586] Call Trace:
> [ 4079.398134]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
> [ 4079.405431]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
> [ 4079.411955]  [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40 [btrfs]
> [ 4079.419644]  [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10 [btrfs]
> [ 4079.427237]  [<ffffffffa06aba52>] ? btrfs_buffer_uptodate+0x52/0x70 [btrfs]
> [ 4079.435041]  [<ffffffffa0689b60>] ? generic_bin_search.constprop.38+0x80/0x190 [btrfs]
> [ 4079.443897]  [<ffffffffa068ea44>] ? btrfs_insert_empty_items+0x74/0xd0 [btrfs]
> [ 4079.451975]  [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs]
> [ 4079.458890]  [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3 [btrfs]
> [ 4079.466292]  [<ffffffffa06f34a1>] ? btrfs_log_inode_parent+0xc61/0xf30 [btrfs]
> [ 4079.474373]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs]
> [ 4079.482161]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
> [ 4079.489558]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
> [ 4079.495300]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
> [ 4079.501422]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>
> [ 4079.508334] user-space-program D ffff88107fc30640     0 32021      1 0x00000004
> [ 4079.516226]  ffff880eae8efbf8 0000000000000086 0000000000009000 ffff880eae8effd8
> [ 4079.524513]  ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26 ffff88101a5d6138
> [ 4079.532802]  ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8 ffff881030279610
> [ 4079.541092] Call Trace:
> [ 4079.550642]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
> [ 4079.557941]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
> [ 4079.564463]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
> [ 4079.572058]  [<ffffffffa06bb7d8>] ? btrfs_truncate_inode_items+0x168/0xb90 [btrfs]
> [ 4079.580526]  [<ffffffffa06b04be>] ? join_transaction.isra.15+0x1e/0x3a0 [btrfs]
> [ 4079.588701]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
> [ 4079.596196]  [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50 [btrfs]
> [ 4079.603789]  [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 [btrfs]
> [ 4079.610994]  [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 [btrfs]
> [ 4079.618197]  [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680
> [ 4079.624625]  [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160
> [ 4079.630854]  [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90
> [ 4079.636889]  [<ffffffff810f59fa>] ? do_sys_ftruncate.constprop.15+0x10a/0x160
> [ 4079.644869]  [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570
> [ 4079.650805]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>
> [ 4080.410607] user-space-program D ffff88107fc70640     0 32028  12639 0x00000004
> [ 4080.418489]  ffff880eaeccbbe0 0000000000000086 0000000000009000 ffff880eaeccbfd8
> [ 4080.426778]  ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26 ffff88101a5d6138
> [ 4080.435067]  ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08 ffff880f317ef1e0
> [ 4080.443353] Call Trace:
> [ 4080.452920]  [<ffffffffa06ed15d>] ? btrfs_tree_read_lock+0xdd/0x190 [btrfs]
> [ 4080.460703]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
> [ 4080.467225]  [<ffffffffa06876bb>] ? btrfs_read_lock_root_node+0x2b/0x40 [btrfs]
> [ 4080.475400]  [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10 [btrfs]
> [ 4080.482994]  [<ffffffffa06b2df0>] ? btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs]
> [ 4080.491857]  [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90 [btrfs]
> [ 4080.499353]  [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0
> [ 4080.505879]  [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs]
> [ 4080.512696]  [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120 [btrfs]
> [ 4080.520387]  [<ffffffffa06f341f>] ? btrfs_log_inode_parent+0xbdf/0xf30 [btrfs]
> [ 4080.528469]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs]
> [ 4080.536258]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
> [ 4080.543657]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
> [ 4080.549399]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
> [ 4080.555534]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
>
> v2:
> * Improve the change log

Same as I mentioned before, versioning information does not belong in
the changelog, but rather after the "---" below.
See https://btrfs.wiki.kernel.org/index.php/Developer's_FAQ#Repeated_submissions
and examples from others that submit patches to this list.

I've picked the patch to my branch at
https://git.kernel.org/cgit/linux/kernel/git/fdmanana/linux.git/log/?h=for-chris-4.10
and reworded the changelog for clarity and completeness.

Thanks

>
> Signed-off-by: Robbie Ko <robbieko@synology.com>
> ---
>  fs/btrfs/tree-log.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index ef9c55b..9913084 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -5213,6 +5213,7 @@ process_leaf:
>                         if (di_key.type == BTRFS_ROOT_ITEM_KEY)
>                                 continue;
>
> +                       btrfs_release_path(path);
>                         di_inode = btrfs_iget(root->fs_info->sb, &di_key,
>                                               root, NULL);
>                         if (IS_ERR(di_inode)) {
> @@ -5222,13 +5223,12 @@ process_leaf:
>
>                         if (btrfs_inode_in_log(di_inode, trans->transid)) {
>                                 iput(di_inode);
> -                               continue;
> +                               break;
>                         }
>
>                         ctx->log_new_dentries = false;
>                         if (type == BTRFS_FT_DIR || type == BTRFS_FT_SYMLINK)
>                                 log_mode = LOG_INODE_ALL;
> -                       btrfs_release_path(path);
>                         ret = btrfs_log_inode(trans, root, di_inode,
>                                               log_mode, 0, LLONG_MAX, ctx);
>                         if (!ret &&
> --
> 1.9.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"People will forget what you said,
 people will forget what you did,
 but people will never forget how you made them feel."

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-11-30 16:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-07 10:05 [PATCH] Btrfs: fix fsync deadlock in log_new_dir_dentries robbieko
2016-10-07 10:23 ` Filipe Manana
     [not found]   ` <e077183a-e7b4-43c1-bb8d-a1cf900072fe@Mail>
2016-10-07 10:46     ` Filipe Manana
2016-10-11  7:47       ` robbieko
2016-10-12  9:04         ` Filipe Manana
  -- strict thread matches above, loose matches on Subject: below --
2016-10-28  2:48 robbieko
2016-11-30 16:25 ` Filipe Manana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).