linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2/4] Btrfs: fix deadlock on sb->s_umount when doing umount
@ 2012-04-26  2:58 Miao Xie
  2012-04-26 11:44 ` David Sterba
  0 siblings, 1 reply; 8+ messages in thread
From: Miao Xie @ 2012-04-26  2:58 UTC (permalink / raw)
  To: Chris Mason; +Cc: Tsutomu Itoh, miaox, Linux Btrfs, Linux FSDevel

The reason the deadlock is that:
  Task					Btrfs-cleaner
  umount()
    down_write(&s->s_umount)
    sync_filesystem()
					do auto-defragment and produce
					lots of dirty pages
    close_ctree()
      wait for the end of
      btrfs-cleaner
					start_transaction
					  reserve space
					    shrink_delalloc()
					      writeback_inodes_sb_nr_if_idle()
						down_read(&sb->s_umount)
So, the deadlock has happened.

We fix it by using try_to_writeback_inodes_sb_nr(), this function will try to
get sb->s_umount, if fail, it won't do any writeback. At this time, In order to
get enough space to reserve successfully, we will use the sync function of
btrfs to sync all the delalloc file. This operation is safe, we needn't worry
about the case that the filesystem goes from r/w to r/o. because the filesystem
should guarantee all the dirty pages have been written into the disk after it
becomes readonly, so the sync operation will do nothing if the filesystem is
already readonly. Though it may waste lots of time, as a corner case, we
needn't care.

Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/disk-io.c     |    3 +++
 fs/btrfs/extent-tree.c |   25 +++++++++++++++++++++++--
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 20196f4..59b566d 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3056,6 +3056,9 @@ int close_ctree(struct btrfs_root *root)
 	/* clear out the rbtree of defraggable inodes */
 	btrfs_run_defrag_inodes(fs_info);
 
+	/* sync all the dirty pages which are made by auto defrag */
+	sync_filesystem(fs_info->sb);
+
 	/*
 	 * Here come 2 situations when btrfs is broken to flip readonly:
 	 *
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 2b35f8d..e533ab3 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3580,6 +3580,28 @@ out:
 	return ret;
 }
 
+void btrfs_writeback_inodes_sb_nr(struct btrfs_root *root,
+				  unsigned long nr_pages)
+{
+	struct super_block *sb = root->fs_info->sb;
+	int started;
+
+	/* If we can not start writeback, just sync all the delalloc file. */
+	started = try_to_writeback_inodes_sb_nr(sb, nr_pages,
+						WB_REASON_FS_FREE_SPACE);
+	if (!started) {
+		/*
+		 * We needn't worry the filesystem going from r/w to r/o though
+		 * we don't acquire ->s_umount mutex, because the filesystem
+		 * should guarantee the delalloc inodes list be empty after
+		 * the filesystem is readonly(all dirty pages are written to
+		 * the disk).
+		 */
+		btrfs_start_delalloc_inodes(root, 0);
+		btrfs_wait_ordered_extents(root, 0, 0);
+	}
+}
+
 /*
  * shrink metadata reservation for delalloc
  */
@@ -3624,8 +3646,7 @@ static int shrink_delalloc(struct btrfs_root *root, u64 to_reclaim,
 		smp_mb();
 		nr_pages = min_t(unsigned long, nr_pages,
 		       root->fs_info->delalloc_bytes >> PAGE_CACHE_SHIFT);
-		writeback_inodes_sb_nr_if_idle(root->fs_info->sb, nr_pages,
-						WB_REASON_FS_FREE_SPACE);
+		btrfs_writeback_inodes_sb_nr(root, nr_pages);
 
 		spin_lock(&space_info->lock);
 		if (reserved > space_info->bytes_may_use)
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/4] Btrfs: fix deadlock on sb->s_umount when doing umount
  2012-04-26  2:58 [PATCH 2/4] Btrfs: fix deadlock on sb->s_umount when doing umount Miao Xie
@ 2012-04-26 11:44 ` David Sterba
  2012-04-27 10:55   ` Miao Xie
  0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2012-04-26 11:44 UTC (permalink / raw)
  To: Miao Xie; +Cc: Chris Mason, Tsutomu Itoh, miaox, Linux Btrfs, Linux FSDevel

On Thu, Apr 26, 2012 at 10:58:04AM +0800, Miao Xie wrote:
> The reason the deadlock is that:
>   Task					Btrfs-cleaner
>   umount()
>     down_write(&s->s_umount)
>     sync_filesystem()
> 					do auto-defragment and produce
> 					lots of dirty pages
>     close_ctree()
>       wait for the end of
>       btrfs-cleaner

why it's needed to wait for cleaner during close_ctree? I got bitten
yesterday by (a non-deadlock) scenario, when there were tons of deleted
snapshots, filesystem almost full, so getting and managing free space
was slow (btrfs-cache thread was more active than btrfs-cleaner), and
umount just did not end. interruptible by reboot only.

avoiding this particular case of waiting for cleaner:

--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3064,7 +3066,13 @@ int btrfs_commit_super(struct btrfs_root *root)

        mutex_lock(&root->fs_info->cleaner_mutex);
        btrfs_run_delayed_iputs(root);
-       btrfs_clean_old_snapshots(root);
+       if (!btrfs_fs_closing(root->fs_info)) {
+               /* btrfs_clean_old_snapshots(root); */
+               wake_up_process(root->fs_info->cleaner_kthread);
+               printk(KERN_DEBUG "btrfs: wake cleaner from commit_super\n");
+       } else {
+               printk(KERN_DEBUG "btrfs: skip cleaning when going down\n");
+       }
        mutex_unlock(&root->fs_info->cleaner_mutex);

        /* wait until ongoing cleanup work done */


plus not even trying to call the cleaner directly, rather waking the cleaner
thread. (attached whole work-in-progress patch).

david
---

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 58a232d..0651f6f 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1617,13 +1617,14 @@ static int cleaner_kthread(void *arg)
 	struct btrfs_root *root = arg;
 
 	do {
+		int again = 0;
 		vfs_check_frozen(root->fs_info->sb, SB_FREEZE_WRITE);
 
 		if (!(root->fs_info->sb->s_flags & MS_RDONLY) &&
 			down_read_trylock(&root->fs_info->sb->s_umount) &&
 		    mutex_trylock(&root->fs_info->cleaner_mutex)) {
 			btrfs_run_delayed_iputs(root);
-			btrfs_clean_old_snapshots(root);
+			again = btrfs_clean_one_old_snapshot(root);
 			mutex_unlock(&root->fs_info->cleaner_mutex);
 			btrfs_run_defrag_inodes(root->fs_info);
 			up_read(&root->fs_info->sb->s_umount);
@@ -1631,7 +1632,7 @@ static int cleaner_kthread(void *arg)
 
 		if (freezing(current)) {
 			refrigerator();
-		} else {
+		} else if (!again) {//FIXME: check again now directly from dead_roots?
 			set_current_state(TASK_INTERRUPTIBLE);
 			if (!kthread_should_stop())
 				schedule();
@@ -3064,7 +3066,13 @@ int btrfs_commit_super(struct btrfs_root *root)
 
 	mutex_lock(&root->fs_info->cleaner_mutex);
 	btrfs_run_delayed_iputs(root);
-	btrfs_clean_old_snapshots(root);
+	if (!btrfs_fs_closing(root->fs_info)) {
+		/* btrfs_clean_old_snapshots(root); */
+		wake_up_process(root->fs_info->cleaner_kthread);
+		printk(KERN_DEBUG "btrfs: wake cleaner from commit_super\n");
+	} else {
+		printk(KERN_DEBUG "btrfs: skip cleaning when going down\n");
+	}
 	mutex_unlock(&root->fs_info->cleaner_mutex);
 
 	/* wait until ongoing cleanup work done */
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index ec1e0c6..3aba911 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -6995,6 +6995,11 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
 	wc->reada_count = BTRFS_NODEPTRS_PER_BLOCK(root);
 
 	while (1) {
+		if (btrfs_fs_closing(root->fs_info)) {
+			printk(KERN_DEBUG "btrfs: drop early exit\n");
+			err = -EAGAIN;
+			goto out_end_trans;
+		}
 		ret = walk_down_tree(trans, root, path, wc);
 		if (ret < 0) {
 			err = ret;
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 922e6ec..c9dc857 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -4060,6 +4060,7 @@ int btrfs_relocate_block_group(struct btrfs_root *extent_root, u64 group_start)
 	while (1) {
 		mutex_lock(&fs_info->cleaner_mutex);
 
+		// FIXME: wake cleaner
 		btrfs_clean_old_snapshots(fs_info->tree_root);
 		ret = relocate_block_group(rc);
 
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index de2942f..3d83f6b 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -783,7 +783,7 @@ static noinline int commit_cowonly_roots(struct btrfs_trans_handle *trans,
 int btrfs_add_dead_root(struct btrfs_root *root)
 {
 	spin_lock(&root->fs_info->trans_lock);
-	list_add(&root->root_list, &root->fs_info->dead_roots);
+	list_add_tail(&root->root_list, &root->fs_info->dead_roots);
 	spin_unlock(&root->fs_info->trans_lock);
 	return 0;
 }
@@ -1533,7 +1533,45 @@ int btrfs_clean_old_snapshots(struct btrfs_root *root)
 			ret = btrfs_drop_snapshot(root, NULL, 0, 0);
 		else
 			ret =btrfs_drop_snapshot(root, NULL, 1, 0);
-		BUG_ON(ret < 0);
+		BUG_ON(ret < 0 && ret != -EAGAIN);
 	}
 	return 0;
 }
+/*
+ * return < 0 if error
+ * 0 if there are no more dead_roots at the time of call
+ * 1 there are more to be processed, call me again
+ *
+ * (racy)
+ */
+int btrfs_clean_one_old_snapshot(struct btrfs_root *root)
+{
+	int ret;
+	int run_again = 1;
+	struct btrfs_fs_info *fs_info = root->fs_info;
+
+	if (root->fs_info->sb->s_flags & MS_RDONLY) {
+		printk(KERN_WARNING "btrfs: cleaner called for RO fs!\n");
+	}
+
+	spin_lock(&fs_info->trans_lock);
+	root = list_first_entry(&fs_info->dead_roots,
+			struct btrfs_root, root_list);
+	list_del(&root->root_list);
+	if (list_empty(&fs_info->dead_roots))
+		run_again = 0;
+	spin_unlock(&fs_info->trans_lock);
+
+	printk(KERN_DEBUG "btrfs: cleaner removing %llu\n",
+			(unsigned long long)root->objectid);
+
+	btrfs_kill_all_delayed_nodes(root);
+
+	if (btrfs_header_backref_rev(root->node) <
+			BTRFS_MIXED_BACKREF_REV)
+		ret = btrfs_drop_snapshot(root, NULL, 0, 0);
+	else
+		ret = btrfs_drop_snapshot(root, NULL, 1, 0);
+	BUG_ON(ret < 0 && ret != -EAGAIN);
+	return run_again || ret == -EAGAIN;
+}
diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
index fe27379..7071ca5 100644
--- a/fs/btrfs/transaction.h
+++ b/fs/btrfs/transaction.h
@@ -94,6 +94,7 @@ int btrfs_write_and_wait_transaction(struct btrfs_trans_handle *trans,
 int btrfs_add_dead_root(struct btrfs_root *root);
 int btrfs_defrag_root(struct btrfs_root *root, int cacheonly);
 int btrfs_clean_old_snapshots(struct btrfs_root *root);
+int btrfs_clean_one_old_snapshot(struct btrfs_root *root);
 int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
 			     struct btrfs_root *root);
 int btrfs_commit_transaction_async(struct btrfs_trans_handle *trans,
---

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/4] Btrfs: fix deadlock on sb->s_umount when doing umount
  2012-04-26 11:44 ` David Sterba
@ 2012-04-27 10:55   ` Miao Xie
  2012-04-30 16:41     ` David Sterba
  0 siblings, 1 reply; 8+ messages in thread
From: Miao Xie @ 2012-04-27 10:55 UTC (permalink / raw)
  To: Miao Xie, Chris Mason, Tsutomu Itoh, miaox, Linux Btrfs,
	Linux FSDevel


>> The reason the deadlock is that:
>>   Task					Btrfs-cleaner
>>   umount()
>>     down_write(&s->s_umount)
>>     sync_filesystem()
>> 					do auto-defragment and produce
>> 					lots of dirty pages
>>     close_ctree()
>>       wait for the end of
>>       btrfs-cleaner
> 
> why it's needed to wait for cleaner during close_ctree? I got bitten

> yesterday by (a non-deadlock) scenario, when there were tons of deleted

> snapshots, filesystem almost full, so getting and managing free space
> was slow (btrfs-cache thread was more active than btrfs-cleaner), and
> umount just did not end. interruptible by reboot only.

> 

> avoiding this particular case of waiting for cleaner:


Your patch doesn't fix the problem.
  Task					Btrfs-cleaner
  umount()
    down_write(&s->s_umount)
    sync_filesystem()
					do auto-defragment for some files
					and produce lots of dirty pages.
					do auto-defragment for the left files
					  start_transaction
					    reserve space
					      shrink_delalloc()
						writeback_inodes_sb_nr_if_idle()
						  down_read(&sb->s_umount)
    close_ctree()
      stop_kthread()
	wait for the end of
	btrfs-cleaner

the deadlock still happens.

But I found you add a trylock for ->s_umount in cleaner_kthread(), this method
can fix the deadlock problem, I think. It may be introduced by the other patch,
could you send that patch to me? I found if we fail to trylock ->cleaner_mutex,
we will forget to unlock ->s_umount.

Thanks
Miao

> 
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3064,7 +3066,13 @@ int btrfs_commit_super(struct btrfs_root *root)
> 
>         mutex_lock(&root->fs_info->cleaner_mutex);
>         btrfs_run_delayed_iputs(root);
> -       btrfs_clean_old_snapshots(root);
> +       if (!btrfs_fs_closing(root->fs_info)) {
> +               /* btrfs_clean_old_snapshots(root); */
> +               wake_up_process(root->fs_info->cleaner_kthread);
> +               printk(KERN_DEBUG "btrfs: wake cleaner from commit_super\n");
> +       } else {
> +               printk(KERN_DEBUG "btrfs: skip cleaning when going down\n");
> +       }
>         mutex_unlock(&root->fs_info->cleaner_mutex);
> 
>         /* wait until ongoing cleanup work done */
> 
> 
> plus not even trying to call the cleaner directly, rather waking the cleaner
> thread. (attached whole work-in-progress patch).
> 
> david
> ---
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 58a232d..0651f6f 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1617,13 +1617,14 @@ static int cleaner_kthread(void *arg)
>  	struct btrfs_root *root = arg;
>  
>  	do {
> +		int again = 0;
>  		vfs_check_frozen(root->fs_info->sb, SB_FREEZE_WRITE);
>  
>  		if (!(root->fs_info->sb->s_flags & MS_RDONLY) &&
>  			down_read_trylock(&root->fs_info->sb->s_umount) &&
>  		    mutex_trylock(&root->fs_info->cleaner_mutex)) {
>  			btrfs_run_delayed_iputs(root);
> -			btrfs_clean_old_snapshots(root);
> +			again = btrfs_clean_one_old_snapshot(root);
>  			mutex_unlock(&root->fs_info->cleaner_mutex);
>  			btrfs_run_defrag_inodes(root->fs_info);
>  			up_read(&root->fs_info->sb->s_umount);
> @@ -1631,7 +1632,7 @@ static int cleaner_kthread(void *arg)
>  
>  		if (freezing(current)) {
>  			refrigerator();
> -		} else {
> +		} else if (!again) {//FIXME: check again now directly from dead_roots?
>  			set_current_state(TASK_INTERRUPTIBLE);
>  			if (!kthread_should_stop())
>  				schedule();
> @@ -3064,7 +3066,13 @@ int btrfs_commit_super(struct btrfs_root *root)
>  
>  	mutex_lock(&root->fs_info->cleaner_mutex);
>  	btrfs_run_delayed_iputs(root);
> -	btrfs_clean_old_snapshots(root);
> +	if (!btrfs_fs_closing(root->fs_info)) {
> +		/* btrfs_clean_old_snapshots(root); */
> +		wake_up_process(root->fs_info->cleaner_kthread);
> +		printk(KERN_DEBUG "btrfs: wake cleaner from commit_super\n");
> +	} else {
> +		printk(KERN_DEBUG "btrfs: skip cleaning when going down\n");
> +	}
>  	mutex_unlock(&root->fs_info->cleaner_mutex);
>  
>  	/* wait until ongoing cleanup work done */
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index ec1e0c6..3aba911 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -6995,6 +6995,11 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>  	wc->reada_count = BTRFS_NODEPTRS_PER_BLOCK(root);
>  
>  	while (1) {
> +		if (btrfs_fs_closing(root->fs_info)) {
> +			printk(KERN_DEBUG "btrfs: drop early exit\n");
> +			err = -EAGAIN;
> +			goto out_end_trans;
> +		}
>  		ret = walk_down_tree(trans, root, path, wc);
>  		if (ret < 0) {
>  			err = ret;
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 922e6ec..c9dc857 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -4060,6 +4060,7 @@ int btrfs_relocate_block_group(struct btrfs_root *extent_root, u64 group_start)
>  	while (1) {
>  		mutex_lock(&fs_info->cleaner_mutex);
>  
> +		// FIXME: wake cleaner
>  		btrfs_clean_old_snapshots(fs_info->tree_root);
>  		ret = relocate_block_group(rc);
>  
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index de2942f..3d83f6b 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -783,7 +783,7 @@ static noinline int commit_cowonly_roots(struct btrfs_trans_handle *trans,
>  int btrfs_add_dead_root(struct btrfs_root *root)
>  {
>  	spin_lock(&root->fs_info->trans_lock);
> -	list_add(&root->root_list, &root->fs_info->dead_roots);
> +	list_add_tail(&root->root_list, &root->fs_info->dead_roots);
>  	spin_unlock(&root->fs_info->trans_lock);
>  	return 0;
>  }
> @@ -1533,7 +1533,45 @@ int btrfs_clean_old_snapshots(struct btrfs_root *root)
>  			ret = btrfs_drop_snapshot(root, NULL, 0, 0);
>  		else
>  			ret =btrfs_drop_snapshot(root, NULL, 1, 0);
> -		BUG_ON(ret < 0);
> +		BUG_ON(ret < 0 && ret != -EAGAIN);
>  	}
>  	return 0;
>  }
> +/*
> + * return < 0 if error
> + * 0 if there are no more dead_roots at the time of call
> + * 1 there are more to be processed, call me again
> + *
> + * (racy)
> + */
> +int btrfs_clean_one_old_snapshot(struct btrfs_root *root)
> +{
> +	int ret;
> +	int run_again = 1;
> +	struct btrfs_fs_info *fs_info = root->fs_info;
> +
> +	if (root->fs_info->sb->s_flags & MS_RDONLY) {
> +		printk(KERN_WARNING "btrfs: cleaner called for RO fs!\n");
> +	}
> +
> +	spin_lock(&fs_info->trans_lock);
> +	root = list_first_entry(&fs_info->dead_roots,
> +			struct btrfs_root, root_list);
> +	list_del(&root->root_list);
> +	if (list_empty(&fs_info->dead_roots))
> +		run_again = 0;
> +	spin_unlock(&fs_info->trans_lock);
> +
> +	printk(KERN_DEBUG "btrfs: cleaner removing %llu\n",
> +			(unsigned long long)root->objectid);
> +
> +	btrfs_kill_all_delayed_nodes(root);
> +
> +	if (btrfs_header_backref_rev(root->node) <
> +			BTRFS_MIXED_BACKREF_REV)
> +		ret = btrfs_drop_snapshot(root, NULL, 0, 0);
> +	else
> +		ret = btrfs_drop_snapshot(root, NULL, 1, 0);
> +	BUG_ON(ret < 0 && ret != -EAGAIN);
> +	return run_again || ret == -EAGAIN;
> +}
> diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
> index fe27379..7071ca5 100644
> --- a/fs/btrfs/transaction.h
> +++ b/fs/btrfs/transaction.h
> @@ -94,6 +94,7 @@ int btrfs_write_and_wait_transaction(struct btrfs_trans_handle *trans,
>  int btrfs_add_dead_root(struct btrfs_root *root);
>  int btrfs_defrag_root(struct btrfs_root *root, int cacheonly);
>  int btrfs_clean_old_snapshots(struct btrfs_root *root);
> +int btrfs_clean_one_old_snapshot(struct btrfs_root *root);
>  int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
>  			     struct btrfs_root *root);
>  int btrfs_commit_transaction_async(struct btrfs_trans_handle *trans,
> ---
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/4] Btrfs: fix deadlock on sb->s_umount when doing umount
  2012-04-27 10:55   ` Miao Xie
@ 2012-04-30 16:41     ` David Sterba
  2012-05-08 10:38       ` Miao Xie
  0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2012-04-30 16:41 UTC (permalink / raw)
  To: Miao Xie; +Cc: Chris Mason, Tsutomu Itoh, miaox, Linux Btrfs, Linux FSDevel

On Fri, Apr 27, 2012 at 06:55:13PM +0800, Miao Xie wrote:
> But I found you add a trylock for ->s_umount in cleaner_kthread(), this method
> can fix the deadlock problem, I think. It may be introduced by the other patch,
> could you send that patch to me? I found if we fail to trylock ->cleaner_mutex,
> we will forget to unlock ->s_umount.

the unlock was not visible within the diff context, the full patch is:

--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1578,11 +1578,13 @@ static int cleaner_kthread(void *arg)
                vfs_check_frozen(root->fs_info->sb, SB_FREEZE_WRITE);

                if (!(root->fs_info->sb->s_flags & MS_RDONLY) &&
+                       down_read_trylock(&root->fs_info->sb->s_umount) &&
                    mutex_trylock(&root->fs_info->cleaner_mutex)) {
                        btrfs_run_delayed_iputs(root);
                        btrfs_clean_old_snapshots(root);
                        mutex_unlock(&root->fs_info->cleaner_mutex);
                        btrfs_run_defrag_inodes(root->fs_info);
+                       up_read(&root->fs_info->sb->s_umount);
                }

                if (freezing(current)) {
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3413,7 +3413,9 @@ static int shrink_delalloc(struct btrfs_
                         max_reclaim >> PAGE_CACHE_SHIFT);
        while (loops < 1024) {
                /* have the flusher threads jump in and do some IO */
-               smp_mb();
+               if (btrfs_fs_closing(root->fs_info))
+                       return -EAGAIN;
+
                nr_pages = min_t(unsigned long, nr_pages,
                       root->fs_info->delalloc_bytes >> PAGE_CACHE_SHIFT);
                writeback_inodes_sb_nr_if_idle(root->fs_info->sb, nr_pages);
---

after close_tree starts the "fs_closing" check will prevent further
calls to writeback. I don't remember from who the patch acutally comes
from (via irc), it was noted as a workaround for the cleaner deadlock
until it gets fully solved (re your patches
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg13897.html
and reference to balance and scrub:
http://www.spinics.net/lists/linux-btrfs/msg14056.html
)


david

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/4] Btrfs: fix deadlock on sb->s_umount when doing umount
  2012-04-30 16:41     ` David Sterba
@ 2012-05-08 10:38       ` Miao Xie
  2012-05-08 15:33         ` David Sterba
  0 siblings, 1 reply; 8+ messages in thread
From: Miao Xie @ 2012-05-08 10:38 UTC (permalink / raw)
  To: Miao Xie, Chris Mason, Tsutomu Itoh, Linux Btrfs, Linux FSDevel

On Mon, 30 Apr 2012 18:41:39 +0200, David Sterba wrote:
> On Fri, Apr 27, 2012 at 06:55:13PM +0800, Miao Xie wrote:
>> But I found you add a trylock for ->s_umount in cleaner_kthread(), this method
>> can fix the deadlock problem, I think. It may be introduced by the other patch,
>> could you send that patch to me? I found if we fail to trylock ->cleaner_mutex,
>> we will forget to unlock ->s_umount.
> 
> the unlock was not visible within the diff context, the full patch is:
> 
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1578,11 +1578,13 @@ static int cleaner_kthread(void *arg)
>                 vfs_check_frozen(root->fs_info->sb, SB_FREEZE_WRITE);
> 
>                 if (!(root->fs_info->sb->s_flags & MS_RDONLY) &&
> +                       down_read_trylock(&root->fs_info->sb->s_umount) &&
>                     mutex_trylock(&root->fs_info->cleaner_mutex)) {
>                         btrfs_run_delayed_iputs(root);
>                         btrfs_clean_old_snapshots(root);
>                         mutex_unlock(&root->fs_info->cleaner_mutex);
>                         btrfs_run_defrag_inodes(root->fs_info);
> +                       up_read(&root->fs_info->sb->s_umount);
>                 }
> 
>                 if (freezing(current)) {
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3413,7 +3413,9 @@ static int shrink_delalloc(struct btrfs_
>                          max_reclaim >> PAGE_CACHE_SHIFT);
>         while (loops < 1024) {
>                 /* have the flusher threads jump in and do some IO */
> -               smp_mb();
> +               if (btrfs_fs_closing(root->fs_info))
> +                       return -EAGAIN;
> +
>                 nr_pages = min_t(unsigned long, nr_pages,
>                        root->fs_info->delalloc_bytes >> PAGE_CACHE_SHIFT);
>                 writeback_inodes_sb_nr_if_idle(root->fs_info->sb, nr_pages);
> ---
> 
> after close_tree starts the "fs_closing" check will prevent further
> calls to writeback. I don't remember from who the patch acutally comes
> from (via irc), it was noted as a workaround for the cleaner deadlock
> until it gets fully solved (re your patches
> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg13897.html
> and reference to balance and scrub:
> http://www.spinics.net/lists/linux-btrfs/msg14056.html
> )

Sorry, I forget to reply this mail.

I think this method can not fix the problem safely because if the other
background threads(not the cleaner) call shrink_delalloc(), the problem
can still occur.

Though trylock for the cleaner can not fix this problem, I think the cleaner
still need trylock ->s_umount, in this way, we can stop the cleaner making
lots of dirty pages when we do readonly remount or umount.

Thanks
Miao

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/4] Btrfs: fix deadlock on sb->s_umount when doing umount
  2012-05-08 10:38       ` Miao Xie
@ 2012-05-08 15:33         ` David Sterba
  2012-05-09  3:24           ` Miao Xie
  0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2012-05-08 15:33 UTC (permalink / raw)
  To: Miao Xie; +Cc: Miao Xie, Chris Mason, Tsutomu Itoh, Linux Btrfs, Linux FSDevel

On Tue, May 08, 2012 at 06:38:01PM +0800, Miao Xie wrote:
> I think this method can not fix the problem safely because if the other
> background threads(not the cleaner) call shrink_delalloc(), the problem
> can still occur.

And it does not indeed fix the problem completely, I found xfstests/269
hung inside cleaner and defrag (mounted with autodefrag).

david

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/4] Btrfs: fix deadlock on sb->s_umount when doing umount
  2012-05-08 15:33         ` David Sterba
@ 2012-05-09  3:24           ` Miao Xie
  2012-05-22 15:05             ` David Sterba
  0 siblings, 1 reply; 8+ messages in thread
From: Miao Xie @ 2012-05-09  3:24 UTC (permalink / raw)
  To: Miao Xie, Chris Mason, Tsutomu Itoh, Linux Btrfs, Linux FSDevel

On Tue, 8 May 2012 17:33:26 +0200, David Sterba wrote:
> On Tue, May 08, 2012 at 06:38:01PM +0800, Miao Xie wrote:
>> I think this method can not fix the problem safely because if the other
>> background threads(not the cleaner) call shrink_delalloc(), the problem
>> can still occur.
> 
> And it does not indeed fix the problem completely, I found xfstests/269
> hung inside cleaner and defrag (mounted with autodefrag).

Did you apply the trylock patchs I sent before?

Thanks
Miao

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/4] Btrfs: fix deadlock on sb->s_umount when doing umount
  2012-05-09  3:24           ` Miao Xie
@ 2012-05-22 15:05             ` David Sterba
  0 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2012-05-22 15:05 UTC (permalink / raw)
  To: Miao Xie; +Cc: Miao Xie, Chris Mason, Tsutomu Itoh, Linux Btrfs, Linux FSDevel

On Wed, May 09, 2012 at 11:24:28AM +0800, Miao Xie wrote:
> Did you apply the trylock patchs I sent before?

20120429   [PATCH 1/2] vfs: re-implement writeback_inodes_sb(_nr)_if_idle() and rename them
20120429   [PATCH 2/2] Btrfs: flush all the dirty pages if try_to_writeback_inodes_sb_nr() fails

on top of 3.4, no deadlocks occured with looped 269, 264, 254, 276. Mounted with
space_cache,autodefrag .

david

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-05-22 15:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-26  2:58 [PATCH 2/4] Btrfs: fix deadlock on sb->s_umount when doing umount Miao Xie
2012-04-26 11:44 ` David Sterba
2012-04-27 10:55   ` Miao Xie
2012-04-30 16:41     ` David Sterba
2012-05-08 10:38       ` Miao Xie
2012-05-08 15:33         ` David Sterba
2012-05-09  3:24           ` Miao Xie
2012-05-22 15:05             ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).