[PATCH V2 0/4] Btrfs: improve the performance fluctuating of the fsync

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH V2 0/4] Btrfs: improve the performance fluctuating of the fsync
@ 2014-01-14 12:31 Miao Xie
  2014-01-14 12:31 ` [PATCH V2 1/4] Btrfs: filter the ordered extents that has been logged Miao Xie
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Miao Xie @ 2014-01-14 12:31 UTC (permalink / raw)
  To: linux-btrfs

We met the performance fluctuating problem when we synchronized the file data.
The main reason of this problem was that we might only flush part dirty pages
in a ordered extent, then got that ordered extent, wait for the csum
calcucation. But if no task flushed the left part, we would wait until the
flusher flushed them, sometimes we need wait for several seconds, it made
the performance drop down suddenly.

This patchset improved the above problem. The 1st patch helps us filter
the logged ordered extents, the 2nd one removes the unnecessary lock operation,
the 3rd one prevents the ordered extents from being inserted into a global
list, it can reduce the lock contention and the traverse time of list.
The 4th is the key patch to the above problem, it fixes the problem by flush
the left dirty pages aggressively.

Miao Xie (4):
  Btrfs: filter the ordered extents that has been logged
  Btrfs: don't get the lock when adding a csum into a ordered extent
  Btrfs: don't mix the ordered extents of all files together during logging the inodes
  Btrfs: flush the dirty pages of the ordered extent aggressively during logging csum

 fs/btrfs/ordered-data.c |   46 +++++++++++++++++++++++++++++-----------
 fs/btrfs/ordered-data.h |    6 ++++-
 fs/btrfs/tree-log.c     |   53 +++++++++++++++++-----------------------------
 3 files changed, 58 insertions(+), 47 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH V2 1/4] Btrfs: filter the ordered extents that has been logged
  2014-01-14 12:31 [PATCH V2 0/4] Btrfs: improve the performance fluctuating of the fsync Miao Xie
@ 2014-01-14 12:31 ` Miao Xie
  2014-01-28 14:50   ` Josef Bacik
  2014-01-14 12:31 ` [PATCH V2 2/4] Btrfs: don't get the lock when adding a csum into a ordered extent Miao Xie
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Miao Xie @ 2014-01-14 12:31 UTC (permalink / raw)
  To: linux-btrfs

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/ordered-data.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 69582d5..e4c3d56 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -433,6 +433,8 @@ void btrfs_get_logged_extents(struct btrfs_root *log, struct inode *inode)
 	spin_lock_irq(&tree->lock);
 	for (n = rb_first(&tree->tree); n; n = rb_next(n)) {
 		ordered = rb_entry(n, struct btrfs_ordered_extent, rb_node);
+		if (test_bit(BTRFS_ORDERED_LOGGED_CSUM, &ordered->flags))
+			continue;
 		spin_lock(&log->log_extents_lock[index]);
 		if (list_empty(&ordered->log_list)) {
 			list_add_tail(&ordered->log_list, &log->logged_list[index]);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 2/4] Btrfs: don't get the lock when adding a csum into a ordered extent
  2014-01-14 12:31 [PATCH V2 0/4] Btrfs: improve the performance fluctuating of the fsync Miao Xie
  2014-01-14 12:31 ` [PATCH V2 1/4] Btrfs: filter the ordered extents that has been logged Miao Xie
@ 2014-01-14 12:31 ` Miao Xie
  2014-01-28 16:00   ` Josef Bacik
  2014-01-14 12:31 ` [PATCH V2 3/4] Btrfs: don't mix the ordered extents of all files together during logging the inodes Miao Xie
  2014-01-14 12:31 ` [PATCH V2 4/4] Btrfs: flush the dirty pages of the ordered extent aggressively during logging csum Miao Xie
  3 siblings, 1 reply; 13+ messages in thread
From: Miao Xie @ 2014-01-14 12:31 UTC (permalink / raw)
  To: linux-btrfs

We are sure that:
- one ordered extent just has one csum calculation worker, and no one
  access the csum list during the csum calculation except the worker.
- we don't change the list and free the csum until no one reference
  to the ordered extent
So it is safe to add csum without the lock.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/ordered-data.c |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index e4c3d56..396c6d1 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -280,16 +280,11 @@ void btrfs_add_ordered_sum(struct inode *inode,
 			   struct btrfs_ordered_extent *entry,
 			   struct btrfs_ordered_sum *sum)
 {
-	struct btrfs_ordered_inode_tree *tree;
-
-	tree = &BTRFS_I(inode)->ordered_tree;
-	spin_lock_irq(&tree->lock);
 	list_add_tail(&sum->list, &entry->list);
 	WARN_ON(entry->csum_bytes_left < sum->len);
 	entry->csum_bytes_left -= sum->len;
 	if (entry->csum_bytes_left == 0)
 		wake_up(&entry->wait);
-	spin_unlock_irq(&tree->lock);
 }
 
 /*
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 3/4] Btrfs: don't mix the ordered extents of all files together during logging the inodes
  2014-01-14 12:31 [PATCH V2 0/4] Btrfs: improve the performance fluctuating of the fsync Miao Xie
  2014-01-14 12:31 ` [PATCH V2 1/4] Btrfs: filter the ordered extents that has been logged Miao Xie
  2014-01-14 12:31 ` [PATCH V2 2/4] Btrfs: don't get the lock when adding a csum into a ordered extent Miao Xie
@ 2014-01-14 12:31 ` Miao Xie
  2014-01-28 14:55   ` Josef Bacik
  2014-01-14 12:31 ` [PATCH V2 4/4] Btrfs: flush the dirty pages of the ordered extent aggressively during logging csum Miao Xie
  3 siblings, 1 reply; 13+ messages in thread
From: Miao Xie @ 2014-01-14 12:31 UTC (permalink / raw)
  To: linux-btrfs

There was a problem in the old code:
If we failed to log the csum, we would free all the ordered extents in the log list
including those ordered extents that were logged successfully, it would make the
log committer not to wait for the completion of the ordered extents.

This patch doesn't insert the ordered extents that is about to be logged into
a global list, instead, we insert them into a local list. If we log the ordered
extents successfully, we splice them with the global list, or we will throw them
away, then do full sync. It can also reduce the lock contention and the traverse
time of list.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/ordered-data.c |   39 +++++++++++++++++++++++++++++++--------
 fs/btrfs/ordered-data.h |    6 +++++-
 fs/btrfs/tree-log.c     |   47 +++++++++++++++--------------------------------
 3 files changed, 51 insertions(+), 41 deletions(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 396c6d1..739de05 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -417,12 +417,12 @@ out:
 }
 
 /* Needs to either be called under a log transaction or the log_mutex */
-void btrfs_get_logged_extents(struct btrfs_root *log, struct inode *inode)
+void btrfs_get_logged_extents(struct inode *inode,
+			      struct list_head *logged_list)
 {
 	struct btrfs_ordered_inode_tree *tree;
 	struct btrfs_ordered_extent *ordered;
 	struct rb_node *n;
-	int index = log->log_transid % 2;
 
 	tree = &BTRFS_I(inode)->ordered_tree;
 	spin_lock_irq(&tree->lock);
@@ -430,16 +430,39 @@ void btrfs_get_logged_extents(struct btrfs_root *log, struct inode *inode)
 		ordered = rb_entry(n, struct btrfs_ordered_extent, rb_node);
 		if (test_bit(BTRFS_ORDERED_LOGGED_CSUM, &ordered->flags))
 			continue;
-		spin_lock(&log->log_extents_lock[index]);
-		if (list_empty(&ordered->log_list)) {
-			list_add_tail(&ordered->log_list, &log->logged_list[index]);
-			atomic_inc(&ordered->refs);
-		}
-		spin_unlock(&log->log_extents_lock[index]);
+
+		if (!list_empty(&ordered->log_list))
+			continue;
+
+		list_add_tail(&ordered->log_list, logged_list);
+		atomic_inc(&ordered->refs);
 	}
 	spin_unlock_irq(&tree->lock);
 }
 
+void btrfs_put_logged_extents(struct list_head *logged_list)
+{
+	struct btrfs_ordered_extent *ordered;
+
+	while (!list_empty(logged_list)) {
+		ordered = list_first_entry(logged_list,
+					   struct btrfs_ordered_extent,
+					   log_list);
+		list_del_init(&ordered->log_list);
+		btrfs_put_ordered_extent(ordered);
+	}
+}
+
+void btrfs_submit_logged_extents(struct list_head *logged_list,
+				 struct btrfs_root *log)
+{
+	int index = log->log_transid % 2;
+
+	spin_lock_irq(&log->log_extents_lock[index]);
+	list_splice_tail(logged_list, &log->logged_list[index]);
+	spin_unlock_irq(&log->log_extents_lock[index]);
+}
+
 void btrfs_wait_logged_extents(struct btrfs_root *log, u64 transid)
 {
 	struct btrfs_ordered_extent *ordered;
diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h
index 9b0450f..2468970 100644
--- a/fs/btrfs/ordered-data.h
+++ b/fs/btrfs/ordered-data.h
@@ -197,7 +197,11 @@ void btrfs_add_ordered_operation(struct btrfs_trans_handle *trans,
 				 struct inode *inode);
 int btrfs_wait_ordered_extents(struct btrfs_root *root, int nr);
 void btrfs_wait_ordered_roots(struct btrfs_fs_info *fs_info, int nr);
-void btrfs_get_logged_extents(struct btrfs_root *log, struct inode *inode);
+void btrfs_get_logged_extents(struct inode *inode,
+			      struct list_head *logged_list);
+void btrfs_put_logged_extents(struct list_head *logged_list);
+void btrfs_submit_logged_extents(struct list_head *logged_list,
+				 struct btrfs_root *log);
 void btrfs_wait_logged_extents(struct btrfs_root *log, u64 transid);
 void btrfs_free_logged_extents(struct btrfs_root *log, u64 transid);
 int __init ordered_data_init(void);
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 9f7fc51..251e6f4 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3331,7 +3331,8 @@ static int extent_cmp(void *priv, struct list_head *a, struct list_head *b)
 
 static int log_one_extent(struct btrfs_trans_handle *trans,
 			  struct inode *inode, struct btrfs_root *root,
-			  struct extent_map *em, struct btrfs_path *path)
+			  struct extent_map *em, struct btrfs_path *path,
+			  struct list_head *logged_list)
 {
 	struct btrfs_root *log = root->log_root;
 	struct btrfs_file_extent_item *fi;
@@ -3347,7 +3348,6 @@ static int log_one_extent(struct btrfs_trans_handle *trans,
 	u64 extent_offset = em->start - em->orig_start;
 	u64 block_len;
 	int ret;
-	int index = log->log_transid % 2;
 	bool skip_csum = BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM;
 
 	ret = __btrfs_drop_extents(trans, log, inode, path, em->start,
@@ -3425,17 +3425,12 @@ static int log_one_extent(struct btrfs_trans_handle *trans,
 	 * First check and see if our csums are on our outstanding ordered
 	 * extents.
 	 */
-again:
-	spin_lock_irq(&log->log_extents_lock[index]);
-	list_for_each_entry(ordered, &log->logged_list[index], log_list) {
+	list_for_each_entry(ordered, logged_list, log_list) {
 		struct btrfs_ordered_sum *sum;
 
 		if (!mod_len)
 			break;
 
-		if (ordered->inode != inode)
-			continue;
-
 		if (ordered->file_offset + ordered->len <= mod_start ||
 		    mod_start + mod_len <= ordered->file_offset)
 			continue;
@@ -3471,36 +3466,19 @@ again:
 			}
 		}
 
-		/*
-		 * To keep us from looping for the above case of an ordered
-		 * extent that falls inside of the logged extent.
-		 */
 		if (test_and_set_bit(BTRFS_ORDERED_LOGGED_CSUM,
 				     &ordered->flags))
 			continue;
-		atomic_inc(&ordered->refs);
-		spin_unlock_irq(&log->log_extents_lock[index]);
-		/*
-		 * we've dropped the lock, we must either break or
-		 * start over after this.
-		 */
 
 		wait_event(ordered->wait, ordered->csum_bytes_left == 0);
 
 		list_for_each_entry(sum, &ordered->list, list) {
 			ret = btrfs_csum_file_blocks(trans, log, sum);
-			if (ret) {
-				btrfs_put_ordered_extent(ordered);
+			if (ret)
 				goto unlocked;
-			}
 		}
-		btrfs_put_ordered_extent(ordered);
-		goto again;
-
 	}
-	spin_unlock_irq(&log->log_extents_lock[index]);
 unlocked:
-
 	if (!mod_len || ret)
 		return ret;
 
@@ -3536,7 +3514,8 @@ unlocked:
 static int btrfs_log_changed_extents(struct btrfs_trans_handle *trans,
 				     struct btrfs_root *root,
 				     struct inode *inode,
-				     struct btrfs_path *path)
+				     struct btrfs_path *path,
+				     struct list_head *logged_list)
 {
 	struct extent_map *em, *n;
 	struct list_head extents;
@@ -3594,7 +3573,7 @@ process:
 
 		write_unlock(&tree->lock);
 
-		ret = log_one_extent(trans, inode, root, em, path);
+		ret = log_one_extent(trans, inode, root, em, path, logged_list);
 		write_lock(&tree->lock);
 		clear_em_logging(tree, em);
 		free_extent_map(em);
@@ -3630,6 +3609,7 @@ static int btrfs_log_inode(struct btrfs_trans_handle *trans,
 	struct btrfs_key max_key;
 	struct btrfs_root *log = root->log_root;
 	struct extent_buffer *src = NULL;
+	LIST_HEAD(logged_list);
 	int err = 0;
 	int ret;
 	int nritems;
@@ -3677,7 +3657,7 @@ static int btrfs_log_inode(struct btrfs_trans_handle *trans,
 
 	mutex_lock(&BTRFS_I(inode)->log_mutex);
 
-	btrfs_get_logged_extents(log, inode);
+	btrfs_get_logged_extents(inode, &logged_list);
 
 	/*
 	 * a brute force approach to making sure we get the most uptodate
@@ -3797,7 +3777,8 @@ log_extents:
 	btrfs_release_path(path);
 	btrfs_release_path(dst_path);
 	if (fast_search) {
-		ret = btrfs_log_changed_extents(trans, root, inode, dst_path);
+		ret = btrfs_log_changed_extents(trans, root, inode, dst_path,
+						&logged_list);
 		if (ret) {
 			err = ret;
 			goto out_unlock;
@@ -3822,8 +3803,10 @@ log_extents:
 	BTRFS_I(inode)->logged_trans = trans->transid;
 	BTRFS_I(inode)->last_log_commit = BTRFS_I(inode)->last_sub_trans;
 out_unlock:
-	if (err)
-		btrfs_free_logged_extents(log, log->log_transid);
+	if (unlikely(err))
+		btrfs_put_logged_extents(&logged_list);
+	else
+		btrfs_submit_logged_extents(&logged_list, log);
 	mutex_unlock(&BTRFS_I(inode)->log_mutex);
 
 	btrfs_free_path(path);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH V2 4/4] Btrfs: flush the dirty pages of the ordered extent aggressively during logging csum
  2014-01-14 12:31 [PATCH V2 0/4] Btrfs: improve the performance fluctuating of the fsync Miao Xie
                   ` (2 preceding siblings ...)
  2014-01-14 12:31 ` [PATCH V2 3/4] Btrfs: don't mix the ordered extents of all files together during logging the inodes Miao Xie
@ 2014-01-14 12:31 ` Miao Xie
  3 siblings, 0 replies; 13+ messages in thread
From: Miao Xie @ 2014-01-14 12:31 UTC (permalink / raw)
  To: linux-btrfs

The performance of fsync dropped down suddenly sometimes, the main reason
of this problem was that we might only flush part dirty pages in a ordered
extent, then got that ordered extent, wait for the csum calcucation. But if
no task flushed the left part, we would wait until the flusher flushed them,
sometimes we need wait for several seconds, it made the performance drop
down suddenly. (On my box, it drop down from 56MB/s to 4-10MB/s)

This patch improves the above problem by flushing left dirty pages aggressively.

Test Environment:
CPU:		2CPU * 2Cores
Memory:		4GB
Partition:	20GB(HDD)

Test Command:
 # sysbench --num-threads=8 --test=fileio --file-num=1 \
 > --file-total-size=8G --file-block-size=32768 \
 > --file-io-mode=sync --file-fsync-freq=100 \
 > --file-fsync-end=no --max-requests=10000 \
 > --file-test-mode=rndwr run

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/tree-log.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 251e6f4..944701b 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3470,7 +3470,11 @@ static int log_one_extent(struct btrfs_trans_handle *trans,
 				     &ordered->flags))
 			continue;
 
-		wait_event(ordered->wait, ordered->csum_bytes_left == 0);
+		if (ordered->csum_bytes_left) {
+			btrfs_start_ordered_extent(inode, ordered, 0);
+			wait_event(ordered->wait,
+				   ordered->csum_bytes_left == 0);
+		}
 
 		list_for_each_entry(sum, &ordered->list, list) {
 			ret = btrfs_csum_file_blocks(trans, log, sum);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 1/4] Btrfs: filter the ordered extents that has been logged
  2014-01-14 12:31 ` [PATCH V2 1/4] Btrfs: filter the ordered extents that has been logged Miao Xie
@ 2014-01-28 14:50   ` Josef Bacik
  2014-01-29  2:08     ` Miao Xie
  0 siblings, 1 reply; 13+ messages in thread
From: Josef Bacik @ 2014-01-28 14:50 UTC (permalink / raw)
  To: Miao Xie, linux-btrfs


On 01/14/2014 07:31 AM, Miao Xie wrote:
> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
> ---
>   fs/btrfs/ordered-data.c |    2 ++
>   1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
> index 69582d5..e4c3d56 100644
> --- a/fs/btrfs/ordered-data.c
> +++ b/fs/btrfs/ordered-data.c
> @@ -433,6 +433,8 @@ void btrfs_get_logged_extents(struct btrfs_root *log, struct inode *inode)
>   	spin_lock_irq(&tree->lock);
>   	for (n = rb_first(&tree->tree); n; n = rb_next(n)) {
>   		ordered = rb_entry(n, struct btrfs_ordered_extent, rb_node);
> +		if (test_bit(BTRFS_ORDERED_LOGGED_CSUM, &ordered->flags))
> +			continue;
>   		spin_lock(&log->log_extents_lock[index]);
>   		if (list_empty(&ordered->log_list)) {
>   			list_add_tail(&ordered->log_list, &log->logged_list[index]);
This isn't right, the logged extents are the extents we need to wait on 
before we exit fsync, logged_csum tells us that we've already copied its 
csums into the log.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 3/4] Btrfs: don't mix the ordered extents of all files together during logging the inodes
  2014-01-14 12:31 ` [PATCH V2 3/4] Btrfs: don't mix the ordered extents of all files together during logging the inodes Miao Xie
@ 2014-01-28 14:55   ` Josef Bacik
  2014-01-29  2:22     ` Miao Xie
  0 siblings, 1 reply; 13+ messages in thread
From: Josef Bacik @ 2014-01-28 14:55 UTC (permalink / raw)
  To: Miao Xie, linux-btrfs


On 01/14/2014 07:31 AM, Miao Xie wrote:
> There was a problem in the old code:
> If we failed to log the csum, we would free all the ordered extents in the log list
> including those ordered extents that were logged successfully, it would make the
> log committer not to wait for the completion of the ordered extents.
>
> This patch doesn't insert the ordered extents that is about to be logged into
> a global list, instead, we insert them into a local list. If we log the ordered
> extents successfully, we splice them with the global list, or we will throw them
> away, then do full sync. It can also reduce the lock contention and the traverse
> time of list.
>
If we fail to log the csums we'll return an error and commit the 
transaction so it doesn't matter that we free all the ordered extents, 
everything will be waited on properly.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 2/4] Btrfs: don't get the lock when adding a csum into a ordered extent
  2014-01-14 12:31 ` [PATCH V2 2/4] Btrfs: don't get the lock when adding a csum into a ordered extent Miao Xie
@ 2014-01-28 16:00   ` Josef Bacik
  2014-01-29  2:25     ` Miao Xie
  0 siblings, 1 reply; 13+ messages in thread
From: Josef Bacik @ 2014-01-28 16:00 UTC (permalink / raw)
  To: Miao Xie, linux-btrfs


On 01/14/2014 07:31 AM, Miao Xie wrote:
> We are sure that:
> - one ordered extent just has one csum calculation worker, and no one
>    access the csum list during the csum calculation except the worker.
> - we don't change the list and free the csum until no one reference
>    to the ordered extent
> So it is safe to add csum without the lock.
>
> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
> ---
>   fs/btrfs/ordered-data.c |    5 -----
>   1 files changed, 0 insertions(+), 5 deletions(-)
>
> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
> index e4c3d56..396c6d1 100644
> --- a/fs/btrfs/ordered-data.c
> +++ b/fs/btrfs/ordered-data.c
> @@ -280,16 +280,11 @@ void btrfs_add_ordered_sum(struct inode *inode,
>   			   struct btrfs_ordered_extent *entry,
>   			   struct btrfs_ordered_sum *sum)
>   {
> -	struct btrfs_ordered_inode_tree *tree;
> -
> -	tree = &BTRFS_I(inode)->ordered_tree;
> -	spin_lock_irq(&tree->lock);
>   	list_add_tail(&sum->list, &entry->list);
>   	WARN_ON(entry->csum_bytes_left < sum->len);
>   	entry->csum_bytes_left -= sum->len;
>   	if (entry->csum_bytes_left == 0)
>   		wake_up(&entry->wait);
> -	spin_unlock_irq(&tree->lock);
>   }
>   
>   /*
This blew up in xfstests so one of your assumptions is incorrect. Thanks,

Josef

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 1/4] Btrfs: filter the ordered extents that has been logged
  2014-01-28 14:50   ` Josef Bacik
@ 2014-01-29  2:08     ` Miao Xie
  0 siblings, 0 replies; 13+ messages in thread
From: Miao Xie @ 2014-01-29  2:08 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs

On Tue, 28 Jan 2014 09:50:38 -0500, Josef Bacik wrote:
> 
> On 01/14/2014 07:31 AM, Miao Xie wrote:
>> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
>> ---
>>   fs/btrfs/ordered-data.c |    2 ++
>>   1 files changed, 2 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
>> index 69582d5..e4c3d56 100644
>> --- a/fs/btrfs/ordered-data.c
>> +++ b/fs/btrfs/ordered-data.c
>> @@ -433,6 +433,8 @@ void btrfs_get_logged_extents(struct btrfs_root *log, struct inode *inode)
>>       spin_lock_irq(&tree->lock);
>>       for (n = rb_first(&tree->tree); n; n = rb_next(n)) {
>>           ordered = rb_entry(n, struct btrfs_ordered_extent, rb_node);
>> +        if (test_bit(BTRFS_ORDERED_LOGGED_CSUM, &ordered->flags))
>> +            continue;
>>           spin_lock(&log->log_extents_lock[index]);
>>           if (list_empty(&ordered->log_list)) {
>>               list_add_tail(&ordered->log_list, &log->logged_list[index]);
> This isn't right, the logged extents are the extents we need to wait on before we exit fsync, logged_csum tells us that we've already copied its csums into the log.  Thanks,

My mistake, We need another flag to filter the ordered extents.

Thanks
Miao


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 3/4] Btrfs: don't mix the ordered extents of all files together during logging the inodes
  2014-01-28 14:55   ` Josef Bacik
@ 2014-01-29  2:22     ` Miao Xie
  2014-01-29 18:48       ` Josef Bacik
  0 siblings, 1 reply; 13+ messages in thread
From: Miao Xie @ 2014-01-29  2:22 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs

On Tue, 28 Jan 2014 09:55:58 -0500, Josef Bacik wrote:
> 
> On 01/14/2014 07:31 AM, Miao Xie wrote:
>> There was a problem in the old code:
>> If we failed to log the csum, we would free all the ordered extents in the log list
>> including those ordered extents that were logged successfully, it would make the
>> log committer not to wait for the completion of the ordered extents.
>>
>> This patch doesn't insert the ordered extents that is about to be logged into
>> a global list, instead, we insert them into a local list. If we log the ordered
>> extents successfully, we splice them with the global list, or we will throw them
>> away, then do full sync. It can also reduce the lock contention and the traverse
>> time of list.
>>
> If we fail to log the csums we'll return an error and commit the transaction so it doesn't matter that we free all the ordered extents, everything will be waited on properly.  Thanks,


No. Maybe my explanation is not clear. Please consider the following case:

Task1 (sync file1)		Task2 (sync file2)
btrfs_sync_file()
				btrfs_sync_file()
  start_log_trans()
				  start_log_trans()
  btrfs_get_logged_extents()
  btrfs_log_changed_extents()
				  btrfs_get_logged_extents()
				  btrfs_log_changed_extents() failed
				  btrfs_free_logged_extents()
				    this function will free all logged extents
				    including the ones that are inserted by task1
  btrfs_end_log_trans()
  btrfs_sync_log()
    btrfs_wait_logged_extents()
      but no extents to be
      waited
  sync end
				  btrfs_wait_ordered_extents()
				    but this operation just wait for file2's
				    extents that is in the specified range.
				  btrfs_commit_transaction()
				    the committer also doesn't wait for the extents
				    of file1

Thanks
Miao
  

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 2/4] Btrfs: don't get the lock when adding a csum into a ordered extent
  2014-01-28 16:00   ` Josef Bacik
@ 2014-01-29  2:25     ` Miao Xie
  2014-01-29 18:43       ` Josef Bacik
  0 siblings, 1 reply; 13+ messages in thread
From: Miao Xie @ 2014-01-29  2:25 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs

On 	tue, 28 Jan 2014 11:00:54 -0500, Josef Bacik wrote:
> 
> On 01/14/2014 07:31 AM, Miao Xie wrote:
>> We are sure that:
>> - one ordered extent just has one csum calculation worker, and no one
>>    access the csum list during the csum calculation except the worker.
>> - we don't change the list and free the csum until no one reference
>>    to the ordered extent
>> So it is safe to add csum without the lock.
>>
>> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
>> ---
>>   fs/btrfs/ordered-data.c |    5 -----
>>   1 files changed, 0 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
>> index e4c3d56..396c6d1 100644
>> --- a/fs/btrfs/ordered-data.c
>> +++ b/fs/btrfs/ordered-data.c
>> @@ -280,16 +280,11 @@ void btrfs_add_ordered_sum(struct inode *inode,
>>                  struct btrfs_ordered_extent *entry,
>>                  struct btrfs_ordered_sum *sum)
>>   {
>> -    struct btrfs_ordered_inode_tree *tree;
>> -
>> -    tree = &BTRFS_I(inode)->ordered_tree;
>> -    spin_lock_irq(&tree->lock);
>>       list_add_tail(&sum->list, &entry->list);
>>       WARN_ON(entry->csum_bytes_left < sum->len);
>>       entry->csum_bytes_left -= sum->len;
>>       if (entry->csum_bytes_left == 0)
>>           wake_up(&entry->wait);
>> -    spin_unlock_irq(&tree->lock);
>>   }
>>     /*
> This blew up in xfstests so one of your assumptions is incorrect. Thanks,

I think it is because there is a bug in the other place. Please tell me
the case that can reproduce the problem.

Thanks
Miao

> 
> Josef
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 2/4] Btrfs: don't get the lock when adding a csum into a ordered extent
  2014-01-29  2:25     ` Miao Xie
@ 2014-01-29 18:43       ` Josef Bacik
  0 siblings, 0 replies; 13+ messages in thread
From: Josef Bacik @ 2014-01-29 18:43 UTC (permalink / raw)
  To: miaox, linux-btrfs


On 01/28/2014 09:25 PM, Miao Xie wrote:
> On 	tue, 28 Jan 2014 11:00:54 -0500, Josef Bacik wrote:
>> On 01/14/2014 07:31 AM, Miao Xie wrote:
>>> We are sure that:
>>> - one ordered extent just has one csum calculation worker, and no one
>>>     access the csum list during the csum calculation except the worker.
>>> - we don't change the list and free the csum until no one reference
>>>     to the ordered extent
>>> So it is safe to add csum without the lock.
>>>
>>> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
>>> ---
>>>    fs/btrfs/ordered-data.c |    5 -----
>>>    1 files changed, 0 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
>>> index e4c3d56..396c6d1 100644
>>> --- a/fs/btrfs/ordered-data.c
>>> +++ b/fs/btrfs/ordered-data.c
>>> @@ -280,16 +280,11 @@ void btrfs_add_ordered_sum(struct inode *inode,
>>>                   struct btrfs_ordered_extent *entry,
>>>                   struct btrfs_ordered_sum *sum)
>>>    {
>>> -    struct btrfs_ordered_inode_tree *tree;
>>> -
>>> -    tree = &BTRFS_I(inode)->ordered_tree;
>>> -    spin_lock_irq(&tree->lock);
>>>        list_add_tail(&sum->list, &entry->list);
>>>        WARN_ON(entry->csum_bytes_left < sum->len);
>>>        entry->csum_bytes_left -= sum->len;
>>>        if (entry->csum_bytes_left == 0)
>>>            wake_up(&entry->wait);
>>> -    spin_unlock_irq(&tree->lock);
>>>    }
>>>      /*
>> This blew up in xfstests so one of your assumptions is incorrect. Thanks,
> I think it is because there is a bug in the other place. Please tell me
> the case that can reproduce the problem.
>
It was one of the btrfs tests, I think btrfs/005 maybe?  Turn on slab 
debugging, that seemed to make a difference.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V2 3/4] Btrfs: don't mix the ordered extents of all files together during logging the inodes
  2014-01-29  2:22     ` Miao Xie
@ 2014-01-29 18:48       ` Josef Bacik
  0 siblings, 0 replies; 13+ messages in thread
From: Josef Bacik @ 2014-01-29 18:48 UTC (permalink / raw)
  To: miaox, linux-btrfs


On 01/28/2014 09:22 PM, Miao Xie wrote:
> On Tue, 28 Jan 2014 09:55:58 -0500, Josef Bacik wrote:
>> On 01/14/2014 07:31 AM, Miao Xie wrote:
>>> There was a problem in the old code:
>>> If we failed to log the csum, we would free all the ordered extents in the log list
>>> including those ordered extents that were logged successfully, it would make the
>>> log committer not to wait for the completion of the ordered extents.
>>>
>>> This patch doesn't insert the ordered extents that is about to be logged into
>>> a global list, instead, we insert them into a local list. If we log the ordered
>>> extents successfully, we splice them with the global list, or we will throw them
>>> away, then do full sync. It can also reduce the lock contention and the traverse
>>> time of list.
>>>
>> If we fail to log the csums we'll return an error and commit the transaction so it doesn't matter that we free all the ordered extents, everything will be waited on properly.  Thanks,
>
> No. Maybe my explanation is not clear. Please consider the following case:
>
> Task1 (sync file1)		Task2 (sync file2)
> btrfs_sync_file()
> 				btrfs_sync_file()
>    start_log_trans()
> 				  start_log_trans()
>    btrfs_get_logged_extents()
>    btrfs_log_changed_extents()
> 				  btrfs_get_logged_extents()
> 				  btrfs_log_changed_extents() failed
> 				  btrfs_free_logged_extents()
> 				    this function will free all logged extents
> 				    including the ones that are inserted by task1
>    btrfs_end_log_trans()
>    btrfs_sync_log()
>      btrfs_wait_logged_extents()
>        but no extents to be
>        waited
>    sync end
> 				  btrfs_wait_ordered_extents()
> 				    but this operation just wait for file2's
> 				    extents that is in the specified range.
> 				  btrfs_commit_transaction()
> 				    the committer also doesn't wait for the extents
> 				    of file1
>
Ah I see then yes that's correct.  I'll pull it in, thanks,

Josef

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-01-29 18:48 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-14 12:31 [PATCH V2 0/4] Btrfs: improve the performance fluctuating of the fsync Miao Xie
2014-01-14 12:31 ` [PATCH V2 1/4] Btrfs: filter the ordered extents that has been logged Miao Xie
2014-01-28 14:50   ` Josef Bacik
2014-01-29  2:08     ` Miao Xie
2014-01-14 12:31 ` [PATCH V2 2/4] Btrfs: don't get the lock when adding a csum into a ordered extent Miao Xie
2014-01-28 16:00   ` Josef Bacik
2014-01-29  2:25     ` Miao Xie
2014-01-29 18:43       ` Josef Bacik
2014-01-14 12:31 ` [PATCH V2 3/4] Btrfs: don't mix the ordered extents of all files together during logging the inodes Miao Xie
2014-01-28 14:55   ` Josef Bacik
2014-01-29  2:22     ` Miao Xie
2014-01-29 18:48       ` Josef Bacik
2014-01-14 12:31 ` [PATCH V2 4/4] Btrfs: flush the dirty pages of the ordered extent aggressively during logging csum Miao Xie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).