linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup
@ 2015-04-21  6:21 Qu Wenruo
  2015-04-21  6:21 ` [PATCH 01/18] btrfs: backref: Don't merge refs which are not for same block Qu Wenruo
                   ` (18 more replies)
  0 siblings, 19 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

[BRIEF]
This patchset mainly introduces a new qgroup mechanism, which is
originally used to fix a bug in fstest/btrfs/057.

[WORKFLOW]
The new mechanism works like the following:
0) Previous transaction is done.
   The new mechanism highly depends on commit_transaction.
   So without commit_transaction(), qgroup is not updated.

1) Record dirty extents.
   Now it has a hook in add_delayed_ref_head() which will record a
   possible dirty extent into
   trans->transaction->delayed_refs.dirty_extent_root.

2) Record old_roots for dirty extents
   old_roots is get from commit root, so it is complete accurate.

3) Record new_roots for dirty extents
   New_roots is get from current root, so we must ensure delayed_refs
   are all handled. It is quite easy done in commit_transaction().

4) Update qgroup counters with old/new_roots.

5) Transaction is committed.

[ADVANTAGES]
The advantages is obvious:
1) No more heavy/confusing ref_node operation.
   Old ref_node based qgroup makes each qgroup enabled ref_node to call
   btrfs_find_all_roots(), which is quite expensive considering the
   number of ref_nodes.
   Also we don't ever need to consider whether pass no_quota as 0 or 1,
   as dirty extents are handled automatically, no meaningful no_quota
   option any more.

2) Accuracy from design.
   Now only few things can break accuracy of new qgroup.(*)
   File clone? Leaf COW/splitting? As long as btrfs tree roots are in
   consistent status, qgroup won't be wrong.
   No more annoying and hard-to-debug btrfs/057 warning anymore.

   *: Although snapshot create/delete qgroup reassign will break the
   mechanism, but create can be handled by a small exception.
   For snapshot delete/reassign? It never works as designed.

[PATCHES]
Patch 1~3 are cleanup for delayed_refs codes, which makes the logic and
codes for merging ref_nodes more clear

Patch 4~5 are cleanup for qgroups. Just replace some open-coded functions
and reduce parameters.

Patch 6~11 are the new mechanism. Some functions like
qgroup_update_refcnt() and qgroup_update_counters() seems much like the
old functions, but for a better patch split, I have to copy some
function with new names and switch to new mechanism.
Nobody want to review a patch with 700+ inserts and 1400+ deletions,
right?

Patch 12~15 are switching to the new mechanism, and remove all the
unneeded codes. Oh yeah! Just press down 'd' !!!

Patch 16~18 are fix a btrfs/022. In theory, for the new mechanism the
best method for snapshot create/delete and qgroup reassign, is to mark
all related extent dirty and let the new mechanism to handle it.
But it's too expensive, so add a small exception for snapshot create to
resolve it.
For snapshot delete and qgroup reassign, it doesn't work now anyway.

Qu Wenruo (18):
  btrfs: backref: Don't merge refs which are not for same block.
  btrfs: delayed-ref: Use list to replace the ref_root in ref_head.
  btrfs: delayed-ref: Cleanup the unneeded functions.
  btrfs: qgroup: Cleanup open-coded old/new_refcnt update and read.
  btrfs: extent-tree: Use ref_node to replace unneeded parameters in
    __inc_extent_ref() and __free_extent()
  btrfs: qgroup: Add function qgroup_update_refcnt().
  btrfs: qgroup: Add function qgroup_update_counters().
  btrfs: qgroup: Record possible quota-related extent for qgroup.
  btrfs: qgroup: Add new function to record old_roots.
  btrfs: backref: Add special time_seq == (u64)-1 case for    
    btrfs_find_all_roots().
  btrfs: qgroup: Add new qgroup calculation function    
    btrfs_qgroup_account_extents().
  btrfs: qgroup: Switch rescan to new mechanism.
  btrfs: qgroup: Switch to new extent-oriented qgroup mechanism.
  btrfs: qgroup: Switch self test to extent-oriented qgroup mechanism.
  btrfs: qgroup: Cleanup the old ref_node-oriented mechanism.
  btrfs: ulist: Add ulist_del() function.
  btrfs: qgroup: Add the ability to skip given qgroup for old/new_roots.
  btrfs: qgroup: Make snapshot accounting work with new extent-oriented 
       qgroup.

 fs/btrfs/backref.c            |   50 +-
 fs/btrfs/ctree.h              |    2 +-
 fs/btrfs/delayed-ref.c        |  372 +++++---------
 fs/btrfs/delayed-ref.h        |   29 +-
 fs/btrfs/disk-io.c            |    8 +-
 fs/btrfs/extent-tree.c        |  203 ++------
 fs/btrfs/extent-tree.h        |    0
 fs/btrfs/qgroup.c             | 1090 ++++++++++-------------------------------
 fs/btrfs/qgroup.h             |   61 +--
 fs/btrfs/tests/qgroup-tests.c |  109 ++++-
 fs/btrfs/transaction.c        |   73 ++-
 fs/btrfs/transaction.h        |   23 +
 fs/btrfs/ulist.c              |   47 +-
 fs/btrfs/ulist.h              |    1 +
 include/trace/events/btrfs.h  |   55 ---
 15 files changed, 711 insertions(+), 1412 deletions(-)
 create mode 100644 fs/btrfs/extent-tree.h

-- 
2.3.5


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 01/18] btrfs: backref: Don't merge refs which are not for same block.
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 02/18] btrfs: delayed-ref: Use list to replace the ref_root in ref_head Qu Wenruo
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

Old __merge_refs() in backref.c will even merge refs whose root_id are
different, which makes qgroup gives wrong result.

Fix it by checking ref_for_same_block() before any mode specific works.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/backref.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index f55721f..7f14275 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -507,7 +507,7 @@ static int __add_missing_keys(struct btrfs_fs_info *fs_info,
 }
 
 /*
- * merge two lists of backrefs and adjust counts accordingly
+ * merge backrefs and adjust counts accordingly
  *
  * mode = 1: merge identical keys, if key is set
  *    FIXME: if we add more keys in __add_prelim_ref, we can merge more here.
@@ -535,9 +535,9 @@ static void __merge_refs(struct list_head *head, int mode)
 
 			ref2 = list_entry(pos2, struct __prelim_ref, list);
 
+			if (!ref_for_same_block(ref1, ref2))
+				continue;
 			if (mode == 1) {
-				if (!ref_for_same_block(ref1, ref2))
-					continue;
 				if (!ref1->parent && ref2->parent) {
 					xchg = ref1;
 					ref1 = ref2;
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 02/18] btrfs: delayed-ref: Use list to replace the ref_root in ref_head.
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
  2015-04-21  6:21 ` [PATCH 01/18] btrfs: backref: Don't merge refs which are not for same block Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 03/18] btrfs: delayed-ref: Cleanup the unneeded functions Qu Wenruo
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

This patch replace the rbtree used in ref_head to list.
This has the following advantage:
1) Easier merge logic.
With the new list implement, we only need to care merging the tail
ref_node with the new ref_node.
And this can be done quite easy at insert time, no need to do a
indicated merge at run_delayed_refs().

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/backref.c     |   9 +--
 fs/btrfs/delayed-ref.c | 156 ++++++++++++++++++++++++++-----------------------
 fs/btrfs/delayed-ref.h |  18 +++++-
 fs/btrfs/disk-io.c     |   8 +--
 fs/btrfs/extent-tree.c |  46 +++------------
 5 files changed, 114 insertions(+), 123 deletions(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 7f14275..a4c31dc 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -572,8 +572,8 @@ static int __add_delayed_refs(struct btrfs_delayed_ref_head *head, u64 seq,
 			      struct list_head *prefs, u64 *total_refs,
 			      u64 inum)
 {
+	struct btrfs_delayed_ref_node *node;
 	struct btrfs_delayed_extent_op *extent_op = head->extent_op;
-	struct rb_node *n = &head->node.rb_node;
 	struct btrfs_key key;
 	struct btrfs_key op_key = {0};
 	int sgn;
@@ -583,12 +583,7 @@ static int __add_delayed_refs(struct btrfs_delayed_ref_head *head, u64 seq,
 		btrfs_disk_key_to_cpu(&op_key, &extent_op->key);
 
 	spin_lock(&head->lock);
-	n = rb_first(&head->ref_root);
-	while (n) {
-		struct btrfs_delayed_ref_node *node;
-		node = rb_entry(n, struct btrfs_delayed_ref_node,
-				rb_node);
-		n = rb_next(n);
+	list_for_each_entry(node, &head->ref_list, list) {
 		if (node->seq > seq)
 			continue;
 
diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
index 6d16bea..f5f706f 100644
--- a/fs/btrfs/delayed-ref.c
+++ b/fs/btrfs/delayed-ref.c
@@ -268,7 +268,7 @@ static inline void drop_delayed_ref(struct btrfs_trans_handle *trans,
 		rb_erase(&head->href_node, &delayed_refs->href_root);
 	} else {
 		assert_spin_locked(&head->lock);
-		rb_erase(&ref->rb_node, &head->ref_root);
+		list_del(&ref->list);
 	}
 	ref->in_tree = 0;
 	btrfs_put_delayed_ref(ref);
@@ -328,48 +328,6 @@ static int merge_ref(struct btrfs_trans_handle *trans,
 	return done;
 }
 
-void btrfs_merge_delayed_refs(struct btrfs_trans_handle *trans,
-			      struct btrfs_fs_info *fs_info,
-			      struct btrfs_delayed_ref_root *delayed_refs,
-			      struct btrfs_delayed_ref_head *head)
-{
-	struct rb_node *node;
-	u64 seq = 0;
-
-	assert_spin_locked(&head->lock);
-	/*
-	 * We don't have too much refs to merge in the case of delayed data
-	 * refs.
-	 */
-	if (head->is_data)
-		return;
-
-	spin_lock(&fs_info->tree_mod_seq_lock);
-	if (!list_empty(&fs_info->tree_mod_seq_list)) {
-		struct seq_list *elem;
-
-		elem = list_first_entry(&fs_info->tree_mod_seq_list,
-					struct seq_list, list);
-		seq = elem->seq;
-	}
-	spin_unlock(&fs_info->tree_mod_seq_lock);
-
-	node = rb_first(&head->ref_root);
-	while (node) {
-		struct btrfs_delayed_ref_node *ref;
-
-		ref = rb_entry(node, struct btrfs_delayed_ref_node,
-			       rb_node);
-		/* We can't merge refs that are outside of our seq count */
-		if (seq && ref->seq >= seq)
-			break;
-		if (merge_ref(trans, delayed_refs, head, ref, seq))
-			node = rb_first(&head->ref_root);
-		else
-			node = rb_next(&ref->rb_node);
-	}
-}
-
 int btrfs_check_delayed_seq(struct btrfs_fs_info *fs_info,
 			    struct btrfs_delayed_ref_root *delayed_refs,
 			    u64 seq)
@@ -485,6 +443,74 @@ update_existing_ref(struct btrfs_trans_handle *trans,
 }
 
 /*
+ * Helper to insert the ref_node to the tail or merge with tail.
+ *
+ * Return 0 for insert.
+ * Return >0 for merge.
+ */
+static int
+add_delayed_ref_tail_merge(struct btrfs_trans_handle *trans,
+			   struct btrfs_delayed_ref_root *root,
+			   struct btrfs_delayed_ref_head *href,
+			   struct btrfs_delayed_ref_node *ref)
+{
+	struct btrfs_delayed_ref_node *exist;
+	int mod;
+	int ret = 0;
+
+	spin_lock(&href->lock);
+	/* Check whether we can merge the tail node with ref */
+	if (list_empty(&href->ref_list))
+		goto add_tail;
+	exist = list_entry(href->ref_list.prev, struct btrfs_delayed_ref_node,
+			   list);
+	/* No need to compare bytenr nor is_head */
+	if (exist->type != ref->type || exist->no_quota != ref->no_quota ||
+	    exist->seq != ref->seq)
+		goto add_tail;
+
+	if ((exist->type == BTRFS_TREE_BLOCK_REF_KEY ||
+	     exist->type == BTRFS_SHARED_BLOCK_REF_KEY) &&
+	    comp_tree_refs(btrfs_delayed_node_to_tree_ref(exist),
+			   btrfs_delayed_node_to_tree_ref(ref),
+			   ref->type))
+		goto add_tail;
+	if ((exist->type == BTRFS_EXTENT_DATA_REF_KEY ||
+	     exist->type == BTRFS_SHARED_DATA_REF_KEY) &&
+	    comp_data_refs(btrfs_delayed_node_to_data_ref(exist),
+			   btrfs_delayed_node_to_data_ref(ref)))
+		goto add_tail;
+
+	/* Now we are sure we can merge */
+	ret = 1;
+	if (exist->action == ref->action) {
+		mod = ref->ref_mod;
+	} else {
+		/* Need to change action */
+		if (exist->ref_mod < ref->ref_mod) {
+			exist->action = ref->action;
+			mod = -exist->ref_mod;
+			exist->ref_mod = ref->ref_mod;
+		} else
+			mod = -ref->ref_mod;
+	}
+	exist->ref_mod += mod;
+
+	/* remove existing tail if its ref_mod is zero */
+	if (exist->ref_mod == 0)
+		drop_delayed_ref(trans, root, href, exist);
+	spin_unlock(&href->lock);
+	return ret;
+
+add_tail:
+	list_add_tail(&ref->list, &href->ref_list);
+	atomic_inc(&root->num_entries);
+	trans->delayed_ref_updates++;
+	spin_unlock(&href->lock);
+	return ret;
+}
+
+/*
  * helper function to update the accounting in the head ref
  * existing and update must have the same bytenr
  */
@@ -603,7 +629,7 @@ add_delayed_ref_head(struct btrfs_fs_info *fs_info,
 	head_ref = btrfs_delayed_node_to_head(ref);
 	head_ref->must_insert_reserved = must_insert_reserved;
 	head_ref->is_data = is_data;
-	head_ref->ref_root = RB_ROOT;
+	INIT_LIST_HEAD(&head_ref->ref_list);
 	head_ref->processing = 0;
 
 	spin_lock_init(&head_ref->lock);
@@ -641,10 +667,10 @@ add_delayed_tree_ref(struct btrfs_fs_info *fs_info,
 		     u64 num_bytes, u64 parent, u64 ref_root, int level,
 		     int action, int no_quota)
 {
-	struct btrfs_delayed_ref_node *existing;
 	struct btrfs_delayed_tree_ref *full_ref;
 	struct btrfs_delayed_ref_root *delayed_refs;
 	u64 seq = 0;
+	int ret;
 
 	if (action == BTRFS_ADD_DELAYED_EXTENT)
 		action = BTRFS_ADD_DELAYED_REF;
@@ -675,21 +701,14 @@ add_delayed_tree_ref(struct btrfs_fs_info *fs_info,
 
 	trace_add_delayed_tree_ref(ref, full_ref, action);
 
-	spin_lock(&head_ref->lock);
-	existing = tree_insert(&head_ref->ref_root, &ref->rb_node);
-	if (existing) {
-		update_existing_ref(trans, delayed_refs, head_ref, existing,
-				    ref);
-		/*
-		 * we've updated the existing ref, free the newly
-		 * allocated ref
-		 */
+	ret = add_delayed_ref_tail_merge(trans, delayed_refs, head_ref, ref);
+
+	/*
+	 * XXX: memory should be freed at the same level allocated.
+	 * But bad practice is anywhere... Follow it now. Need cleanup.
+	 */
+	if (ret > 0)
 		kmem_cache_free(btrfs_delayed_tree_ref_cachep, full_ref);
-	} else {
-		atomic_inc(&delayed_refs->num_entries);
-		trans->delayed_ref_updates++;
-	}
-	spin_unlock(&head_ref->lock);
 }
 
 /*
@@ -703,10 +722,10 @@ add_delayed_data_ref(struct btrfs_fs_info *fs_info,
 		     u64 num_bytes, u64 parent, u64 ref_root, u64 owner,
 		     u64 offset, int action, int no_quota)
 {
-	struct btrfs_delayed_ref_node *existing;
 	struct btrfs_delayed_data_ref *full_ref;
 	struct btrfs_delayed_ref_root *delayed_refs;
 	u64 seq = 0;
+	int ret;
 
 	if (action == BTRFS_ADD_DELAYED_EXTENT)
 		action = BTRFS_ADD_DELAYED_REF;
@@ -740,21 +759,10 @@ add_delayed_data_ref(struct btrfs_fs_info *fs_info,
 
 	trace_add_delayed_data_ref(ref, full_ref, action);
 
-	spin_lock(&head_ref->lock);
-	existing = tree_insert(&head_ref->ref_root, &ref->rb_node);
-	if (existing) {
-		update_existing_ref(trans, delayed_refs, head_ref, existing,
-				    ref);
-		/*
-		 * we've updated the existing ref, free the newly
-		 * allocated ref
-		 */
+	ret = add_delayed_ref_tail_merge(trans, delayed_refs, head_ref, ref);
+
+	if (ret > 0)
 		kmem_cache_free(btrfs_delayed_data_ref_cachep, full_ref);
-	} else {
-		atomic_inc(&delayed_refs->num_entries);
-		trans->delayed_ref_updates++;
-	}
-	spin_unlock(&head_ref->lock);
 }
 
 /*
diff --git a/fs/btrfs/delayed-ref.h b/fs/btrfs/delayed-ref.h
index a764e23..a5f6a66 100644
--- a/fs/btrfs/delayed-ref.h
+++ b/fs/btrfs/delayed-ref.h
@@ -24,9 +24,25 @@
 #define BTRFS_ADD_DELAYED_EXTENT 3 /* record a full extent allocation */
 #define BTRFS_UPDATE_DELAYED_HEAD 4 /* not changing ref count on head ref */
 
+/*
+ * XXX: Qu: I really hate the design that ref_head and tree/data ref shares the
+ * same ref_node structure.
+ * Ref_head is in a higher logic level than tree/data ref, and duplicated
+ * bytenr/num_bytes in ref_node is really a waste or memory, they should be
+ * referred from ref_head.
+ * This gets more disgusting after we use list to store tree/data ref in
+ * ref_head. Must clean this mess up later.
+ */
 struct btrfs_delayed_ref_node {
+	/*
+	 * ref_head use rb tree, stored in ref_root->href.
+	 * indexed by bytenr
+	 */
 	struct rb_node rb_node;
 
+	/*data/tree ref use list, stored in ref_head->ref_list. */
+	struct list_head list;
+
 	/* the starting bytenr of the extent */
 	u64 bytenr;
 
@@ -83,7 +99,7 @@ struct btrfs_delayed_ref_head {
 	struct mutex mutex;
 
 	spinlock_t lock;
-	struct rb_root ref_root;
+	struct list_head ref_list;
 
 	struct rb_node href_node;
 
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 639f266..eaeab39 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4016,6 +4016,7 @@ static int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans,
 
 	while ((node = rb_first(&delayed_refs->href_root)) != NULL) {
 		struct btrfs_delayed_ref_head *head;
+		struct btrfs_delayed_ref_node *tmp;
 		bool pin_bytes = false;
 
 		head = rb_entry(node, struct btrfs_delayed_ref_head,
@@ -4031,11 +4032,10 @@ static int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans,
 			continue;
 		}
 		spin_lock(&head->lock);
-		while ((node = rb_first(&head->ref_root)) != NULL) {
-			ref = rb_entry(node, struct btrfs_delayed_ref_node,
-				       rb_node);
+		list_for_each_entry_safe_reverse(ref, tmp, &head->ref_list,
+						 list) {
 			ref->in_tree = 0;
-			rb_erase(&ref->rb_node, &head->ref_root);
+			list_del(&ref->list);
 			atomic_dec(&delayed_refs->num_entries);
 			btrfs_put_delayed_ref(ref);
 		}
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 8b353ad..e0d3fcc 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2323,28 +2323,14 @@ static int run_one_delayed_ref(struct btrfs_trans_handle *trans,
 	return ret;
 }
 
-static noinline struct btrfs_delayed_ref_node *
+static inline struct btrfs_delayed_ref_node *
 select_delayed_ref(struct btrfs_delayed_ref_head *head)
 {
-	struct rb_node *node;
-	struct btrfs_delayed_ref_node *ref, *last = NULL;;
+	if (list_empty(&head->ref_list))
+		return NULL;
 
-	/*
-	 * select delayed ref of type BTRFS_ADD_DELAYED_REF first.
-	 * this prevents ref count from going down to zero when
-	 * there still are pending delayed ref.
-	 */
-	node = rb_first(&head->ref_root);
-	while (node) {
-		ref = rb_entry(node, struct btrfs_delayed_ref_node,
-				rb_node);
-		if (ref->action == BTRFS_ADD_DELAYED_REF)
-			return ref;
-		else if (last == NULL)
-			last = ref;
-		node = rb_next(node);
-	}
-	return last;
+	return list_entry(head->ref_list.next, struct btrfs_delayed_ref_node,
+			  list);
 }
 
 /*
@@ -2396,16 +2382,7 @@ static noinline int __btrfs_run_delayed_refs(struct btrfs_trans_handle *trans,
 			}
 		}
 
-		/*
-		 * We need to try and merge add/drops of the same ref since we
-		 * can run into issues with relocate dropping the implicit ref
-		 * and then it being added back again before the drop can
-		 * finish.  If we merged anything we need to re-loop so we can
-		 * get a good ref.
-		 */
 		spin_lock(&locked_ref->lock);
-		btrfs_merge_delayed_refs(trans, fs_info, delayed_refs,
-					 locked_ref);
 
 		/*
 		 * locked_ref is the head node, so we have to go one
@@ -2482,7 +2459,7 @@ static noinline int __btrfs_run_delayed_refs(struct btrfs_trans_handle *trans,
 			spin_unlock(&locked_ref->lock);
 			spin_lock(&delayed_refs->lock);
 			spin_lock(&locked_ref->lock);
-			if (rb_first(&locked_ref->ref_root) ||
+			if (!list_empty(&locked_ref->ref_list) ||
 			    locked_ref->extent_op) {
 				spin_unlock(&locked_ref->lock);
 				spin_unlock(&delayed_refs->lock);
@@ -2496,7 +2473,7 @@ static noinline int __btrfs_run_delayed_refs(struct btrfs_trans_handle *trans,
 		} else {
 			actual_count++;
 			ref->in_tree = 0;
-			rb_erase(&ref->rb_node, &locked_ref->ref_root);
+			list_del(&ref->list);
 		}
 		atomic_dec(&delayed_refs->num_entries);
 
@@ -2874,7 +2851,6 @@ static noinline int check_delayed_ref(struct btrfs_trans_handle *trans,
 	struct btrfs_delayed_ref_node *ref;
 	struct btrfs_delayed_data_ref *data_ref;
 	struct btrfs_delayed_ref_root *delayed_refs;
-	struct rb_node *node;
 	int ret = 0;
 
 	delayed_refs = &trans->transaction->delayed_refs;
@@ -2903,11 +2879,7 @@ static noinline int check_delayed_ref(struct btrfs_trans_handle *trans,
 	spin_unlock(&delayed_refs->lock);
 
 	spin_lock(&head->lock);
-	node = rb_first(&head->ref_root);
-	while (node) {
-		ref = rb_entry(node, struct btrfs_delayed_ref_node, rb_node);
-		node = rb_next(node);
-
+	list_for_each_entry(ref, &head->ref_list, list) {
 		/* If it's a shared ref we know a cross reference exists */
 		if (ref->type != BTRFS_EXTENT_DATA_REF_KEY) {
 			ret = 1;
@@ -6135,7 +6107,7 @@ static noinline int check_ref_cleanup(struct btrfs_trans_handle *trans,
 		goto out_delayed_unlock;
 
 	spin_lock(&head->lock);
-	if (rb_first(&head->ref_root))
+	if (!list_empty(&head->ref_list))
 		goto out;
 
 	if (head->extent_op) {
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 03/18] btrfs: delayed-ref: Cleanup the unneeded functions.
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
  2015-04-21  6:21 ` [PATCH 01/18] btrfs: backref: Don't merge refs which are not for same block Qu Wenruo
  2015-04-21  6:21 ` [PATCH 02/18] btrfs: delayed-ref: Use list to replace the ref_root in ref_head Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 04/18] btrfs: qgroup: Cleanup open-coded old/new_refcnt update and read Qu Wenruo
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

Cleanup the rb_tree merge/insert/update functions, since now we use list
instead of rb_tree now.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/delayed-ref.c | 174 -------------------------------------------------
 1 file changed, 174 deletions(-)

diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
index f5f706f..197f45b 100644
--- a/fs/btrfs/delayed-ref.c
+++ b/fs/btrfs/delayed-ref.c
@@ -84,87 +84,6 @@ static int comp_data_refs(struct btrfs_delayed_data_ref *ref2,
 	return 0;
 }
 
-/*
- * entries in the rb tree are ordered by the byte number of the extent,
- * type of the delayed backrefs and content of delayed backrefs.
- */
-static int comp_entry(struct btrfs_delayed_ref_node *ref2,
-		      struct btrfs_delayed_ref_node *ref1,
-		      bool compare_seq)
-{
-	if (ref1->bytenr < ref2->bytenr)
-		return -1;
-	if (ref1->bytenr > ref2->bytenr)
-		return 1;
-	if (ref1->is_head && ref2->is_head)
-		return 0;
-	if (ref2->is_head)
-		return -1;
-	if (ref1->is_head)
-		return 1;
-	if (ref1->type < ref2->type)
-		return -1;
-	if (ref1->type > ref2->type)
-		return 1;
-	if (ref1->no_quota > ref2->no_quota)
-		return 1;
-	if (ref1->no_quota < ref2->no_quota)
-		return -1;
-	/* merging of sequenced refs is not allowed */
-	if (compare_seq) {
-		if (ref1->seq < ref2->seq)
-			return -1;
-		if (ref1->seq > ref2->seq)
-			return 1;
-	}
-	if (ref1->type == BTRFS_TREE_BLOCK_REF_KEY ||
-	    ref1->type == BTRFS_SHARED_BLOCK_REF_KEY) {
-		return comp_tree_refs(btrfs_delayed_node_to_tree_ref(ref2),
-				      btrfs_delayed_node_to_tree_ref(ref1),
-				      ref1->type);
-	} else if (ref1->type == BTRFS_EXTENT_DATA_REF_KEY ||
-		   ref1->type == BTRFS_SHARED_DATA_REF_KEY) {
-		return comp_data_refs(btrfs_delayed_node_to_data_ref(ref2),
-				      btrfs_delayed_node_to_data_ref(ref1));
-	}
-	BUG();
-	return 0;
-}
-
-/*
- * insert a new ref into the rbtree.  This returns any existing refs
- * for the same (bytenr,parent) tuple, or NULL if the new node was properly
- * inserted.
- */
-static struct btrfs_delayed_ref_node *tree_insert(struct rb_root *root,
-						  struct rb_node *node)
-{
-	struct rb_node **p = &root->rb_node;
-	struct rb_node *parent_node = NULL;
-	struct btrfs_delayed_ref_node *entry;
-	struct btrfs_delayed_ref_node *ins;
-	int cmp;
-
-	ins = rb_entry(node, struct btrfs_delayed_ref_node, rb_node);
-	while (*p) {
-		parent_node = *p;
-		entry = rb_entry(parent_node, struct btrfs_delayed_ref_node,
-				 rb_node);
-
-		cmp = comp_entry(entry, ins, 1);
-		if (cmp < 0)
-			p = &(*p)->rb_left;
-		else if (cmp > 0)
-			p = &(*p)->rb_right;
-		else
-			return entry;
-	}
-
-	rb_link_node(node, parent_node, p);
-	rb_insert_color(node, root);
-	return NULL;
-}
-
 /* insert a new ref to head ref rbtree */
 static struct btrfs_delayed_ref_head *htree_insert(struct rb_root *root,
 						   struct rb_node *node)
@@ -277,57 +196,6 @@ static inline void drop_delayed_ref(struct btrfs_trans_handle *trans,
 		trans->delayed_ref_updates--;
 }
 
-static int merge_ref(struct btrfs_trans_handle *trans,
-		     struct btrfs_delayed_ref_root *delayed_refs,
-		     struct btrfs_delayed_ref_head *head,
-		     struct btrfs_delayed_ref_node *ref, u64 seq)
-{
-	struct rb_node *node;
-	int mod = 0;
-	int done = 0;
-
-	node = rb_next(&ref->rb_node);
-	while (!done && node) {
-		struct btrfs_delayed_ref_node *next;
-
-		next = rb_entry(node, struct btrfs_delayed_ref_node, rb_node);
-		node = rb_next(node);
-		if (seq && next->seq >= seq)
-			break;
-		if (comp_entry(ref, next, 0))
-			continue;
-
-		if (ref->action == next->action) {
-			mod = next->ref_mod;
-		} else {
-			if (ref->ref_mod < next->ref_mod) {
-				struct btrfs_delayed_ref_node *tmp;
-
-				tmp = ref;
-				ref = next;
-				next = tmp;
-				done = 1;
-			}
-			mod = -next->ref_mod;
-		}
-
-		drop_delayed_ref(trans, delayed_refs, head, next);
-		ref->ref_mod += mod;
-		if (ref->ref_mod == 0) {
-			drop_delayed_ref(trans, delayed_refs, head, ref);
-			done = 1;
-		} else {
-			/*
-			 * You can't have multiples of the same ref on a tree
-			 * block.
-			 */
-			WARN_ON(ref->type == BTRFS_TREE_BLOCK_REF_KEY ||
-				ref->type == BTRFS_SHARED_BLOCK_REF_KEY);
-		}
-	}
-	return done;
-}
-
 int btrfs_check_delayed_seq(struct btrfs_fs_info *fs_info,
 			    struct btrfs_delayed_ref_root *delayed_refs,
 			    u64 seq)
@@ -401,48 +269,6 @@ again:
 }
 
 /*
- * helper function to update an extent delayed ref in the
- * rbtree.  existing and update must both have the same
- * bytenr and parent
- *
- * This may free existing if the update cancels out whatever
- * operation it was doing.
- */
-static noinline void
-update_existing_ref(struct btrfs_trans_handle *trans,
-		    struct btrfs_delayed_ref_root *delayed_refs,
-		    struct btrfs_delayed_ref_head *head,
-		    struct btrfs_delayed_ref_node *existing,
-		    struct btrfs_delayed_ref_node *update)
-{
-	if (update->action != existing->action) {
-		/*
-		 * this is effectively undoing either an add or a
-		 * drop.  We decrement the ref_mod, and if it goes
-		 * down to zero we just delete the entry without
-		 * every changing the extent allocation tree.
-		 */
-		existing->ref_mod--;
-		if (existing->ref_mod == 0)
-			drop_delayed_ref(trans, delayed_refs, head, existing);
-		else
-			WARN_ON(existing->type == BTRFS_TREE_BLOCK_REF_KEY ||
-				existing->type == BTRFS_SHARED_BLOCK_REF_KEY);
-	} else {
-		WARN_ON(existing->type == BTRFS_TREE_BLOCK_REF_KEY ||
-			existing->type == BTRFS_SHARED_BLOCK_REF_KEY);
-		/*
-		 * the action on the existing ref matches
-		 * the action on the ref we're trying to add.
-		 * Bump the ref_mod by one so the backref that
-		 * is eventually added/removed has the correct
-		 * reference count
-		 */
-		existing->ref_mod += update->ref_mod;
-	}
-}
-
-/*
  * Helper to insert the ref_node to the tail or merge with tail.
  *
  * Return 0 for insert.
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 04/18] btrfs: qgroup: Cleanup open-coded old/new_refcnt update and read.
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (2 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 03/18] btrfs: delayed-ref: Cleanup the unneeded functions Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 05/18] btrfs: extent-tree: Use ref_node to replace unneeded parameters in __inc_extent_ref() and __free_extent() Qu Wenruo
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

Use inline functions to do such things, to improve readability.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Acked-by: David Sterba <dsterba@suse.cz>
---
 fs/btrfs/qgroup.c | 95 +++++++++++++++++++++++++++++++------------------------
 1 file changed, 54 insertions(+), 41 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 058c79e..9625644 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -84,11 +84,42 @@ struct btrfs_qgroup {
 
 	/*
 	 * temp variables for accounting operations
+	 * Refer to qgroup_shared_accouting() for details.
 	 */
 	u64 old_refcnt;
 	u64 new_refcnt;
 };
 
+static void btrfs_qgroup_update_old_refcnt(struct btrfs_qgroup *qg, u64 seq,
+					   int mod)
+{
+	if (qg->old_refcnt < seq)
+		qg->old_refcnt = seq;
+	qg->old_refcnt += mod;
+}
+
+static void btrfs_qgroup_update_new_refcnt(struct btrfs_qgroup *qg, u64 seq,
+					   int mod)
+{
+	if (qg->new_refcnt < seq)
+		qg->new_refcnt = seq;
+	qg->new_refcnt += mod;
+}
+
+static inline u64 btrfs_qgroup_get_old_refcnt(struct btrfs_qgroup *qg, u64 seq)
+{
+	if (qg->old_refcnt < seq)
+		return 0;
+	return qg->old_refcnt - seq;
+}
+
+static inline u64 btrfs_qgroup_get_new_refcnt(struct btrfs_qgroup *qg, u64 seq)
+{
+	if (qg->new_refcnt < seq)
+		return 0;
+	return qg->new_refcnt - seq;
+}
+
 /*
  * glue structure to represent the relations between qgroups.
  */
@@ -1497,6 +1528,7 @@ static int qgroup_calc_old_refcnt(struct btrfs_fs_info *fs_info,
 		ULIST_ITER_INIT(&tmp_uiter);
 		while ((tmp_unode = ulist_next(tmp, &tmp_uiter))) {
 			struct btrfs_qgroup_list *glist;
+			int mod;
 
 			qg = u64_to_ptr(tmp_unode->aux);
 			/*
@@ -1508,20 +1540,15 @@ static int qgroup_calc_old_refcnt(struct btrfs_fs_info *fs_info,
 			 * upper level qgroups in order to determine exclusive
 			 * counts.
 			 *
-			 * For rescan we want to set old_refcnt to seq so our
-			 * exclusive calculations end up correct.
+			 * For rescan none of the extent is recorded before so
+			 * we just don't add old_refcnt.
 			 */
 			if (rescan)
-				qg->old_refcnt = seq;
-			else if (qg->old_refcnt < seq)
-				qg->old_refcnt = seq + 1;
-			else
-				qg->old_refcnt++;
-
-			if (qg->new_refcnt < seq)
-				qg->new_refcnt = seq + 1;
+				mod = 0;
 			else
-				qg->new_refcnt++;
+				mod = 1;
+			btrfs_qgroup_update_old_refcnt(qg, seq, mod);
+			btrfs_qgroup_update_new_refcnt(qg, seq, 1);
 			list_for_each_entry(glist, &qg->groups, next_group) {
 				ret = ulist_add(qgroups, glist->group->qgroupid,
 						ptr_to_u64(glist->group),
@@ -1615,14 +1642,8 @@ next:
 		struct btrfs_qgroup_list *glist;
 
 		qg = u64_to_ptr(unode->aux);
-		if (qg->old_refcnt < seq)
-			qg->old_refcnt = seq + 1;
-		else
-			qg->old_refcnt++;
-		if (qg->new_refcnt < seq)
-			qg->new_refcnt = seq + 1;
-		else
-			qg->new_refcnt++;
+		btrfs_qgroup_update_old_refcnt(qg, seq, 1);
+		btrfs_qgroup_update_new_refcnt(qg, seq, 1);
 		list_for_each_entry(glist, &qg->groups, next_group) {
 			ret = ulist_add(qgroups, glist->group->qgroupid,
 					ptr_to_u64(glist->group), GFP_ATOMIC);
@@ -1663,17 +1684,10 @@ static int qgroup_calc_new_refcnt(struct btrfs_fs_info *fs_info,
 		struct btrfs_qgroup_list *glist;
 
 		qg = u64_to_ptr(unode->aux);
-		if (oper->type == BTRFS_QGROUP_OPER_ADD_SHARED) {
-			if (qg->new_refcnt < seq)
-				qg->new_refcnt = seq + 1;
-			else
-				qg->new_refcnt++;
-		} else {
-			if (qg->old_refcnt < seq)
-				qg->old_refcnt = seq + 1;
-			else
-				qg->old_refcnt++;
-		}
+		if (oper->type == BTRFS_QGROUP_OPER_ADD_SHARED)
+			btrfs_qgroup_update_new_refcnt(qg, seq, 1);
+		else
+			btrfs_qgroup_update_old_refcnt(qg, seq, 1);
 		list_for_each_entry(glist, &qg->groups, next_group) {
 			ret = ulist_add(tmp, glist->group->qgroupid,
 					ptr_to_u64(glist->group), GFP_ATOMIC);
@@ -1706,11 +1720,14 @@ static int qgroup_adjust_counters(struct btrfs_fs_info *fs_info,
 		bool dirty = false;
 
 		qg = u64_to_ptr(unode->aux);
+		cur_old_count = btrfs_qgroup_get_old_refcnt(qg, seq);
+		cur_new_count = btrfs_qgroup_get_new_refcnt(qg, seq);
+
 		/*
 		 * Wasn't referenced before but is now, add to the reference
 		 * counters.
 		 */
-		if (qg->old_refcnt <= seq && qg->new_refcnt > seq) {
+		if (cur_old_count == 0 && cur_new_count > 0) {
 			qg->rfer += num_bytes;
 			qg->rfer_cmpr += num_bytes;
 			dirty = true;
@@ -1720,21 +1737,12 @@ static int qgroup_adjust_counters(struct btrfs_fs_info *fs_info,
 		 * Was referenced before but isn't now, subtract from the
 		 * reference counters.
 		 */
-		if (qg->old_refcnt > seq && qg->new_refcnt <= seq) {
+		if (cur_old_count > 0 && cur_new_count == 0) {
 			qg->rfer -= num_bytes;
 			qg->rfer_cmpr -= num_bytes;
 			dirty = true;
 		}
 
-		if (qg->old_refcnt < seq)
-			cur_old_count = 0;
-		else
-			cur_old_count = qg->old_refcnt - seq;
-		if (qg->new_refcnt < seq)
-			cur_new_count = 0;
-		else
-			cur_new_count = qg->new_refcnt - seq;
-
 		/*
 		 * If our refcount was the same as the roots previously but our
 		 * new count isn't the same as the number of roots now then we
@@ -1932,6 +1940,11 @@ static int qgroup_shared_accounting(struct btrfs_trans_handle *trans,
 		new_roots = old_roots;
 		old_roots++;
 	}
+
+	/*
+	 * Bump qgroup_seq to avoid seq overlap
+	 * XXX: This makes qgroup_seq mismatch with oper->seq.
+	 */
 	fs_info->qgroup_seq += old_roots + 1;
 
 
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 05/18] btrfs: extent-tree: Use ref_node to replace unneeded parameters in __inc_extent_ref() and __free_extent()
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (3 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 04/18] btrfs: qgroup: Cleanup open-coded old/new_refcnt update and read Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 06/18] btrfs: qgroup: Add function qgroup_update_refcnt() Qu Wenruo
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

__btrfs_inc_extent_ref() and __btrfs_free_extent() have already had too
many parameters, but three of them can be extracted from
btrfs_delayed_ref_node struct.

So use btrfs_delayed_ref_node struct as a single parameter to replace
the bytenr/num_byte/no_quota parameters.

The real objective of this patch is to allow btrfs_qgroup_record_ref()
get the delayed_ref_node in incoming qgroup patches.

Other functions calling btrfs_qgroup_record_ref() are not affected since
the rest will only add/sub exclusive extents, where node is not used.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c | 42 +++++++++++++++++++++---------------------
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index e0d3fcc..dd6e47e 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -79,11 +79,10 @@ static int update_block_group(struct btrfs_trans_handle *trans,
 			      u64 num_bytes, int alloc);
 static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 				struct btrfs_root *root,
-				u64 bytenr, u64 num_bytes, u64 parent,
+				struct btrfs_delayed_ref_node *node, u64 parent,
 				u64 root_objectid, u64 owner_objectid,
 				u64 owner_offset, int refs_to_drop,
-				struct btrfs_delayed_extent_op *extra_op,
-				int no_quota);
+				struct btrfs_delayed_extent_op *extra_op);
 static void __run_delayed_extent_op(struct btrfs_delayed_extent_op *extent_op,
 				    struct extent_buffer *leaf,
 				    struct btrfs_extent_item *ei);
@@ -1967,10 +1966,9 @@ int btrfs_inc_extent_ref(struct btrfs_trans_handle *trans,
 
 static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans,
 				  struct btrfs_root *root,
-				  u64 bytenr, u64 num_bytes,
+				  struct btrfs_delayed_ref_node *node,
 				  u64 parent, u64 root_objectid,
 				  u64 owner, u64 offset, int refs_to_add,
-				  int no_quota,
 				  struct btrfs_delayed_extent_op *extent_op)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
@@ -1978,8 +1976,11 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans,
 	struct extent_buffer *leaf;
 	struct btrfs_extent_item *item;
 	struct btrfs_key key;
+	u64 bytenr = node->bytenr;
+	u64 num_bytes = node->num_bytes;
 	u64 refs;
 	int ret;
+	int no_quota = node->no_quota;
 	enum btrfs_qgroup_operation_type type = BTRFS_QGROUP_OPER_ADD_EXCL;
 
 	path = btrfs_alloc_path();
@@ -2087,17 +2088,15 @@ static int run_delayed_data_ref(struct btrfs_trans_handle *trans,
 						 ref->objectid, ref->offset,
 						 &ins, node->ref_mod);
 	} else if (node->action == BTRFS_ADD_DELAYED_REF) {
-		ret = __btrfs_inc_extent_ref(trans, root, node->bytenr,
-					     node->num_bytes, parent,
+		ret = __btrfs_inc_extent_ref(trans, root, node, parent,
 					     ref_root, ref->objectid,
 					     ref->offset, node->ref_mod,
-					     node->no_quota, extent_op);
+					     extent_op);
 	} else if (node->action == BTRFS_DROP_DELAYED_REF) {
-		ret = __btrfs_free_extent(trans, root, node->bytenr,
-					  node->num_bytes, parent,
+		ret = __btrfs_free_extent(trans, root, node, parent,
 					  ref_root, ref->objectid,
 					  ref->offset, node->ref_mod,
-					  extent_op, node->no_quota);
+					  extent_op);
 	} else {
 		BUG();
 	}
@@ -2255,15 +2254,14 @@ static int run_delayed_tree_ref(struct btrfs_trans_handle *trans,
 						ref->level, &ins,
 						node->no_quota);
 	} else if (node->action == BTRFS_ADD_DELAYED_REF) {
-		ret = __btrfs_inc_extent_ref(trans, root, node->bytenr,
-					     node->num_bytes, parent, ref_root,
-					     ref->level, 0, 1, node->no_quota,
+		ret = __btrfs_inc_extent_ref(trans, root, node,
+					     parent, ref_root,
+					     ref->level, 0, 1,
 					     extent_op);
 	} else if (node->action == BTRFS_DROP_DELAYED_REF) {
-		ret = __btrfs_free_extent(trans, root, node->bytenr,
-					  node->num_bytes, parent, ref_root,
-					  ref->level, 0, 1, extent_op,
-					  node->no_quota);
+		ret = __btrfs_free_extent(trans, root, node,
+					  parent, ref_root,
+					  ref->level, 0, 1, extent_op);
 	} else {
 		BUG();
 	}
@@ -5806,11 +5804,10 @@ static void add_pinned_bytes(struct btrfs_fs_info *fs_info, u64 num_bytes,
 
 static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 				struct btrfs_root *root,
-				u64 bytenr, u64 num_bytes, u64 parent,
+				struct btrfs_delayed_ref_node *node, u64 parent,
 				u64 root_objectid, u64 owner_objectid,
 				u64 owner_offset, int refs_to_drop,
-				struct btrfs_delayed_extent_op *extent_op,
-				int no_quota)
+				struct btrfs_delayed_extent_op *extent_op)
 {
 	struct btrfs_key key;
 	struct btrfs_path *path;
@@ -5824,8 +5821,11 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 	int extent_slot = 0;
 	int found_extent = 0;
 	int num_to_del = 1;
+	int no_quota = node->no_quota;
 	u32 item_size;
 	u64 refs;
+	u64 bytenr = node->bytenr;
+	u64 num_bytes = node->num_bytes;
 	int last_ref = 0;
 	enum btrfs_qgroup_operation_type type = BTRFS_QGROUP_OPER_SUB_EXCL;
 	bool skinny_metadata = btrfs_fs_incompat(root->fs_info,
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 06/18] btrfs: qgroup: Add function qgroup_update_refcnt().
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (4 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 05/18] btrfs: extent-tree: Use ref_node to replace unneeded parameters in __inc_extent_ref() and __free_extent() Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 07/18] btrfs: qgroup: Add function qgroup_update_counters() Qu Wenruo
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

This function is used to update refcnt for qgroups.
And is one of the two core functions used in the new qgroup implement.

This is based on the old update_old/new_refcnt, but provides a unified
logic and behavior.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/qgroup.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 9625644..17a974c 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1702,6 +1702,64 @@ static int qgroup_calc_new_refcnt(struct btrfs_fs_info *fs_info,
 	return 0;
 }
 
+#define UPDATE_NEW	0
+#define UPDATE_OLD	1
+/*
+ * Walk all of the roots that points to the bytenr and adjust their refcnts.
+ */
+static int qgroup_update_refcnt(struct btrfs_fs_info *fs_info,
+				struct ulist *roots, struct ulist *tmp,
+				struct ulist *qgroups, u64 seq, int update_old)
+{
+	struct ulist_node *unode;
+	struct ulist_iterator uiter;
+	struct ulist_node *tmp_unode;
+	struct ulist_iterator tmp_uiter;
+	struct btrfs_qgroup *qg;
+	int ret = 0;
+
+	if (!roots)
+		return 0;
+	ULIST_ITER_INIT(&uiter);
+	while ((unode = ulist_next(roots, &uiter))) {
+		qg = find_qgroup_rb(fs_info, unode->val);
+		if (!qg)
+			continue;
+
+		ulist_reinit(tmp);
+		ret = ulist_add(qgroups, qg->qgroupid, ptr_to_u64(qg),
+				GFP_ATOMIC);
+		if (ret < 0)
+			return ret;
+		ret = ulist_add(tmp, qg->qgroupid, ptr_to_u64(qg), GFP_ATOMIC);
+		if (ret < 0)
+			return ret;
+		ULIST_ITER_INIT(&tmp_uiter);
+		while ((tmp_unode = ulist_next(tmp, &tmp_uiter))) {
+			struct btrfs_qgroup_list *glist;
+
+			qg = u64_to_ptr(tmp_unode->aux);
+			if (update_old)
+				btrfs_qgroup_update_old_refcnt(qg, seq, 1);
+			else
+				btrfs_qgroup_update_new_refcnt(qg, seq, 1);
+			list_for_each_entry(glist, &qg->groups, next_group) {
+				ret = ulist_add(qgroups, glist->group->qgroupid,
+						ptr_to_u64(glist->group),
+						GFP_ATOMIC);
+				if (ret < 0)
+					return ret;
+				ret = ulist_add(tmp, glist->group->qgroupid,
+						ptr_to_u64(glist->group),
+						GFP_ATOMIC);
+				if (ret < 0)
+					return ret;
+			}
+		}
+	}
+	return 0;
+}
+
 /*
  * This adjusts the counters for all referenced qgroups if need be.
  */
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 07/18] btrfs: qgroup: Add function qgroup_update_counters().
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (5 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 06/18] btrfs: qgroup: Add function qgroup_update_refcnt() Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 08/18] btrfs: qgroup: Record possible quota-related extent for qgroup Qu Wenruo
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

Add function qgroup_update_counters(), which will update related
qgroups' rfer/excl according to old/new_roots.

This is one of the two core functions for the new qgroup implement.

This is based on btrfs_adjust_coutners() but with clearer logic and
comment.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/qgroup.c | 120 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 120 insertions(+)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 17a974c..a780f8f 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1761,6 +1761,126 @@ static int qgroup_update_refcnt(struct btrfs_fs_info *fs_info,
 }
 
 /*
+ * Update qgroup rfer/excl counters.
+ * Rfer update is easy, codes can explain themselves.
+ * Excl update is tricky, the update is split into 2 part.
+ * Part 1: Possible exclusive <-> sharing detect:
+ *	|	A	|	!A	|
+ *  -------------------------------------
+ *  B	|	*	|	-	|
+ *  -------------------------------------
+ *  !B	|	+	|	**	|
+ *  -------------------------------------
+ *
+ * Conditions:
+ * A:	cur_old_roots < nr_old_roots	(not exclusive before)
+ * !A:	cur_old_roots == nr_old_roots	(possible exclusive before)
+ * B:	cur_new_roots < nr_new_roots	(not exclusive now)
+ * !B:	cur_new_roots == nr_new_roots	(possible exclsuive now)
+ *
+ * Results:
+ * +: Possible sharing -> exclusive	-: Possible exclusive -> sharing
+ * *: Definitely not changed.		**: Possible unchanged.
+ *
+ * For !A and !B condition, the exception is cur_old/new_roots == 0 case.
+ *
+ * To make the logic clear, we first use condition A and B to split
+ * combination into 4 results.
+ *
+ * Then, for result "+" and "-", check old/new_roots == 0 case, as in them
+ * only on variant maybe 0.
+ *
+ * Lastly, check result **, since there are 2 variants maybe 0, split them
+ * again(2x2).
+ * But this time we don't need to consider other things, the codes and logic
+ * is easy to understand now.
+ */
+static int qgroup_update_counters(struct btrfs_fs_info *fs_info,
+				  struct ulist *qgroups,
+				  u64 nr_old_roots,
+				  u64 nr_new_roots,
+				  u64 num_bytes, u64 seq)
+{
+	struct ulist_node *unode;
+	struct ulist_iterator uiter;
+	struct btrfs_qgroup *qg;
+	u64 cur_new_count, cur_old_count;
+
+	ULIST_ITER_INIT(&uiter);
+	while ((unode = ulist_next(qgroups, &uiter))) {
+		bool dirty = false;
+
+		qg = u64_to_ptr(unode->aux);
+		cur_old_count = btrfs_qgroup_get_old_refcnt(qg, seq);
+		cur_new_count = btrfs_qgroup_get_new_refcnt(qg, seq);
+
+		/* Rfer update part */
+		if (cur_old_count == 0 && cur_new_count > 0) {
+			qg->rfer += num_bytes;
+			qg->rfer_cmpr += num_bytes;
+			dirty = true;
+		}
+		if (cur_old_count > 0 && cur_new_count == 0) {
+			qg->rfer -= num_bytes;
+			qg->rfer_cmpr -= num_bytes;
+			dirty = true;
+		}
+
+		/* Excl update part */
+		/* Exclusive/none -> shared case */
+		if (cur_old_count == nr_old_roots &&
+		    cur_new_count < nr_new_roots) {
+			/* Exclusive -> shared */
+			if (cur_old_count != 0) {
+				qg->excl -= num_bytes;
+				qg->excl_cmpr -= num_bytes;
+				dirty = true;
+			}
+		}
+
+		/* Shared -> exclusive/none case */
+		if (cur_old_count < nr_old_roots &&
+		    cur_new_count == nr_new_roots) {
+			/* Shared->exclusive */
+			if (cur_new_count != 0) {
+				qg->excl += num_bytes;
+				qg->excl_cmpr += num_bytes;
+				dirty = true;
+			}
+		}
+
+		/* Exclusive/none -> exclusive/none case */
+		if (cur_old_count == nr_old_roots &&
+		    cur_new_count == nr_new_roots) {
+			if (cur_old_count == 0) {
+				/* None -> exclusive/none */
+
+				if (cur_new_count != 0) {
+					/* None -> exclusive */
+					qg->excl += num_bytes;
+					qg->excl_cmpr += num_bytes;
+					dirty = true;
+				}
+				/* None -> none, nothing changed */
+			} else {
+				/* Exclusive -> exclusive/none */
+
+				if (cur_new_count == 0) {
+					/* Exclusive -> none */
+					qg->excl -= num_bytes;
+					qg->excl_cmpr -= num_bytes;
+					dirty = true;
+				}
+				/* Exclusive -> exclusive, nothing changed */
+			}
+		}
+		if (dirty)
+			qgroup_dirty(fs_info, qg);
+	}
+	return 0;
+}
+
+/*
  * This adjusts the counters for all referenced qgroups if need be.
  */
 static int qgroup_adjust_counters(struct btrfs_fs_info *fs_info,
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 08/18] btrfs: qgroup: Record possible quota-related extent for qgroup.
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (6 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 07/18] btrfs: qgroup: Add function qgroup_update_counters() Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 09/18] btrfs: qgroup: Add new function to record old_roots Qu Wenruo
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

Add hook in add_delayed_ref_head() to record quota-related extent record
into delayed_ref_root->dirty_extent_record rb-tree for later qgroup
accounting.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/delayed-ref.c | 58 ++++++++++++++++++++++++++++++++++++++++++++------
 fs/btrfs/delayed-ref.h |  3 +++
 fs/btrfs/qgroup.c      | 26 ++++++++++++++++++++++
 fs/btrfs/qgroup.h      | 17 +++++++++++++++
 fs/btrfs/transaction.c |  1 +
 5 files changed, 98 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
index 197f45b..72de81e 100644
--- a/fs/btrfs/delayed-ref.c
+++ b/fs/btrfs/delayed-ref.c
@@ -22,6 +22,7 @@
 #include "ctree.h"
 #include "delayed-ref.h"
 #include "transaction.h"
+#include "qgroup.h"
 
 struct kmem_cache *btrfs_delayed_ref_head_cachep;
 struct kmem_cache *btrfs_delayed_tree_ref_cachep;
@@ -405,12 +406,14 @@ update_existing_head_ref(struct btrfs_delayed_ref_node *existing,
 static noinline struct btrfs_delayed_ref_head *
 add_delayed_ref_head(struct btrfs_fs_info *fs_info,
 		     struct btrfs_trans_handle *trans,
-		     struct btrfs_delayed_ref_node *ref, u64 bytenr,
-		     u64 num_bytes, int action, int is_data)
+		     struct btrfs_delayed_ref_node *ref,
+		     struct btrfs_qgroup_extent_record *qrecord,
+		     u64 bytenr, u64 num_bytes, int action, int is_data)
 {
 	struct btrfs_delayed_ref_head *existing;
 	struct btrfs_delayed_ref_head *head_ref = NULL;
 	struct btrfs_delayed_ref_root *delayed_refs;
+	struct btrfs_qgroup_extent_record *qexisting;
 	int count_mod = 1;
 	int must_insert_reserved = 0;
 
@@ -458,6 +461,18 @@ add_delayed_ref_head(struct btrfs_fs_info *fs_info,
 	INIT_LIST_HEAD(&head_ref->ref_list);
 	head_ref->processing = 0;
 
+	/* Record qgroup extent info if provided */
+	if (qrecord) {
+		qrecord->bytenr = bytenr;
+		qrecord->num_bytes = num_bytes;
+		qrecord->old_roots = NULL;
+
+		qexisting = btrfs_qgroup_insert_dirty_extent(delayed_refs,
+							     qrecord);
+		if (qexisting)
+			kfree(qrecord);
+	}
+
 	spin_lock_init(&head_ref->lock);
 	mutex_init(&head_ref->mutex);
 
@@ -606,6 +621,7 @@ int btrfs_add_delayed_tree_ref(struct btrfs_fs_info *fs_info,
 	struct btrfs_delayed_tree_ref *ref;
 	struct btrfs_delayed_ref_head *head_ref;
 	struct btrfs_delayed_ref_root *delayed_refs;
+	struct btrfs_qgroup_extent_record *record = NULL;
 
 	if (!is_fstree(ref_root) || !fs_info->quota_enabled)
 		no_quota = 0;
@@ -621,6 +637,15 @@ int btrfs_add_delayed_tree_ref(struct btrfs_fs_info *fs_info,
 		return -ENOMEM;
 	}
 
+	if (is_fstree(ref_root)) {
+		record = kmalloc(sizeof(*record), GFP_NOFS);
+		if (!record) {
+			kmem_cache_free(btrfs_delayed_tree_ref_cachep, ref);
+			kmem_cache_free(btrfs_delayed_ref_head_cachep, ref);
+			return -ENOMEM;
+		}
+	}
+
 	head_ref->extent_op = extent_op;
 
 	delayed_refs = &trans->transaction->delayed_refs;
@@ -630,7 +655,7 @@ int btrfs_add_delayed_tree_ref(struct btrfs_fs_info *fs_info,
 	 * insert both the head node and the new ref without dropping
 	 * the spin lock
 	 */
-	head_ref = add_delayed_ref_head(fs_info, trans, &head_ref->node,
+	head_ref = add_delayed_ref_head(fs_info, trans, &head_ref->node, record,
 					bytenr, num_bytes, action, 0);
 
 	add_delayed_tree_ref(fs_info, trans, head_ref, &ref->node, bytenr,
@@ -655,6 +680,7 @@ int btrfs_add_delayed_data_ref(struct btrfs_fs_info *fs_info,
 	struct btrfs_delayed_data_ref *ref;
 	struct btrfs_delayed_ref_head *head_ref;
 	struct btrfs_delayed_ref_root *delayed_refs;
+	struct btrfs_qgroup_extent_record *record = NULL;
 
 	if (!is_fstree(ref_root) || !fs_info->quota_enabled)
 		no_quota = 0;
@@ -670,6 +696,16 @@ int btrfs_add_delayed_data_ref(struct btrfs_fs_info *fs_info,
 		return -ENOMEM;
 	}
 
+	if (is_fstree(ref_root)) {
+		record = kmalloc(sizeof(*record), GFP_NOFS);
+		if (!record) {
+			kmem_cache_free(btrfs_delayed_data_ref_cachep, ref);
+			kmem_cache_free(btrfs_delayed_ref_head_cachep,
+					head_ref);
+			return -ENOMEM;
+		}
+	}
+
 	head_ref->extent_op = extent_op;
 
 	delayed_refs = &trans->transaction->delayed_refs;
@@ -679,7 +715,7 @@ int btrfs_add_delayed_data_ref(struct btrfs_fs_info *fs_info,
 	 * insert both the head node and the new ref without dropping
 	 * the spin lock
 	 */
-	head_ref = add_delayed_ref_head(fs_info, trans, &head_ref->node,
+	head_ref = add_delayed_ref_head(fs_info, trans, &head_ref->node, record,
 					bytenr, num_bytes, action, 1);
 
 	add_delayed_data_ref(fs_info, trans, head_ref, &ref->node, bytenr,
@@ -697,19 +733,27 @@ int btrfs_add_delayed_extent_op(struct btrfs_fs_info *fs_info,
 {
 	struct btrfs_delayed_ref_head *head_ref;
 	struct btrfs_delayed_ref_root *delayed_refs;
+	struct btrfs_qgroup_extent_record *record;
 
 	head_ref = kmem_cache_alloc(btrfs_delayed_ref_head_cachep, GFP_NOFS);
 	if (!head_ref)
 		return -ENOMEM;
 
+	/* TODO: Skip this for non-fs tree root and quota disabled case */
+	record = kmalloc(sizeof(*record), GFP_NOFS);
+	if (!record) {
+		kmem_cache_free(btrfs_delayed_ref_head_cachep, head_ref);
+		return -ENOMEM;
+	}
+
 	head_ref->extent_op = extent_op;
 
 	delayed_refs = &trans->transaction->delayed_refs;
 	spin_lock(&delayed_refs->lock);
 
-	add_delayed_ref_head(fs_info, trans, &head_ref->node, bytenr,
-				   num_bytes, BTRFS_UPDATE_DELAYED_HEAD,
-				   extent_op->is_data);
+	add_delayed_ref_head(fs_info, trans, &head_ref->node, record, bytenr,
+			     num_bytes, BTRFS_UPDATE_DELAYED_HEAD,
+			     extent_op->is_data);
 
 	spin_unlock(&delayed_refs->lock);
 	return 0;
diff --git a/fs/btrfs/delayed-ref.h b/fs/btrfs/delayed-ref.h
index a5f6a66..6481de2 100644
--- a/fs/btrfs/delayed-ref.h
+++ b/fs/btrfs/delayed-ref.h
@@ -140,6 +140,9 @@ struct btrfs_delayed_ref_root {
 	/* head ref rbtree */
 	struct rb_root href_root;
 
+	/* dirty extent records */
+	struct rb_root dirty_extent_root;
+
 	/* this spin lock protects the rbtree and the entries inside */
 	spinlock_t lock;
 
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index a780f8f..505f6f3 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1403,6 +1403,32 @@ int btrfs_qgroup_record_ref(struct btrfs_trans_handle *trans,
 	return 0;
 }
 
+struct btrfs_qgroup_extent_record
+*btrfs_qgroup_insert_dirty_extent(struct btrfs_delayed_ref_root *delayed_refs,
+				  struct btrfs_qgroup_extent_record *record)
+{
+	struct rb_node **p = &delayed_refs->dirty_extent_root.rb_node;
+	struct rb_node *parent_node = NULL;
+	struct btrfs_qgroup_extent_record *entry;
+	u64 bytenr = record->bytenr;
+
+	while (*p) {
+		parent_node = *p;
+		entry = rb_entry(parent_node, struct btrfs_qgroup_extent_record,
+				 node);
+		if (bytenr < entry->bytenr)
+			p = &(*p)->rb_left;
+		else if (bytenr > entry->bytenr)
+			p = &(*p)->rb_right;
+		else
+			return entry;
+	}
+
+	rb_link_node(&record->node, parent_node, p);
+	rb_insert_color(&record->node, &delayed_refs->dirty_extent_root);
+	return NULL;
+}
+
 /*
  * The easy accounting, if we are adding/removing the only ref for an extent
  * then this qgroup and all of the parent qgroups get their refrence and
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index 18cc68c..7754d08 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -19,6 +19,9 @@
 #ifndef __BTRFS_QGROUP__
 #define __BTRFS_QGROUP__
 
+#include "ulist.h"
+#include "delayed-ref.h"
+
 /*
  * A description of the operations, all of these operations only happen when we
  * are adding the 1st reference for that subvolume in the case of adding space
@@ -58,6 +61,17 @@ struct btrfs_qgroup_operation {
 	struct list_head list;
 };
 
+/*
+ * Record a dirty extent, and info qgroup to update quota on it
+ * TODO: Use kmem cache to alloc it.
+ */
+struct btrfs_qgroup_extent_record {
+	struct rb_node node;
+	u64 bytenr;
+	u64 num_bytes;
+	struct ulist *old_roots;
+};
+
 int btrfs_quota_enable(struct btrfs_trans_handle *trans,
 		       struct btrfs_fs_info *fs_info);
 int btrfs_quota_disable(struct btrfs_trans_handle *trans,
@@ -85,6 +99,9 @@ int btrfs_qgroup_record_ref(struct btrfs_trans_handle *trans,
 			    u64 bytenr, u64 num_bytes,
 			    enum btrfs_qgroup_operation_type type,
 			    int mod_seq);
+struct btrfs_qgroup_extent_record
+*btrfs_qgroup_insert_dirty_extent(struct btrfs_delayed_ref_root *delayed_refs,
+				  struct btrfs_qgroup_extent_record *record);
 int btrfs_delayed_qgroup_accounting(struct btrfs_trans_handle *trans,
 				    struct btrfs_fs_info *fs_info);
 void btrfs_remove_qgroup_operation(struct btrfs_trans_handle *trans,
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 8be4278..871b405 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -224,6 +224,7 @@ loop:
 	cur_trans->start_time = get_seconds();
 
 	cur_trans->delayed_refs.href_root = RB_ROOT;
+	cur_trans->delayed_refs.dirty_extent_root = RB_ROOT;
 	atomic_set(&cur_trans->delayed_refs.num_entries, 0);
 	cur_trans->delayed_refs.num_heads_ready = 0;
 	cur_trans->delayed_refs.num_heads = 0;
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 09/18] btrfs: qgroup: Add new function to record old_roots.
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (7 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 08/18] btrfs: qgroup: Record possible quota-related extent for qgroup Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 10/18] btrfs: backref: Add special time_seq == (u64)-1 case for btrfs_find_all_roots() Qu Wenruo
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

Add function btrfs_qgroup_prepare_account_extents() to get old_roots
which are needed for qgroup.

We do it in commit_transaction() and before switch_roots(), and only
search commit_root, so it gives a quite accurate view for previous
transaction.

With old_roots from previous transaction, we can use it to do accurate
account with current transaction.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/qgroup.c | 27 +++++++++++++++++++++++++++
 fs/btrfs/qgroup.h |  2 ++
 2 files changed, 29 insertions(+)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 505f6f3..3040296 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1403,6 +1403,33 @@ int btrfs_qgroup_record_ref(struct btrfs_trans_handle *trans,
 	return 0;
 }
 
+int btrfs_qgroup_prepare_account_extents(struct btrfs_trans_handle *trans,
+					 struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_qgroup_extent_record *record;
+	struct btrfs_delayed_ref_root *delayed_refs;
+	struct rb_node *node;
+	int ret = 0;
+
+	delayed_refs = &trans->transaction->delayed_refs;
+
+	/*
+	 * No need to do lock, since this function will only be called in
+	 * btrfs_commmit_transaction().
+	 */
+	node = rb_first(&delayed_refs->dirty_extent_root);
+	while (node) {
+		record = rb_entry(node, struct btrfs_qgroup_extent_record,
+				  node);
+		ret = btrfs_find_all_roots(NULL, fs_info, record->bytenr, 0,
+					   &record->old_roots);
+		if (ret < 0)
+			break;
+		node = rb_next(node);
+	}
+	return ret;
+}
+
 struct btrfs_qgroup_extent_record
 *btrfs_qgroup_insert_dirty_extent(struct btrfs_delayed_ref_root *delayed_refs,
 				  struct btrfs_qgroup_extent_record *record)
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index 7754d08..3b549cb 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -99,6 +99,8 @@ int btrfs_qgroup_record_ref(struct btrfs_trans_handle *trans,
 			    u64 bytenr, u64 num_bytes,
 			    enum btrfs_qgroup_operation_type type,
 			    int mod_seq);
+int btrfs_qgroup_prepare_account_extents(struct btrfs_trans_handle *trans,
+					 struct btrfs_fs_info *fs_info);
 struct btrfs_qgroup_extent_record
 *btrfs_qgroup_insert_dirty_extent(struct btrfs_delayed_ref_root *delayed_refs,
 				  struct btrfs_qgroup_extent_record *record);
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 10/18] btrfs: backref: Add special time_seq == (u64)-1 case for btrfs_find_all_roots().
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (8 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 09/18] btrfs: qgroup: Add new function to record old_roots Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 11/18] btrfs: qgroup: Add new qgroup calculation function btrfs_qgroup_account_extents() Qu Wenruo
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

Allow btrfs_find_all_roots() to skip all delayed_ref_head lock and tree
lock to do tree search.

This is important for later qgroup implement which will call
find_all_roots() after fs trees are committed.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/backref.c | 35 +++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index a4c31dc..ab9cc42 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -250,8 +250,12 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path,
 	 * the first item to check. But sometimes, we may enter it with
 	 * slot==nritems. In that case, go to the next leaf before we continue.
 	 */
-	if (path->slots[0] >= btrfs_header_nritems(path->nodes[0]))
-		ret = btrfs_next_old_leaf(root, path, time_seq);
+	if (path->slots[0] >= btrfs_header_nritems(path->nodes[0])) {
+		if (time_seq == (u64)-1)
+			ret = btrfs_next_leaf(root, path);
+		else
+			ret = btrfs_next_old_leaf(root, path, time_seq);
+	}
 
 	while (!ret && count < total_refs) {
 		eb = path->nodes[0];
@@ -291,7 +295,10 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path,
 			eie = NULL;
 		}
 next:
-		ret = btrfs_next_old_item(root, path, time_seq);
+		if (time_seq == (u64)-1)
+			ret = btrfs_next_item(root, path);
+		else
+			ret = btrfs_next_old_item(root, path, time_seq);
 	}
 
 	if (ret > 0)
@@ -334,6 +341,8 @@ static int __resolve_indirect_ref(struct btrfs_fs_info *fs_info,
 
 	if (path->search_commit_root)
 		root_level = btrfs_header_level(root->commit_root);
+	else if (time_seq == (u64)-1)
+		root_level = btrfs_header_level(root->node);
 	else
 		root_level = btrfs_old_root_level(root, time_seq);
 
@@ -343,7 +352,12 @@ static int __resolve_indirect_ref(struct btrfs_fs_info *fs_info,
 	}
 
 	path->lowest_level = level;
-	ret = btrfs_search_old_slot(root, &ref->key_for_search, path, time_seq);
+	if (time_seq == (u64)-1)
+		ret = btrfs_search_slot(NULL, root, &ref->key_for_search, path,
+					0, 0);
+	else
+		ret = btrfs_search_old_slot(root, &ref->key_for_search, path,
+					    time_seq);
 
 	/* root node has been locked, we can release @subvol_srcu safely here */
 	srcu_read_unlock(&fs_info->subvol_srcu, index);
@@ -875,6 +889,11 @@ static int __add_keyed_refs(struct btrfs_fs_info *fs_info,
  * indirect refs to their parent bytenr.
  * When roots are found, they're added to the roots list
  *
+ * If time_seq is set to (u64)-1, it will not search delayed_refs, and behave
+ * much like trans == NULL case, the difference only lies in it will not
+ * commit root.
+ * The special case is for qgroup to search roots in commit_transaction().
+ *
  * FIXME some caching might speed things up
  */
 static int find_parent_nodes(struct btrfs_trans_handle *trans,
@@ -913,6 +932,9 @@ static int find_parent_nodes(struct btrfs_trans_handle *trans,
 		path->skip_locking = 1;
 	}
 
+	if (time_seq == (u64)-1)
+		path->skip_locking = 1;
+
 	/*
 	 * grab both a lock on the path and a lock on the delayed ref head.
 	 * We need both to get a consistent picture of how the refs look
@@ -927,9 +949,10 @@ again:
 	BUG_ON(ret == 0);
 
 #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
-	if (trans && likely(trans->type != __TRANS_DUMMY)) {
+	if (trans && likely(trans->type != __TRANS_DUMMY) &&
+	    time_seq != (u64)-1) {
 #else
-	if (trans) {
+	if (trans && time_seq != (u64)-1) {
 #endif
 		/*
 		 * look if there are updates for this ref queued and lock the
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 11/18] btrfs: qgroup: Add new qgroup calculation function btrfs_qgroup_account_extents().
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (9 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 10/18] btrfs: backref: Add special time_seq == (u64)-1 case for btrfs_find_all_roots() Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 12/18] btrfs: qgroup: Switch rescan to new mechanism Qu Wenruo
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

The new btrfs_qgroup_account_extents() function should be called in
btrfs_commit_transaction() and it will update all the qgroup according
to delayed_ref_root->dirty_extent_root.

The new function can handle both normal operation during
commit_transaction() or in rescan in a unified method with clearer
logic.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.h |   0
 fs/btrfs/qgroup.c      | 116 +++++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/qgroup.h      |   2 +
 3 files changed, 118 insertions(+)
 create mode 100644 fs/btrfs/extent-tree.h

diff --git a/fs/btrfs/extent-tree.h b/fs/btrfs/extent-tree.h
new file mode 100644
index 0000000..e69de29
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 3040296..be5717f 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -2346,6 +2346,122 @@ static int btrfs_qgroup_account(struct btrfs_trans_handle *trans,
 	return ret;
 }
 
+static int
+btrfs_qgroup_account_extent(struct btrfs_trans_handle *trans,
+			    struct btrfs_fs_info *fs_info,
+			    u64 bytenr, u64 num_bytes,
+			    struct ulist *old_roots, struct ulist *new_roots)
+{
+	struct ulist *qgroups = NULL;
+	struct ulist *tmp = NULL;
+	u64 seq;
+	u64 nr_new_roots = 0;
+	u64 nr_old_roots = 0;
+	int ret = 0;
+
+	if (new_roots)
+		nr_new_roots = new_roots->nnodes;
+	if (old_roots)
+		nr_old_roots = old_roots->nnodes;
+
+	if (!fs_info->quota_enabled)
+		return 0;
+	BUG_ON(!fs_info->quota_root);
+
+	qgroups = ulist_alloc(GFP_NOFS);
+	if (!qgroups) {
+		ret = -ENOMEM;
+		goto out_free;
+	}
+	tmp = ulist_alloc(GFP_NOFS);
+	if (!tmp) {
+		ret = -ENOMEM;
+		goto out_free;
+	}
+
+	mutex_lock(&fs_info->qgroup_rescan_lock);
+	if (fs_info->qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_RESCAN) {
+		if (fs_info->qgroup_rescan_progress.objectid <= bytenr) {
+			mutex_unlock(&fs_info->qgroup_rescan_lock);
+			ret = 0;
+			goto out_free;
+		}
+	}
+	mutex_unlock(&fs_info->qgroup_rescan_lock);
+
+	spin_lock(&fs_info->qgroup_lock);
+	seq = fs_info->qgroup_seq;
+
+	/* Update old refcnts using old_roots */
+	ret = qgroup_update_refcnt(fs_info, old_roots, tmp, qgroups, seq,
+				   UPDATE_OLD);
+	if (ret < 0)
+		goto out;
+
+	/* Update new refcnts using new_roots */
+	ret = qgroup_update_refcnt(fs_info, new_roots, tmp, qgroups, seq,
+				   UPDATE_NEW);
+	if (ret < 0)
+		goto out;
+
+	qgroup_update_counters(fs_info, qgroups, nr_old_roots, nr_new_roots,
+			       num_bytes, seq);
+
+	/*
+	 * Bump qgroup_seq to avoid seq overlap
+	 */
+	fs_info->qgroup_seq += max(nr_old_roots, nr_new_roots) + 1;
+out:
+	spin_unlock(&fs_info->qgroup_lock);
+out_free:
+	ulist_free(tmp);
+	ulist_free(qgroups);
+	ulist_free(old_roots);
+	ulist_free(new_roots);
+	return ret;
+}
+
+int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans,
+				 struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_qgroup_extent_record *record;
+	struct btrfs_delayed_ref_root *delayed_refs;
+	struct ulist *new_roots = NULL;
+	struct rb_node *node;
+	int ret = 0;
+
+	delayed_refs = &trans->transaction->delayed_refs;
+	while ((node = rb_first(&delayed_refs->dirty_extent_root))) {
+		record = rb_entry(node, struct btrfs_qgroup_extent_record,
+				  node);
+
+		if (!ret) {
+			/*
+			 * Use (u64)-1 as time_seq to do special search, which
+			 * doesn't lock tree or delayed_refs and search current
+			 * root. It's safe inside commit_transaction().
+			 */
+			ret = btrfs_find_all_roots(trans, fs_info,
+					record->bytenr, (u64)-1, &new_roots);
+			if (ret < 0)
+				goto cleanup;
+			ret = btrfs_qgroup_account_extent(trans, fs_info,
+					record->bytenr, record->num_bytes,
+					record->old_roots, new_roots);
+			record->old_roots = NULL;
+			new_roots = NULL;
+		}
+cleanup:
+		ulist_free(record->old_roots);
+		ulist_free(new_roots);
+		new_roots = NULL;
+		rb_erase(node, &delayed_refs->dirty_extent_root);
+		kfree(record);
+
+	}
+	return ret;
+}
+
 /*
  * Needs to be called everytime we run delayed refs, even if there is an error
  * in order to cleanup outstanding operations.
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index 3b549cb..c3b08d6 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -104,6 +104,8 @@ int btrfs_qgroup_prepare_account_extents(struct btrfs_trans_handle *trans,
 struct btrfs_qgroup_extent_record
 *btrfs_qgroup_insert_dirty_extent(struct btrfs_delayed_ref_root *delayed_refs,
 				  struct btrfs_qgroup_extent_record *record);
+int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans,
+				 struct btrfs_fs_info *fs_info);
 int btrfs_delayed_qgroup_accounting(struct btrfs_trans_handle *trans,
 				    struct btrfs_fs_info *fs_info);
 void btrfs_remove_qgroup_operation(struct btrfs_trans_handle *trans,
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 12/18] btrfs: qgroup: Switch rescan to new mechanism.
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (10 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 11/18] btrfs: qgroup: Add new qgroup calculation function btrfs_qgroup_account_extents() Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 13/18] btrfs: qgroup: Switch to new extent-oriented qgroup mechanism Qu Wenruo
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

Switch rescan to use the new new extent oriented mechanism.

As rescan is also based on extent, new mechanism is just a perfect match
for rescan.

With re-designed internal functions, rescan is quite easy, just call
btrfs_find_all_roots() and then btrfs_qgroup_account_one_extent().

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/qgroup.c | 43 +++++++------------------------------------
 1 file changed, 7 insertions(+), 36 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index be5717f..579f94a 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -2877,15 +2877,13 @@ void assert_qgroups_uptodate(struct btrfs_trans_handle *trans)
  */
 static int
 qgroup_rescan_leaf(struct btrfs_fs_info *fs_info, struct btrfs_path *path,
-		   struct btrfs_trans_handle *trans, struct ulist *qgroups,
-		   struct ulist *tmp, struct extent_buffer *scratch_leaf)
+		   struct btrfs_trans_handle *trans,
+		   struct extent_buffer *scratch_leaf)
 {
 	struct btrfs_key found;
 	struct ulist *roots = NULL;
 	struct seq_list tree_mod_seq_elem = {};
 	u64 num_bytes;
-	u64 seq;
-	int new_roots;
 	int slot;
 	int ret;
 
@@ -2935,33 +2933,15 @@ qgroup_rescan_leaf(struct btrfs_fs_info *fs_info, struct btrfs_path *path,
 		else
 			num_bytes = found.offset;
 
-		ulist_reinit(qgroups);
 		ret = btrfs_find_all_roots(NULL, fs_info, found.objectid, 0,
 					   &roots);
 		if (ret < 0)
 			goto out;
-		spin_lock(&fs_info->qgroup_lock);
-		seq = fs_info->qgroup_seq;
-		fs_info->qgroup_seq += roots->nnodes + 1; /* max refcnt */
-
-		new_roots = 0;
-		ret = qgroup_calc_old_refcnt(fs_info, 0, tmp, roots, qgroups,
-					     seq, &new_roots, 1);
-		if (ret < 0) {
-			spin_unlock(&fs_info->qgroup_lock);
-			ulist_free(roots);
-			goto out;
-		}
-
-		ret = qgroup_adjust_counters(fs_info, 0, num_bytes, qgroups,
-					     seq, 0, new_roots, 1);
-		if (ret < 0) {
-			spin_unlock(&fs_info->qgroup_lock);
-			ulist_free(roots);
+		/* For rescan, just pass old_roots as NULL */
+		ret = btrfs_qgroup_account_extent(trans, fs_info,
+				found.objectid, num_bytes, NULL, roots);
+		if (ret < 0)
 			goto out;
-		}
-		spin_unlock(&fs_info->qgroup_lock);
-		ulist_free(roots);
 	}
 out:
 	btrfs_put_tree_mod_seq(fs_info, &tree_mod_seq_elem);
@@ -2975,19 +2955,12 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
 						     qgroup_rescan_work);
 	struct btrfs_path *path;
 	struct btrfs_trans_handle *trans = NULL;
-	struct ulist *tmp = NULL, *qgroups = NULL;
 	struct extent_buffer *scratch_leaf = NULL;
 	int err = -ENOMEM;
 
 	path = btrfs_alloc_path();
 	if (!path)
 		goto out;
-	qgroups = ulist_alloc(GFP_NOFS);
-	if (!qgroups)
-		goto out;
-	tmp = ulist_alloc(GFP_NOFS);
-	if (!tmp)
-		goto out;
 	scratch_leaf = kmalloc(sizeof(*scratch_leaf), GFP_NOFS);
 	if (!scratch_leaf)
 		goto out;
@@ -3003,7 +2976,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
 			err = -EINTR;
 		} else {
 			err = qgroup_rescan_leaf(fs_info, path, trans,
-						 qgroups, tmp, scratch_leaf);
+						 scratch_leaf);
 		}
 		if (err > 0)
 			btrfs_commit_transaction(trans, fs_info->fs_root);
@@ -3013,8 +2986,6 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
 
 out:
 	kfree(scratch_leaf);
-	ulist_free(qgroups);
-	ulist_free(tmp);
 	btrfs_free_path(path);
 
 	mutex_lock(&fs_info->qgroup_rescan_lock);
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 13/18] btrfs: qgroup: Switch to new extent-oriented qgroup mechanism.
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (11 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 12/18] btrfs: qgroup: Switch rescan to new mechanism Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 14/18] btrfs: qgroup: Switch self test to " Qu Wenruo
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

Switch from old ref_node based qgroup to extent based qgroup mechanism
for normal operations.

The new mechanism should hugely reduce the overhead of btrfs quota
system, and further more, the codes and logic should be more clean and
easier to maintain.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c | 110 +++++--------------------------------------------
 fs/btrfs/transaction.c |  18 ++++++++
 2 files changed, 28 insertions(+), 100 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index dd6e47e..70723dc 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1997,26 +1997,8 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans,
 					   bytenr, num_bytes, parent,
 					   root_objectid, owner, offset,
 					   refs_to_add, extent_op);
-	if ((ret < 0 && ret != -EAGAIN) || (!ret && no_quota))
+	if ((ret < 0 && ret != -EAGAIN) || !ret)
 		goto out;
-	/*
-	 * Ok we were able to insert an inline extent and it appears to be a new
-	 * reference, deal with the qgroup accounting.
-	 */
-	if (!ret && !no_quota) {
-		ASSERT(root->fs_info->quota_enabled);
-		leaf = path->nodes[0];
-		btrfs_item_key_to_cpu(leaf, &key, path->slots[0]);
-		item = btrfs_item_ptr(leaf, path->slots[0],
-				      struct btrfs_extent_item);
-		if (btrfs_extent_refs(leaf, item) > (u64)refs_to_add)
-			type = BTRFS_QGROUP_OPER_ADD_SHARED;
-		btrfs_release_path(path);
-
-		ret = btrfs_qgroup_record_ref(trans, fs_info, root_objectid,
-					      bytenr, num_bytes, type, 0);
-		goto out;
-	}
 
 	/*
 	 * Ok we had -EAGAIN which means we didn't have space to insert and
@@ -2036,13 +2018,6 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans,
 	btrfs_mark_buffer_dirty(leaf);
 	btrfs_release_path(path);
 
-	if (!no_quota) {
-		ret = btrfs_qgroup_record_ref(trans, fs_info, root_objectid,
-					      bytenr, num_bytes, type, 0);
-		if (ret)
-			goto out;
-	}
-
 	path->reada = 1;
 	path->leave_spinning = 1;
 	/* now insert the actual backref */
@@ -2808,9 +2783,6 @@ again:
 		goto again;
 	}
 out:
-	ret = btrfs_delayed_qgroup_accounting(trans, root->fs_info);
-	if (ret)
-		return ret;
 	assert_qgroups_uptodate(trans);
 	return 0;
 }
@@ -6070,18 +6042,6 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 	}
 	btrfs_release_path(path);
 
-	/* Deal with the quota accounting */
-	if (!ret && last_ref && !no_quota) {
-		int mod_seq = 0;
-
-		if (owner_objectid >= BTRFS_FIRST_FREE_OBJECTID &&
-		    type == BTRFS_QGROUP_OPER_SUB_SHARED)
-			mod_seq = 1;
-
-		ret = btrfs_qgroup_record_ref(trans, info, root_objectid,
-					      bytenr, num_bytes, type,
-					      mod_seq);
-	}
 out:
 	btrfs_free_path(path);
 	return ret;
@@ -7017,13 +6977,6 @@ static int alloc_reserved_file_extent(struct btrfs_trans_handle *trans,
 	btrfs_mark_buffer_dirty(path->nodes[0]);
 	btrfs_free_path(path);
 
-	/* Always set parent to 0 here since its exclusive anyway. */
-	ret = btrfs_qgroup_record_ref(trans, fs_info, root_objectid,
-				      ins->objectid, ins->offset,
-				      BTRFS_QGROUP_OPER_ADD_EXCL, 0);
-	if (ret)
-		return ret;
-
 	ret = update_block_group(trans, root, ins->objectid, ins->offset, 1);
 	if (ret) { /* -ENOENT, logic error */
 		btrfs_err(fs_info, "update block group failed for %llu %llu",
@@ -7105,14 +7058,6 @@ static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans,
 	btrfs_mark_buffer_dirty(leaf);
 	btrfs_free_path(path);
 
-	if (!no_quota) {
-		ret = btrfs_qgroup_record_ref(trans, fs_info, root_objectid,
-					      ins->objectid, num_bytes,
-					      BTRFS_QGROUP_OPER_ADD_EXCL, 0);
-		if (ret)
-			return ret;
-	}
-
 	ret = update_block_group(trans, root, ins->objectid, root->nodesize,
 				 1);
 	if (ret) { /* -ENOENT, logic error */
@@ -7454,12 +7399,18 @@ reada:
 	wc->reada_slot = slot;
 }
 
+/*
+ * TODO: Modify related function to add related node/leaf to dirty_extent_root,
+ * for later qgroup accounting.
+ *
+ * Current, this function does nothing.
+ */
 static int account_leaf_items(struct btrfs_trans_handle *trans,
 			      struct btrfs_root *root,
 			      struct extent_buffer *eb)
 {
 	int nr = btrfs_header_nritems(eb);
-	int i, extent_type, ret;
+	int i, extent_type;
 	struct btrfs_key key;
 	struct btrfs_file_extent_item *fi;
 	u64 bytenr, num_bytes;
@@ -7482,13 +7433,6 @@ static int account_leaf_items(struct btrfs_trans_handle *trans,
 			continue;
 
 		num_bytes = btrfs_file_extent_disk_num_bytes(eb, fi);
-
-		ret = btrfs_qgroup_record_ref(trans, root->fs_info,
-					      root->objectid,
-					      bytenr, num_bytes,
-					      BTRFS_QGROUP_OPER_SUB_SUBTREE, 0);
-		if (ret)
-			return ret;
 	}
 	return 0;
 }
@@ -7557,6 +7501,8 @@ static int adjust_slots_upwards(struct btrfs_root *root,
 
 /*
  * root_eb is the subtree root and is locked before this function is called.
+ * TODO: Modify this function to mark all (including complete shared node)
+ * to dirty_extent_root to allow it get accounted in qgroup.
  */
 static int account_shared_subtree(struct btrfs_trans_handle *trans,
 				  struct btrfs_root *root,
@@ -7630,16 +7576,6 @@ walk_down:
 			btrfs_tree_read_lock(eb);
 			btrfs_set_lock_blocking_rw(eb, BTRFS_READ_LOCK);
 			path->locks[level] = BTRFS_READ_LOCK_BLOCKING;
-
-			ret = btrfs_qgroup_record_ref(trans, root->fs_info,
-						root->objectid,
-						child_bytenr,
-						root->nodesize,
-						BTRFS_QGROUP_OPER_SUB_SUBTREE,
-						0);
-			if (ret)
-				goto out;
-
 		}
 
 		if (level == 0) {
@@ -8232,24 +8168,6 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
 				goto out_end_trans;
 			}
 
-			/*
-			 * Qgroup update accounting is run from
-			 * delayed ref handling. This usually works
-			 * out because delayed refs are normally the
-			 * only way qgroup updates are added. However,
-			 * we may have added updates during our tree
-			 * walk so run qgroups here to make sure we
-			 * don't lose any updates.
-			 */
-			ret = btrfs_delayed_qgroup_accounting(trans,
-							      root->fs_info);
-			if (ret)
-				printk_ratelimited(KERN_ERR "BTRFS: Failure %d "
-						   "running qgroup updates "
-						   "during snapshot delete. "
-						   "Quota is out of sync, "
-						   "rescan required.\n", ret);
-
 			btrfs_end_transaction_throttle(trans, tree_root);
 			if (!for_reloc && btrfs_need_cleaner_sleep(root)) {
 				pr_debug("BTRFS: drop snapshot early exit\n");
@@ -8303,14 +8221,6 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
 	}
 	root_dropped = true;
 out_end_trans:
-	ret = btrfs_delayed_qgroup_accounting(trans, tree_root->fs_info);
-	if (ret)
-		printk_ratelimited(KERN_ERR "BTRFS: Failure %d "
-				   "running qgroup updates "
-				   "during snapshot delete. "
-				   "Quota is out of sync, "
-				   "rescan required.\n", ret);
-
 	btrfs_end_transaction_throttle(trans, tree_root);
 out_free:
 	kfree(wc);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 871b405..eaea170 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1927,6 +1927,13 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
 		goto scrub_continue;
 	}
 
+	/* Reocrd old roots for later qgroup accounting */
+	ret = btrfs_qgroup_prepare_account_extents(trans, root->fs_info);
+	if (ret) {
+		mutex_unlock(&root->fs_info->reloc_mutex);
+		goto scrub_continue;
+	}
+
 	/*
 	 * make sure none of the code above managed to slip in a
 	 * delayed item
@@ -1968,6 +1975,17 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
 	 */
 	btrfs_free_log_root_tree(trans, root->fs_info);
 
+	/*
+	 * Since fs roots are all committed, we can get a quite accurate
+	 * new_roots. So let's do quota accounting.
+	 */
+	ret = btrfs_qgroup_account_extents(trans, root->fs_info);
+	if (ret < 0) {
+		mutex_unlock(&root->fs_info->tree_log_mutex);
+		mutex_unlock(&root->fs_info->reloc_mutex);
+		goto scrub_continue;
+	}
+
 	ret = commit_cowonly_roots(trans, root);
 	if (ret) {
 		mutex_unlock(&root->fs_info->tree_log_mutex);
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 14/18] btrfs: qgroup: Switch self test to extent-oriented qgroup mechanism.
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (12 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 13/18] btrfs: qgroup: Switch to new extent-oriented qgroup mechanism Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 15/18] btrfs: qgroup: Cleanup the old ref_node-oriented mechanism Qu Wenruo
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

Since the self test transaction don't have delayed_ref_roots, so use
find_all_roots() and export btrfs_qgroup_account_extent() to simulate it

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/qgroup.c             |   2 +-
 fs/btrfs/qgroup.h             |   5 ++
 fs/btrfs/tests/qgroup-tests.c | 109 ++++++++++++++++++++++++++++++++----------
 3 files changed, 89 insertions(+), 27 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 579f94a..cac7ded 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -2346,7 +2346,7 @@ static int btrfs_qgroup_account(struct btrfs_trans_handle *trans,
 	return ret;
 }
 
-static int
+int
 btrfs_qgroup_account_extent(struct btrfs_trans_handle *trans,
 			    struct btrfs_fs_info *fs_info,
 			    u64 bytenr, u64 num_bytes,
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index c3b08d6..80d7df8 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -104,6 +104,11 @@ int btrfs_qgroup_prepare_account_extents(struct btrfs_trans_handle *trans,
 struct btrfs_qgroup_extent_record
 *btrfs_qgroup_insert_dirty_extent(struct btrfs_delayed_ref_root *delayed_refs,
 				  struct btrfs_qgroup_extent_record *record);
+int
+btrfs_qgroup_account_extent(struct btrfs_trans_handle *trans,
+			    struct btrfs_fs_info *fs_info,
+			    u64 bytenr, u64 num_bytes,
+			    struct ulist *old_roots, struct ulist *new_roots);
 int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans,
 				 struct btrfs_fs_info *fs_info);
 int btrfs_delayed_qgroup_accounting(struct btrfs_trans_handle *trans,
diff --git a/fs/btrfs/tests/qgroup-tests.c b/fs/btrfs/tests/qgroup-tests.c
index 73f299e..db61e3e 100644
--- a/fs/btrfs/tests/qgroup-tests.c
+++ b/fs/btrfs/tests/qgroup-tests.c
@@ -21,6 +21,7 @@
 #include "../transaction.h"
 #include "../disk-io.h"
 #include "../qgroup.h"
+#include "../backref.h"
 
 static void init_dummy_trans(struct btrfs_trans_handle *trans)
 {
@@ -227,6 +228,8 @@ static int test_no_shared_qgroup(struct btrfs_root *root)
 {
 	struct btrfs_trans_handle trans;
 	struct btrfs_fs_info *fs_info = root->fs_info;
+	struct ulist *old_roots = NULL;
+	struct ulist *new_roots = NULL;
 	int ret;
 
 	init_dummy_trans(&trans);
@@ -238,10 +241,15 @@ static int test_no_shared_qgroup(struct btrfs_root *root)
 		return ret;
 	}
 
-	ret = btrfs_qgroup_record_ref(&trans, fs_info, 5, 4096, 4096,
-				      BTRFS_QGROUP_OPER_ADD_EXCL, 0);
+	/*
+	 * Since the test trans doesn't havee the complicated delayed refs,
+	 * we can only call btrfs_qgroup_account_extent() directly to test
+	 * quota.
+	 */
+	ret = btrfs_find_all_roots(&trans, fs_info, 4096, 0, &old_roots);
 	if (ret) {
-		test_msg("Couldn't add space to a qgroup %d\n", ret);
+		ulist_free(old_roots);
+		test_msg("Couldn't find old roots: %d\n", ret);
 		return ret;
 	}
 
@@ -249,9 +257,18 @@ static int test_no_shared_qgroup(struct btrfs_root *root)
 	if (ret)
 		return ret;
 
-	ret = btrfs_delayed_qgroup_accounting(&trans, fs_info);
+	ret = btrfs_find_all_roots(&trans, fs_info, 4096, 0, &new_roots);
+	if (ret) {
+		ulist_free(old_roots);
+		ulist_free(new_roots);
+		test_msg("Couldn't find old roots: %d\n", ret);
+		return ret;
+	}
+
+	ret = btrfs_qgroup_account_extent(&trans, fs_info, 4096, 4096,
+					  old_roots, new_roots);
 	if (ret) {
-		test_msg("Delayed qgroup accounting failed %d\n", ret);
+		test_msg("Couldn't account space for a qgroup %d\n", ret);
 		return ret;
 	}
 
@@ -259,21 +276,32 @@ static int test_no_shared_qgroup(struct btrfs_root *root)
 		test_msg("Qgroup counts didn't match expected values\n");
 		return -EINVAL;
 	}
+	old_roots = NULL;
+	new_roots = NULL;
+
+	ret = btrfs_find_all_roots(&trans, fs_info, 4096, 0, &old_roots);
+	if (ret) {
+		ulist_free(old_roots);
+		test_msg("Couldn't find old roots: %d\n", ret);
+		return ret;
+	}
 
 	ret = remove_extent_item(root, 4096, 4096);
 	if (ret)
 		return -EINVAL;
 
-	ret = btrfs_qgroup_record_ref(&trans, fs_info, 5, 4096, 4096,
-				      BTRFS_QGROUP_OPER_SUB_EXCL, 0);
+	ret = btrfs_find_all_roots(&trans, fs_info, 4096, 0, &new_roots);
 	if (ret) {
-		test_msg("Couldn't remove space from the qgroup %d\n", ret);
-		return -EINVAL;
+		ulist_free(old_roots);
+		ulist_free(new_roots);
+		test_msg("Couldn't find old roots: %d\n", ret);
+		return ret;
 	}
 
-	ret = btrfs_delayed_qgroup_accounting(&trans, fs_info);
+	ret = btrfs_qgroup_account_extent(&trans, fs_info, 4096, 4096,
+					  old_roots, new_roots);
 	if (ret) {
-		test_msg("Qgroup accounting failed %d\n", ret);
+		test_msg("Couldn't account space for a qgroup %d\n", ret);
 		return -EINVAL;
 	}
 
@@ -294,6 +322,8 @@ static int test_multiple_refs(struct btrfs_root *root)
 {
 	struct btrfs_trans_handle trans;
 	struct btrfs_fs_info *fs_info = root->fs_info;
+	struct ulist *old_roots = NULL;
+	struct ulist *new_roots = NULL;
 	int ret;
 
 	init_dummy_trans(&trans);
@@ -307,20 +337,29 @@ static int test_multiple_refs(struct btrfs_root *root)
 		return ret;
 	}
 
+	ret = btrfs_find_all_roots(&trans, fs_info, 4096, 0, &old_roots);
+	if (ret) {
+		ulist_free(old_roots);
+		test_msg("Couldn't find old roots: %d\n", ret);
+		return ret;
+	}
+
 	ret = insert_normal_tree_ref(root, 4096, 4096, 0, 5);
 	if (ret)
 		return ret;
 
-	ret = btrfs_qgroup_record_ref(&trans, fs_info, 5, 4096, 4096,
-				      BTRFS_QGROUP_OPER_ADD_EXCL, 0);
+	ret = btrfs_find_all_roots(&trans, fs_info, 4096, 0, &new_roots);
 	if (ret) {
-		test_msg("Couldn't add space to a qgroup %d\n", ret);
+		ulist_free(old_roots);
+		ulist_free(new_roots);
+		test_msg("Couldn't find old roots: %d\n", ret);
 		return ret;
 	}
 
-	ret = btrfs_delayed_qgroup_accounting(&trans, fs_info);
+	ret = btrfs_qgroup_account_extent(&trans, fs_info, 4096, 4096,
+					  old_roots, new_roots);
 	if (ret) {
-		test_msg("Delayed qgroup accounting failed %d\n", ret);
+		test_msg("Couldn't account space for a qgroup %d\n", ret);
 		return ret;
 	}
 
@@ -329,20 +368,29 @@ static int test_multiple_refs(struct btrfs_root *root)
 		return -EINVAL;
 	}
 
+	ret = btrfs_find_all_roots(&trans, fs_info, 4096, 0, &old_roots);
+	if (ret) {
+		ulist_free(old_roots);
+		test_msg("Couldn't find old roots: %d\n", ret);
+		return ret;
+	}
+
 	ret = add_tree_ref(root, 4096, 4096, 0, 256);
 	if (ret)
 		return ret;
 
-	ret = btrfs_qgroup_record_ref(&trans, fs_info, 256, 4096, 4096,
-				      BTRFS_QGROUP_OPER_ADD_SHARED, 0);
+	ret = btrfs_find_all_roots(&trans, fs_info, 4096, 0, &new_roots);
 	if (ret) {
-		test_msg("Qgroup record ref failed %d\n", ret);
+		ulist_free(old_roots);
+		ulist_free(new_roots);
+		test_msg("Couldn't find old roots: %d\n", ret);
 		return ret;
 	}
 
-	ret = btrfs_delayed_qgroup_accounting(&trans, fs_info);
+	ret = btrfs_qgroup_account_extent(&trans, fs_info, 4096, 4096,
+					  old_roots, new_roots);
 	if (ret) {
-		test_msg("Qgroup accounting failed %d\n", ret);
+		test_msg("Couldn't account space for a qgroup %d\n", ret);
 		return ret;
 	}
 
@@ -356,20 +404,29 @@ static int test_multiple_refs(struct btrfs_root *root)
 		return -EINVAL;
 	}
 
+	ret = btrfs_find_all_roots(&trans, fs_info, 4096, 0, &old_roots);
+	if (ret) {
+		ulist_free(old_roots);
+		test_msg("Couldn't find old roots: %d\n", ret);
+		return ret;
+	}
+
 	ret = remove_extent_ref(root, 4096, 4096, 0, 256);
 	if (ret)
 		return ret;
 
-	ret = btrfs_qgroup_record_ref(&trans, fs_info, 256, 4096, 4096,
-				      BTRFS_QGROUP_OPER_SUB_SHARED, 0);
+	ret = btrfs_find_all_roots(&trans, fs_info, 4096, 0, &new_roots);
 	if (ret) {
-		test_msg("Qgroup record ref failed %d\n", ret);
+		ulist_free(old_roots);
+		ulist_free(new_roots);
+		test_msg("Couldn't find old roots: %d\n", ret);
 		return ret;
 	}
 
-	ret = btrfs_delayed_qgroup_accounting(&trans, fs_info);
+	ret = btrfs_qgroup_account_extent(&trans, fs_info, 4096, 4096,
+					  old_roots, new_roots);
 	if (ret) {
-		test_msg("Qgroup accounting failed %d\n", ret);
+		test_msg("Couldn't account space for a qgroup %d\n", ret);
 		return ret;
 	}
 
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 15/18] btrfs: qgroup: Cleanup the old ref_node-oriented mechanism.
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (13 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 14/18] btrfs: qgroup: Switch self test to " Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 16/18] btrfs: ulist: Add ulist_del() function Qu Wenruo
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

Goodbye, the old mechanisim.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/ctree.h             |   2 +-
 fs/btrfs/extent-tree.c       |   5 -
 fs/btrfs/qgroup.c            | 905 +------------------------------------------
 fs/btrfs/qgroup.h            |  49 ---
 include/trace/events/btrfs.h |  55 ---
 5 files changed, 3 insertions(+), 1013 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index f9c89ca..be6f9ce 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1703,7 +1703,7 @@ struct btrfs_fs_info {
 	/* list of dirty qgroups to be written at next commit */
 	struct list_head dirty_qgroups;
 
-	/* used by btrfs_qgroup_record_ref for an efficient tree traversal */
+	/* used by qgroup for an efficient tree traversal */
 	u64 qgroup_seq;
 
 	/* qgroup rescan items */
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 70723dc..265c36b 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1981,7 +1981,6 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans,
 	u64 refs;
 	int ret;
 	int no_quota = node->no_quota;
-	enum btrfs_qgroup_operation_type type = BTRFS_QGROUP_OPER_ADD_EXCL;
 
 	path = btrfs_alloc_path();
 	if (!path)
@@ -2009,8 +2008,6 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans,
 	btrfs_item_key_to_cpu(leaf, &key, path->slots[0]);
 	item = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_item);
 	refs = btrfs_extent_refs(leaf, item);
-	if (refs)
-		type = BTRFS_QGROUP_OPER_ADD_SHARED;
 	btrfs_set_extent_refs(leaf, item, refs + refs_to_add);
 	if (extent_op)
 		__run_delayed_extent_op(extent_op, leaf, item);
@@ -5799,7 +5796,6 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 	u64 bytenr = node->bytenr;
 	u64 num_bytes = node->num_bytes;
 	int last_ref = 0;
-	enum btrfs_qgroup_operation_type type = BTRFS_QGROUP_OPER_SUB_EXCL;
 	bool skinny_metadata = btrfs_fs_incompat(root->fs_info,
 						 SKINNY_METADATA);
 
@@ -5980,7 +5976,6 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 	refs -= refs_to_drop;
 
 	if (refs > 0) {
-		type = BTRFS_QGROUP_OPER_SUB_SHARED;
 		if (extent_op)
 			__run_delayed_extent_op(extent_op, leaf, ei);
 		/*
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index cac7ded..01612d7 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -34,6 +34,7 @@
 #include "extent_io.h"
 #include "qgroup.h"
 
+
 /* TODO XXX FIXME
  *  - subvol delete -> delete when ref goes to 0? delete limits also?
  *  - reorganize keys
@@ -1237,172 +1238,6 @@ out:
 	return ret;
 }
 
-static int comp_oper_exist(struct btrfs_qgroup_operation *oper1,
-			   struct btrfs_qgroup_operation *oper2)
-{
-	/*
-	 * Ignore seq and type here, we're looking for any operation
-	 * at all related to this extent on that root.
-	 */
-	if (oper1->bytenr < oper2->bytenr)
-		return -1;
-	if (oper1->bytenr > oper2->bytenr)
-		return 1;
-	if (oper1->ref_root < oper2->ref_root)
-		return -1;
-	if (oper1->ref_root > oper2->ref_root)
-		return 1;
-	return 0;
-}
-
-static int qgroup_oper_exists(struct btrfs_fs_info *fs_info,
-			      struct btrfs_qgroup_operation *oper)
-{
-	struct rb_node *n;
-	struct btrfs_qgroup_operation *cur;
-	int cmp;
-
-	spin_lock(&fs_info->qgroup_op_lock);
-	n = fs_info->qgroup_op_tree.rb_node;
-	while (n) {
-		cur = rb_entry(n, struct btrfs_qgroup_operation, n);
-		cmp = comp_oper_exist(cur, oper);
-		if (cmp < 0) {
-			n = n->rb_right;
-		} else if (cmp) {
-			n = n->rb_left;
-		} else {
-			spin_unlock(&fs_info->qgroup_op_lock);
-			return -EEXIST;
-		}
-	}
-	spin_unlock(&fs_info->qgroup_op_lock);
-	return 0;
-}
-
-static int comp_oper(struct btrfs_qgroup_operation *oper1,
-		     struct btrfs_qgroup_operation *oper2)
-{
-	if (oper1->bytenr < oper2->bytenr)
-		return -1;
-	if (oper1->bytenr > oper2->bytenr)
-		return 1;
-	if (oper1->seq < oper2->seq)
-		return -1;
-	if (oper1->seq > oper2->seq)
-		return 1;
-	if (oper1->ref_root < oper2->ref_root)
-		return -1;
-	if (oper1->ref_root > oper2->ref_root)
-		return 1;
-	if (oper1->type < oper2->type)
-		return -1;
-	if (oper1->type > oper2->type)
-		return 1;
-	return 0;
-}
-
-static int insert_qgroup_oper(struct btrfs_fs_info *fs_info,
-			      struct btrfs_qgroup_operation *oper)
-{
-	struct rb_node **p;
-	struct rb_node *parent = NULL;
-	struct btrfs_qgroup_operation *cur;
-	int cmp;
-
-	spin_lock(&fs_info->qgroup_op_lock);
-	p = &fs_info->qgroup_op_tree.rb_node;
-	while (*p) {
-		parent = *p;
-		cur = rb_entry(parent, struct btrfs_qgroup_operation, n);
-		cmp = comp_oper(cur, oper);
-		if (cmp < 0) {
-			p = &(*p)->rb_right;
-		} else if (cmp) {
-			p = &(*p)->rb_left;
-		} else {
-			spin_unlock(&fs_info->qgroup_op_lock);
-			return -EEXIST;
-		}
-	}
-	rb_link_node(&oper->n, parent, p);
-	rb_insert_color(&oper->n, &fs_info->qgroup_op_tree);
-	spin_unlock(&fs_info->qgroup_op_lock);
-	return 0;
-}
-
-/*
- * Record a quota operation for processing later on.
- * @trans: the transaction we are adding the delayed op to.
- * @fs_info: the fs_info for this fs.
- * @ref_root: the root of the reference we are acting on,
- * @bytenr: the bytenr we are acting on.
- * @num_bytes: the number of bytes in the reference.
- * @type: the type of operation this is.
- * @mod_seq: do we need to get a sequence number for looking up roots.
- *
- * We just add it to our trans qgroup_ref_list and carry on and process these
- * operations in order at some later point.  If the reference root isn't a fs
- * root then we don't bother with doing anything.
- *
- * MUST BE HOLDING THE REF LOCK.
- */
-int btrfs_qgroup_record_ref(struct btrfs_trans_handle *trans,
-			    struct btrfs_fs_info *fs_info, u64 ref_root,
-			    u64 bytenr, u64 num_bytes,
-			    enum btrfs_qgroup_operation_type type, int mod_seq)
-{
-	struct btrfs_qgroup_operation *oper;
-	int ret;
-
-	if (!is_fstree(ref_root) || !fs_info->quota_enabled)
-		return 0;
-
-	oper = kmalloc(sizeof(*oper), GFP_NOFS);
-	if (!oper)
-		return -ENOMEM;
-
-	oper->ref_root = ref_root;
-	oper->bytenr = bytenr;
-	oper->num_bytes = num_bytes;
-	oper->type = type;
-	oper->seq = atomic_inc_return(&fs_info->qgroup_op_seq);
-	INIT_LIST_HEAD(&oper->elem.list);
-	oper->elem.seq = 0;
-
-	trace_btrfs_qgroup_record_ref(oper);
-
-	if (type == BTRFS_QGROUP_OPER_SUB_SUBTREE) {
-		/*
-		 * If any operation for this bytenr/ref_root combo
-		 * exists, then we know it's not exclusively owned and
-		 * shouldn't be queued up.
-		 *
-		 * This also catches the case where we have a cloned
-		 * extent that gets queued up multiple times during
-		 * drop snapshot.
-		 */
-		if (qgroup_oper_exists(fs_info, oper)) {
-			kfree(oper);
-			return 0;
-		}
-	}
-
-	ret = insert_qgroup_oper(fs_info, oper);
-	if (ret) {
-		/* Shouldn't happen so have an assert for developers */
-		ASSERT(0);
-		kfree(oper);
-		return ret;
-	}
-	list_add_tail(&oper->list, &trans->qgroup_ref_list);
-
-	if (mod_seq)
-		btrfs_get_tree_mod_seq(fs_info, &oper->elem);
-
-	return 0;
-}
-
 int btrfs_qgroup_prepare_account_extents(struct btrfs_trans_handle *trans,
 					 struct btrfs_fs_info *fs_info)
 {
@@ -1456,305 +1291,6 @@ struct btrfs_qgroup_extent_record
 	return NULL;
 }
 
-/*
- * The easy accounting, if we are adding/removing the only ref for an extent
- * then this qgroup and all of the parent qgroups get their refrence and
- * exclusive counts adjusted.
- */
-static int qgroup_excl_accounting(struct btrfs_fs_info *fs_info,
-				  struct btrfs_qgroup_operation *oper)
-{
-	struct btrfs_qgroup *qgroup;
-	struct ulist *tmp;
-	struct btrfs_qgroup_list *glist;
-	struct ulist_node *unode;
-	struct ulist_iterator uiter;
-	int sign = 0;
-	int ret = 0;
-
-	tmp = ulist_alloc(GFP_NOFS);
-	if (!tmp)
-		return -ENOMEM;
-
-	spin_lock(&fs_info->qgroup_lock);
-	if (!fs_info->quota_root)
-		goto out;
-	qgroup = find_qgroup_rb(fs_info, oper->ref_root);
-	if (!qgroup)
-		goto out;
-	switch (oper->type) {
-	case BTRFS_QGROUP_OPER_ADD_EXCL:
-		sign = 1;
-		break;
-	case BTRFS_QGROUP_OPER_SUB_EXCL:
-		sign = -1;
-		break;
-	default:
-		ASSERT(0);
-	}
-	qgroup->rfer += sign * oper->num_bytes;
-	qgroup->rfer_cmpr += sign * oper->num_bytes;
-
-	WARN_ON(sign < 0 && qgroup->excl < oper->num_bytes);
-	qgroup->excl += sign * oper->num_bytes;
-	qgroup->excl_cmpr += sign * oper->num_bytes;
-
-	qgroup_dirty(fs_info, qgroup);
-
-	/* Get all of the parent groups that contain this qgroup */
-	list_for_each_entry(glist, &qgroup->groups, next_group) {
-		ret = ulist_add(tmp, glist->group->qgroupid,
-				ptr_to_u64(glist->group), GFP_ATOMIC);
-		if (ret < 0)
-			goto out;
-	}
-
-	/* Iterate all of the parents and adjust their reference counts */
-	ULIST_ITER_INIT(&uiter);
-	while ((unode = ulist_next(tmp, &uiter))) {
-		qgroup = u64_to_ptr(unode->aux);
-		qgroup->rfer += sign * oper->num_bytes;
-		qgroup->rfer_cmpr += sign * oper->num_bytes;
-		WARN_ON(sign < 0 && qgroup->excl < oper->num_bytes);
-		qgroup->excl += sign * oper->num_bytes;
-		qgroup->excl_cmpr += sign * oper->num_bytes;
-		qgroup_dirty(fs_info, qgroup);
-
-		/* Add any parents of the parents */
-		list_for_each_entry(glist, &qgroup->groups, next_group) {
-			ret = ulist_add(tmp, glist->group->qgroupid,
-					ptr_to_u64(glist->group), GFP_ATOMIC);
-			if (ret < 0)
-				goto out;
-		}
-	}
-	ret = 0;
-out:
-	spin_unlock(&fs_info->qgroup_lock);
-	ulist_free(tmp);
-	return ret;
-}
-
-/*
- * Walk all of the roots that pointed to our bytenr and adjust their refcnts as
- * properly.
- */
-static int qgroup_calc_old_refcnt(struct btrfs_fs_info *fs_info,
-				  u64 root_to_skip, struct ulist *tmp,
-				  struct ulist *roots, struct ulist *qgroups,
-				  u64 seq, int *old_roots, int rescan)
-{
-	struct ulist_node *unode;
-	struct ulist_iterator uiter;
-	struct ulist_node *tmp_unode;
-	struct ulist_iterator tmp_uiter;
-	struct btrfs_qgroup *qg;
-	int ret;
-
-	ULIST_ITER_INIT(&uiter);
-	while ((unode = ulist_next(roots, &uiter))) {
-		/* We don't count our current root here */
-		if (unode->val == root_to_skip)
-			continue;
-		qg = find_qgroup_rb(fs_info, unode->val);
-		if (!qg)
-			continue;
-		/*
-		 * We could have a pending removal of this same ref so we may
-		 * not have actually found our ref root when doing
-		 * btrfs_find_all_roots, so we need to keep track of how many
-		 * old roots we find in case we removed ours and added a
-		 * different one at the same time.  I don't think this could
-		 * happen in practice but that sort of thinking leads to pain
-		 * and suffering and to the dark side.
-		 */
-		(*old_roots)++;
-
-		ulist_reinit(tmp);
-		ret = ulist_add(qgroups, qg->qgroupid, ptr_to_u64(qg),
-				GFP_ATOMIC);
-		if (ret < 0)
-			return ret;
-		ret = ulist_add(tmp, qg->qgroupid, ptr_to_u64(qg), GFP_ATOMIC);
-		if (ret < 0)
-			return ret;
-		ULIST_ITER_INIT(&tmp_uiter);
-		while ((tmp_unode = ulist_next(tmp, &tmp_uiter))) {
-			struct btrfs_qgroup_list *glist;
-			int mod;
-
-			qg = u64_to_ptr(tmp_unode->aux);
-			/*
-			 * We use this sequence number to keep from having to
-			 * run the whole list and 0 out the refcnt every time.
-			 * We basically use sequnce as the known 0 count and
-			 * then add 1 everytime we see a qgroup.  This is how we
-			 * get how many of the roots actually point up to the
-			 * upper level qgroups in order to determine exclusive
-			 * counts.
-			 *
-			 * For rescan none of the extent is recorded before so
-			 * we just don't add old_refcnt.
-			 */
-			if (rescan)
-				mod = 0;
-			else
-				mod = 1;
-			btrfs_qgroup_update_old_refcnt(qg, seq, mod);
-			btrfs_qgroup_update_new_refcnt(qg, seq, 1);
-			list_for_each_entry(glist, &qg->groups, next_group) {
-				ret = ulist_add(qgroups, glist->group->qgroupid,
-						ptr_to_u64(glist->group),
-						GFP_ATOMIC);
-				if (ret < 0)
-					return ret;
-				ret = ulist_add(tmp, glist->group->qgroupid,
-						ptr_to_u64(glist->group),
-						GFP_ATOMIC);
-				if (ret < 0)
-					return ret;
-			}
-		}
-	}
-	return 0;
-}
-
-/*
- * We need to walk forward in our operation tree and account for any roots that
- * were deleted after we made this operation.
- */
-static int qgroup_account_deleted_refs(struct btrfs_fs_info *fs_info,
-				       struct btrfs_qgroup_operation *oper,
-				       struct ulist *tmp,
-				       struct ulist *qgroups, u64 seq,
-				       int *old_roots)
-{
-	struct ulist_node *unode;
-	struct ulist_iterator uiter;
-	struct btrfs_qgroup *qg;
-	struct btrfs_qgroup_operation *tmp_oper;
-	struct rb_node *n;
-	int ret;
-
-	ulist_reinit(tmp);
-
-	/*
-	 * We only walk forward in the tree since we're only interested in
-	 * removals that happened _after_  our operation.
-	 */
-	spin_lock(&fs_info->qgroup_op_lock);
-	n = rb_next(&oper->n);
-	spin_unlock(&fs_info->qgroup_op_lock);
-	if (!n)
-		return 0;
-	tmp_oper = rb_entry(n, struct btrfs_qgroup_operation, n);
-	while (tmp_oper->bytenr == oper->bytenr) {
-		/*
-		 * If it's not a removal we don't care, additions work out
-		 * properly with our refcnt tracking.
-		 */
-		if (tmp_oper->type != BTRFS_QGROUP_OPER_SUB_SHARED &&
-		    tmp_oper->type != BTRFS_QGROUP_OPER_SUB_EXCL)
-			goto next;
-		qg = find_qgroup_rb(fs_info, tmp_oper->ref_root);
-		if (!qg)
-			goto next;
-		ret = ulist_add(qgroups, qg->qgroupid, ptr_to_u64(qg),
-				GFP_ATOMIC);
-		if (ret) {
-			if (ret < 0)
-				return ret;
-			/*
-			 * We only want to increase old_roots if this qgroup is
-			 * not already in the list of qgroups.  If it is already
-			 * there then that means it must have been re-added or
-			 * the delete will be discarded because we had an
-			 * existing ref that we haven't looked up yet.  In this
-			 * case we don't want to increase old_roots.  So if ret
-			 * == 1 then we know that this is the first time we've
-			 * seen this qgroup and we can bump the old_roots.
-			 */
-			(*old_roots)++;
-			ret = ulist_add(tmp, qg->qgroupid, ptr_to_u64(qg),
-					GFP_ATOMIC);
-			if (ret < 0)
-				return ret;
-		}
-next:
-		spin_lock(&fs_info->qgroup_op_lock);
-		n = rb_next(&tmp_oper->n);
-		spin_unlock(&fs_info->qgroup_op_lock);
-		if (!n)
-			break;
-		tmp_oper = rb_entry(n, struct btrfs_qgroup_operation, n);
-	}
-
-	/* Ok now process the qgroups we found */
-	ULIST_ITER_INIT(&uiter);
-	while ((unode = ulist_next(tmp, &uiter))) {
-		struct btrfs_qgroup_list *glist;
-
-		qg = u64_to_ptr(unode->aux);
-		btrfs_qgroup_update_old_refcnt(qg, seq, 1);
-		btrfs_qgroup_update_new_refcnt(qg, seq, 1);
-		list_for_each_entry(glist, &qg->groups, next_group) {
-			ret = ulist_add(qgroups, glist->group->qgroupid,
-					ptr_to_u64(glist->group), GFP_ATOMIC);
-			if (ret < 0)
-				return ret;
-			ret = ulist_add(tmp, glist->group->qgroupid,
-					ptr_to_u64(glist->group), GFP_ATOMIC);
-			if (ret < 0)
-				return ret;
-		}
-	}
-	return 0;
-}
-
-/* Add refcnt for the newly added reference. */
-static int qgroup_calc_new_refcnt(struct btrfs_fs_info *fs_info,
-				  struct btrfs_qgroup_operation *oper,
-				  struct btrfs_qgroup *qgroup,
-				  struct ulist *tmp, struct ulist *qgroups,
-				  u64 seq)
-{
-	struct ulist_node *unode;
-	struct ulist_iterator uiter;
-	struct btrfs_qgroup *qg;
-	int ret;
-
-	ulist_reinit(tmp);
-	ret = ulist_add(qgroups, qgroup->qgroupid, ptr_to_u64(qgroup),
-			GFP_ATOMIC);
-	if (ret < 0)
-		return ret;
-	ret = ulist_add(tmp, qgroup->qgroupid, ptr_to_u64(qgroup),
-			GFP_ATOMIC);
-	if (ret < 0)
-		return ret;
-	ULIST_ITER_INIT(&uiter);
-	while ((unode = ulist_next(tmp, &uiter))) {
-		struct btrfs_qgroup_list *glist;
-
-		qg = u64_to_ptr(unode->aux);
-		if (oper->type == BTRFS_QGROUP_OPER_ADD_SHARED)
-			btrfs_qgroup_update_new_refcnt(qg, seq, 1);
-		else
-			btrfs_qgroup_update_old_refcnt(qg, seq, 1);
-		list_for_each_entry(glist, &qg->groups, next_group) {
-			ret = ulist_add(tmp, glist->group->qgroupid,
-					ptr_to_u64(glist->group), GFP_ATOMIC);
-			if (ret < 0)
-				return ret;
-			ret = ulist_add(qgroups, glist->group->qgroupid,
-					ptr_to_u64(glist->group), GFP_ATOMIC);
-			if (ret < 0)
-				return ret;
-		}
-	}
-	return 0;
-}
-
 #define UPDATE_NEW	0
 #define UPDATE_OLD	1
 /*
@@ -1816,6 +1352,7 @@ static int qgroup_update_refcnt(struct btrfs_fs_info *fs_info,
 /*
  * Update qgroup rfer/excl counters.
  * Rfer update is easy, codes can explain themselves.
+ *
  * Excl update is tricky, the update is split into 2 part.
  * Part 1: Possible exclusive <-> sharing detect:
  *	|	A	|	!A	|
@@ -1933,419 +1470,6 @@ static int qgroup_update_counters(struct btrfs_fs_info *fs_info,
 	return 0;
 }
 
-/*
- * This adjusts the counters for all referenced qgroups if need be.
- */
-static int qgroup_adjust_counters(struct btrfs_fs_info *fs_info,
-				  u64 root_to_skip, u64 num_bytes,
-				  struct ulist *qgroups, u64 seq,
-				  int old_roots, int new_roots, int rescan)
-{
-	struct ulist_node *unode;
-	struct ulist_iterator uiter;
-	struct btrfs_qgroup *qg;
-	u64 cur_new_count, cur_old_count;
-
-	ULIST_ITER_INIT(&uiter);
-	while ((unode = ulist_next(qgroups, &uiter))) {
-		bool dirty = false;
-
-		qg = u64_to_ptr(unode->aux);
-		cur_old_count = btrfs_qgroup_get_old_refcnt(qg, seq);
-		cur_new_count = btrfs_qgroup_get_new_refcnt(qg, seq);
-
-		/*
-		 * Wasn't referenced before but is now, add to the reference
-		 * counters.
-		 */
-		if (cur_old_count == 0 && cur_new_count > 0) {
-			qg->rfer += num_bytes;
-			qg->rfer_cmpr += num_bytes;
-			dirty = true;
-		}
-
-		/*
-		 * Was referenced before but isn't now, subtract from the
-		 * reference counters.
-		 */
-		if (cur_old_count > 0 && cur_new_count == 0) {
-			qg->rfer -= num_bytes;
-			qg->rfer_cmpr -= num_bytes;
-			dirty = true;
-		}
-
-		/*
-		 * If our refcount was the same as the roots previously but our
-		 * new count isn't the same as the number of roots now then we
-		 * went from having a exclusive reference on this range to not.
-		 */
-		if (old_roots && cur_old_count == old_roots &&
-		    (cur_new_count != new_roots || new_roots == 0)) {
-			WARN_ON(cur_new_count != new_roots && new_roots == 0);
-			qg->excl -= num_bytes;
-			qg->excl_cmpr -= num_bytes;
-			dirty = true;
-		}
-
-		/*
-		 * If we didn't reference all the roots before but now we do we
-		 * have an exclusive reference to this range.
-		 */
-		if ((!old_roots || (old_roots && cur_old_count != old_roots))
-		    && cur_new_count == new_roots) {
-			qg->excl += num_bytes;
-			qg->excl_cmpr += num_bytes;
-			dirty = true;
-		}
-
-		if (dirty)
-			qgroup_dirty(fs_info, qg);
-	}
-	return 0;
-}
-
-/*
- * If we removed a data extent and there were other references for that bytenr
- * then we need to lookup all referenced roots to make sure we still don't
- * reference this bytenr.  If we do then we can just discard this operation.
- */
-static int check_existing_refs(struct btrfs_trans_handle *trans,
-			       struct btrfs_fs_info *fs_info,
-			       struct btrfs_qgroup_operation *oper)
-{
-	struct ulist *roots = NULL;
-	struct ulist_node *unode;
-	struct ulist_iterator uiter;
-	int ret = 0;
-
-	ret = btrfs_find_all_roots(trans, fs_info, oper->bytenr,
-				   oper->elem.seq, &roots);
-	if (ret < 0)
-		return ret;
-	ret = 0;
-
-	ULIST_ITER_INIT(&uiter);
-	while ((unode = ulist_next(roots, &uiter))) {
-		if (unode->val == oper->ref_root) {
-			ret = 1;
-			break;
-		}
-	}
-	ulist_free(roots);
-	btrfs_put_tree_mod_seq(fs_info, &oper->elem);
-
-	return ret;
-}
-
-/*
- * If we share a reference across multiple roots then we may need to adjust
- * various qgroups referenced and exclusive counters.  The basic premise is this
- *
- * 1) We have seq to represent a 0 count.  Instead of looping through all of the
- * qgroups and resetting their refcount to 0 we just constantly bump this
- * sequence number to act as the base reference count.  This means that if
- * anybody is equal to or below this sequence they were never referenced.  We
- * jack this sequence up by the number of roots we found each time in order to
- * make sure we don't have any overlap.
- *
- * 2) We first search all the roots that reference the area _except_ the root
- * we're acting on currently.  This makes up the old_refcnt of all the qgroups
- * before.
- *
- * 3) We walk all of the qgroups referenced by the root we are currently acting
- * on, and will either adjust old_refcnt in the case of a removal or the
- * new_refcnt in the case of an addition.
- *
- * 4) Finally we walk all the qgroups that are referenced by this range
- * including the root we are acting on currently.  We will adjust the counters
- * based on the number of roots we had and will have after this operation.
- *
- * Take this example as an illustration
- *
- *			[qgroup 1/0]
- *		     /         |          \
- *		[qg 0/0]   [qg 0/1]	[qg 0/2]
- *		   \          |            /
- *		  [	   extent	    ]
- *
- * Say we are adding a reference that is covered by qg 0/0.  The first step
- * would give a refcnt of 1 to qg 0/1 and 0/2 and a refcnt of 2 to qg 1/0 with
- * old_roots being 2.  Because it is adding new_roots will be 1.  We then go
- * through qg 0/0 which will get the new_refcnt set to 1 and add 1 to qg 1/0's
- * new_refcnt, bringing it to 3.  We then walk through all of the qgroups, we
- * notice that the old refcnt for qg 0/0 < the new refcnt, so we added a
- * reference and thus must add the size to the referenced bytes.  Everything
- * else is the same so nothing else changes.
- */
-static int qgroup_shared_accounting(struct btrfs_trans_handle *trans,
-				    struct btrfs_fs_info *fs_info,
-				    struct btrfs_qgroup_operation *oper)
-{
-	struct ulist *roots = NULL;
-	struct ulist *qgroups, *tmp;
-	struct btrfs_qgroup *qgroup;
-	struct seq_list elem = {};
-	u64 seq;
-	int old_roots = 0;
-	int new_roots = 0;
-	int ret = 0;
-
-	if (oper->elem.seq) {
-		ret = check_existing_refs(trans, fs_info, oper);
-		if (ret < 0)
-			return ret;
-		if (ret)
-			return 0;
-	}
-
-	qgroups = ulist_alloc(GFP_NOFS);
-	if (!qgroups)
-		return -ENOMEM;
-
-	tmp = ulist_alloc(GFP_NOFS);
-	if (!tmp) {
-		ulist_free(qgroups);
-		return -ENOMEM;
-	}
-
-	btrfs_get_tree_mod_seq(fs_info, &elem);
-	ret = btrfs_find_all_roots(trans, fs_info, oper->bytenr, elem.seq,
-				   &roots);
-	btrfs_put_tree_mod_seq(fs_info, &elem);
-	if (ret < 0) {
-		ulist_free(qgroups);
-		ulist_free(tmp);
-		return ret;
-	}
-	spin_lock(&fs_info->qgroup_lock);
-	qgroup = find_qgroup_rb(fs_info, oper->ref_root);
-	if (!qgroup)
-		goto out;
-	seq = fs_info->qgroup_seq;
-
-	/*
-	 * So roots is the list of all the roots currently pointing at the
-	 * bytenr, including the ref we are adding if we are adding, or not if
-	 * we are removing a ref.  So we pass in the ref_root to skip that root
-	 * in our calculations.  We set old_refnct and new_refcnt cause who the
-	 * hell knows what everything looked like before, and it doesn't matter
-	 * except...
-	 */
-	ret = qgroup_calc_old_refcnt(fs_info, oper->ref_root, tmp, roots, qgroups,
-				     seq, &old_roots, 0);
-	if (ret < 0)
-		goto out;
-
-	/*
-	 * Now adjust the refcounts of the qgroups that care about this
-	 * reference, either the old_count in the case of removal or new_count
-	 * in the case of an addition.
-	 */
-	ret = qgroup_calc_new_refcnt(fs_info, oper, qgroup, tmp, qgroups,
-				     seq);
-	if (ret < 0)
-		goto out;
-
-	/*
-	 * ...in the case of removals.  If we had a removal before we got around
-	 * to processing this operation then we need to find that guy and count
-	 * his references as if they really existed so we don't end up screwing
-	 * up the exclusive counts.  Then whenever we go to process the delete
-	 * everything will be grand and we can account for whatever exclusive
-	 * changes need to be made there.  We also have to pass in old_roots so
-	 * we have an accurate count of the roots as it pertains to this
-	 * operations view of the world.
-	 */
-	ret = qgroup_account_deleted_refs(fs_info, oper, tmp, qgroups, seq,
-					  &old_roots);
-	if (ret < 0)
-		goto out;
-
-	/*
-	 * We are adding our root, need to adjust up the number of roots,
-	 * otherwise old_roots is the number of roots we want.
-	 */
-	if (oper->type == BTRFS_QGROUP_OPER_ADD_SHARED) {
-		new_roots = old_roots + 1;
-	} else {
-		new_roots = old_roots;
-		old_roots++;
-	}
-
-	/*
-	 * Bump qgroup_seq to avoid seq overlap
-	 * XXX: This makes qgroup_seq mismatch with oper->seq.
-	 */
-	fs_info->qgroup_seq += old_roots + 1;
-
-
-	/*
-	 * And now the magic happens, bless Arne for having a pretty elegant
-	 * solution for this.
-	 */
-	qgroup_adjust_counters(fs_info, oper->ref_root, oper->num_bytes,
-			       qgroups, seq, old_roots, new_roots, 0);
-out:
-	spin_unlock(&fs_info->qgroup_lock);
-	ulist_free(qgroups);
-	ulist_free(roots);
-	ulist_free(tmp);
-	return ret;
-}
-
-/*
- * Process a reference to a shared subtree. This type of operation is
- * queued during snapshot removal when we encounter extents which are
- * shared between more than one root.
- */
-static int qgroup_subtree_accounting(struct btrfs_trans_handle *trans,
-				     struct btrfs_fs_info *fs_info,
-				     struct btrfs_qgroup_operation *oper)
-{
-	struct ulist *roots = NULL;
-	struct ulist_node *unode;
-	struct ulist_iterator uiter;
-	struct btrfs_qgroup_list *glist;
-	struct ulist *parents;
-	int ret = 0;
-	int err;
-	struct btrfs_qgroup *qg;
-	u64 root_obj = 0;
-	struct seq_list elem = {};
-
-	parents = ulist_alloc(GFP_NOFS);
-	if (!parents)
-		return -ENOMEM;
-
-	btrfs_get_tree_mod_seq(fs_info, &elem);
-	ret = btrfs_find_all_roots(trans, fs_info, oper->bytenr,
-				   elem.seq, &roots);
-	btrfs_put_tree_mod_seq(fs_info, &elem);
-	if (ret < 0)
-		goto out;
-
-	if (roots->nnodes != 1)
-		goto out;
-
-	ULIST_ITER_INIT(&uiter);
-	unode = ulist_next(roots, &uiter); /* Only want 1 so no need to loop */
-	/*
-	 * If we find our ref root then that means all refs
-	 * this extent has to the root have not yet been
-	 * deleted. In that case, we do nothing and let the
-	 * last ref for this bytenr drive our update.
-	 *
-	 * This can happen for example if an extent is
-	 * referenced multiple times in a snapshot (clone,
-	 * etc). If we are in the middle of snapshot removal,
-	 * queued updates for such an extent will find the
-	 * root if we have not yet finished removing the
-	 * snapshot.
-	 */
-	if (unode->val == oper->ref_root)
-		goto out;
-
-	root_obj = unode->val;
-	BUG_ON(!root_obj);
-
-	spin_lock(&fs_info->qgroup_lock);
-	qg = find_qgroup_rb(fs_info, root_obj);
-	if (!qg)
-		goto out_unlock;
-
-	qg->excl += oper->num_bytes;
-	qg->excl_cmpr += oper->num_bytes;
-	qgroup_dirty(fs_info, qg);
-
-	/*
-	 * Adjust counts for parent groups. First we find all
-	 * parents, then in the 2nd loop we do the adjustment
-	 * while adding parents of the parents to our ulist.
-	 */
-	list_for_each_entry(glist, &qg->groups, next_group) {
-		err = ulist_add(parents, glist->group->qgroupid,
-				ptr_to_u64(glist->group), GFP_ATOMIC);
-		if (err < 0) {
-			ret = err;
-			goto out_unlock;
-		}
-	}
-
-	ULIST_ITER_INIT(&uiter);
-	while ((unode = ulist_next(parents, &uiter))) {
-		qg = u64_to_ptr(unode->aux);
-		qg->excl += oper->num_bytes;
-		qg->excl_cmpr += oper->num_bytes;
-		qgroup_dirty(fs_info, qg);
-
-		/* Add any parents of the parents */
-		list_for_each_entry(glist, &qg->groups, next_group) {
-			err = ulist_add(parents, glist->group->qgroupid,
-					ptr_to_u64(glist->group), GFP_ATOMIC);
-			if (err < 0) {
-				ret = err;
-				goto out_unlock;
-			}
-		}
-	}
-
-out_unlock:
-	spin_unlock(&fs_info->qgroup_lock);
-
-out:
-	ulist_free(roots);
-	ulist_free(parents);
-	return ret;
-}
-
-/*
- * btrfs_qgroup_account_ref is called for every ref that is added to or deleted
- * from the fs. First, all roots referencing the extent are searched, and
- * then the space is accounted accordingly to the different roots. The
- * accounting algorithm works in 3 steps documented inline.
- */
-static int btrfs_qgroup_account(struct btrfs_trans_handle *trans,
-				struct btrfs_fs_info *fs_info,
-				struct btrfs_qgroup_operation *oper)
-{
-	int ret = 0;
-
-	if (!fs_info->quota_enabled)
-		return 0;
-
-	BUG_ON(!fs_info->quota_root);
-
-	mutex_lock(&fs_info->qgroup_rescan_lock);
-	if (fs_info->qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_RESCAN) {
-		if (fs_info->qgroup_rescan_progress.objectid <= oper->bytenr) {
-			mutex_unlock(&fs_info->qgroup_rescan_lock);
-			return 0;
-		}
-	}
-	mutex_unlock(&fs_info->qgroup_rescan_lock);
-
-	ASSERT(is_fstree(oper->ref_root));
-
-	trace_btrfs_qgroup_account(oper);
-
-	switch (oper->type) {
-	case BTRFS_QGROUP_OPER_ADD_EXCL:
-	case BTRFS_QGROUP_OPER_SUB_EXCL:
-		ret = qgroup_excl_accounting(fs_info, oper);
-		break;
-	case BTRFS_QGROUP_OPER_ADD_SHARED:
-	case BTRFS_QGROUP_OPER_SUB_SHARED:
-		ret = qgroup_shared_accounting(trans, fs_info, oper);
-		break;
-	case BTRFS_QGROUP_OPER_SUB_SUBTREE:
-		ret = qgroup_subtree_accounting(trans, fs_info, oper);
-		break;
-	default:
-		ASSERT(0);
-	}
-	return ret;
-}
-
 int
 btrfs_qgroup_account_extent(struct btrfs_trans_handle *trans,
 			    struct btrfs_fs_info *fs_info,
@@ -2463,31 +1587,6 @@ cleanup:
 }
 
 /*
- * Needs to be called everytime we run delayed refs, even if there is an error
- * in order to cleanup outstanding operations.
- */
-int btrfs_delayed_qgroup_accounting(struct btrfs_trans_handle *trans,
-				    struct btrfs_fs_info *fs_info)
-{
-	struct btrfs_qgroup_operation *oper;
-	int ret = 0;
-
-	while (!list_empty(&trans->qgroup_ref_list)) {
-		oper = list_first_entry(&trans->qgroup_ref_list,
-					struct btrfs_qgroup_operation, list);
-		list_del_init(&oper->list);
-		if (!ret || !trans->aborted)
-			ret = btrfs_qgroup_account(trans, fs_info, oper);
-		spin_lock(&fs_info->qgroup_op_lock);
-		rb_erase(&oper->n, &fs_info->qgroup_op_tree);
-		spin_unlock(&fs_info->qgroup_op_lock);
-		btrfs_put_tree_mod_seq(fs_info, &oper->elem);
-		kfree(oper);
-	}
-	return ret;
-}
-
-/*
  * called from commit_transaction. Writes all changed qgroups to disk.
  */
 int btrfs_run_qgroups(struct btrfs_trans_handle *trans,
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index 80d7df8..236567f 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -23,45 +23,6 @@
 #include "delayed-ref.h"
 
 /*
- * A description of the operations, all of these operations only happen when we
- * are adding the 1st reference for that subvolume in the case of adding space
- * or on the last reference delete in the case of subtraction.  The only
- * exception is the last one, which is added for confusion.
- *
- * BTRFS_QGROUP_OPER_ADD_EXCL: adding bytes where this subvolume is the only
- * one pointing at the bytes we are adding.  This is called on the first
- * allocation.
- *
- * BTRFS_QGROUP_OPER_ADD_SHARED: adding bytes where this bytenr is going to be
- * shared between subvols.  This is called on the creation of a ref that already
- * has refs from a different subvolume, so basically reflink.
- *
- * BTRFS_QGROUP_OPER_SUB_EXCL: removing bytes where this subvolume is the only
- * one referencing the range.
- *
- * BTRFS_QGROUP_OPER_SUB_SHARED: removing bytes where this subvolume shares with
- * refs with other subvolumes.
- */
-enum btrfs_qgroup_operation_type {
-	BTRFS_QGROUP_OPER_ADD_EXCL,
-	BTRFS_QGROUP_OPER_ADD_SHARED,
-	BTRFS_QGROUP_OPER_SUB_EXCL,
-	BTRFS_QGROUP_OPER_SUB_SHARED,
-	BTRFS_QGROUP_OPER_SUB_SUBTREE,
-};
-
-struct btrfs_qgroup_operation {
-	u64 ref_root;
-	u64 bytenr;
-	u64 num_bytes;
-	u64 seq;
-	enum btrfs_qgroup_operation_type type;
-	struct seq_list elem;
-	struct rb_node n;
-	struct list_head list;
-};
-
-/*
  * Record a dirty extent, and info qgroup to update quota on it
  * TODO: Use kmem cache to alloc it.
  */
@@ -94,11 +55,6 @@ int btrfs_limit_qgroup(struct btrfs_trans_handle *trans,
 int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info);
 void btrfs_free_qgroup_config(struct btrfs_fs_info *fs_info);
 struct btrfs_delayed_extent_op;
-int btrfs_qgroup_record_ref(struct btrfs_trans_handle *trans,
-			    struct btrfs_fs_info *fs_info, u64 ref_root,
-			    u64 bytenr, u64 num_bytes,
-			    enum btrfs_qgroup_operation_type type,
-			    int mod_seq);
 int btrfs_qgroup_prepare_account_extents(struct btrfs_trans_handle *trans,
 					 struct btrfs_fs_info *fs_info);
 struct btrfs_qgroup_extent_record
@@ -111,11 +67,6 @@ btrfs_qgroup_account_extent(struct btrfs_trans_handle *trans,
 			    struct ulist *old_roots, struct ulist *new_roots);
 int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans,
 				 struct btrfs_fs_info *fs_info);
-int btrfs_delayed_qgroup_accounting(struct btrfs_trans_handle *trans,
-				    struct btrfs_fs_info *fs_info);
-void btrfs_remove_qgroup_operation(struct btrfs_trans_handle *trans,
-				   struct btrfs_fs_info *fs_info,
-				   struct btrfs_qgroup_operation *oper);
 int btrfs_run_qgroups(struct btrfs_trans_handle *trans,
 		      struct btrfs_fs_info *fs_info);
 int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans,
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index 1faecea..c494060 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
@@ -1117,61 +1117,6 @@ DEFINE_EVENT(btrfs__workqueue_done, btrfs_workqueue_destroy,
 	TP_ARGS(wq)
 );
 
-#define show_oper_type(type)						\
-	__print_symbolic(type,						\
-		{ BTRFS_QGROUP_OPER_ADD_EXCL, 	"OPER_ADD_EXCL" },	\
-		{ BTRFS_QGROUP_OPER_ADD_SHARED, "OPER_ADD_SHARED" },	\
-		{ BTRFS_QGROUP_OPER_SUB_EXCL, 	"OPER_SUB_EXCL" },	\
-		{ BTRFS_QGROUP_OPER_SUB_SHARED,	"OPER_SUB_SHARED" })
-
-DECLARE_EVENT_CLASS(btrfs_qgroup_oper,
-
-	TP_PROTO(struct btrfs_qgroup_operation *oper),
-
-	TP_ARGS(oper),
-
-	TP_STRUCT__entry(
-		__field(	u64,  ref_root		)
-		__field(	u64,  bytenr		)
-		__field(	u64,  num_bytes		)
-		__field(	u64,  seq		)
-		__field(	int,  type		)
-		__field(	u64,  elem_seq		)
-	),
-
-	TP_fast_assign(
-		__entry->ref_root	= oper->ref_root;
-		__entry->bytenr		= oper->bytenr,
-		__entry->num_bytes	= oper->num_bytes;
-		__entry->seq 		= oper->seq;
-		__entry->type		= oper->type;
-		__entry->elem_seq	= oper->elem.seq;
-	),
-
-	TP_printk("ref_root = %llu, bytenr = %llu, num_bytes = %llu, "
-		  "seq = %llu, elem.seq = %llu, type = %s",
-		  (unsigned long long)__entry->ref_root,
-		  (unsigned long long)__entry->bytenr,
-		  (unsigned long long)__entry->num_bytes,
-		  (unsigned long long)__entry->seq,
-		  (unsigned long long)__entry->elem_seq,
-		  show_oper_type(__entry->type))
-);
-
-DEFINE_EVENT(btrfs_qgroup_oper, btrfs_qgroup_account,
-
-	TP_PROTO(struct btrfs_qgroup_operation *oper),
-
-	TP_ARGS(oper)
-);
-
-DEFINE_EVENT(btrfs_qgroup_oper, btrfs_qgroup_record_ref,
-
-	TP_PROTO(struct btrfs_qgroup_operation *oper),
-
-	TP_ARGS(oper)
-);
-
 #endif /* _TRACE_BTRFS_H */
 
 /* This part must be outside protection */
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 16/18] btrfs: ulist: Add ulist_del() function.
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (14 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 15/18] btrfs: qgroup: Cleanup the old ref_node-oriented mechanism Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 17/18] btrfs: qgroup: Add the ability to skip given qgroup for old/new_roots Qu Wenruo
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

This function will delete unode with given (val,aux) pair.
And with this patch, seqnum for debug usage doesn't have any meaning
now, so remove them.

This is used by later patches to skip snapshot root.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/ulist.c | 47 ++++++++++++++++++++++++++++++++++++-----------
 fs/btrfs/ulist.h |  1 +
 2 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/ulist.c b/fs/btrfs/ulist.c
index 840a38b..91feb2b 100644
--- a/fs/btrfs/ulist.c
+++ b/fs/btrfs/ulist.c
@@ -132,6 +132,15 @@ static struct ulist_node *ulist_rbtree_search(struct ulist *ulist, u64 val)
 	return NULL;
 }
 
+static void ulist_rbtree_erase(struct ulist *ulist, struct ulist_node *node)
+{
+	rb_erase(&node->rb_node, &ulist->root);
+	list_del(&node->list);
+	kfree(node);
+	BUG_ON(ulist->nnodes == 0);
+	ulist->nnodes--;
+}
+
 static int ulist_rbtree_insert(struct ulist *ulist, struct ulist_node *ins)
 {
 	struct rb_node **p = &ulist->root.rb_node;
@@ -197,9 +206,6 @@ int ulist_add_merge(struct ulist *ulist, u64 val, u64 aux,
 
 	node->val = val;
 	node->aux = aux;
-#ifdef CONFIG_BTRFS_DEBUG
-	node->seqnum = ulist->nnodes;
-#endif
 
 	ret = ulist_rbtree_insert(ulist, node);
 	ASSERT(!ret);
@@ -209,6 +215,33 @@ int ulist_add_merge(struct ulist *ulist, u64 val, u64 aux,
 	return 1;
 }
 
+/*
+ * ulist_del - delete one node from ulist
+ * @ulist:	ulist to remove node from
+ * @val:	value to delete
+ * @aux:	aux to delete
+ *
+ * The deletion will only be done when *BOTH* val and aux matches.
+ * Return 0 for successful delete.
+ * Return > 0 for not found.
+ */
+int ulist_del(struct ulist *ulist, u64 val, u64 aux)
+{
+	struct ulist_node *node;
+
+	node = ulist_rbtree_search(ulist, val);
+	/* Not found */
+	if (!node)
+		return 1;
+
+	if (node->aux != aux)
+		return 1;
+
+	/* Found and delete */
+	ulist_rbtree_erase(ulist, node);
+	return 0;
+}
+
 /**
  * ulist_next - iterate ulist
  * @ulist:	ulist to iterate
@@ -237,15 +270,7 @@ struct ulist_node *ulist_next(struct ulist *ulist, struct ulist_iterator *uiter)
 		uiter->cur_list = uiter->cur_list->next;
 	} else {
 		uiter->cur_list = ulist->nodes.next;
-#ifdef CONFIG_BTRFS_DEBUG
-		uiter->i = 0;
-#endif
 	}
 	node = list_entry(uiter->cur_list, struct ulist_node, list);
-#ifdef CONFIG_BTRFS_DEBUG
-	ASSERT(node->seqnum == uiter->i);
-	ASSERT(uiter->i >= 0 && uiter->i < ulist->nnodes);
-	uiter->i++;
-#endif
 	return node;
 }
diff --git a/fs/btrfs/ulist.h b/fs/btrfs/ulist.h
index 4c29db6..a01a2c4 100644
--- a/fs/btrfs/ulist.h
+++ b/fs/btrfs/ulist.h
@@ -57,6 +57,7 @@ void ulist_free(struct ulist *ulist);
 int ulist_add(struct ulist *ulist, u64 val, u64 aux, gfp_t gfp_mask);
 int ulist_add_merge(struct ulist *ulist, u64 val, u64 aux,
 		    u64 *old_aux, gfp_t gfp_mask);
+int ulist_del(struct ulist *ulist, u64 val, u64 aux);
 
 /* just like ulist_add_merge() but take a pointer for the aux data */
 static inline int ulist_add_merge_ptr(struct ulist *ulist, u64 val, void *aux,
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 17/18] btrfs: qgroup: Add the ability to skip given qgroup for old/new_roots.
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (15 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 16/18] btrfs: ulist: Add ulist_del() function Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
  2015-04-21  6:21 ` [PATCH 18/18] btrfs: qgroup: Make snapshot accounting work with new extent-oriented qgroup Qu Wenruo
       [not found] ` <CAP9B-QkT5dY7GkMgiumq6HCszB0aSjcjYidcYFFzSvCFcCEU3A@mail.gmail.com>
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

This is used by later qgroup fix patches for snapshot.

As current snapshot accounting is done by btrfs_qgroup_inherit(), but
new extent oriented quota mechanism will account extent from
btrfs_copy_root() and other snapshot things, causing wrong result.

So add this ability to handle snapshot accounting.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/delayed-ref.h |  8 ++++++++
 fs/btrfs/qgroup.c      |  8 ++++++++
 fs/btrfs/transaction.c |  1 +
 fs/btrfs/transaction.h | 23 +++++++++++++++++++++++
 4 files changed, 40 insertions(+)

diff --git a/fs/btrfs/delayed-ref.h b/fs/btrfs/delayed-ref.h
index 6481de2..07b701c 100644
--- a/fs/btrfs/delayed-ref.h
+++ b/fs/btrfs/delayed-ref.h
@@ -165,6 +165,14 @@ struct btrfs_delayed_ref_root {
 	int flushing;
 
 	u64 run_delayed_start;
+
+	/*
+	 * To make qgroup to skip given root.
+	 * This is for snapshot, as btrfs_qgroup_inherit() will manully
+	 * modify counters for snapshot and its source, so we should skip
+	 * the snapshot in new_root/old_roots or it will get calculated twice
+	 */
+	u64 qgroup_to_skip;
 };
 
 extern struct kmem_cache *btrfs_delayed_ref_head_cachep;
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 01612d7..029a695 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1244,9 +1244,11 @@ int btrfs_qgroup_prepare_account_extents(struct btrfs_trans_handle *trans,
 	struct btrfs_qgroup_extent_record *record;
 	struct btrfs_delayed_ref_root *delayed_refs;
 	struct rb_node *node;
+	u64 qgroup_to_skip;
 	int ret = 0;
 
 	delayed_refs = &trans->transaction->delayed_refs;
+	qgroup_to_skip = delayed_refs->qgroup_to_skip;
 
 	/*
 	 * No need to do lock, since this function will only be called in
@@ -1260,6 +1262,8 @@ int btrfs_qgroup_prepare_account_extents(struct btrfs_trans_handle *trans,
 					   &record->old_roots);
 		if (ret < 0)
 			break;
+		if (qgroup_to_skip)
+			ulist_del(record->old_roots, qgroup_to_skip, 0);
 		node = rb_next(node);
 	}
 	return ret;
@@ -1552,9 +1556,11 @@ int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans,
 	struct btrfs_delayed_ref_root *delayed_refs;
 	struct ulist *new_roots = NULL;
 	struct rb_node *node;
+	u64 qgroup_to_skip;
 	int ret = 0;
 
 	delayed_refs = &trans->transaction->delayed_refs;
+	qgroup_to_skip = delayed_refs->qgroup_to_skip;
 	while ((node = rb_first(&delayed_refs->dirty_extent_root))) {
 		record = rb_entry(node, struct btrfs_qgroup_extent_record,
 				  node);
@@ -1569,6 +1575,8 @@ int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans,
 					record->bytenr, (u64)-1, &new_roots);
 			if (ret < 0)
 				goto cleanup;
+			if (qgroup_to_skip)
+				ulist_del(new_roots, qgroup_to_skip, 0);
 			ret = btrfs_qgroup_account_extent(trans, fs_info,
 					record->bytenr, record->num_bytes,
 					record->old_roots, new_roots);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index eaea170..2b50ce4 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -230,6 +230,7 @@ loop:
 	cur_trans->delayed_refs.num_heads = 0;
 	cur_trans->delayed_refs.flushing = 0;
 	cur_trans->delayed_refs.run_delayed_start = 0;
+	cur_trans->delayed_refs.qgroup_to_skip = 0;
 
 	/*
 	 * although the tree mod log is per file system and not per transaction,
diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
index 937050a..c7cb8aa 100644
--- a/fs/btrfs/transaction.h
+++ b/fs/btrfs/transaction.h
@@ -141,6 +141,29 @@ static inline void btrfs_set_inode_last_trans(struct btrfs_trans_handle *trans,
 	BTRFS_I(inode)->last_log_commit = BTRFS_I(inode)->root->last_log_commit;
 }
 
+/*
+ * Make qgroup codes to skip given qgroupid, means the old/new_roots for
+ * qgroup won't contain the qgroupid in it.
+ */
+static inline void btrfs_set_skip_qgroup(struct btrfs_trans_handle *trans,
+					 u64 qgroupid)
+{
+	struct btrfs_delayed_ref_root *delayed_refs;
+
+	delayed_refs = &trans->transaction->delayed_refs;
+	WARN_ON(delayed_refs->qgroup_to_skip);
+	delayed_refs->qgroup_to_skip = qgroupid;
+}
+
+static inline void btrfs_clear_skip_qgroup(struct btrfs_trans_handle *trans)
+{
+	struct btrfs_delayed_ref_root *delayed_refs;
+
+	delayed_refs = &trans->transaction->delayed_refs;
+	WARN_ON(!delayed_refs->qgroup_to_skip);
+	delayed_refs->qgroup_to_skip = 0;
+}
+
 int btrfs_end_transaction(struct btrfs_trans_handle *trans,
 			  struct btrfs_root *root);
 struct btrfs_trans_handle *btrfs_start_transaction(struct btrfs_root *root,
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 18/18] btrfs: qgroup: Make snapshot accounting work with new extent-oriented qgroup.
  2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
                   ` (16 preceding siblings ...)
  2015-04-21  6:21 ` [PATCH 17/18] btrfs: qgroup: Add the ability to skip given qgroup for old/new_roots Qu Wenruo
@ 2015-04-21  6:21 ` Qu Wenruo
       [not found] ` <CAP9B-QkT5dY7GkMgiumq6HCszB0aSjcjYidcYFFzSvCFcCEU3A@mail.gmail.com>
  18 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-21  6:21 UTC (permalink / raw)
  To: linux-btrfs

Make snapshot accounting work with new extent-oriented mechanism by
skipping given root in new/old_roots in create_pending_snapshot().

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/transaction.c | 53 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 33 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 2b50ce4..0f3c1b8 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1286,6 +1286,12 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
 	if (pending->error)
 		goto no_free_objectid;
 
+	/*
+	 * Make qgroup to skip current new snapshot's qgroupid, as it is
+	 * accounted by later btrfs_qgroup_inherit().
+	 */
+	btrfs_set_skip_qgroup(trans, objectid);
+
 	btrfs_reloc_pre_snapshot(trans, pending, &to_reserve);
 
 	if (to_reserve > 0) {
@@ -1294,7 +1300,7 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
 						     to_reserve,
 						     BTRFS_RESERVE_NO_FLUSH);
 		if (pending->error)
-			goto no_free_objectid;
+			goto clear_skip_qgroup;
 	}
 
 	key.objectid = objectid;
@@ -1392,25 +1398,6 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
 		btrfs_abort_transaction(trans, root, ret);
 		goto fail;
 	}
-
-	/*
-	 * We need to flush delayed refs in order to make sure all of our quota
-	 * operations have been done before we call btrfs_qgroup_inherit.
-	 */
-	ret = btrfs_run_delayed_refs(trans, root, (unsigned long)-1);
-	if (ret) {
-		btrfs_abort_transaction(trans, root, ret);
-		goto fail;
-	}
-
-	ret = btrfs_qgroup_inherit(trans, fs_info,
-				   root->root_key.objectid,
-				   objectid, pending->inherit);
-	if (ret) {
-		btrfs_abort_transaction(trans, root, ret);
-		goto fail;
-	}
-
 	/* see comments in should_cow_block() */
 	set_bit(BTRFS_ROOT_FORCE_COW, &root->state);
 	smp_wmb();
@@ -1493,11 +1480,37 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
 			goto fail;
 		}
 	}
+
+	ret = btrfs_run_delayed_refs(trans, root, (unsigned long)-1);
+	if (ret) {
+		btrfs_abort_transaction(trans, root, ret);
+		goto fail;
+	}
+
+	/*
+	 * account qgroup counters before qgroup_inherit()
+	 */
+	ret = btrfs_qgroup_prepare_account_extents(trans, fs_info);
+	if (ret)
+		goto fail;
+	ret = btrfs_qgroup_account_extents(trans, fs_info);
+	if (ret)
+		goto fail;
+	ret = btrfs_qgroup_inherit(trans, fs_info,
+				   root->root_key.objectid,
+				   objectid, pending->inherit);
+	if (ret) {
+		btrfs_abort_transaction(trans, root, ret);
+		goto fail;
+	}
+
 fail:
 	pending->error = ret;
 dir_item_existed:
 	trans->block_rsv = rsv;
 	trans->bytes_reserved = 0;
+clear_skip_qgroup:
+	btrfs_clear_skip_qgroup(trans);
 no_free_objectid:
 	kfree(new_root_item);
 root_item_alloc_fail:
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup
       [not found] ` <CAP9B-QkT5dY7GkMgiumq6HCszB0aSjcjYidcYFFzSvCFcCEU3A@mail.gmail.com>
@ 2015-04-29  1:55   ` Qu Wenruo
  0 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-29  1:55 UTC (permalink / raw)
  To: Shilong Wang; +Cc: linux-btrfs



-------- Original Message  --------
Subject: Re: [PATCH 00/18] New extent-oriented qgroup mechanism with 
minor cleanup
From: Shilong Wang <wangshilong1991@gmail.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Date: 2015年04月21日 15:24

> *Hi Qu,
>
> Look cleaner logic for me.^_^
> Just two points from me.*
>
> 2015-04-21 14:21 GMT+08:00 Qu Wenruo <quwenruo@cn.fujitsu.com
> <mailto:quwenruo@cn.fujitsu.com>>:
>
>     [BRIEF]
>     This patchset mainly introduces a new qgroup mechanism, which is
>     originally used to fix a bug in fstest/btrfs/057.
>
>     [WORKFLOW]
>     The new mechanism works like the following:
>     0) Previous transaction is done.
>         The new mechanism highly depends on commit_transaction.
>         So without commit_transaction(), qgroup is not updated.
>
>     1) Record dirty extents.
>         Now it has a hook in add_delayed_ref_head() which will record a
>         possible dirty extent into
>         trans->transaction->delayed_refs.dirty_extent_root.
>
> *Hmm..so if users disable quota, we still do such allocating trick,
> can we avoid it if users don't use qgroup.*
>
>
>     2) Record old_roots for dirty extents
>         old_roots is get from commit root, so it is complete accurate.
>
>     3) Record new_roots for dirty extents
>         New_roots is get from current root, so we must ensure delayed_refs
>         are all handled. It is quite easy done in commit_transaction().
>
>     4) Update qgroup counters with old/new_roots.
>
>     5) Transaction is committed.
>
>
> *So Now, we do all qgroup accounting during commiting transaction,
> One problem is that function like btrfs_find_all_roots() is a little
> time-cosuming, and which means with thounds of snapshots.
> commit thread might be blocked for a long time...
>
> Can we do some benchmarking with/without quota enabled(thounds of snapshots)
> How much effects now it can affect?
>
> Best Regards,
> *
> *Wang Shilong*
Oh, My fault.

Till now I found out that the patchset was old and lacks later fixes
to avoid qrecord for quota disabled case.

I'll send the latest version soon.

Thanks.
Qu
>
>
>     [ADVANTAGES]
>     The advantages is obvious:
>     1) No more heavy/confusing ref_node operation.
>         Old ref_node based qgroup makes each qgroup enabled ref_node to call
>         btrfs_find_all_roots(), which is quite expensive considering the
>         number of ref_nodes.
>         Also we don't ever need to consider whether pass no_quota as 0 or 1,
>         as dirty extents are handled automatically, no meaningful no_quota
>         option any more.
>
>
>
>     2) Accuracy from design.
>         Now only few things can break accuracy of new qgroup.(*)
>         File clone? Leaf COW/splitting? As long as btrfs tree roots are in
>         consistent status, qgroup won't be wrong.
>         No more annoying and hard-to-debug btrfs/057 warning anymore.
>
>         *: Although snapshot create/delete qgroup reassign will break the
>         mechanism, but create can be handled by a small exception.
>         For snapshot delete/reassign? It never works as designed.
>
>     [PATCHES]
>     Patch 1~3 are cleanup for delayed_refs codes, which makes the logic and
>     codes for merging ref_nodes more clear
>
>     Patch 4~5 are cleanup for qgroups. Just replace some open-coded
>     functions
>     and reduce parameters.
>
>     Patch 6~11 are the new mechanism. Some functions like
>     qgroup_update_refcnt() and qgroup_update_counters() seems much like the
>     old functions, but for a better patch split, I have to copy some
>     function with new names and switch to new mechanism.
>     Nobody want to review a patch with 700+ inserts and 1400+ deletions,
>     right?
>
>     Patch 12~15 are switching to the new mechanism, and remove all the
>     unneeded codes. Oh yeah! Just press down 'd' !!!
>
>     Patch 16~18 are fix a btrfs/022. In theory, for the new mechanism the
>     best method for snapshot create/delete and qgroup reassign, is to mark
>     all related extent dirty and let the new mechanism to handle it.
>     But it's too expensive, so add a small exception for snapshot create to
>     resolve it.
>     For snapshot delete and qgroup reassign, it doesn't work now anyway.
>
>     Qu Wenruo (18):
>        btrfs: backref: Don't merge refs which are not for same block.
>        btrfs: delayed-ref: Use list to replace the ref_root in ref_head.
>        btrfs: delayed-ref: Cleanup the unneeded functions.
>        btrfs: qgroup: Cleanup open-coded old/new_refcnt update and read.
>        btrfs: extent-tree: Use ref_node to replace unneeded parameters in
>          __inc_extent_ref() and __free_extent()
>        btrfs: qgroup: Add function qgroup_update_refcnt().
>        btrfs: qgroup: Add function qgroup_update_counters().
>        btrfs: qgroup: Record possible quota-related extent for qgroup.
>        btrfs: qgroup: Add new function to record old_roots.
>        btrfs: backref: Add special time_seq == (u64)-1 case for
>          btrfs_find_all_roots().
>        btrfs: qgroup: Add new qgroup calculation function
>          btrfs_qgroup_account_extents().
>        btrfs: qgroup: Switch rescan to new mechanism.
>        btrfs: qgroup: Switch to new extent-oriented qgroup mechanism.
>        btrfs: qgroup: Switch self test to extent-oriented qgroup mechanism.
>        btrfs: qgroup: Cleanup the old ref_node-oriented mechanism.
>        btrfs: ulist: Add ulist_del() function.
>        btrfs: qgroup: Add the ability to skip given qgroup for
>     old/new_roots.
>        btrfs: qgroup: Make snapshot accounting work with new extent-oriented
>             qgroup.
>
>       fs/btrfs/backref.c            |   50 +-
>       fs/btrfs/ctree.h              |    2 +-
>       fs/btrfs/delayed-ref.c        |  372 +++++---------
>       fs/btrfs/delayed-ref.h        |   29 +-
>       fs/btrfs/disk-io.c            |    8 +-
>       fs/btrfs/extent-tree.c        |  203 ++------
>       fs/btrfs/extent-tree.h        |    0
>       fs/btrfs/qgroup.c             | 1090
>     ++++++++++-------------------------------
>       fs/btrfs/qgroup.h             |   61 +--
>       fs/btrfs/tests/qgroup-tests.c |  109 ++++-
>       fs/btrfs/transaction.c        |   73 ++-
>       fs/btrfs/transaction.h        |   23 +
>       fs/btrfs/ulist.c              |   47 +-
>       fs/btrfs/ulist.h              |    1 +
>       include/trace/events/btrfs.h  |   55 ---
>       15 files changed, 711 insertions(+), 1412 deletions(-)
>       create mode 100644 fs/btrfs/extent-tree.h
>
>     --
>     2.3.5
>
>     --
>     To unsubscribe from this list: send the line "unsubscribe
>     linux-btrfs" in
>     the body of a message to majordomo@vger.kernel.org
>     <mailto:majordomo@vger.kernel.org>
>     More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup
@ 2015-04-29  2:28 Qu Wenruo
  0 siblings, 0 replies; 21+ messages in thread
From: Qu Wenruo @ 2015-04-29  2:28 UTC (permalink / raw)
  To: linux-btrfs

[BRIEF]
This patchset mainly introduces a new qgroup mechanism, which is
originally used to fix a bug in fstest/btrfs/057.

[WORKFLOW]
The new mechanism works like the following:
0) Previous transaction is done.
   The new mechanism highly depends on commit_transaction.
   So without commit_transaction(), qgroup is not updated.

1) Record dirty extents.
   Now it has a hook in add_delayed_ref_head() which will record a
   possible dirty extent into
   trans->transaction->delayed_refs.dirty_extent_root.

2) Record old_roots for dirty extents
   old_roots is get from commit root, so it is complete accurate.

3) Record new_roots for dirty extents
   New_roots is get from current root, so we must ensure delayed_refs
   are all handled. It is quite easy done in commit_transaction().

4) Update qgroup counters with old/new_roots.

5) Transaction is committed.

[ADVANTAGES]
The advantages is obvious:
1) No more heavy/confusing ref_node operation.
   Old ref_node based qgroup makes each qgroup enabled ref_node to call
   btrfs_find_all_roots(), which is quite expensive considering the
   number of ref_nodes.
   Also we don't ever need to consider whether pass no_quota as 0 or 1,
   as dirty extents are handled automatically, no meaningful no_quota
   option any more.

2) Accuracy from design.
   Now only few things can break accuracy of new qgroup.(*)
   File clone? Leaf COW/splitting? As long as btrfs tree roots are in
   consistent status, qgroup won't be wrong.
   No more annoying and hard-to-debug btrfs/057 warning anymore.

   *: Although snapshot create/delete qgroup reassign will break the
   mechanism, but create can be handled by a small exception.
   For snapshot delete/reassign? It never works as designed.

[PATCHES]
Patch 1~3 are cleanup for delayed_refs codes, which makes the logic and
codes for merging ref_nodes more clear

Patch 4~5 are cleanup for qgroups. Just replace some open-coded functions
and reduce parameters.

Patch 6~11 are the new mechanism. Some functions like
qgroup_update_refcnt() and qgroup_update_counters() seems much like the
old functions, but for a better patch split, I have to copy some
function with new names and switch to new mechanism.
Nobody want to review a patch with 700+ inserts and 1400+ deletions,
right?

Patch 12~15 are switching to the new mechanism, and remove all the
unneeded codes. Oh yeah! Just press down 'd' !!!

Patch 16~18 are fix a btrfs/022. In theory, for the new mechanism the
best method for snapshot create/delete and qgroup reassign, is to mark
all related extent dirty and let the new mechanism to handle it.
But it's too expensive, so add a small exception for snapshot create to
resolve it.
For snapshot delete and qgroup reassign, it doesn't work now anyway.

Changelog:
v2:
   Fix a memleak in quota disabled case.
   (In fact, I formatted wrong patchset from old branch)
   Only Patch 08 and 11 are affected.
   Thanks Wang Shilong for pointint out.

Qu Wenruo (18):
  btrfs: backref: Don't merge refs which are not for same block.
  btrfs: delayed-ref: Use list to replace the ref_root in ref_head.
  btrfs: delayed-ref: Cleanup the unneeded functions.
  btrfs: qgroup: Cleanup open-coded old/new_refcnt update and read.
  btrfs: extent-tree: Use ref_node to replace unneeded parameters in
    __inc_extent_ref() and __free_extent()
  btrfs: qgroup: Add function qgroup_update_refcnt().
  btrfs: qgroup: Add function qgroup_update_counters().
  btrfs: qgroup: Record possible quota-related extent for qgroup.
  btrfs: qgroup: Add new function to record old_roots.
  btrfs: backref: Add special time_seq == (u64)-1 case for    
    btrfs_find_all_roots().
  btrfs: qgroup: Add new qgroup calculation function    
    btrfs_qgroup_account_extents().
  btrfs: qgroup: Switch rescan to new mechanism.
  btrfs: qgroup: Switch to new extent-oriented qgroup mechanism.
  btrfs: qgroup: Switch self test to extent-oriented qgroup mechanism.
  btrfs: qgroup: Cleanup the old ref_node-oriented mechanism.
  btrfs: ulist: Add ulist_del() function.
  btrfs: qgroup: Add the ability to skip given qgroup for old/new_roots.
  btrfs: qgroup: Make snapshot accounting work with new extent-oriented 
       qgroup.

 fs/btrfs/backref.c            |   50 +-
 fs/btrfs/ctree.h              |    2 +-
 fs/btrfs/delayed-ref.c        |  364 +++++---------
 fs/btrfs/delayed-ref.h        |   29 +-
 fs/btrfs/disk-io.c            |    8 +-
 fs/btrfs/extent-tree.c        |  203 ++------
 fs/btrfs/extent-tree.h        |    0
 fs/btrfs/qgroup.c             | 1090 ++++++++++-------------------------------
 fs/btrfs/qgroup.h             |   61 +--
 fs/btrfs/tests/qgroup-tests.c |  109 ++++-
 fs/btrfs/transaction.c        |   73 ++-
 fs/btrfs/transaction.h        |   23 +
 fs/btrfs/ulist.c              |   47 +-
 fs/btrfs/ulist.h              |    1 +
 include/trace/events/btrfs.h  |   55 ---
 15 files changed, 703 insertions(+), 1412 deletions(-)
 create mode 100644 fs/btrfs/extent-tree.h

-- 
2.3.6


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-04-29  2:31 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-21  6:21 [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
2015-04-21  6:21 ` [PATCH 01/18] btrfs: backref: Don't merge refs which are not for same block Qu Wenruo
2015-04-21  6:21 ` [PATCH 02/18] btrfs: delayed-ref: Use list to replace the ref_root in ref_head Qu Wenruo
2015-04-21  6:21 ` [PATCH 03/18] btrfs: delayed-ref: Cleanup the unneeded functions Qu Wenruo
2015-04-21  6:21 ` [PATCH 04/18] btrfs: qgroup: Cleanup open-coded old/new_refcnt update and read Qu Wenruo
2015-04-21  6:21 ` [PATCH 05/18] btrfs: extent-tree: Use ref_node to replace unneeded parameters in __inc_extent_ref() and __free_extent() Qu Wenruo
2015-04-21  6:21 ` [PATCH 06/18] btrfs: qgroup: Add function qgroup_update_refcnt() Qu Wenruo
2015-04-21  6:21 ` [PATCH 07/18] btrfs: qgroup: Add function qgroup_update_counters() Qu Wenruo
2015-04-21  6:21 ` [PATCH 08/18] btrfs: qgroup: Record possible quota-related extent for qgroup Qu Wenruo
2015-04-21  6:21 ` [PATCH 09/18] btrfs: qgroup: Add new function to record old_roots Qu Wenruo
2015-04-21  6:21 ` [PATCH 10/18] btrfs: backref: Add special time_seq == (u64)-1 case for btrfs_find_all_roots() Qu Wenruo
2015-04-21  6:21 ` [PATCH 11/18] btrfs: qgroup: Add new qgroup calculation function btrfs_qgroup_account_extents() Qu Wenruo
2015-04-21  6:21 ` [PATCH 12/18] btrfs: qgroup: Switch rescan to new mechanism Qu Wenruo
2015-04-21  6:21 ` [PATCH 13/18] btrfs: qgroup: Switch to new extent-oriented qgroup mechanism Qu Wenruo
2015-04-21  6:21 ` [PATCH 14/18] btrfs: qgroup: Switch self test to " Qu Wenruo
2015-04-21  6:21 ` [PATCH 15/18] btrfs: qgroup: Cleanup the old ref_node-oriented mechanism Qu Wenruo
2015-04-21  6:21 ` [PATCH 16/18] btrfs: ulist: Add ulist_del() function Qu Wenruo
2015-04-21  6:21 ` [PATCH 17/18] btrfs: qgroup: Add the ability to skip given qgroup for old/new_roots Qu Wenruo
2015-04-21  6:21 ` [PATCH 18/18] btrfs: qgroup: Make snapshot accounting work with new extent-oriented qgroup Qu Wenruo
     [not found] ` <CAP9B-QkT5dY7GkMgiumq6HCszB0aSjcjYidcYFFzSvCFcCEU3A@mail.gmail.com>
2015-04-29  1:55   ` [PATCH 00/18] New extent-oriented qgroup mechanism with minor cleanup Qu Wenruo
  -- strict thread matches above, loose matches on Subject: below --
2015-04-29  2:28 Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).