linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] btrfs: update qgroups in drop snapshot, V2
@ 2015-11-05 22:37 Mark Fasheh
  2015-11-05 22:37 ` [PATCH 1/3] Btrfs: use btrfs_get_fs_root in resolve_indirect_ref Mark Fasheh
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Mark Fasheh @ 2015-11-05 22:37 UTC (permalink / raw)
  To: linux-btrfs; +Cc: jbacik, clm, quwenruo, mfasheh

Hi,

The following 3 patches fix a regression introduced in Linux
4.2 where btrfs_drop_snapshot() wasn't updating qgroups, resulting in
them going bad.

The original e-mail pointing this out is below:

http://www.spinics.net/lists/linux-btrfs/msg46093.html

The first patch is from Josef and fix bugs in our counting of
roots (which is critical for qgroups to work correctly). It was previously
sent to the list:

http://www.spinics.net/lists/linux-btrfs/msg47035.html

Truth be told, most of the time fixing this was spent figuring out that and
another issue (which has also been fixed).  Once I realized I was seeing a
bug and we fixed it correctly, my drop snapshot patch got dramatically
smaller.

I also re-added some of the tracing in qgroup.c that we recently
lost. It is again possible to debug qgroup operations on a live
system, allowing us to find issues like the two above by narrowing
down our operations and manually walking through them via
cat sys/debug/tracing.

The entire patch series can be tested in xfstests test btrfs/104.

Thanks,
	--Mark

Changes from V1, thanks to Qu for his comments:
  - lock around call to btrfs_qgroup_insert_dirty_extent()
  - check whether qgroups are enabled in account_leaf_items()

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] Btrfs: use btrfs_get_fs_root in resolve_indirect_ref
  2015-11-05 22:37 [PATCH 0/3] btrfs: update qgroups in drop snapshot, V2 Mark Fasheh
@ 2015-11-05 22:37 ` Mark Fasheh
  2015-11-05 22:37 ` [PATCH 2/3] btrfs: Add qgroup tracing Mark Fasheh
  2015-11-05 22:38 ` [PATCH 3/3] btrfs: qgroup: account shared subtree during snapshot delete Mark Fasheh
  2 siblings, 0 replies; 4+ messages in thread
From: Mark Fasheh @ 2015-11-05 22:37 UTC (permalink / raw)
  To: linux-btrfs; +Cc: jbacik, clm, quwenruo, mfasheh

From: Josef Bacik <jbacik@fb.com>

The backref code will look up the fs_root we're trying to resolve our indirect
refs for, unfortunately we use btrfs_read_fs_root_no_name, which returns -ENOENT
if the ref is 0.  This isn't helpful for the qgroup stuff with snapshot delete
as it won't be able to search down the snapshot we are deleting, which will
cause us to miss roots.  So use btrfs_get_fs_root and send false for check_ref
so we can always get the root we're looking for.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
---
 fs/btrfs/backref.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 9a2ec79..0e9da72 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -355,7 +355,7 @@ static int __resolve_indirect_ref(struct btrfs_fs_info *fs_info,
 
 	index = srcu_read_lock(&fs_info->subvol_srcu);
 
-	root = btrfs_read_fs_root_no_name(fs_info, &root_key);
+	root = btrfs_get_fs_root(fs_info, &root_key, false);
 	if (IS_ERR(root)) {
 		srcu_read_unlock(&fs_info->subvol_srcu, index);
 		ret = PTR_ERR(root);
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/3] btrfs: Add qgroup tracing
  2015-11-05 22:37 [PATCH 0/3] btrfs: update qgroups in drop snapshot, V2 Mark Fasheh
  2015-11-05 22:37 ` [PATCH 1/3] Btrfs: use btrfs_get_fs_root in resolve_indirect_ref Mark Fasheh
@ 2015-11-05 22:37 ` Mark Fasheh
  2015-11-05 22:38 ` [PATCH 3/3] btrfs: qgroup: account shared subtree during snapshot delete Mark Fasheh
  2 siblings, 0 replies; 4+ messages in thread
From: Mark Fasheh @ 2015-11-05 22:37 UTC (permalink / raw)
  To: linux-btrfs; +Cc: jbacik, clm, quwenruo, mfasheh

This patch adds tracepoints to the qgroup code on both the reporting side
(insert_dirty_extents) and the accounting side. Taken together it allows us
to see what qgroup operations have happened, and what their result was.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
---
 fs/btrfs/qgroup.c            | 10 +++++
 include/trace/events/btrfs.h | 88 +++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 97 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index d904ee1..b068209 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1461,6 +1461,8 @@ struct btrfs_qgroup_extent_record
 	struct btrfs_qgroup_extent_record *entry;
 	u64 bytenr = record->bytenr;
 
+	trace_btrfs_qgroup_insert_dirty_extent(record);
+
 	while (*p) {
 		parent_node = *p;
 		entry = rb_entry(parent_node, struct btrfs_qgroup_extent_record,
@@ -1591,6 +1593,9 @@ static int qgroup_update_counters(struct btrfs_fs_info *fs_info,
 		cur_old_count = btrfs_qgroup_get_old_refcnt(qg, seq);
 		cur_new_count = btrfs_qgroup_get_new_refcnt(qg, seq);
 
+		trace_qgroup_update_counters(qg->qgroupid, cur_old_count,
+					     cur_new_count);
+
 		/* Rfer update part */
 		if (cur_old_count == 0 && cur_new_count > 0) {
 			qg->rfer += num_bytes;
@@ -1684,6 +1689,9 @@ btrfs_qgroup_account_extent(struct btrfs_trans_handle *trans,
 		goto out_free;
 	BUG_ON(!fs_info->quota_root);
 
+	trace_btrfs_qgroup_account_extent(bytenr, num_bytes, nr_old_roots,
+					  nr_new_roots);
+
 	qgroups = ulist_alloc(GFP_NOFS);
 	if (!qgroups) {
 		ret = -ENOMEM;
@@ -1753,6 +1761,8 @@ int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans,
 		record = rb_entry(node, struct btrfs_qgroup_extent_record,
 				  node);
 
+		trace_btrfs_qgroup_account_extents(record);
+
 		if (!ret) {
 			/*
 			 * Use (u64)-1 as time_seq to do special search, which
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index 0b73af9..9d7b545 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
@@ -23,7 +23,7 @@ struct map_lookup;
 struct extent_buffer;
 struct btrfs_work;
 struct __btrfs_workqueue;
-struct btrfs_qgroup_operation;
+struct btrfs_qgroup_extent_record;
 
 #define show_ref_type(type)						\
 	__print_symbolic(type,						\
@@ -1117,6 +1117,92 @@ DEFINE_EVENT(btrfs__workqueue_done, btrfs_workqueue_destroy,
 	TP_ARGS(wq)
 );
 
+DECLARE_EVENT_CLASS(btrfs_qgroup_extent,
+	TP_PROTO(struct btrfs_qgroup_extent_record *rec),
+
+	TP_ARGS(rec),
+
+	TP_STRUCT__entry(
+		__field(	u64,  bytenr		)
+		__field(	u64,  num_bytes		)
+	),
+
+	TP_fast_assign(
+		__entry->bytenr		= rec->bytenr,
+		__entry->num_bytes	= rec->num_bytes;
+	),
+
+	TP_printk("bytenr = %llu, num_bytes = %llu",
+		  (unsigned long long)__entry->bytenr,
+		  (unsigned long long)__entry->num_bytes)
+);
+
+DEFINE_EVENT(btrfs_qgroup_extent, btrfs_qgroup_account_extents,
+
+	TP_PROTO(struct btrfs_qgroup_extent_record *rec),
+
+	TP_ARGS(rec)
+);
+
+DEFINE_EVENT(btrfs_qgroup_extent, btrfs_qgroup_insert_dirty_extent,
+
+	TP_PROTO(struct btrfs_qgroup_extent_record *rec),
+
+	TP_ARGS(rec)
+);
+
+TRACE_EVENT(btrfs_qgroup_account_extent,
+
+	TP_PROTO(u64 bytenr, u64 num_bytes, u64 nr_old_roots, u64 nr_new_roots),
+
+	TP_ARGS(bytenr, num_bytes, nr_old_roots, nr_new_roots),
+
+	TP_STRUCT__entry(
+		__field(	u64,  bytenr			)
+		__field(	u64,  num_bytes			)
+		__field(	u64,  nr_old_roots		)
+		__field(	u64,  nr_new_roots		)
+	),
+
+	TP_fast_assign(
+		__entry->bytenr		= bytenr;
+		__entry->num_bytes	= num_bytes;
+		__entry->nr_old_roots	= nr_old_roots;
+		__entry->nr_new_roots	= nr_new_roots;
+	),
+
+	TP_printk("bytenr = %llu, num_bytes = %llu, nr_old_roots = %llu, "
+		  "nr_new_roots = %llu",
+		  __entry->bytenr,
+		  __entry->num_bytes,
+		  __entry->nr_old_roots,
+		  __entry->nr_new_roots)
+);
+
+TRACE_EVENT(qgroup_update_counters,
+
+	TP_PROTO(u64 qgid, u64 cur_old_count, u64 cur_new_count),
+
+	TP_ARGS(qgid, cur_old_count, cur_new_count),
+
+	TP_STRUCT__entry(
+		__field(	u64,  qgid			)
+		__field(	u64,  cur_old_count		)
+		__field(	u64,  cur_new_count		)
+	),
+
+	TP_fast_assign(
+		__entry->qgid		= qgid;
+		__entry->cur_old_count	= cur_old_count;
+		__entry->cur_new_count	= cur_new_count;
+	),
+
+	    TP_printk("	qgid = %llu, cur_old_count = %llu, cur_new_count = %llu",
+		  __entry->qgid,
+		  __entry->cur_old_count,
+		  __entry->cur_new_count)
+);
+
 #endif /* _TRACE_BTRFS_H */
 
 /* This part must be outside protection */
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/3] btrfs: qgroup: account shared subtree during snapshot delete
  2015-11-05 22:37 [PATCH 0/3] btrfs: update qgroups in drop snapshot, V2 Mark Fasheh
  2015-11-05 22:37 ` [PATCH 1/3] Btrfs: use btrfs_get_fs_root in resolve_indirect_ref Mark Fasheh
  2015-11-05 22:37 ` [PATCH 2/3] btrfs: Add qgroup tracing Mark Fasheh
@ 2015-11-05 22:38 ` Mark Fasheh
  2 siblings, 0 replies; 4+ messages in thread
From: Mark Fasheh @ 2015-11-05 22:38 UTC (permalink / raw)
  To: linux-btrfs; +Cc: jbacik, clm, quwenruo, mfasheh

Commit 0ed4792 ('btrfs: qgroup: Switch to new extent-oriented qgroup
mechanism.') removed our qgroup accounting during
btrfs_drop_snapshot(). Predictably, this results in qgroup numbers
going bad shortly after a snapshot is removed.

Fix this by adding a dirty extent record when we encounter extents during
our shared subtree walk. This effectively restores the functionality we had
with the original shared subtree walking code in 1152651 (btrfs: qgroup:
account shared subtrees during snapshot delete).

The idea with the original patch (and this one) is that shared subtrees can
get skipped during drop_snapshot. The shared subtree walk then allows us a
chance to visit those extents and add them to the qgroup work for later
processing. This ultimately makes the accounting for drop snapshot work.

The new qgroup code nicely handles all the other extents during the tree
walk via the ref dec/inc functions so we don't have to add actions beyond
what we had originally.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
---
 fs/btrfs/extent-tree.c | 47 ++++++++++++++++++++++++++++++++++++++++-------
 fs/btrfs/qgroup.c      |  2 ++
 2 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 601d7d4..410b46d 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -7850,21 +7850,47 @@ reada:
 }
 
 /*
- * TODO: Modify related function to add related node/leaf to dirty_extent_root,
- * for later qgroup accounting.
- *
- * Current, this function does nothing.
+ * These may not be seen by the usual inc/dec ref code so we have to
+ * add them here.
  */
+static int record_one_subtree_extent(struct btrfs_trans_handle *trans,
+				     struct btrfs_root *root, u64 bytenr,
+				     u64 num_bytes)
+{
+	struct btrfs_qgroup_extent_record *qrecord;
+	struct btrfs_delayed_ref_root *delayed_refs;
+
+	qrecord = kmalloc(sizeof(*qrecord), GFP_NOFS);
+	if (!qrecord)
+		return -ENOMEM;
+
+	qrecord->bytenr = bytenr;
+	qrecord->num_bytes = num_bytes;
+	qrecord->old_roots = NULL;
+
+	delayed_refs = &trans->transaction->delayed_refs;
+	spin_lock(&delayed_refs->lock);
+	if (btrfs_qgroup_insert_dirty_extent(delayed_refs, qrecord))
+		kfree(qrecord);
+	spin_unlock(&delayed_refs->lock);
+
+	return 0;
+}
+
 static int account_leaf_items(struct btrfs_trans_handle *trans,
 			      struct btrfs_root *root,
 			      struct extent_buffer *eb)
 {
 	int nr = btrfs_header_nritems(eb);
-	int i, extent_type;
+	int i, extent_type, ret;
 	struct btrfs_key key;
 	struct btrfs_file_extent_item *fi;
 	u64 bytenr, num_bytes;
 
+	/* We can be called directly from walk_up_proc() */
+	if (!root->fs_info->quota_enabled)
+		return 0;
+
 	for (i = 0; i < nr; i++) {
 		btrfs_item_key_to_cpu(eb, &key, i);
 
@@ -7883,6 +7909,10 @@ static int account_leaf_items(struct btrfs_trans_handle *trans,
 			continue;
 
 		num_bytes = btrfs_file_extent_disk_num_bytes(eb, fi);
+
+		ret = record_one_subtree_extent(trans, root, bytenr, num_bytes);
+		if (ret)
+			return ret;
 	}
 	return 0;
 }
@@ -7951,8 +7981,6 @@ static int adjust_slots_upwards(struct btrfs_root *root,
 
 /*
  * root_eb is the subtree root and is locked before this function is called.
- * TODO: Modify this function to mark all (including complete shared node)
- * to dirty_extent_root to allow it get accounted in qgroup.
  */
 static int account_shared_subtree(struct btrfs_trans_handle *trans,
 				  struct btrfs_root *root,
@@ -8030,6 +8058,11 @@ walk_down:
 			btrfs_tree_read_lock(eb);
 			btrfs_set_lock_blocking_rw(eb, BTRFS_READ_LOCK);
 			path->locks[level] = BTRFS_READ_LOCK_BLOCKING;
+
+			ret = record_one_subtree_extent(trans, root, child_bytenr,
+							root->nodesize);
+			if (ret)
+				goto out;
 		}
 
 		if (level == 0) {
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index b068209..ce1cdcf 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1461,6 +1461,8 @@ struct btrfs_qgroup_extent_record
 	struct btrfs_qgroup_extent_record *entry;
 	u64 bytenr = record->bytenr;
 
+	assert_spin_locked(&delayed_refs->lock);
+
 	trace_btrfs_qgroup_insert_dirty_extent(record);
 
 	while (*p) {
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-11-05 22:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-05 22:37 [PATCH 0/3] btrfs: update qgroups in drop snapshot, V2 Mark Fasheh
2015-11-05 22:37 ` [PATCH 1/3] Btrfs: use btrfs_get_fs_root in resolve_indirect_ref Mark Fasheh
2015-11-05 22:37 ` [PATCH 2/3] btrfs: Add qgroup tracing Mark Fasheh
2015-11-05 22:38 ` [PATCH 3/3] btrfs: qgroup: account shared subtree during snapshot delete Mark Fasheh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).