From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: Mark Fasheh <mfasheh@suse.de>, <linux-btrfs@vger.kernel.org>
Cc: <jbacik@fb.com>, <clm@fb.com>
Subject: Re: [PATCH 4/4] btrfs: qgroup: account shared subtree during snapshot delete
Date: Mon, 2 Nov 2015 09:59:01 +0800 [thread overview]
Message-ID: <5636C365.2080908@cn.fujitsu.com> (raw)
In-Reply-To: <1442952948-21389-5-git-send-email-mfasheh@suse.de>
Mark Fasheh wrote on 2015/09/22 13:15 -0700:
> Commit 0ed4792 ('btrfs: qgroup: Switch to new extent-oriented qgroup
> mechanism.') removed our qgroup accounting during
> btrfs_drop_snapshot(). Predictably, this results in qgroup numbers
> going bad shortly after a snapshot is removed.
>
> Fix this by adding a dirty extent record when we encounter extents during
> our shared subtree walk. This effectively restores the functionality we had
> with the original shared subtree walking code in 1152651 (btrfs: qgroup:
> account shared subtrees during snapshot delete).
>
> The idea with the original patch (and this one) is that shared subtrees can
> get skipped during drop_snapshot. The shared subtree walk then allows us a
> chance to visit those extents and add them to the qgroup work for later
> processing. This ultimately makes the accounting for drop snapshot work.
>
> The new qgroup code nicely handles all the other extents during the tree
> walk via the ref dec/inc functions so we don't have to add actions beyond
> what we had originally.
>
> Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Hi Mark,
Despite the performance regression reported from Stefan Priebe,
there is another problem, I'll comment inlined below.
> ---
> fs/btrfs/extent-tree.c | 41 ++++++++++++++++++++++++++++++++++-------
> 1 file changed, 34 insertions(+), 7 deletions(-)
>
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 3a70e6c..89be620 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -7757,17 +7757,37 @@ reada:
> }
>
> /*
> - * TODO: Modify related function to add related node/leaf to dirty_extent_root,
> - * for later qgroup accounting.
> - *
> - * Current, this function does nothing.
> + * These may not be seen by the usual inc/dec ref code so we have to
> + * add them here.
> */
> +static int record_one_subtree_extent(struct btrfs_trans_handle *trans,
> + struct btrfs_root *root, u64 bytenr,
> + u64 num_bytes)
> +{
> + struct btrfs_qgroup_extent_record *qrecord;
> + struct btrfs_delayed_ref_root *delayed_refs;
> +
> + qrecord = kmalloc(sizeof(*qrecord), GFP_NOFS);
> + if (!qrecord)
> + return -ENOMEM;
> +
> + qrecord->bytenr = bytenr;
> + qrecord->num_bytes = num_bytes;
> + qrecord->old_roots = NULL;
> +
> + delayed_refs = &trans->transaction->delayed_refs;
> + if (btrfs_qgroup_insert_dirty_extent(delayed_refs, qrecord))
> + kfree(qrecord);
1) Unprotected dirty_extent_root.
Unfortunately, btrfs_qgroup_insert_dirty_exntet() is not protected by
any lock/mutex.
And I'm sorry not to add comment about that.
In fact, btrfs_qgroup_insert_dirty_extent should always be used with
delayed_refs->lock hold.
Just like add_delayed_ref_head(), where every caller of
add_delayed_ref_head() holds delayed_refs->lock.
So here you will nned to hold delayed_refs->lock.
2) Performance regression.(Reported by Stefan Priebe)
The performance regression is not caused by your codes, at least not
completely.
It's my fault not adding enough comment for insert_dirty_extent()
function. (just like 1, I must say I'm a bad reviewer until there is bug
report)
As I was only expecting it called inside add_delayed_ref_head(),
and caller of add_delayed_ref_head() has judged whether qgroup is
enabled before calling add_delayed_ref_head().
So for qgroup disabled case, insert_dirty_extent() won't ever be called.
As a result, if you want to call btrfs_qgroup_insert_dirty_extent() out
of add_delayed_ref_head(), you will need to handle the
delayed_refs->lock and judge whether qgroup is enabled.
BTW, if it's OK for you, you can also further improve the performance of
qgroup by using kmem_cache for struct btrfs_qgroup_extent_record.
I assume the kmalloc() may be one performance hot spot considering the
amount it called in qgroup enabled case.
Thanks,
Qu
> +
> + return 0;
> +}
> +
> static int account_leaf_items(struct btrfs_trans_handle *trans,
> struct btrfs_root *root,
> struct extent_buffer *eb)
> {
> int nr = btrfs_header_nritems(eb);
> - int i, extent_type;
> + int i, extent_type, ret;
> struct btrfs_key key;
> struct btrfs_file_extent_item *fi;
> u64 bytenr, num_bytes;
> @@ -7790,6 +7810,10 @@ static int account_leaf_items(struct btrfs_trans_handle *trans,
> continue;
>
> num_bytes = btrfs_file_extent_disk_num_bytes(eb, fi);
> +
> + ret = record_one_subtree_extent(trans, root, bytenr, num_bytes);
> + if (ret)
> + return ret;
> }
> return 0;
> }
> @@ -7858,8 +7882,6 @@ static int adjust_slots_upwards(struct btrfs_root *root,
>
> /*
> * root_eb is the subtree root and is locked before this function is called.
> - * TODO: Modify this function to mark all (including complete shared node)
> - * to dirty_extent_root to allow it get accounted in qgroup.
> */
> static int account_shared_subtree(struct btrfs_trans_handle *trans,
> struct btrfs_root *root,
> @@ -7937,6 +7959,11 @@ walk_down:
> btrfs_tree_read_lock(eb);
> btrfs_set_lock_blocking_rw(eb, BTRFS_READ_LOCK);
> path->locks[level] = BTRFS_READ_LOCK_BLOCKING;
> +
> + ret = record_one_subtree_extent(trans, root, child_bytenr,
> + root->nodesize);
> + if (ret)
> + goto out;
> }
>
> if (level == 0) {
>
next prev parent reply other threads:[~2015-11-02 1:59 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-22 20:15 [PATCH 0/4] btrfs: update qgroups in drop snapshot Mark Fasheh
2015-09-22 20:15 ` [PATCH 1/4] Btrfs: use btrfs_get_fs_root in resolve_indirect_ref Mark Fasheh
2015-09-22 20:15 ` [PATCH 2/4] Btrfs: keep dropped roots in cache until transaction commit, V2 Mark Fasheh
2015-09-22 20:15 ` [PATCH 3/4] btrfs: Add qgroup tracing Mark Fasheh
2015-09-22 20:15 ` [PATCH 4/4] btrfs: qgroup: account shared subtree during snapshot delete Mark Fasheh
2015-11-02 1:59 ` Qu Wenruo [this message]
2015-11-03 23:56 ` Mark Fasheh
2015-11-04 1:10 ` Qu Wenruo
2015-09-22 21:12 ` [PATCH 0/4] btrfs: update qgroups in drop snapshot Mark Fasheh
2015-09-23 1:40 ` Qu Wenruo
2015-09-23 21:49 ` Mark Fasheh
2015-09-24 5:47 ` Duncan
2015-09-24 6:29 ` Qu Wenruo
2015-09-23 3:58 ` Qu Wenruo
2015-09-23 8:50 ` Holger Hoffstätte
2015-09-23 22:08 ` Mark Fasheh
2015-09-25 3:17 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5636C365.2080908@cn.fujitsu.com \
--to=quwenruo@cn.fujitsu.com \
--cc=clm@fb.com \
--cc=jbacik@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=mfasheh@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.