From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:49190 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753284AbcDVSMZ (ORCPT ); Fri, 22 Apr 2016 14:12:25 -0400 Subject: Re: [PATCH v4] btrfs: qgroup: Fix qgroup accounting when creating snapshot To: Qu Wenruo , , , References: <1460711302-2478-1-git-send-email-quwenruo@cn.fujitsu.com> From: Josef Bacik Message-ID: <571A697B.6050502@fb.com> Date: Fri, 22 Apr 2016 14:12:11 -0400 MIME-Version: 1.0 In-Reply-To: <1460711302-2478-1-git-send-email-quwenruo@cn.fujitsu.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 04/15/2016 05:08 AM, Qu Wenruo wrote: > Current btrfs qgroup design implies a requirement that after calling > btrfs_qgroup_account_extents() there must be a commit root switch. > > Normally this is OK, as btrfs_qgroup_accounting_extents() is only called > inside btrfs_commit_transaction() just be commit_cowonly_roots(). > > However there is a exception at create_pending_snapshot(), which will > call btrfs_qgroup_account_extents() but no any commit root switch. > > In case of creating a snapshot whose parent root is itself (create a > snapshot of fs tree), it will corrupt qgroup by the following trace: > (skipped unrelated data) > ====== > btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 1 > qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 0, excl = 0 > qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 16384, excl = 16384 > btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 0 > ====== > > The problem here is in first qgroup_account_extent(), the > nr_new_roots of the extent is 1, which means its reference got > increased, and qgroup increased its rfer and excl. > > But at second qgroup_account_extent(), its reference got decreased, but > between these two qgroup_account_extent(), there is no switch roots. > This leads to the same nr_old_roots, and this extent just got ignored by > qgroup, which means this extent is wrongly accounted. > > Fix it by call commit_cowonly_roots() after qgroup_account_extent() in > create_pending_snapshot(), with needed preparation. > > Reported-by: Mark Fasheh > Signed-off-by: Qu Wenruo > --- > v2: > Fix a soft lockup caused by missing switch_commit_root() call. > Fix a warning caused by dirty-but-not-committed root. > v3: > Fix a bug which will cause qgroup accounting for dropping snapshot > wrong > v4: > Fix a bug caused by non-cowed btree modification. > > To Filipe: > I'm sorry I didn't wait for your reply on the dropped roots. > I reverted back the version where we deleted dropped roots in > switch_commit_roots(). > > As I think as long as we called btrfs_qgroup_prepare_account_extents() > and btrfs_qgroup_account_extents(), it should have already accounted > extents for dropped roots, and then we are OK to drop them. > > It would be very nice if you could point out what I missed. > Thanks > Qu > --- > fs/btrfs/transaction.c | 117 +++++++++++++++++++++++++++++++++++++++---------- > 1 file changed, 93 insertions(+), 24 deletions(-) > > diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c > index 43885e5..92f8193 100644 > --- a/fs/btrfs/transaction.c > +++ b/fs/btrfs/transaction.c > @@ -311,10 +311,11 @@ loop: > * when the transaction commits > */ > static int record_root_in_trans(struct btrfs_trans_handle *trans, > - struct btrfs_root *root) > + struct btrfs_root *root, > + int force) > { > - if (test_bit(BTRFS_ROOT_REF_COWS, &root->state) && > - root->last_trans < trans->transid) { > + if ((test_bit(BTRFS_ROOT_REF_COWS, &root->state) && > + root->last_trans < trans->transid) || force) { > WARN_ON(root == root->fs_info->extent_root); > WARN_ON(root->commit_root != root->node); > > @@ -331,7 +332,7 @@ static int record_root_in_trans(struct btrfs_trans_handle *trans, > smp_wmb(); > > spin_lock(&root->fs_info->fs_roots_radix_lock); > - if (root->last_trans == trans->transid) { > + if (root->last_trans == trans->transid && !force) { > spin_unlock(&root->fs_info->fs_roots_radix_lock); > return 0; > } > @@ -402,7 +403,7 @@ int btrfs_record_root_in_trans(struct btrfs_trans_handle *trans, > return 0; > > mutex_lock(&root->fs_info->reloc_mutex); > - record_root_in_trans(trans, root); > + record_root_in_trans(trans, root, 0); > mutex_unlock(&root->fs_info->reloc_mutex); > > return 0; > @@ -1311,6 +1312,80 @@ int btrfs_defrag_root(struct btrfs_root *root) > } > > /* > + * Do all special snapshot related qgroup dirty hack. > + * > + * Will do all needed qgroup inherit and dirty hack like switch commit > + * roots inside one transaction and write all btree into disk, to make > + * qgroup works. > + */ > +static int qgroup_account_snapshot(struct btrfs_trans_handle *trans, > + struct btrfs_root *src, > + struct btrfs_root *parent, > + struct btrfs_qgroup_inherit *inherit, > + u64 dst_objectid) > +{ > + struct btrfs_fs_info *fs_info = src->fs_info; > + int ret; > + > + /* > + * We are going to commit transaction, see btrfs_commit_transaction() > + * comment for reason locking tree_log_mutex > + */ > + mutex_lock(&fs_info->tree_log_mutex); > + > + ret = commit_fs_roots(trans, src); > + if (ret) > + goto out; > + ret = btrfs_qgroup_prepare_account_extents(trans, fs_info); > + if (ret < 0) > + goto out; > + ret = btrfs_qgroup_account_extents(trans, fs_info); > + if (ret < 0) > + goto out; > + > + /* Now qgroup are all updated, we can inherit it to new qgroups */ > + ret = btrfs_qgroup_inherit(trans, fs_info, > + src->root_key.objectid, dst_objectid, > + inherit); > + if (ret < 0) > + goto out; > + > + /* > + * Now we do a simplified commit transaction, which will: > + * 1) commit all subvolume and extent tree > + * To ensure all subvolume and extent tree have a valid > + * commit_root to accounting later insert_dir_item() > + * 2) write all btree blocks onto disk > + * This is to make sure later btree modification will be cowed > + * Or commit_root can be populated and cause wrong qgroup numbers > + * In this simplified commit, we don't really care about other trees > + * like chunk and root tree, as they won't affect qgroup. > + * And we don't write super to avoid half committed status. > + */ > + ret = commit_cowonly_roots(trans, src); > + if (ret) > + goto out; > + switch_commit_roots(trans->transaction, fs_info); > + ret = btrfs_write_and_wait_transaction(trans, src); > + if (ret) > + btrfs_std_error(fs_info, ret, > + "Error while writing out transaction for qgroup"); > + > +out: > + mutex_unlock(&fs_info->tree_log_mutex); > + > + /* > + * Force parent root to be updated, as we recorded it before so its > + * last_trans == cur_transid. > + * Or it won't be committed again onto disk after later > + * insert_dir_item() > + */ > + if (!ret) > + record_root_in_trans(trans, parent, 1); > + return ret; > +} NACK, holy shit we aren't adding a special transaction commit only for qgroup snapshots. Figure out a different way. Thanks, Josef