[PATCH v3 0/2] btrfs: qgroup: stale qgroups related impromvents

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

* [PATCH v3 0/2] btrfs: qgroup: stale qgroups related impromvents
@ 2024-05-07  6:58 Qu Wenruo
  2024-05-07  6:58 ` [PATCH v3 1/2] btrfs: slightly loose the requirement for qgroup removal Qu Wenruo
  2024-05-07  6:58 ` [PATCH v3 2/2] btrfs: automatically remove the subvolume qgroup Qu Wenruo
  0 siblings, 2 replies; 7+ messages in thread
From: Qu Wenruo @ 2024-05-07  6:58 UTC (permalink / raw)
  To: linux-btrfs

[CHANGELOG]
v3:
- Do not allow dropping non-zero qgroups for squota
  Due to the design of squota, a fully dropped subvolume can still have
  non-zero qgroups, as in the extent tree the delta is still using the
  original owner (the dropped subvolume).

  So we can not drop any non-zero squota qgroups

- Slight change on the can_delete_qgroup() condition
  To co-operate with above change.

v2:
- Do more sanity checks before deleting a qgroup

- Make squota to handle auto deleted qgroup more gracefully
  Unfortunately the behavior change would affect btrfs/301, as the
  fully deleted subvolume would make the test case to cause bash grammar
  error (since the qgroup is gone with the subvolume).

  Cc Boris for extra comments on squota compatibility and future
  btrfs/311 updates ideas.

We have two problems in recent qgroup code:

- One can not delete a fully removed qgroup if the drop hits
  drop_subtree_threshold
  As hitting drop_subtree_threshold would mark qgroup inconsistent and
  skip all accounting, this would leave qgroup number untouched (thus
  non-zero), and btrfs refuses to delete qgroup with non-zero rfer/excl
  numbers.

  This would be addressed by the first patch, allowing qgroup
  deletion as long as it doesn't have any child nor a corresponding
  subvolume.

- Deleted snapshot would leave a stale qgroup
  This is a long existing problem.

  Although previous pushes all failed, just let me try it again.

  The idea is commit current transaction if needed (full accounting mode
  and qgroup numbers are consistent), then try to remove the subvolume
  qgroup after it is fully dropped.

  For full qgroup mode, no matter if qgroup is inconsistent, we will
  auto remove the dropped 0 level qgroup.
  (If consistent, the qgroup numbers should all be 0, and a safe
   removal. If inconsistent, we still remove the qgroup, leaving
   higher level qgroup incorrect, but since it's already inconsistent,
   we need a rescan anyway).

  For squota mode, we only do the auto reap if the qgroup numbers are
  already 0.
  Otherwise the qgroup numbers would be needed for future accounting.

The behavior change would only affect btrfs/301, which still expects the
dropped subvolume would still leave its qgroup untouched.
But that can be easily fixed by explicitly echoing "0" for dropped
subvolume.

Qu Wenruo (2):
  btrfs: slightly loose the requirement for qgroup removal
  btrfs: automatically remove the subvolume qgroup

 fs/btrfs/extent-tree.c |   8 +++
 fs/btrfs/qgroup.c      | 120 ++++++++++++++++++++++++++++++++++++++---
 fs/btrfs/qgroup.h      |   2 +
 3 files changed, 123 insertions(+), 7 deletions(-)

-- 
2.45.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 1/2] btrfs: slightly loose the requirement for qgroup removal
  2024-05-07  6:58 [PATCH v3 0/2] btrfs: qgroup: stale qgroups related impromvents Qu Wenruo
@ 2024-05-07  6:58 ` Qu Wenruo
  2024-05-20 22:46   ` Boris Burkov
  2024-05-07  6:58 ` [PATCH v3 2/2] btrfs: automatically remove the subvolume qgroup Qu Wenruo
  1 sibling, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2024-05-07  6:58 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
Currently if one is utilizing "qgroups/drop_subtree_threshold" sysfs,
and a snapshot with level higher than that value is dropped, btrfs will
not be able to delete the qgroup until next qgroup rescan:

  uuid=ffffffff-eeee-dddd-cccc-000000000000

  wipefs -fa $dev
  mkfs.btrfs -f $dev -O quota -s 4k -n 4k -U $uuid
  mount $dev $mnt

  btrfs subvolume create $mnt/subv1/
  for (( i = 0; i < 1024; i++ )); do
  	xfs_io -f -c "pwrite 0 2k" $mnt/subv1/file_$i > /dev/null
  done
  sync
  btrfs subv snapshot $mnt/subv1 $mnt/snapshot
  btrfs quota enable $mnt
  btrfs quota rescan -w $mnt
  sync
  echo 1 > /sys/fs/btrfs/$uuid/qgroups/drop_subtree_threshold
  btrfs subvolume delete $mnt/snapshot
  btrfs subv sync $mnt
  btrfs qgroup show -prce --sync $mnt
  btrfs qgroup destroy 0/257 $mnt
  umount $mnt

The final qgroup removal would fail with the following error:

  ERROR: unable to destroy quota group: Device or resource busy

[CAUSE]
The above script would generate a subvolume of level 2, then snapshot
it, enable qgroup, set the drop_subtree_threshold, then drop the
snapshot.

Since the subvolume drop would meet the threshold, qgroup would be
marked inconsistent and skip accounting to avoid hanging the system at
transaction commit.

But currently we do not allow a qgroup with any rfer/excl numbers to be
dropped, and this is not really compatible with the new
drop_subtree_threshold behavior.

[FIX]
Only require the strong zero rfer/excl/rfer_cmpr/excl_cmpr for squota
mode.
This is due to the fact that squota can never go inconsistent, and it
can have dropped subvolume but with non-zero qgroup numbers for future
accounting.

For full qgroup mode, we only check if there is a subvolume for it.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/qgroup.c | 82 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 75 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index eb28141d5c37..d89f16366a1c 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1728,13 +1728,51 @@ int btrfs_create_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
 	return ret;
 }
 
-static bool qgroup_has_usage(struct btrfs_qgroup *qgroup)
+static bool can_delete_qgroup(struct btrfs_fs_info *fs_info,
+			      struct btrfs_qgroup *qgroup)
 {
-	return (qgroup->rfer > 0 || qgroup->rfer_cmpr > 0 ||
-		qgroup->excl > 0 || qgroup->excl_cmpr > 0 ||
-		qgroup->rsv.values[BTRFS_QGROUP_RSV_DATA] > 0 ||
-		qgroup->rsv.values[BTRFS_QGROUP_RSV_META_PREALLOC] > 0 ||
-		qgroup->rsv.values[BTRFS_QGROUP_RSV_META_PERTRANS] > 0);
+	struct btrfs_key key;
+	struct btrfs_path *path;
+	int ret;
+
+	/*
+	 * Squota would never be inconsistent, but there can still be
+	 * case where a dropped subvolume still has qgroup numbers, and
+	 * squota relies on such qgroup for future accounting.
+	 *
+	 * So for squota, do not allow dropping any non-zero qgroup.
+	 */
+	if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_SIMPLE) {
+		if (qgroup->rfer || qgroup->excl || qgroup->excl_cmpr ||
+		    qgroup->rfer_cmpr)
+			return false;
+	}
+
+	/* For higher level qgroup, we can only delete it if it has no child. */
+	if (btrfs_qgroup_level(qgroup->qgroupid)) {
+		if (!list_empty(&qgroup->members))
+			return false;
+		return true;
+	}
+
+	/*
+	 * For level-0 qgroups, we can only delete it if it has no subvolume
+	 * for it.
+	 * This means even a subvolume is unlinked but not yet fully dropped,
+	 * we can not delete the qgroup.
+	 */
+	key.objectid = qgroup->qgroupid;
+	key.type = BTRFS_ROOT_ITEM_KEY;
+	key.offset = -1ULL;
+	path = btrfs_alloc_path();
+	if (!path)
+		return false;
+
+	ret = btrfs_find_root(fs_info->tree_root, &key, path, NULL, NULL);
+	btrfs_free_path(path);
+	if (ret > 0)
+		return true;
+	return false;
 }
 
 int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
@@ -1756,7 +1794,7 @@ int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
 		goto out;
 	}
 
-	if (is_fstree(qgroupid) && qgroup_has_usage(qgroup)) {
+	if (!can_delete_qgroup(fs_info, qgroup)) {
 		ret = -EBUSY;
 		goto out;
 	}
@@ -1781,6 +1819,36 @@ int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
 	}
 
 	spin_lock(&fs_info->qgroup_lock);
+	/*
+	 * Warn on reserved space. The subvolume should has no child nor
+	 * corresponding subvolume.
+	 * Thus its reserved space should all be zero, no matter if qgroup
+	 * is consistent or the mode.
+	 */
+	WARN_ON(qgroup->rsv.values[BTRFS_QGROUP_RSV_DATA] ||
+		qgroup->rsv.values[BTRFS_QGROUP_RSV_META_PREALLOC] ||
+		qgroup->rsv.values[BTRFS_QGROUP_RSV_META_PERTRANS]);
+	/*
+	 * The same for rfer/excl numbers, but that's only if our qgroup is
+	 * consistent and if it's in regular qgroup mode.
+	 * For simple mode it's not as accurate thus we can hit non-zero values
+	 * very frequently.
+	 */
+	if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_FULL &&
+	    !(fs_info->qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT)) {
+		if (WARN_ON(qgroup->rfer || qgroup->excl ||
+			    qgroup->rfer_cmpr || qgroup->excl_cmpr)) {
+			btrfs_warn_rl(fs_info,
+"to be deleted qgroup %u/%llu has non-zero numbers, rfer %llu rfer_cmpr %llu excl %llu excl_cmpr %llu",
+				      btrfs_qgroup_level(qgroup->qgroupid),
+				      btrfs_qgroup_subvolid(qgroup->qgroupid),
+				      qgroup->rfer,
+				      qgroup->rfer_cmpr,
+				      qgroup->excl,
+				      qgroup->excl_cmpr);
+			qgroup_mark_inconsistent(fs_info);
+		}
+	}
 	del_qgroup_rb(fs_info, qgroupid);
 	spin_unlock(&fs_info->qgroup_lock);
 
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 2/2] btrfs: automatically remove the subvolume qgroup
  2024-05-07  6:58 [PATCH v3 0/2] btrfs: qgroup: stale qgroups related impromvents Qu Wenruo
  2024-05-07  6:58 ` [PATCH v3 1/2] btrfs: slightly loose the requirement for qgroup removal Qu Wenruo
@ 2024-05-07  6:58 ` Qu Wenruo
  2024-05-20 22:50   ` Boris Burkov
  1 sibling, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2024-05-07  6:58 UTC (permalink / raw)
  To: linux-btrfs

Currently if we fully removed a subvolume (not only unlinked, but fully
dropped its root item), its qgroup would not be removed.

Thus we have "btrfs qgroup clear-stale" to handle such 0 level qgroups.

This patch changes the behavior by automatically removing the qgroup of
a fully dropped subvolume when possible:

- Full qgroup but still consistent
  We can and should remove the qgroup.
  The qgroup numbers should be 0, without any rsv.

- Full qgroup but inconsistent
  Can happen with drop_subtree_threshold feature (skip accounting
  and mark qgroup inconsistent).

  We can and should remove the qgroup.
  Higher level qgroup numbers will be incorrect, but since qgroup
  is already inconsistent, it should not be a problem.

- Squota mode
  This is the special case, we can only drop the qgroup if its numbers
  are all 0.

  This would be handled by can_delete_qgroup(), so we only need to check
  the return value and ignore the -EBUSY error.

Link: https://bugzilla.suse.com/show_bug.cgi?id=1222847
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent-tree.c |  8 ++++++++
 fs/btrfs/qgroup.c      | 38 ++++++++++++++++++++++++++++++++++++++
 fs/btrfs/qgroup.h      |  2 ++
 3 files changed, 48 insertions(+)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 47d48233b592..21e07b698625 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5833,6 +5833,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, int update_ref, int for_reloc)
 	struct btrfs_root_item *root_item = &root->root_item;
 	struct walk_control *wc;
 	struct btrfs_key key;
+	const u64 rootid = btrfs_root_id(root);
 	int err = 0;
 	int ret;
 	int level;
@@ -6063,6 +6064,13 @@ int btrfs_drop_snapshot(struct btrfs_root *root, int update_ref, int for_reloc)
 	kfree(wc);
 	btrfs_free_path(path);
 out:
+	if (!err && root_dropped) {
+		ret = btrfs_qgroup_cleanup_dropped_subvolume(fs_info, rootid);
+		if (ret < 0)
+			btrfs_warn_rl(fs_info,
+				      "failed to cleanup qgroup 0/%llu: %d",
+				      rootid, ret);
+	}
 	/*
 	 * We were an unfinished drop root, check to see if there are any
 	 * pending, and if not clear and wake up any waiters.
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index d89f16366a1c..d894a7e2bf30 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1864,6 +1864,44 @@ int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
 	return ret;
 }
 
+int btrfs_qgroup_cleanup_dropped_subvolume(struct btrfs_fs_info *fs_info,
+					   u64 subvolid)
+{
+	struct btrfs_trans_handle *trans;
+	int ret;
+
+	if (!is_fstree(subvolid) || !btrfs_qgroup_enabled(fs_info) ||
+	    !fs_info->quota_root)
+		return 0;
+
+	/*
+	 * Commit current transaction to make sure all the rfer/excl numbers
+	 * get updated.
+	 */
+	trans = btrfs_start_transaction(fs_info->quota_root, 0);
+	if (IS_ERR(trans))
+		return PTR_ERR(trans);
+
+	ret = btrfs_commit_transaction(trans);
+	if (ret < 0)
+		return ret;
+
+	/* Start new trans to delete the qgroup info and limit items. */
+	trans = btrfs_start_transaction(fs_info->quota_root, 2);
+	if (IS_ERR(trans))
+		return PTR_ERR(trans);
+	ret = btrfs_remove_qgroup(trans, subvolid);
+	btrfs_end_transaction(trans);
+	/*
+	 * It's squota and the subvolume still has numbers needed
+	 * for future accounting, in this case we can not delete.
+	 * Just skip it.
+	 */
+	if (ret == -EBUSY)
+		ret = 0;
+	return ret;
+}
+
 int btrfs_limit_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid,
 		       struct btrfs_qgroup_limit *limit)
 {
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index 706640be0ec2..3f93856a02e1 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -327,6 +327,8 @@ int btrfs_del_qgroup_relation(struct btrfs_trans_handle *trans, u64 src,
 			      u64 dst);
 int btrfs_create_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid);
 int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid);
+int btrfs_qgroup_cleanup_dropped_subvolume(struct btrfs_fs_info *fs_info,
+					   u64 subvolid);
 int btrfs_limit_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid,
 		       struct btrfs_qgroup_limit *limit);
 int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info);
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 1/2] btrfs: slightly loose the requirement for qgroup removal
  2024-05-07  6:58 ` [PATCH v3 1/2] btrfs: slightly loose the requirement for qgroup removal Qu Wenruo
@ 2024-05-20 22:46   ` Boris Burkov
  0 siblings, 0 replies; 7+ messages in thread
From: Boris Burkov @ 2024-05-20 22:46 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Tue, May 07, 2024 at 04:28:10PM +0930, Qu Wenruo wrote:
> [BUG]
> Currently if one is utilizing "qgroups/drop_subtree_threshold" sysfs,
> and a snapshot with level higher than that value is dropped, btrfs will
> not be able to delete the qgroup until next qgroup rescan:
> 
>   uuid=ffffffff-eeee-dddd-cccc-000000000000
> 
>   wipefs -fa $dev
>   mkfs.btrfs -f $dev -O quota -s 4k -n 4k -U $uuid
>   mount $dev $mnt
> 
>   btrfs subvolume create $mnt/subv1/
>   for (( i = 0; i < 1024; i++ )); do
>   	xfs_io -f -c "pwrite 0 2k" $mnt/subv1/file_$i > /dev/null
>   done
>   sync
>   btrfs subv snapshot $mnt/subv1 $mnt/snapshot
>   btrfs quota enable $mnt
>   btrfs quota rescan -w $mnt
>   sync
>   echo 1 > /sys/fs/btrfs/$uuid/qgroups/drop_subtree_threshold
>   btrfs subvolume delete $mnt/snapshot
>   btrfs subv sync $mnt
>   btrfs qgroup show -prce --sync $mnt
>   btrfs qgroup destroy 0/257 $mnt
>   umount $mnt
> 
> The final qgroup removal would fail with the following error:
> 
>   ERROR: unable to destroy quota group: Device or resource busy
> 
> [CAUSE]
> The above script would generate a subvolume of level 2, then snapshot
> it, enable qgroup, set the drop_subtree_threshold, then drop the
> snapshot.
> 
> Since the subvolume drop would meet the threshold, qgroup would be
> marked inconsistent and skip accounting to avoid hanging the system at
> transaction commit.
> 
> But currently we do not allow a qgroup with any rfer/excl numbers to be
> dropped, and this is not really compatible with the new
> drop_subtree_threshold behavior.
> 
> [FIX]
> Only require the strong zero rfer/excl/rfer_cmpr/excl_cmpr for squota
> mode.
> This is due to the fact that squota can never go inconsistent, and it
> can have dropped subvolume but with non-zero qgroup numbers for future
> accounting.
> 
> For full qgroup mode, we only check if there is a subvolume for it.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Sorry, I got dragged into other stuff and didn't get around to reviewing
this. LGTM!

Reviewed-by: Boris Burkov <boris@bur.io>
> ---
>  fs/btrfs/qgroup.c | 82 +++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 75 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index eb28141d5c37..d89f16366a1c 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -1728,13 +1728,51 @@ int btrfs_create_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
>  	return ret;
>  }
>  
> -static bool qgroup_has_usage(struct btrfs_qgroup *qgroup)
> +static bool can_delete_qgroup(struct btrfs_fs_info *fs_info,
> +			      struct btrfs_qgroup *qgroup)
>  {
> -	return (qgroup->rfer > 0 || qgroup->rfer_cmpr > 0 ||
> -		qgroup->excl > 0 || qgroup->excl_cmpr > 0 ||
> -		qgroup->rsv.values[BTRFS_QGROUP_RSV_DATA] > 0 ||
> -		qgroup->rsv.values[BTRFS_QGROUP_RSV_META_PREALLOC] > 0 ||
> -		qgroup->rsv.values[BTRFS_QGROUP_RSV_META_PERTRANS] > 0);
> +	struct btrfs_key key;
> +	struct btrfs_path *path;
> +	int ret;
> +
> +	/*
> +	 * Squota would never be inconsistent, but there can still be
> +	 * case where a dropped subvolume still has qgroup numbers, and
> +	 * squota relies on such qgroup for future accounting.
> +	 *
> +	 * So for squota, do not allow dropping any non-zero qgroup.
> +	 */
> +	if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_SIMPLE) {

nit: you can chain this condition with an and rather than a nested if.

> +		if (qgroup->rfer || qgroup->excl || qgroup->excl_cmpr ||
> +		    qgroup->rfer_cmpr)
> +			return false;
> +	}
> +
> +	/* For higher level qgroup, we can only delete it if it has no child. */
> +	if (btrfs_qgroup_level(qgroup->qgroupid)) {
> +		if (!list_empty(&qgroup->members))
> +			return false;
> +		return true;
> +	}
> +
> +	/*
> +	 * For level-0 qgroups, we can only delete it if it has no subvolume
> +	 * for it.
> +	 * This means even a subvolume is unlinked but not yet fully dropped,
> +	 * we can not delete the qgroup.
> +	 */
> +	key.objectid = qgroup->qgroupid;
> +	key.type = BTRFS_ROOT_ITEM_KEY;
> +	key.offset = -1ULL;
> +	path = btrfs_alloc_path();
> +	if (!path)

I suppose ideally this should be ENOMEM, not false => EBUSY

> +		return false;
> +
> +	ret = btrfs_find_root(fs_info->tree_root, &key, path, NULL, NULL);
> +	btrfs_free_path(path);
> +	if (ret > 0)
> +		return true;
> +	return false;
>  }
>  
>  int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
> @@ -1756,7 +1794,7 @@ int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
>  		goto out;
>  	}
>  
> -	if (is_fstree(qgroupid) && qgroup_has_usage(qgroup)) {
> +	if (!can_delete_qgroup(fs_info, qgroup)) {
>  		ret = -EBUSY;
>  		goto out;
>  	}
> @@ -1781,6 +1819,36 @@ int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
>  	}
>  
>  	spin_lock(&fs_info->qgroup_lock);
> +	/*
> +	 * Warn on reserved space. The subvolume should has no child nor
> +	 * corresponding subvolume.
> +	 * Thus its reserved space should all be zero, no matter if qgroup
> +	 * is consistent or the mode.
> +	 */
> +	WARN_ON(qgroup->rsv.values[BTRFS_QGROUP_RSV_DATA] ||
> +		qgroup->rsv.values[BTRFS_QGROUP_RSV_META_PREALLOC] ||
> +		qgroup->rsv.values[BTRFS_QGROUP_RSV_META_PERTRANS]);
> +	/*
> +	 * The same for rfer/excl numbers, but that's only if our qgroup is
> +	 * consistent and if it's in regular qgroup mode.
> +	 * For simple mode it's not as accurate thus we can hit non-zero values
> +	 * very frequently.
> +	 */
> +	if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_FULL &&
> +	    !(fs_info->qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT)) {
> +		if (WARN_ON(qgroup->rfer || qgroup->excl ||
> +			    qgroup->rfer_cmpr || qgroup->excl_cmpr)) {
> +			btrfs_warn_rl(fs_info,
> +"to be deleted qgroup %u/%llu has non-zero numbers, rfer %llu rfer_cmpr %llu excl %llu excl_cmpr %llu",
> +				      btrfs_qgroup_level(qgroup->qgroupid),
> +				      btrfs_qgroup_subvolid(qgroup->qgroupid),
> +				      qgroup->rfer,
> +				      qgroup->rfer_cmpr,
> +				      qgroup->excl,
> +				      qgroup->excl_cmpr);
> +			qgroup_mark_inconsistent(fs_info);
> +		}
> +	}
>  	del_qgroup_rb(fs_info, qgroupid);
>  	spin_unlock(&fs_info->qgroup_lock);
>  
> -- 
> 2.45.0
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 2/2] btrfs: automatically remove the subvolume qgroup
  2024-05-07  6:58 ` [PATCH v3 2/2] btrfs: automatically remove the subvolume qgroup Qu Wenruo
@ 2024-05-20 22:50   ` Boris Burkov
  2024-05-20 23:03     ` Qu Wenruo
  0 siblings, 1 reply; 7+ messages in thread
From: Boris Burkov @ 2024-05-20 22:50 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Tue, May 07, 2024 at 04:28:11PM +0930, Qu Wenruo wrote:
> Currently if we fully removed a subvolume (not only unlinked, but fully
> dropped its root item), its qgroup would not be removed.
> 
> Thus we have "btrfs qgroup clear-stale" to handle such 0 level qgroups.
> 
> This patch changes the behavior by automatically removing the qgroup of
> a fully dropped subvolume when possible:
> 
> - Full qgroup but still consistent
>   We can and should remove the qgroup.
>   The qgroup numbers should be 0, without any rsv.
> 
> - Full qgroup but inconsistent
>   Can happen with drop_subtree_threshold feature (skip accounting
>   and mark qgroup inconsistent).
> 
>   We can and should remove the qgroup.
>   Higher level qgroup numbers will be incorrect, but since qgroup
>   is already inconsistent, it should not be a problem.
> 
> - Squota mode
>   This is the special case, we can only drop the qgroup if its numbers
>   are all 0.
> 
>   This would be handled by can_delete_qgroup(), so we only need to check
>   the return value and ignore the -EBUSY error.
> 
> Link: https://bugzilla.suse.com/show_bug.cgi?id=1222847
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Reviewed-by: Boris Burkov <boris@bur.io>

> ---
>  fs/btrfs/extent-tree.c |  8 ++++++++
>  fs/btrfs/qgroup.c      | 38 ++++++++++++++++++++++++++++++++++++++
>  fs/btrfs/qgroup.h      |  2 ++
>  3 files changed, 48 insertions(+)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 47d48233b592..21e07b698625 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -5833,6 +5833,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, int update_ref, int for_reloc)
>  	struct btrfs_root_item *root_item = &root->root_item;
>  	struct walk_control *wc;
>  	struct btrfs_key key;
> +	const u64 rootid = btrfs_root_id(root);
>  	int err = 0;
>  	int ret;
>  	int level;
> @@ -6063,6 +6064,13 @@ int btrfs_drop_snapshot(struct btrfs_root *root, int update_ref, int for_reloc)
>  	kfree(wc);
>  	btrfs_free_path(path);
>  out:
> +	if (!err && root_dropped) {
> +		ret = btrfs_qgroup_cleanup_dropped_subvolume(fs_info, rootid);
> +		if (ret < 0)
> +			btrfs_warn_rl(fs_info,
> +				      "failed to cleanup qgroup 0/%llu: %d",
> +				      rootid, ret);
> +	}
>  	/*
>  	 * We were an unfinished drop root, check to see if there are any
>  	 * pending, and if not clear and wake up any waiters.
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index d89f16366a1c..d894a7e2bf30 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -1864,6 +1864,44 @@ int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
>  	return ret;
>  }
>  
> +int btrfs_qgroup_cleanup_dropped_subvolume(struct btrfs_fs_info *fs_info,
> +					   u64 subvolid)
> +{
> +	struct btrfs_trans_handle *trans;
> +	int ret;
> +
> +	if (!is_fstree(subvolid) || !btrfs_qgroup_enabled(fs_info) ||
> +	    !fs_info->quota_root)
> +		return 0;
> +
> +	/*
> +	 * Commit current transaction to make sure all the rfer/excl numbers
> +	 * get updated.
> +	 */
> +	trans = btrfs_start_transaction(fs_info->quota_root, 0);
> +	if (IS_ERR(trans))
> +		return PTR_ERR(trans);
> +
> +	ret = btrfs_commit_transaction(trans);
> +	if (ret < 0)
> +		return ret;
> +
> +	/* Start new trans to delete the qgroup info and limit items. */
> +	trans = btrfs_start_transaction(fs_info->quota_root, 2);
> +	if (IS_ERR(trans))
> +		return PTR_ERR(trans);
> +	ret = btrfs_remove_qgroup(trans, subvolid);
> +	btrfs_end_transaction(trans);
> +	/*
> +	 * It's squota and the subvolume still has numbers needed
> +	 * for future accounting, in this case we can not delete.
> +	 * Just skip it.
> +	 */

Maybe throw in an ASSERT or WARN or whatever you think is best checking
for squota mode, if we are sure this shouldn't happen for normal qgroup?

> +	if (ret == -EBUSY)
> +		ret = 0;
> +	return ret;
> +}
> +
>  int btrfs_limit_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid,
>  		       struct btrfs_qgroup_limit *limit)
>  {
> diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
> index 706640be0ec2..3f93856a02e1 100644
> --- a/fs/btrfs/qgroup.h
> +++ b/fs/btrfs/qgroup.h
> @@ -327,6 +327,8 @@ int btrfs_del_qgroup_relation(struct btrfs_trans_handle *trans, u64 src,
>  			      u64 dst);
>  int btrfs_create_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid);
>  int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid);
> +int btrfs_qgroup_cleanup_dropped_subvolume(struct btrfs_fs_info *fs_info,
> +					   u64 subvolid);
>  int btrfs_limit_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid,
>  		       struct btrfs_qgroup_limit *limit);
>  int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info);
> -- 
> 2.45.0
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 2/2] btrfs: automatically remove the subvolume qgroup
  2024-05-20 22:50   ` Boris Burkov
@ 2024-05-20 23:03     ` Qu Wenruo
  2024-05-21  1:05       ` Qu Wenruo
  0 siblings, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2024-05-20 23:03 UTC (permalink / raw)
  To: Boris Burkov; +Cc: linux-btrfs



在 2024/5/21 08:20, Boris Burkov 写道:
[...]
>> +	/*
>> +	 * It's squota and the subvolume still has numbers needed
>> +	 * for future accounting, in this case we can not delete.
>> +	 * Just skip it.
>> +	 */
> 
> Maybe throw in an ASSERT or WARN or whatever you think is best checking
> for squota mode, if we are sure this shouldn't happen for normal qgroup?

Sounds good.

Would add an ASSERT() for making sure it's squota mode.

Thanks,
Qu

> 
>> +	if (ret == -EBUSY)
>> +		ret = 0;
>> +	return ret;
>> +}
>> +
>>   int btrfs_limit_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid,
>>   		       struct btrfs_qgroup_limit *limit)
>>   {
>> diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
>> index 706640be0ec2..3f93856a02e1 100644
>> --- a/fs/btrfs/qgroup.h
>> +++ b/fs/btrfs/qgroup.h
>> @@ -327,6 +327,8 @@ int btrfs_del_qgroup_relation(struct btrfs_trans_handle *trans, u64 src,
>>   			      u64 dst);
>>   int btrfs_create_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid);
>>   int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid);
>> +int btrfs_qgroup_cleanup_dropped_subvolume(struct btrfs_fs_info *fs_info,
>> +					   u64 subvolid);
>>   int btrfs_limit_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid,
>>   		       struct btrfs_qgroup_limit *limit);
>>   int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info);
>> -- 
>> 2.45.0
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 2/2] btrfs: automatically remove the subvolume qgroup
  2024-05-20 23:03     ` Qu Wenruo
@ 2024-05-21  1:05       ` Qu Wenruo
  0 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2024-05-21  1:05 UTC (permalink / raw)
  To: Boris Burkov; +Cc: linux-btrfs



在 2024/5/21 08:33, Qu Wenruo 写道:
> 
> 
> 在 2024/5/21 08:20, Boris Burkov 写道:
> [...]
>>> +    /*
>>> +     * It's squota and the subvolume still has numbers needed
>>> +     * for future accounting, in this case we can not delete.
>>> +     * Just skip it.
>>> +     */
>>
>> Maybe throw in an ASSERT or WARN or whatever you think is best checking
>> for squota mode, if we are sure this shouldn't happen for normal qgroup?
> 
> Sounds good.
> 
> Would add an ASSERT() for making sure it's squota mode.

After more thought, I believe ASSERT() can lead to false alerts.

The problem here is, we do not have any extra race prevention here, 
really rely on one time call on btrfs_remove_qgroup() to do the proper 
locking.

But after btrfs_remove_qgroup() returned -EBUSY, we can race with qgroup 
disabling, thus doing an ASSERT() without the proper lock context can 
lead to false alert, e.g. the qgroup is disabled after 
btrfs_remove_qgroup() call.

So I'm afraid we can not do extra checks here.

Thanks,
Qu
> 
> Thanks,
> Qu
> 
>>
>>> +    if (ret == -EBUSY)
>>> +        ret = 0;
>>> +    return ret;
>>> +}
>>> +
>>>   int btrfs_limit_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid,
>>>                  struct btrfs_qgroup_limit *limit)
>>>   {
>>> diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
>>> index 706640be0ec2..3f93856a02e1 100644
>>> --- a/fs/btrfs/qgroup.h
>>> +++ b/fs/btrfs/qgroup.h
>>> @@ -327,6 +327,8 @@ int btrfs_del_qgroup_relation(struct 
>>> btrfs_trans_handle *trans, u64 src,
>>>                     u64 dst);
>>>   int btrfs_create_qgroup(struct btrfs_trans_handle *trans, u64 
>>> qgroupid);
>>>   int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 
>>> qgroupid);
>>> +int btrfs_qgroup_cleanup_dropped_subvolume(struct btrfs_fs_info 
>>> *fs_info,
>>> +                       u64 subvolid);
>>>   int btrfs_limit_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid,
>>>                  struct btrfs_qgroup_limit *limit);
>>>   int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info);
>>> -- 
>>> 2.45.0
>>>
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-05-21  1:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-07  6:58 [PATCH v3 0/2] btrfs: qgroup: stale qgroups related impromvents Qu Wenruo
2024-05-07  6:58 ` [PATCH v3 1/2] btrfs: slightly loose the requirement for qgroup removal Qu Wenruo
2024-05-20 22:46   ` Boris Burkov
2024-05-07  6:58 ` [PATCH v3 2/2] btrfs: automatically remove the subvolume qgroup Qu Wenruo
2024-05-20 22:50   ` Boris Burkov
2024-05-20 23:03     ` Qu Wenruo
2024-05-21  1:05       ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox