Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Boris Burkov <boris@bur.io>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v3 1/2] btrfs: slightly loose the requirement for qgroup removal
Date: Mon, 20 May 2024 15:46:34 -0700	[thread overview]
Message-ID: <20240520224634.GA1820897@zen.localdomain> (raw)
In-Reply-To: <16337d00a7946336cd742573327c8714db278331.1715064550.git.wqu@suse.com>

On Tue, May 07, 2024 at 04:28:10PM +0930, Qu Wenruo wrote:
> [BUG]
> Currently if one is utilizing "qgroups/drop_subtree_threshold" sysfs,
> and a snapshot with level higher than that value is dropped, btrfs will
> not be able to delete the qgroup until next qgroup rescan:
> 
>   uuid=ffffffff-eeee-dddd-cccc-000000000000
> 
>   wipefs -fa $dev
>   mkfs.btrfs -f $dev -O quota -s 4k -n 4k -U $uuid
>   mount $dev $mnt
> 
>   btrfs subvolume create $mnt/subv1/
>   for (( i = 0; i < 1024; i++ )); do
>   	xfs_io -f -c "pwrite 0 2k" $mnt/subv1/file_$i > /dev/null
>   done
>   sync
>   btrfs subv snapshot $mnt/subv1 $mnt/snapshot
>   btrfs quota enable $mnt
>   btrfs quota rescan -w $mnt
>   sync
>   echo 1 > /sys/fs/btrfs/$uuid/qgroups/drop_subtree_threshold
>   btrfs subvolume delete $mnt/snapshot
>   btrfs subv sync $mnt
>   btrfs qgroup show -prce --sync $mnt
>   btrfs qgroup destroy 0/257 $mnt
>   umount $mnt
> 
> The final qgroup removal would fail with the following error:
> 
>   ERROR: unable to destroy quota group: Device or resource busy
> 
> [CAUSE]
> The above script would generate a subvolume of level 2, then snapshot
> it, enable qgroup, set the drop_subtree_threshold, then drop the
> snapshot.
> 
> Since the subvolume drop would meet the threshold, qgroup would be
> marked inconsistent and skip accounting to avoid hanging the system at
> transaction commit.
> 
> But currently we do not allow a qgroup with any rfer/excl numbers to be
> dropped, and this is not really compatible with the new
> drop_subtree_threshold behavior.
> 
> [FIX]
> Only require the strong zero rfer/excl/rfer_cmpr/excl_cmpr for squota
> mode.
> This is due to the fact that squota can never go inconsistent, and it
> can have dropped subvolume but with non-zero qgroup numbers for future
> accounting.
> 
> For full qgroup mode, we only check if there is a subvolume for it.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Sorry, I got dragged into other stuff and didn't get around to reviewing
this. LGTM!

Reviewed-by: Boris Burkov <boris@bur.io>
> ---
>  fs/btrfs/qgroup.c | 82 +++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 75 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index eb28141d5c37..d89f16366a1c 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -1728,13 +1728,51 @@ int btrfs_create_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
>  	return ret;
>  }
>  
> -static bool qgroup_has_usage(struct btrfs_qgroup *qgroup)
> +static bool can_delete_qgroup(struct btrfs_fs_info *fs_info,
> +			      struct btrfs_qgroup *qgroup)
>  {
> -	return (qgroup->rfer > 0 || qgroup->rfer_cmpr > 0 ||
> -		qgroup->excl > 0 || qgroup->excl_cmpr > 0 ||
> -		qgroup->rsv.values[BTRFS_QGROUP_RSV_DATA] > 0 ||
> -		qgroup->rsv.values[BTRFS_QGROUP_RSV_META_PREALLOC] > 0 ||
> -		qgroup->rsv.values[BTRFS_QGROUP_RSV_META_PERTRANS] > 0);
> +	struct btrfs_key key;
> +	struct btrfs_path *path;
> +	int ret;
> +
> +	/*
> +	 * Squota would never be inconsistent, but there can still be
> +	 * case where a dropped subvolume still has qgroup numbers, and
> +	 * squota relies on such qgroup for future accounting.
> +	 *
> +	 * So for squota, do not allow dropping any non-zero qgroup.
> +	 */
> +	if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_SIMPLE) {

nit: you can chain this condition with an and rather than a nested if.

> +		if (qgroup->rfer || qgroup->excl || qgroup->excl_cmpr ||
> +		    qgroup->rfer_cmpr)
> +			return false;
> +	}
> +
> +	/* For higher level qgroup, we can only delete it if it has no child. */
> +	if (btrfs_qgroup_level(qgroup->qgroupid)) {
> +		if (!list_empty(&qgroup->members))
> +			return false;
> +		return true;
> +	}
> +
> +	/*
> +	 * For level-0 qgroups, we can only delete it if it has no subvolume
> +	 * for it.
> +	 * This means even a subvolume is unlinked but not yet fully dropped,
> +	 * we can not delete the qgroup.
> +	 */
> +	key.objectid = qgroup->qgroupid;
> +	key.type = BTRFS_ROOT_ITEM_KEY;
> +	key.offset = -1ULL;
> +	path = btrfs_alloc_path();
> +	if (!path)

I suppose ideally this should be ENOMEM, not false => EBUSY

> +		return false;
> +
> +	ret = btrfs_find_root(fs_info->tree_root, &key, path, NULL, NULL);
> +	btrfs_free_path(path);
> +	if (ret > 0)
> +		return true;
> +	return false;
>  }
>  
>  int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
> @@ -1756,7 +1794,7 @@ int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
>  		goto out;
>  	}
>  
> -	if (is_fstree(qgroupid) && qgroup_has_usage(qgroup)) {
> +	if (!can_delete_qgroup(fs_info, qgroup)) {
>  		ret = -EBUSY;
>  		goto out;
>  	}
> @@ -1781,6 +1819,36 @@ int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
>  	}
>  
>  	spin_lock(&fs_info->qgroup_lock);
> +	/*
> +	 * Warn on reserved space. The subvolume should has no child nor
> +	 * corresponding subvolume.
> +	 * Thus its reserved space should all be zero, no matter if qgroup
> +	 * is consistent or the mode.
> +	 */
> +	WARN_ON(qgroup->rsv.values[BTRFS_QGROUP_RSV_DATA] ||
> +		qgroup->rsv.values[BTRFS_QGROUP_RSV_META_PREALLOC] ||
> +		qgroup->rsv.values[BTRFS_QGROUP_RSV_META_PERTRANS]);
> +	/*
> +	 * The same for rfer/excl numbers, but that's only if our qgroup is
> +	 * consistent and if it's in regular qgroup mode.
> +	 * For simple mode it's not as accurate thus we can hit non-zero values
> +	 * very frequently.
> +	 */
> +	if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_FULL &&
> +	    !(fs_info->qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT)) {
> +		if (WARN_ON(qgroup->rfer || qgroup->excl ||
> +			    qgroup->rfer_cmpr || qgroup->excl_cmpr)) {
> +			btrfs_warn_rl(fs_info,
> +"to be deleted qgroup %u/%llu has non-zero numbers, rfer %llu rfer_cmpr %llu excl %llu excl_cmpr %llu",
> +				      btrfs_qgroup_level(qgroup->qgroupid),
> +				      btrfs_qgroup_subvolid(qgroup->qgroupid),
> +				      qgroup->rfer,
> +				      qgroup->rfer_cmpr,
> +				      qgroup->excl,
> +				      qgroup->excl_cmpr);
> +			qgroup_mark_inconsistent(fs_info);
> +		}
> +	}
>  	del_qgroup_rb(fs_info, qgroupid);
>  	spin_unlock(&fs_info->qgroup_lock);
>  
> -- 
> 2.45.0
> 

  reply	other threads:[~2024-05-20 22:44 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-07  6:58 [PATCH v3 0/2] btrfs: qgroup: stale qgroups related impromvents Qu Wenruo
2024-05-07  6:58 ` [PATCH v3 1/2] btrfs: slightly loose the requirement for qgroup removal Qu Wenruo
2024-05-20 22:46   ` Boris Burkov [this message]
2024-05-07  6:58 ` [PATCH v3 2/2] btrfs: automatically remove the subvolume qgroup Qu Wenruo
2024-05-20 22:50   ` Boris Burkov
2024-05-20 23:03     ` Qu Wenruo
2024-05-21  1:05       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240520224634.GA1820897@zen.localdomain \
    --to=boris@bur.io \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox