From: Boris Burkov <boris@bur.io>
To: Mark Harmstone <mark@harmstone.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v3 17/17] btrfs: add stripe removal pending flag
Date: Tue, 14 Oct 2025 22:05:37 -0700 [thread overview]
Message-ID: <20251015050537.GJ1702774@zen.localdomain> (raw)
In-Reply-To: <20251009112814.13942-18-mark@harmstone.com>
On Thu, Oct 09, 2025 at 12:28:12PM +0100, Mark Harmstone wrote:
> If the filesystem is unmounted while the async discard of a fully remapped
> block group is in progress, its unused device extents will never be freed.
>
> To counter this, add a new flag BTRFS_BLOCK_GROUP_STRIPE_REMOVAL_PENDING
> to say that this has been interrupted. Set it in the transaction in which
> the last identity remap has been removed, clear it when we remove the
> device extents, and if we encounter it on mount queue that block group
> up for discard.
I don't see how this is special for remapped block groups.
in read_one_block_group() for empty block groups we queue them for async
discard unconditionally. There is a comment to this effect in discard.c
about crashes and mounts behaving the same.
And either way, if we go down at any point before we remove the bg, then
we will be re-discarding everything we already discarded (possibly the
whole thing) so this is optimistic anyway, right?
I don't think the benefit of this is worth the special case compared to
normal unused bgs, and I don't think it makes sense to "take advantage"
of this format change opportunity to also add the persisted "needs
discard" bit for all discard.
A persisted "discard state" on the extents would actually be
interesting, I think, but I don't think that is in scope or even
necessarily a good idea.
Thanks,
Boris
>
> Signed-off-by: Mark Harmstone <mark@harmstone.com>
> ---
> fs/btrfs/block-group.c | 35 ++++++++++++++++++++++++++++++++-
> fs/btrfs/free-space-cache.c | 5 +++++
> fs/btrfs/relocation.c | 18 +++++++++++++++++
> include/uapi/linux/btrfs_tree.h | 1 +
> 4 files changed, 58 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index a7dfa6c95223..851d76ce8ec9 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -2530,6 +2530,16 @@ static int read_one_block_group(struct btrfs_fs_info *info,
> inc_block_group_ro(cache, 1);
> }
>
> + if (cache->flags & BTRFS_BLOCK_GROUP_STRIPE_REMOVAL_PENDING) {
> + btrfs_get_block_group(cache);
> + spin_lock(&info->unused_bgs_lock);
> + list_add_tail(&cache->bg_list, &info->fully_remapped_bgs);
> + spin_unlock(&info->unused_bgs_lock);
> +
> + if (btrfs_test_opt(info, DISCARD_ASYNC))
> + btrfs_discard_queue_work(&info->discard_ctl, cache);
> + }
> +
> return 0;
> error:
> btrfs_put_block_group(cache);
> @@ -4828,6 +4838,29 @@ void btrfs_mark_bg_fully_remapped(struct btrfs_block_group *bg,
>
> spin_unlock(&fs_info->unused_bgs_lock);
>
> - if (btrfs_test_opt(fs_info, DISCARD_ASYNC))
> + if (btrfs_test_opt(fs_info, DISCARD_ASYNC)) {
> + bool bg_already_dirty = true;
> +
> + spin_lock(&bg->lock);
> + bg->flags |= BTRFS_BLOCK_GROUP_STRIPE_REMOVAL_PENDING;
> + spin_unlock(&bg->lock);
> +
> + spin_lock(&trans->transaction->dirty_bgs_lock);
> + if (list_empty(&bg->dirty_list)) {
> + list_add_tail(&bg->dirty_list,
> + &trans->transaction->dirty_bgs);
> + bg_already_dirty = false;
> + btrfs_get_block_group(bg);
> + }
> + spin_unlock(&trans->transaction->dirty_bgs_lock);
> +
> + /*
> + * Modified block groups are accounted for in
> + * the delayed_refs_rsv.
> + */
> + if (!bg_already_dirty)
> + btrfs_inc_delayed_refs_rsv_bg_updates(trans->fs_info);
> +
> btrfs_discard_queue_work(&fs_info->discard_ctl, bg);
> + }
> }
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index 56f27487b632..813b82294341 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -3065,6 +3065,7 @@ bool btrfs_is_free_space_trimmed(struct btrfs_block_group *block_group)
> bool ret = true;
>
> if (block_group->flags & BTRFS_BLOCK_GROUP_REMAPPED &&
> + !(block_group->flags & BTRFS_BLOCK_GROUP_STRIPE_REMOVAL_PENDING) &&
> block_group->remap_bytes == 0 &&
> block_group->identity_remap_count == 0) {
> return true;
> @@ -3845,6 +3846,9 @@ void btrfs_trim_fully_remapped_block_group(struct btrfs_block_group *bg)
> struct btrfs_trans_handle *trans;
> struct btrfs_chunk_map *map;
>
> + if (!(bg->flags & BTRFS_BLOCK_GROUP_STRIPE_REMOVAL_PENDING))
> + goto skip_discard;
> +
> while (bg->discard_cursor < end) {
> u64 trimmed;
>
> @@ -3897,6 +3901,7 @@ void btrfs_trim_fully_remapped_block_group(struct btrfs_block_group *bg)
>
> btrfs_free_chunk_map(map);
>
> +skip_discard:
> if (bg->used == 0) {
> spin_lock(&fs_info->unused_bgs_lock);
> list_move_tail(&bg->bg_list, &fs_info->unused_bgs);
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 7bad8d65d145..a179e4a8e960 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -4733,6 +4733,7 @@ int btrfs_last_identity_remap_gone(struct btrfs_trans_handle *trans,
> struct btrfs_block_group *bg)
> {
> int ret;
> + bool bg_already_dirty = true;
> BTRFS_PATH_AUTO_FREE(path);
>
> ret = btrfs_remove_dev_extents(trans, chunk);
> @@ -4757,6 +4758,23 @@ int btrfs_last_identity_remap_gone(struct btrfs_trans_handle *trans,
>
> btrfs_remove_bg_from_sinfo(bg);
>
> + spin_lock(&bg->lock);
> + bg->flags &= ~BTRFS_BLOCK_GROUP_STRIPE_REMOVAL_PENDING;
> + spin_unlock(&bg->lock);
> +
> + spin_lock(&trans->transaction->dirty_bgs_lock);
> + if (list_empty(&bg->dirty_list)) {
> + list_add_tail(&bg->dirty_list,
> + &trans->transaction->dirty_bgs);
> + bg_already_dirty = false;
> + btrfs_get_block_group(bg);
> + }
> + spin_unlock(&trans->transaction->dirty_bgs_lock);
> +
> + /* Modified block groups are accounted for in the delayed_refs_rsv. */
> + if (!bg_already_dirty)
> + btrfs_inc_delayed_refs_rsv_bg_updates(trans->fs_info);
> +
> path = btrfs_alloc_path();
> if (!path)
> return -ENOMEM;
> diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h
> index 89bcb80081a6..36a7d1a3cbe3 100644
> --- a/include/uapi/linux/btrfs_tree.h
> +++ b/include/uapi/linux/btrfs_tree.h
> @@ -1173,6 +1173,7 @@ struct btrfs_dev_replace_item {
> #define BTRFS_BLOCK_GROUP_RAID1C4 (1ULL << 10)
> #define BTRFS_BLOCK_GROUP_REMAPPED (1ULL << 11)
> #define BTRFS_BLOCK_GROUP_REMAP (1ULL << 12)
> +#define BTRFS_BLOCK_GROUP_STRIPE_REMOVAL_PENDING (1ULL << 13)
> #define BTRFS_BLOCK_GROUP_RESERVED (BTRFS_AVAIL_ALLOC_BIT_SINGLE | \
> BTRFS_SPACE_INFO_GLOBAL_RSV)
>
> --
> 2.49.1
>
next prev parent reply other threads:[~2025-10-15 5:05 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-09 11:27 [PATCH v3 00/17] Remap tree Mark Harmstone
2025-10-09 11:27 ` [PATCH v3 01/17] btrfs: add definitions and constants for remap-tree Mark Harmstone
2025-10-09 11:27 ` [PATCH v3 02/17] btrfs: add REMAP chunk type Mark Harmstone
2025-10-15 3:37 ` Boris Burkov
2025-10-20 9:58 ` Mark Harmstone
2025-10-20 17:35 ` Boris Burkov
2025-10-09 11:27 ` [PATCH v3 03/17] btrfs: allow remapped chunks to have zero stripes Mark Harmstone
2025-10-15 3:47 ` Boris Burkov
2025-10-20 12:15 ` Mark Harmstone
2025-10-09 11:27 ` [PATCH v3 04/17] btrfs: remove remapped block groups from the free-space tree Mark Harmstone
2025-10-09 11:28 ` [PATCH v3 05/17] btrfs: don't add metadata items for the remap tree to the extent tree Mark Harmstone
2025-10-09 11:28 ` [PATCH v3 06/17] btrfs: add extended version of struct block_group_item Mark Harmstone
2025-10-09 11:28 ` [PATCH v3 07/17] btrfs: allow mounting filesystems with remap-tree incompat flag Mark Harmstone
2025-10-15 3:55 ` Boris Burkov
2025-10-20 11:32 ` Mark Harmstone
2025-10-20 17:44 ` Boris Burkov
2025-10-09 11:28 ` [PATCH v3 08/17] btrfs: redirect I/O for remapped block groups Mark Harmstone
2025-10-15 4:21 ` Boris Burkov
2025-10-20 14:31 ` Mark Harmstone
2025-10-20 17:44 ` Boris Burkov
2025-10-09 11:28 ` [PATCH v3 09/17] btrfs: release BG lock before calling btrfs_link_bg_list() Mark Harmstone
2025-10-09 11:56 ` Filipe Manana
2025-10-09 14:58 ` Mark Harmstone
2025-10-09 15:16 ` Filipe Manana
2025-10-09 16:30 ` Mark Harmstone
2025-10-09 11:28 ` [PATCH v3 10/17] btrfs: handle deletions from remapped block group Mark Harmstone
2025-10-09 11:28 ` [PATCH v3 11/17] btrfs: handle setting up relocation of block group with remap-tree Mark Harmstone
2025-10-09 11:28 ` [PATCH v3 12/17] btrfs: move existing remaps before relocating block group Mark Harmstone
2025-10-09 11:28 ` [PATCH v3 13/17] btrfs: replace identity maps with actual remaps when doing relocations Mark Harmstone
2025-10-09 11:28 ` [PATCH v3 14/17] btrfs: add do_remap param to btrfs_discard_extent() Mark Harmstone
2025-10-09 11:28 ` [PATCH v3 15/17] btrfs: allow balancing remap tree Mark Harmstone
2025-10-15 4:24 ` Boris Burkov
2025-10-09 11:28 ` [PATCH v3 16/17] btrfs: handle discarding fully-remapped block groups Mark Harmstone
2025-10-15 4:54 ` Boris Burkov
2025-10-23 17:35 ` Mark Harmstone
2025-10-09 11:28 ` [PATCH v3 17/17] btrfs: add stripe removal pending flag Mark Harmstone
2025-10-15 5:05 ` Boris Burkov [this message]
2025-10-20 14:52 ` Mark Harmstone
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251015050537.GJ1702774@zen.localdomain \
--to=boris@bur.io \
--cc=linux-btrfs@vger.kernel.org \
--cc=mark@harmstone.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox