Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Mark Harmstone <mark@harmstone.com>,
	linux-btrfs@vger.kernel.org, fdmanana@kernel.org
Subject: Re: [PATCH 2/2] btrfs: clear spurious free-space entries
Date: Wed, 24 Sep 2025 06:35:43 +0930	[thread overview]
Message-ID: <a7ae58b2-6bdc-4f44-bb3f-9068fe51673d@suse.com> (raw)
In-Reply-To: <20250923155523.31617-2-mark@harmstone.com>



在 2025/9/24 01:24, Mark Harmstone 写道:
> Version 6.16.1 of btrfs-progs fixes a broken btrfs check test for
> spurious entries in the free-space tree, those that don't belong to any
> block group. Unfortunately mkfs.btrfs had been generating these, meaning
> that these filesystems will now fail btrfs check.
> 
> Add a compat flag BTRFS_FEATURE_COMPAT_NO_SPURIOUS_FREE_SPACE, and if on
> mount we find this isn't set, clean any spurious entries from the
> beginning of the free-space tree.

I found this compat flag a little overkilled.

Are we really going to introduce a new compat flag every time there is 
something wrong with the free space tree?

> 
> Signed-off-by: Mark Harmstone <mark@harmstone.com>
> ---
>   fs/btrfs/disk-io.c         |  10 ++++
>   fs/btrfs/free-space-tree.c | 115 +++++++++++++++++++++++++++++++++++++
>   fs/btrfs/free-space-tree.h |   1 +
>   include/uapi/linux/btrfs.h |   2 +
>   4 files changed, 128 insertions(+)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 21c2a19d690f..224369c450e4 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3077,6 +3077,16 @@ int btrfs_start_pre_rw_mount(struct btrfs_fs_info *fs_info)
>   		}
>   	}
>   
> +	if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE) &&
> +	    !btrfs_fs_compat(fs_info, NO_SPURIOUS_FREE_SPACE)) {
> +		ret = btrfs_remove_spurious_free_space(fs_info);
> +		if (ret) {
> +			btrfs_warn(fs_info,
> +				   "failed to remove spurious free space: %d",
> +				   ret);
> +		}
> +	}
> +
>   	/*
>   	 * btrfs_find_orphan_roots() is responsible for finding all the dead
>   	 * roots (with 0 refs), flag them with BTRFS_ROOT_DEAD_TREE and load
> diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c
> index dad0b492a663..5980710cf6b5 100644
> --- a/fs/btrfs/free-space-tree.c
> +++ b/fs/btrfs/free-space-tree.c
> @@ -1722,3 +1722,118 @@ int btrfs_load_free_space_tree(struct btrfs_caching_control *caching_ctl)
>   	else
>   		return load_free_space_extents(caching_ctl, path, extent_count);
>   }
> +
> +/*
> + * Earlier versions of mkfs.btrfs created spurious entries at the beginning of
> + * the free-space tree, before the start of any block group.
> + * If the compat flag NO_SPURIOUS_FREE_SPACE is not set, clean these up and
> + * set the flag so we know we don't have to check again.
> + */
> +int btrfs_remove_spurious_free_space(struct btrfs_fs_info *fs_info)
> +{
> +	struct btrfs_root *fst;
> +	struct btrfs_trans_handle *trans;
> +	struct btrfs_key key;
> +	struct extent_buffer *leaf;
> +	struct btrfs_block_group *bg;
> +	u64 bg_start;
> +	BTRFS_PATH_AUTO_FREE(path);
> +	int ret, ret2;
> +	unsigned int entries_to_remove = 0;
> +
> +	struct btrfs_key root_key = {
> +		.objectid = BTRFS_FREE_SPACE_TREE_OBJECTID,
> +		.type = BTRFS_ROOT_ITEM_KEY,
> +		.offset = 0,
> +	};
> +
> +	path = btrfs_alloc_path();
> +	if (!path)
> +		return -ENOMEM;
> +
> +	fst = btrfs_grab_root(btrfs_global_root(fs_info, &root_key));
> +	if (!fst)
> +		return -EINVAL;
> +
> +	trans = btrfs_start_transaction(fst, 0);
> +	if (IS_ERR(trans)) {
> +		ret = PTR_ERR(trans);
> +		goto end;
> +	}
> +
> +	key.objectid = 0;
> +	key.type = 0;
> +	key.offset = 0;
> +
> +	ret = btrfs_search_slot(trans, fst, &key, path, 0, 0);
> +	if (ret < 0)
> +		goto end_trans;
> +
> +	while (true) {
> +		leaf = path->nodes[0];
> +		if (path->slots[0] >= btrfs_header_nritems(leaf)) {
> +			ret = btrfs_next_leaf(fst, path);
> +			if (ret < 0)
> +				goto end_trans;
> +			if (ret > 0)
> +				break;
> +			leaf = path->nodes[0];
> +		}
> +
> +		btrfs_item_key_to_cpu(leaf, &key, path->slots[0]);
> +
> +		bg = btrfs_lookup_first_block_group(fs_info, key.objectid);

We can do the lookup out of the loop. With a parameter 0 for @bytenr.

As we just need to delete any entry before the first bg, there is no 
need to lookup the bg every time.

Furthermore, since the stale entries are just from the temporary chunks, 
they should be pretty small, thus we are able to afford the check at 
every mount.

> +		if (!bg)
> +			break;
> +
> +		bg_start = bg->start;
> +
> +		btrfs_put_block_group(bg);
> +
> +		if (key.objectid >= bg_start)
> +			break;
> +
> +		entries_to_remove++;
> +
> +		path->slots[0]++;
> +	}
> +
> +	if (entries_to_remove == 0) {
> +		ret = 0;
> +		goto end_trans;
> +	}
> +
> +	btrfs_release_path(path);
> +
> +	key.objectid = 0;
> +	key.type = 0;
> +	key.offset = 0;
> +
> +	ret = btrfs_search_slot(trans, fst, &key, path, -1, 1);
> +	if (ret < 0)
> +		goto end_trans;
> +
> +	ret = btrfs_del_items(trans, fst, path, 0, entries_to_remove);

What if @entries_to_remove is larger than the items in the leaf?
I know there should not be that many entries, but we should still take 
extra cares just in case.

Thanks,
Qu
> +	if (ret)
> +		btrfs_abort_transaction(trans, ret);
> +
> +end_trans:
> +	btrfs_release_path(path);
> +
> +	if (!ret)
> +		btrfs_set_fs_compat(fs_info, NO_SPURIOUS_FREE_SPACE);
> +
> +	ret2 = btrfs_commit_transaction(trans);
> +	if (!ret)
> +		ret = ret2;
> +
> +	if (!ret && entries_to_remove > 0) {
> +		btrfs_info(fs_info, "removed %u spurious free-space entries",
> +			   entries_to_remove);
> +	}
> +
> +end:
> +	btrfs_put_root(fst);
> +
> +	return ret;
> +}
> diff --git a/fs/btrfs/free-space-tree.h b/fs/btrfs/free-space-tree.h
> index 3d9a5d4477fc..b501c41acf3b 100644
> --- a/fs/btrfs/free-space-tree.h
> +++ b/fs/btrfs/free-space-tree.h
> @@ -35,6 +35,7 @@ int btrfs_add_to_free_space_tree(struct btrfs_trans_handle *trans,
>   				 u64 start, u64 size);
>   int btrfs_remove_from_free_space_tree(struct btrfs_trans_handle *trans,
>   				      u64 start, u64 size);
> +int btrfs_remove_spurious_free_space(struct btrfs_fs_info *fs_info);
>   
>   #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
>   struct btrfs_free_space_info *
> diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
> index 8e710bbb688e..6219e2b8e334 100644
> --- a/include/uapi/linux/btrfs.h
> +++ b/include/uapi/linux/btrfs.h
> @@ -337,6 +337,8 @@ struct btrfs_ioctl_fs_info_args {
>   #define BTRFS_FEATURE_INCOMPAT_RAID_STRIPE_TREE	(1ULL << 14)
>   #define BTRFS_FEATURE_INCOMPAT_SIMPLE_QUOTA	(1ULL << 16)
>   
> +#define BTRFS_FEATURE_COMPAT_NO_SPURIOUS_FREE_SPACE	(1ULL << 0)
> +
>   struct btrfs_ioctl_feature_flags {
>   	__u64 compat_flags;
>   	__u64 compat_ro_flags;


  parent reply	other threads:[~2025-09-23 21:05 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-23 15:54 [PATCH 1/2] btrfs: add compat flag functions Mark Harmstone
2025-09-23 15:54 ` [PATCH 2/2] btrfs: clear spurious free-space entries Mark Harmstone
2025-09-23 16:35   ` Filipe Manana
2025-09-23 21:05   ` Qu Wenruo [this message]
2025-09-25 11:32     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a7ae58b2-6bdc-4f44-bb3f-9068fe51673d@suse.com \
    --to=wqu@suse.com \
    --cc=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=mark@harmstone.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox