From: Filipe Manana <fdmanana@kernel.org>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 5/5] btrfs: defrag: allow defrag_one_cluster() to large extent which is not a target
Date: Fri, 4 Feb 2022 16:56:46 +0000 [thread overview]
Message-ID: <Yf1azilXKXv/13lf@debian9.Home> (raw)
In-Reply-To: <6aa51c24be3675deba0c696198d2c64e6113a11f.1643961719.git.wqu@suse.com>
On Fri, Feb 04, 2022 at 04:11:59PM +0800, Qu Wenruo wrote:
> In the rework of btrfs_defrag_file(), we always call
> defrag_one_cluster() and increase the offset by cluster size, which is
> only 256K.
>
> But there are cases where we have a large extent (e.g. 128M) which
> doesn't need to be defragged at all.
>
> Before the refactor, we can directly skip the range, but now we have to
> scan that extent map again and again until the cluster moves after the
> non-target extent.
>
> Fix the problem by allow defrag_one_cluster() to increase
> btrfs_defrag_ctrl::last_scanned to the end of an extent, if and only if
> the last extent of the cluster is not a target.
>
> The test script looks like this:
>
> mkfs.btrfs -f $dev > /dev/null
>
> mount $dev $mnt
>
> # As btrfs ioctl uses 32M as extent_threshold
> xfs_io -f -c "pwrite 0 64M" $mnt/file1
> sync
> # Some fragemented range to defrag
> xfs_io -s -c "pwrite 65548k 4k" \
> -c "pwrite 65544k 4k" \
> -c "pwrite 65540k 4k" \
> -c "pwrite 65536k 4k" \
> $mnt/file1
> sync
>
> echo "=== before ==="
> xfs_io -c "fiemap -v" $mnt/file1
> echo "=== after ==="
> btrfs fi defrag $mnt/file1
> sync
> xfs_io -c "fiemap -v" $mnt/file1
> umount $mnt
>
> With extra ftrace put into defrag_one_cluster(), before the patch it
> would result tons of loops:
>
> (As defrag_one_cluster() is inlined, the function name is its caller)
>
> btrfs-126062 [005] ..... 4682.816026: btrfs_defrag_file: r/i=5/257 start=0 len=262144
> btrfs-126062 [005] ..... 4682.816027: btrfs_defrag_file: r/i=5/257 start=262144 len=262144
> btrfs-126062 [005] ..... 4682.816028: btrfs_defrag_file: r/i=5/257 start=524288 len=262144
> btrfs-126062 [005] ..... 4682.816028: btrfs_defrag_file: r/i=5/257 start=786432 len=262144
> btrfs-126062 [005] ..... 4682.816028: btrfs_defrag_file: r/i=5/257 start=1048576 len=262144
> ...
> btrfs-126062 [005] ..... 4682.816043: btrfs_defrag_file: r/i=5/257 start=67108864 len=262144
>
> But with this patch there will be just one loop, then directly to the
> end of the extent:
>
> btrfs-130471 [014] ..... 5434.029558: defrag_one_cluster: r/i=5/257 start=0 len=262144
> btrfs-130471 [014] ..... 5434.029559: defrag_one_cluster: r/i=5/257 start=67108864 len=16384
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> fs/btrfs/ioctl.c | 40 ++++++++++++++++++++++++++++++++++------
> 1 file changed, 34 insertions(+), 6 deletions(-)
>
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 567a662df118..999173d0925b 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1185,10 +1185,11 @@ struct defrag_target_range {
> */
> static int defrag_collect_targets(struct btrfs_inode *inode,
> const struct btrfs_defrag_ctrl *ctrl,
> - u64 start, u32 len, bool locked,
> - struct list_head *target_list)
> + u64 start, u32 len, u64 *last_scanned_ret,
> + bool locked, struct list_head *target_list)
> {
> bool do_compress = ctrl->flags & BTRFS_DEFRAG_RANGE_COMPRESS;
> + bool last_is_target = false;
> u64 cur = start;
> int ret = 0;
>
> @@ -1198,6 +1199,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
> bool next_mergeable = true;
> u64 range_len;
>
> + last_is_target = false;
> em = defrag_lookup_extent(&inode->vfs_inode, cur, locked);
> if (!em)
> break;
> @@ -1269,6 +1271,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
> }
>
> add:
> + last_is_target = true;
> range_len = min(extent_map_end(em), start + len) - cur;
> /*
> * This one is a good target, check if it can be merged into
> @@ -1312,6 +1315,17 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
> kfree(entry);
> }
> }
> + if (!ret && last_scanned_ret) {
> + /*
> + * If the last extent is not a target, the caller can skip to
> + * the end of that extent.
> + * Otherwise, we can only go the end of the spcified range.
> + */
> + if (!last_is_target)
> + *last_scanned_ret = cur;
> + else
> + *last_scanned_ret = start + len;
> + }
> return ret;
> }
>
> @@ -1382,6 +1396,7 @@ static int defrag_one_range(struct btrfs_inode *inode,
> const u32 sectorsize = inode->root->fs_info->sectorsize;
> u64 last_index = (start + len - 1) >> PAGE_SHIFT;
> u64 start_index = start >> PAGE_SHIFT;
> + u64 last_scanned;
> unsigned int nr_pages = last_index - start_index + 1;
> int ret = 0;
> int i;
> @@ -1416,8 +1431,8 @@ static int defrag_one_range(struct btrfs_inode *inode,
> * And this time we have extent locked already, pass @locked = true
> * so that we won't relock the extent range and cause deadlock.
> */
> - ret = defrag_collect_targets(inode, ctrl, start, len, true,
> - &target_list);
> + ret = defrag_collect_targets(inode, ctrl, start, len, &last_scanned,
> + true, &target_list);
> if (ret < 0)
> goto unlock_extent;
>
> @@ -1434,6 +1449,8 @@ static int defrag_one_range(struct btrfs_inode *inode,
> list_del_init(&entry->list);
> kfree(entry);
> }
> + if (!ret)
> + ctrl->last_scanned = last_scanned;
> unlock_extent:
> unlock_extent_cached(&inode->io_tree, start_index << PAGE_SHIFT,
> (last_index << PAGE_SHIFT) + PAGE_SIZE - 1,
> @@ -1461,13 +1478,14 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
> struct btrfs_defrag_ctrl *ctrl, u32 len)
> {
> const u32 sectorsize = inode->root->fs_info->sectorsize;
> + u64 orig_start = ctrl->last_scanned;
Can be made const.
It helps to read the code and avoid regressions where we update the
wrong variable by mistake - the const makes it easy to trigger.
Just a minor comment like in the previous patches. I won't make you send
another version just because of that.
I'll reply to cover letter with the review tag.
Thanks.
> struct defrag_target_range *entry;
> struct defrag_target_range *tmp;
> LIST_HEAD(target_list);
> int ret;
>
> ret = defrag_collect_targets(inode, ctrl, ctrl->last_scanned, len,
> - false, &target_list);
> + NULL, false, &target_list);
> if (ret < 0)
> goto out;
>
> @@ -1486,6 +1504,15 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
> (ctrl->max_sectors_to_defrag -
> ctrl->sectors_defragged) * sectorsize);
>
> + /*
> + * If defrag_one_range() has updated ctrl::last_scanned,
> + * our range may already be invalid (e.g. hole punched).
> + * Skip if our range is before ctrl::last_scanned, as there is
> + * no need to defrag the range anymore.
> + */
> + if (entry->start + range_len <= ctrl->last_scanned)
> + continue;
> +
> if (ra)
> page_cache_sync_readahead(inode->vfs_inode.i_mapping,
> ra, NULL, entry->start >> PAGE_SHIFT,
> @@ -1500,6 +1527,8 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
> list_del_init(&entry->list);
> kfree(entry);
> }
> + if (ret >= 0)
> + ctrl->last_scanned = max(ctrl->last_scanned, orig_start + len);
> return ret;
> }
>
> @@ -1645,7 +1674,6 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
> btrfs_inode_unlock(inode, 0);
> if (ret < 0)
> break;
> - ctrl->last_scanned = cluster_end + 1;
> if (ret > 0) {
> ret = 0;
> break;
> --
> 2.35.0
>
next prev parent reply other threads:[~2022-02-04 16:56 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-04 8:11 [PATCH v2 0/5] btrfs: defrag: don't waste CPU time on non-target extent Qu Wenruo
2022-02-04 8:11 ` [PATCH v2 1/5] btrfs: uapi: introduce BTRFS_DEFRAG_RANGE_MASK for later sanity check Qu Wenruo
2022-02-04 16:40 ` Filipe Manana
2022-02-04 8:11 ` [PATCH v2 2/5] btrfs: defrag: introduce btrfs_defrag_ctrl structure for later usage Qu Wenruo
2022-02-04 8:11 ` [PATCH v2 3/5] btrfs: defrag: use btrfs_defrag_ctrl to replace btrfs_ioctl_defrag_range_args for btrfs_defrag_file() Qu Wenruo
2022-02-04 16:45 ` Filipe Manana
2022-02-04 8:11 ` [PATCH v2 4/5] btrfs: defrag: make btrfs_defrag_file() to report accurate number of defragged sectors Qu Wenruo
2022-02-04 8:11 ` [PATCH v2 5/5] btrfs: defrag: allow defrag_one_cluster() to large extent which is not a target Qu Wenruo
2022-02-04 16:56 ` Filipe Manana [this message]
2022-02-04 17:20 ` [PATCH v2 0/5] btrfs: defrag: don't waste CPU time on non-target extent Filipe Manana
2022-02-05 0:10 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yf1azilXKXv/13lf@debian9.Home \
--to=fdmanana@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).