linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@kernel.org>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 5/5] btrfs: defrag: allow defrag_one_cluster() to large extent which is not a target
Date: Fri, 4 Feb 2022 16:56:46 +0000	[thread overview]
Message-ID: <Yf1azilXKXv/13lf@debian9.Home> (raw)
In-Reply-To: <6aa51c24be3675deba0c696198d2c64e6113a11f.1643961719.git.wqu@suse.com>

On Fri, Feb 04, 2022 at 04:11:59PM +0800, Qu Wenruo wrote:
> In the rework of btrfs_defrag_file(), we always call
> defrag_one_cluster() and increase the offset by cluster size, which is
> only 256K.
> 
> But there are cases where we have a large extent (e.g. 128M) which
> doesn't need to be defragged at all.
> 
> Before the refactor, we can directly skip the range, but now we have to
> scan that extent map again and again until the cluster moves after the
> non-target extent.
> 
> Fix the problem by allow defrag_one_cluster() to increase
> btrfs_defrag_ctrl::last_scanned to the end of an extent, if and only if
> the last extent of the cluster is not a target.
> 
> The test script looks like this:
> 
> 	mkfs.btrfs -f $dev > /dev/null
> 
> 	mount $dev $mnt
> 
> 	# As btrfs ioctl uses 32M as extent_threshold
> 	xfs_io -f -c "pwrite 0 64M" $mnt/file1
> 	sync
> 	# Some fragemented range to defrag
> 	xfs_io -s -c "pwrite 65548k 4k" \
> 		  -c "pwrite 65544k 4k" \
> 		  -c "pwrite 65540k 4k" \
> 		  -c "pwrite 65536k 4k" \
> 		  $mnt/file1
> 	sync
> 
> 	echo "=== before ==="
> 	xfs_io -c "fiemap -v" $mnt/file1
> 	echo "=== after ==="
> 	btrfs fi defrag $mnt/file1
> 	sync
> 	xfs_io -c "fiemap -v" $mnt/file1
> 	umount $mnt
> 
> With extra ftrace put into defrag_one_cluster(), before the patch it
> would result tons of loops:
> 
> (As defrag_one_cluster() is inlined, the function name is its caller)
> 
>   btrfs-126062  [005] .....  4682.816026: btrfs_defrag_file: r/i=5/257 start=0 len=262144
>   btrfs-126062  [005] .....  4682.816027: btrfs_defrag_file: r/i=5/257 start=262144 len=262144
>   btrfs-126062  [005] .....  4682.816028: btrfs_defrag_file: r/i=5/257 start=524288 len=262144
>   btrfs-126062  [005] .....  4682.816028: btrfs_defrag_file: r/i=5/257 start=786432 len=262144
>   btrfs-126062  [005] .....  4682.816028: btrfs_defrag_file: r/i=5/257 start=1048576 len=262144
>   ...
>   btrfs-126062  [005] .....  4682.816043: btrfs_defrag_file: r/i=5/257 start=67108864 len=262144
> 
> But with this patch there will be just one loop, then directly to the
> end of the extent:
> 
>   btrfs-130471  [014] .....  5434.029558: defrag_one_cluster: r/i=5/257 start=0 len=262144
>   btrfs-130471  [014] .....  5434.029559: defrag_one_cluster: r/i=5/257 start=67108864 len=16384
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/ioctl.c | 40 ++++++++++++++++++++++++++++++++++------
>  1 file changed, 34 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 567a662df118..999173d0925b 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1185,10 +1185,11 @@ struct defrag_target_range {
>   */
>  static int defrag_collect_targets(struct btrfs_inode *inode,
>  				  const struct btrfs_defrag_ctrl *ctrl,
> -				  u64 start, u32 len, bool locked,
> -				  struct list_head *target_list)
> +				  u64 start, u32 len, u64 *last_scanned_ret,
> +				  bool locked, struct list_head *target_list)
>  {
>  	bool do_compress = ctrl->flags & BTRFS_DEFRAG_RANGE_COMPRESS;
> +	bool last_is_target = false;
>  	u64 cur = start;
>  	int ret = 0;
>  
> @@ -1198,6 +1199,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>  		bool next_mergeable = true;
>  		u64 range_len;
>  
> +		last_is_target = false;
>  		em = defrag_lookup_extent(&inode->vfs_inode, cur, locked);
>  		if (!em)
>  			break;
> @@ -1269,6 +1271,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>  		}
>  
>  add:
> +		last_is_target = true;
>  		range_len = min(extent_map_end(em), start + len) - cur;
>  		/*
>  		 * This one is a good target, check if it can be merged into
> @@ -1312,6 +1315,17 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>  			kfree(entry);
>  		}
>  	}
> +	if (!ret && last_scanned_ret) {
> +		/*
> +		 * If the last extent is not a target, the caller can skip to
> +		 * the end of that extent.
> +		 * Otherwise, we can only go the end of the spcified range.
> +		 */
> +		if (!last_is_target)
> +			*last_scanned_ret = cur;
> +		else
> +			*last_scanned_ret = start + len;
> +	}
>  	return ret;
>  }
>  
> @@ -1382,6 +1396,7 @@ static int defrag_one_range(struct btrfs_inode *inode,
>  	const u32 sectorsize = inode->root->fs_info->sectorsize;
>  	u64 last_index = (start + len - 1) >> PAGE_SHIFT;
>  	u64 start_index = start >> PAGE_SHIFT;
> +	u64 last_scanned;
>  	unsigned int nr_pages = last_index - start_index + 1;
>  	int ret = 0;
>  	int i;
> @@ -1416,8 +1431,8 @@ static int defrag_one_range(struct btrfs_inode *inode,
>  	 * And this time we have extent locked already, pass @locked = true
>  	 * so that we won't relock the extent range and cause deadlock.
>  	 */
> -	ret = defrag_collect_targets(inode, ctrl, start, len, true,
> -				     &target_list);
> +	ret = defrag_collect_targets(inode, ctrl, start, len, &last_scanned,
> +				     true, &target_list);
>  	if (ret < 0)
>  		goto unlock_extent;
>  
> @@ -1434,6 +1449,8 @@ static int defrag_one_range(struct btrfs_inode *inode,
>  		list_del_init(&entry->list);
>  		kfree(entry);
>  	}
> +	if (!ret)
> +		ctrl->last_scanned = last_scanned;
>  unlock_extent:
>  	unlock_extent_cached(&inode->io_tree, start_index << PAGE_SHIFT,
>  			     (last_index << PAGE_SHIFT) + PAGE_SIZE - 1,
> @@ -1461,13 +1478,14 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
>  			      struct btrfs_defrag_ctrl *ctrl, u32 len)
>  {
>  	const u32 sectorsize = inode->root->fs_info->sectorsize;
> +	u64 orig_start = ctrl->last_scanned;

Can be made const.
It helps to read the code and avoid regressions where we update the
wrong variable by mistake - the const makes it easy to trigger.

Just a minor comment like in the previous patches. I won't make you send
another version just because of that.

I'll reply to cover letter with the review tag.

Thanks.

>  	struct defrag_target_range *entry;
>  	struct defrag_target_range *tmp;
>  	LIST_HEAD(target_list);
>  	int ret;
>  
>  	ret = defrag_collect_targets(inode, ctrl, ctrl->last_scanned, len,
> -				     false, &target_list);
> +				     NULL, false, &target_list);
>  	if (ret < 0)
>  		goto out;
>  
> @@ -1486,6 +1504,15 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
>  					  (ctrl->max_sectors_to_defrag -
>  					   ctrl->sectors_defragged) * sectorsize);
>  
> +		/*
> +		 * If defrag_one_range() has updated ctrl::last_scanned,
> +		 * our range may already be invalid (e.g. hole punched).
> +		 * Skip if our range is before ctrl::last_scanned, as there is
> +		 * no need to defrag the range anymore.
> +		 */
> +		if (entry->start + range_len <= ctrl->last_scanned)
> +			continue;
> +
>  		if (ra)
>  			page_cache_sync_readahead(inode->vfs_inode.i_mapping,
>  				ra, NULL, entry->start >> PAGE_SHIFT,
> @@ -1500,6 +1527,8 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
>  		list_del_init(&entry->list);
>  		kfree(entry);
>  	}
> +	if (ret >= 0)
> +		ctrl->last_scanned = max(ctrl->last_scanned, orig_start + len);
>  	return ret;
>  }
>  
> @@ -1645,7 +1674,6 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
>  		btrfs_inode_unlock(inode, 0);
>  		if (ret < 0)
>  			break;
> -		ctrl->last_scanned = cluster_end + 1;
>  		if (ret > 0) {
>  			ret = 0;
>  			break;
> -- 
> 2.35.0
> 

  reply	other threads:[~2022-02-04 16:56 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-04  8:11 [PATCH v2 0/5] btrfs: defrag: don't waste CPU time on non-target extent Qu Wenruo
2022-02-04  8:11 ` [PATCH v2 1/5] btrfs: uapi: introduce BTRFS_DEFRAG_RANGE_MASK for later sanity check Qu Wenruo
2022-02-04 16:40   ` Filipe Manana
2022-02-04  8:11 ` [PATCH v2 2/5] btrfs: defrag: introduce btrfs_defrag_ctrl structure for later usage Qu Wenruo
2022-02-04  8:11 ` [PATCH v2 3/5] btrfs: defrag: use btrfs_defrag_ctrl to replace btrfs_ioctl_defrag_range_args for btrfs_defrag_file() Qu Wenruo
2022-02-04 16:45   ` Filipe Manana
2022-02-04  8:11 ` [PATCH v2 4/5] btrfs: defrag: make btrfs_defrag_file() to report accurate number of defragged sectors Qu Wenruo
2022-02-04  8:11 ` [PATCH v2 5/5] btrfs: defrag: allow defrag_one_cluster() to large extent which is not a target Qu Wenruo
2022-02-04 16:56   ` Filipe Manana [this message]
2022-02-04 17:20 ` [PATCH v2 0/5] btrfs: defrag: don't waste CPU time on non-target extent Filipe Manana
2022-02-05  0:10   ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yf1azilXKXv/13lf@debian9.Home \
    --to=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).