public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@kernel.org>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: avoid defragging extents whose next extents are not targets
Date: Tue, 15 Mar 2022 11:15:44 +0000	[thread overview]
Message-ID: <YjB1YO95Vycuhlzo@debian9.Home> (raw)
In-Reply-To: <795e3dee8c4789f845e5e14bfc02c992b86fa2d9.1647306224.git.wqu@suse.com>

On Tue, Mar 15, 2022 at 09:07:52AM +0800, Qu Wenruo wrote:
> [BUG]
> There is a report that autodefrag is defragging single sector, which
> is completely waste of IO, and no help for defragging:
> 
>    btrfs-cleaner-808     defrag_one_locked_range: root=256 ino=651122 start=0 len=4096
> 
> [CAUSE]
> In defrag_collect_targets(), we check if the current range (A) can be merged
> with next one (B).
> 
> If mergeable, we will add range A into target for defrag.
> 
> However there is a catch for autodefrag, when checking mergebility against
> range B, we intentionally pass 0 as @newer_than, hoping to get a
> higher chance to merge with the next extent.
> 
> But in next iteartion, range B will looked up by defrag_lookup_extent(),
> with non-zero @newer_than.
> 
> And if range B is not really newer, it will rejected directly, causing
> only range A being defragged, while we expect to defrag both range A and
> B.
> 
> [FIX]
> Since the root cause is the difference in check condition of
> defrag_check_next_extent() and defrag_collect_targets(), we fix it by:
> 
> 1. Pass @newer_than to defrag_check_next_extent()
> 2. Pass @extent_thresh to defrag_check_next_extent()
> 
> This makes the check between defrag_collect_targets() and
> defrag_check_next_extent() more consistent.
> 
> While there is still some minor difference, the remaining checks are
> focus on runtime flags like writeback/delalloc, which are mostly
> transient and safe to be checked only in defrag_collect_targets().
> 
> Issue: 423#issuecomment-1066981856

Where is the issue exactly? It's the first time I'm seeing an Issue tag
for kernel patches. Is this a github issue? If so, which repo? There are
several repos we use for btrfs:

https://github.com/btrfs/linux
https://github.com/kdave/btrfs-devel
https://github.com/kdave/btrfs-progs
https://github.com/btrfs/fstests
etc

Can't we use a Link tag with an URL? That removes any doubts where the
issue is and makes it easier to look at it.

> Signed-off-by: Qu Wenruo <wqu@suse.com>

Doesn't this miss a Fixes tag and a CC stable tag for 5.16?

This is fixing code added in 5.16, and given that users are reporting
autodefrag causing disruptive amounts of IO, I don't see why it doesn't
have a CC tag for stable.

The change itself looks good. Thanks.

Reviewed-by: Filipe Manana <fdmanana@suse.com>

> ---
>  fs/btrfs/ioctl.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 3d3d6e2f110a..7d7520a2e281 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1189,7 +1189,7 @@ static u32 get_extent_max_capacity(const struct extent_map *em)
>  }
>  
>  static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
> -				     bool locked)
> +				     u32 extent_thresh, u64 newer_than, bool locked)
>  {
>  	struct extent_map *next;
>  	bool ret = false;
> @@ -1199,11 +1199,13 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>  		return false;
>  
>  	/*
> -	 * We want to check if the next extent can be merged with the current
> -	 * one, which can be an extent created in a past generation, so we pass
> -	 * a minimum generation of 0 to defrag_lookup_extent().
> +	 * Here we need to pass @newer_then when checking the next extent, or
> +	 * we will hit a case we mark current extent for defrag, but the next
> +	 * one will not be a target.
> +	 * This will just cause extra IO without really reducing the fragments.
>  	 */
> -	next = defrag_lookup_extent(inode, em->start + em->len, 0, locked);
> +	next = defrag_lookup_extent(inode, em->start + em->len, newer_than,
> +				    locked);
>  	/* No more em or hole */
>  	if (!next || next->block_start >= EXTENT_MAP_LAST_BYTE)
>  		goto out;
> @@ -1215,6 +1217,13 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>  	 */
>  	if (next->len >= get_extent_max_capacity(em))
>  		goto out;
> +	/* Skip older extent */
> +	if (next->generation < newer_than)
> +		goto out;
> +	/* Also check extent size */
> +	if (next->len >= extent_thresh)
> +		goto out;
> +
>  	ret = true;
>  out:
>  	free_extent_map(next);
> @@ -1420,7 +1429,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>  			goto next;
>  
>  		next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,
> -							  locked);
> +						extent_thresh, newer_than, locked);
>  		if (!next_mergeable) {
>  			struct defrag_target_range *last;
>  
> -- 
> 2.35.1
> 

  reply	other threads:[~2022-03-15 11:15 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15  1:07 [PATCH] btrfs: avoid defragging extents whose next extents are not targets Qu Wenruo
2022-03-15 11:15 ` Filipe Manana [this message]
2022-03-15 11:19   ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YjB1YO95Vycuhlzo@debian9.Home \
    --to=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox