public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@kernel.org>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 2/2] btrfs: defrag: don't use merged extent map for their generation check
Date: Wed, 9 Feb 2022 15:46:20 +0000	[thread overview]
Message-ID: <YgPhzFA3230j2mlf@debian9.Home> (raw)
In-Reply-To: <fe880742bbee1e1c219c7b468300724a3336107d.1644301903.git.wqu@suse.com>

On Tue, Feb 08, 2022 at 02:36:31PM +0800, Qu Wenruo wrote:
> For extent maps, if they are not compressed extents and are adjacent by
> logical addresses and file offsets, they can be merged into one larger
> extent map.
> 
> Such merged extent map will have the higher generation of all the
> original ones.
> 
> This behavior won't affect fsync code, as only extents read from disks
> can be merged.

That is not true. An extent created by a write can be merged after a
fsync runs and logs the extent. So yes, extent maps created by writes
can be merged, but only after an fsync.

> 
> But this brings a problem for autodefrag, as it relies on accurate
> extent_map::generation to determine if one extent should be defragged.
> 
> For merged extent maps, their higher generation can mark some older
> extents to be defragged while the original extent map doesn't meet the
> minimal generation threshold.
> 
> Thus this will cause extra IO.
> 
> So solve the problem, here we introduce a new flag, EXTENT_FLAG_MERGED, to
> indicate if the extent map is merged from one or more ems.
> 
> And for autodefrag, if we find a merged extent map, and its generation
> meets the generation requirement, we just don't use this one, and go
> back to defrag_get_extent() to read em from disk.

Instead of saying from disk, I would say from the subvolume btree. That
may or may not require reading a leaf, and any nodes in the path from the
root to the leaf, from disk.

> 
> This could cause more read IO, but should result less defrag data write,
> so in the long run it should be a win for autodefrag.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/extent_map.c |  2 ++
>  fs/btrfs/extent_map.h |  8 ++++++++
>  fs/btrfs/ioctl.c      | 14 ++++++++++++++
>  3 files changed, 24 insertions(+)
> 
> diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
> index 5a36add21305..c28ceddefae4 100644
> --- a/fs/btrfs/extent_map.c
> +++ b/fs/btrfs/extent_map.c
> @@ -261,6 +261,7 @@ static void try_merge_map(struct extent_map_tree *tree, struct extent_map *em)
>  			em->mod_len = (em->mod_len + em->mod_start) - merge->mod_start;
>  			em->mod_start = merge->mod_start;
>  			em->generation = max(em->generation, merge->generation);
> +			set_bit(EXTENT_FLAG_MERGED, &em->flags);
>  
>  			rb_erase_cached(&merge->rb_node, &tree->map);
>  			RB_CLEAR_NODE(&merge->rb_node);
> @@ -278,6 +279,7 @@ static void try_merge_map(struct extent_map_tree *tree, struct extent_map *em)
>  		RB_CLEAR_NODE(&merge->rb_node);
>  		em->mod_len = (merge->mod_start + merge->mod_len) - em->mod_start;
>  		em->generation = max(em->generation, merge->generation);
> +		set_bit(EXTENT_FLAG_MERGED, &em->flags);
>  		free_extent_map(merge);
>  	}
>  }
> diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h
> index 8e217337dff9..d2fa32ffe304 100644
> --- a/fs/btrfs/extent_map.h
> +++ b/fs/btrfs/extent_map.h
> @@ -25,6 +25,8 @@ enum {
>  	EXTENT_FLAG_FILLING,
>  	/* filesystem extent mapping type */
>  	EXTENT_FLAG_FS_MAPPING,
> +	/* This em is merged from two or more physically adjacent ems */
> +	EXTENT_FLAG_MERGED,
>  };
>  
>  struct extent_map {
> @@ -40,6 +42,12 @@ struct extent_map {
>  	u64 ram_bytes;
>  	u64 block_start;
>  	u64 block_len;
> +
> +	/*
> +	 * Generation of the extent map, for merged em it's the highest
> +	 * generation of all merged ems.
> +	 * For non-merged extents, it's from btrfs_file_extent_item::generation.

Missing the (u64)-1 special case, a temporary value set when writeback starts
and changed when the ordered extent completes.

> +	 */
>  	u64 generation;
>  	unsigned long flags;
>  	/* Used for chunk mappings, flag EXTENT_FLAG_FS_MAPPING must be set */
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index fb4c25e5ff5f..3a5ada561298 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1169,6 +1169,20 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
>  	em = lookup_extent_mapping(em_tree, start, sectorsize);
>  	read_unlock(&em_tree->lock);
>  
> +	/*
> +	 * We can get a merged extent, in that case, we need to re-search
> +	 * tree to get the original em for defrag.
> +	 *
> +	 * If @newer_than is 0 or em::geneartion < newer_than, we can trust

em::geneartion -> em::generation

> +	 * this em, as either we don't care about the geneartion, or the

geneartion, -> generation

The rest looks fine.
Thanks.

> +	 * merged extent map will be rejected anyway.
> +	 */
> +	if (em && test_bit(EXTENT_FLAG_MERGED, &em->flags) &&
> +	    newer_than && em->generation >= newer_than) {
> +		free_extent_map(em);
> +		em = NULL;
> +	}
> +
>  	if (!em) {
>  		struct extent_state *cached = NULL;
>  		u64 end = start + sectorsize - 1;
> -- 
> 2.35.0
> 

  reply	other threads:[~2022-02-09 15:46 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-08  6:36 [PATCH v2 0/2] btrfs: defrag: bring back the old file extent search behavior and address merged extent map generation problem Qu Wenruo
2022-02-08  6:36 ` [PATCH v2 1/2] btrfs: defrag: bring back the old file extent search behavior Qu Wenruo
2022-02-09 15:35   ` Filipe Manana
2022-02-08  6:36 ` [PATCH v2 2/2] btrfs: defrag: don't use merged extent map for their generation check Qu Wenruo
2022-02-09 15:46   ` Filipe Manana [this message]
2022-02-10  0:40     ` Qu Wenruo
2022-02-10 15:49       ` Filipe Manana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YgPhzFA3230j2mlf@debian9.Home \
    --to=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox