Linux XFS filesystem development
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Damien Le Moal <dlemoal@kernel.org>
Cc: linux-xfs@vger.kernel.org, Carlos Maiolino <cem@kernel.org>,
	Christoph Hellwig <hch@lst.de>,
	Hans Holmberg <hans.holmberg@wdc.com>
Subject: Re: [PATCH v4] xfs: do not tightly pack-write large files
Date: Tue, 14 Oct 2025 14:04:06 -0700	[thread overview]
Message-ID: <20251014210406.GD6188@frogsfrogsfrogs> (raw)
In-Reply-To: <20251014041945.760013-1-dlemoal@kernel.org>

On Tue, Oct 14, 2025 at 01:19:45PM +0900, Damien Le Moal wrote:
> When using a zoned realtime device, tightly packing of data blocks
> belonging to multiple closed files into the same realtime group (RTG)
> is very efficient at improving write performance. This is especially
> true with SMR HDDs as this can reduce, and even suppress, disk head
> seeks.
> 
> However, such tight packing does not make sense for large files that
> require at least a full RTG. If tight packing placement is applied for
> such files, the VM writeback thread switching between inodes result in
> the large files to be fragmented, thus increasing the garbage collection
> penalty later when the RTG needs to be reclaimed.
> 
> This problem can be avoided with a simple heuristic: if the size of the
> inode being written back is at least equal to the RTG size, do not use
> tight-packing. Modify xfs_zoned_pack_tight() to always return false in
> this case.
> 
> With this change, a multi-writer workload writing files of 256 MB on a
> file system backed by an SMR HDD with 256 MB zone size as a realtime
> device sees all files occupying exactly one RTG (i.e. one device zone),
> thus completely removing the heavy fragmentation observed without this
> change.
> 
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>

Seems reasonable to me, it's like tail packing of the old days.
Only now the blocks are 256M, like mkp says. ;)

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
> Changes from v1:
>  - Improved commit message
>  - Improved code comments
> Changes from v2:
>  - Fixed typos in the commit message
> Changes from v3:
>  - Changed code comment as suggested by Christoph.
> 
>  fs/xfs/xfs_zone_alloc.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
> index 1147bacb2da8..1b462cd5d8fa 100644
> --- a/fs/xfs/xfs_zone_alloc.c
> +++ b/fs/xfs/xfs_zone_alloc.c
> @@ -614,14 +614,25 @@ static inline enum rw_hint xfs_inode_write_hint(struct xfs_inode *ip)
>  }
>  
>  /*
> - * Try to pack inodes that are written back after they were closed tight instead
> - * of trying to open new zones for them or spread them to the least recently
> - * used zone.  This optimizes the data layout for workloads that untar or copy
> - * a lot of small files.  Right now this does not separate multiple such
> + * Try to tightly pack small files that are written back after they were closed
> + * instead of trying to open new zones for them or spread them to the least
> + * recently used zone. This optimizes the data layout for workloads that untar
> + * or copy a lot of small files. Right now this does not separate multiple such
>   * streams.
>   */
>  static inline bool xfs_zoned_pack_tight(struct xfs_inode *ip)
>  {
> +	struct xfs_mount *mp = ip->i_mount;
> +	size_t zone_capacity =
> +		XFS_FSB_TO_B(mp, mp->m_groups[XG_TYPE_RTG].blocks);
> +
> +	/*
> +	 * Do not pack write files that are already using a full zone to avoid
> +	 * fragmentation.
> +	 */
> +	if (i_size_read(VFS_I(ip)) >= zone_capacity)
> +		return false;
> +
>  	return !inode_is_open_for_write(VFS_I(ip)) &&
>  		!(ip->i_diflags & XFS_DIFLAG_APPEND);
>  }
> -- 
> 2.51.0
> 
> 

  parent reply	other threads:[~2025-10-14 21:04 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-14  4:19 [PATCH v4] xfs: do not tightly pack-write large files Damien Le Moal
2025-10-14  4:25 ` Christoph Hellwig
2025-10-14 21:04 ` Darrick J. Wong [this message]
2025-10-21  9:37 ` Carlos Maiolino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251014210406.GD6188@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=cem@kernel.org \
    --cc=dlemoal@kernel.org \
    --cc=hans.holmberg@wdc.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox