linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] xfs: do not tightly pack-write large files
@ 2025-10-14  4:19 Damien Le Moal
  2025-10-14  4:25 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Damien Le Moal @ 2025-10-14  4:19 UTC (permalink / raw)
  To: linux-xfs, Carlos Maiolino; +Cc: Christoph Hellwig, Hans Holmberg

When using a zoned realtime device, tightly packing of data blocks
belonging to multiple closed files into the same realtime group (RTG)
is very efficient at improving write performance. This is especially
true with SMR HDDs as this can reduce, and even suppress, disk head
seeks.

However, such tight packing does not make sense for large files that
require at least a full RTG. If tight packing placement is applied for
such files, the VM writeback thread switching between inodes result in
the large files to be fragmented, thus increasing the garbage collection
penalty later when the RTG needs to be reclaimed.

This problem can be avoided with a simple heuristic: if the size of the
inode being written back is at least equal to the RTG size, do not use
tight-packing. Modify xfs_zoned_pack_tight() to always return false in
this case.

With this change, a multi-writer workload writing files of 256 MB on a
file system backed by an SMR HDD with 256 MB zone size as a realtime
device sees all files occupying exactly one RTG (i.e. one device zone),
thus completely removing the heavy fragmentation observed without this
change.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
Changes from v1:
 - Improved commit message
 - Improved code comments
Changes from v2:
 - Fixed typos in the commit message
Changes from v3:
 - Changed code comment as suggested by Christoph.

 fs/xfs/xfs_zone_alloc.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
index 1147bacb2da8..1b462cd5d8fa 100644
--- a/fs/xfs/xfs_zone_alloc.c
+++ b/fs/xfs/xfs_zone_alloc.c
@@ -614,14 +614,25 @@ static inline enum rw_hint xfs_inode_write_hint(struct xfs_inode *ip)
 }
 
 /*
- * Try to pack inodes that are written back after they were closed tight instead
- * of trying to open new zones for them or spread them to the least recently
- * used zone.  This optimizes the data layout for workloads that untar or copy
- * a lot of small files.  Right now this does not separate multiple such
+ * Try to tightly pack small files that are written back after they were closed
+ * instead of trying to open new zones for them or spread them to the least
+ * recently used zone. This optimizes the data layout for workloads that untar
+ * or copy a lot of small files. Right now this does not separate multiple such
  * streams.
  */
 static inline bool xfs_zoned_pack_tight(struct xfs_inode *ip)
 {
+	struct xfs_mount *mp = ip->i_mount;
+	size_t zone_capacity =
+		XFS_FSB_TO_B(mp, mp->m_groups[XG_TYPE_RTG].blocks);
+
+	/*
+	 * Do not pack write files that are already using a full zone to avoid
+	 * fragmentation.
+	 */
+	if (i_size_read(VFS_I(ip)) >= zone_capacity)
+		return false;
+
 	return !inode_is_open_for_write(VFS_I(ip)) &&
 		!(ip->i_diflags & XFS_DIFLAG_APPEND);
 }
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v4] xfs: do not tightly pack-write large files
  2025-10-14  4:19 [PATCH v4] xfs: do not tightly pack-write large files Damien Le Moal
@ 2025-10-14  4:25 ` Christoph Hellwig
  2025-10-14 21:04 ` Darrick J. Wong
  2025-10-21  9:37 ` Carlos Maiolino
  2 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2025-10-14  4:25 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-xfs, Carlos Maiolino, Christoph Hellwig, Hans Holmberg

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v4] xfs: do not tightly pack-write large files
  2025-10-14  4:19 [PATCH v4] xfs: do not tightly pack-write large files Damien Le Moal
  2025-10-14  4:25 ` Christoph Hellwig
@ 2025-10-14 21:04 ` Darrick J. Wong
  2025-10-21  9:37 ` Carlos Maiolino
  2 siblings, 0 replies; 4+ messages in thread
From: Darrick J. Wong @ 2025-10-14 21:04 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-xfs, Carlos Maiolino, Christoph Hellwig, Hans Holmberg

On Tue, Oct 14, 2025 at 01:19:45PM +0900, Damien Le Moal wrote:
> When using a zoned realtime device, tightly packing of data blocks
> belonging to multiple closed files into the same realtime group (RTG)
> is very efficient at improving write performance. This is especially
> true with SMR HDDs as this can reduce, and even suppress, disk head
> seeks.
> 
> However, such tight packing does not make sense for large files that
> require at least a full RTG. If tight packing placement is applied for
> such files, the VM writeback thread switching between inodes result in
> the large files to be fragmented, thus increasing the garbage collection
> penalty later when the RTG needs to be reclaimed.
> 
> This problem can be avoided with a simple heuristic: if the size of the
> inode being written back is at least equal to the RTG size, do not use
> tight-packing. Modify xfs_zoned_pack_tight() to always return false in
> this case.
> 
> With this change, a multi-writer workload writing files of 256 MB on a
> file system backed by an SMR HDD with 256 MB zone size as a realtime
> device sees all files occupying exactly one RTG (i.e. one device zone),
> thus completely removing the heavy fragmentation observed without this
> change.
> 
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>

Seems reasonable to me, it's like tail packing of the old days.
Only now the blocks are 256M, like mkp says. ;)

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
> Changes from v1:
>  - Improved commit message
>  - Improved code comments
> Changes from v2:
>  - Fixed typos in the commit message
> Changes from v3:
>  - Changed code comment as suggested by Christoph.
> 
>  fs/xfs/xfs_zone_alloc.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
> index 1147bacb2da8..1b462cd5d8fa 100644
> --- a/fs/xfs/xfs_zone_alloc.c
> +++ b/fs/xfs/xfs_zone_alloc.c
> @@ -614,14 +614,25 @@ static inline enum rw_hint xfs_inode_write_hint(struct xfs_inode *ip)
>  }
>  
>  /*
> - * Try to pack inodes that are written back after they were closed tight instead
> - * of trying to open new zones for them or spread them to the least recently
> - * used zone.  This optimizes the data layout for workloads that untar or copy
> - * a lot of small files.  Right now this does not separate multiple such
> + * Try to tightly pack small files that are written back after they were closed
> + * instead of trying to open new zones for them or spread them to the least
> + * recently used zone. This optimizes the data layout for workloads that untar
> + * or copy a lot of small files. Right now this does not separate multiple such
>   * streams.
>   */
>  static inline bool xfs_zoned_pack_tight(struct xfs_inode *ip)
>  {
> +	struct xfs_mount *mp = ip->i_mount;
> +	size_t zone_capacity =
> +		XFS_FSB_TO_B(mp, mp->m_groups[XG_TYPE_RTG].blocks);
> +
> +	/*
> +	 * Do not pack write files that are already using a full zone to avoid
> +	 * fragmentation.
> +	 */
> +	if (i_size_read(VFS_I(ip)) >= zone_capacity)
> +		return false;
> +
>  	return !inode_is_open_for_write(VFS_I(ip)) &&
>  		!(ip->i_diflags & XFS_DIFLAG_APPEND);
>  }
> -- 
> 2.51.0
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v4] xfs: do not tightly pack-write large files
  2025-10-14  4:19 [PATCH v4] xfs: do not tightly pack-write large files Damien Le Moal
  2025-10-14  4:25 ` Christoph Hellwig
  2025-10-14 21:04 ` Darrick J. Wong
@ 2025-10-21  9:37 ` Carlos Maiolino
  2 siblings, 0 replies; 4+ messages in thread
From: Carlos Maiolino @ 2025-10-21  9:37 UTC (permalink / raw)
  To: linux-xfs, Damien Le Moal; +Cc: Christoph Hellwig, Hans Holmberg

On Tue, 14 Oct 2025 13:19:45 +0900, Damien Le Moal wrote:
> When using a zoned realtime device, tightly packing of data blocks
> belonging to multiple closed files into the same realtime group (RTG)
> is very efficient at improving write performance. This is especially
> true with SMR HDDs as this can reduce, and even suppress, disk head
> seeks.
> 
> However, such tight packing does not make sense for large files that
> require at least a full RTG. If tight packing placement is applied for
> such files, the VM writeback thread switching between inodes result in
> the large files to be fragmented, thus increasing the garbage collection
> penalty later when the RTG needs to be reclaimed.
> 
> [...]

Applied to for-next, thanks!

[1/1] xfs: do not tightly pack-write large files
      commit: b00bcb190eef35ae4da3c424b8a72f287e69f650

Best regards,
-- 
Carlos Maiolino <cem@kernel.org>


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-10-21  9:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-14  4:19 [PATCH v4] xfs: do not tightly pack-write large files Damien Le Moal
2025-10-14  4:25 ` Christoph Hellwig
2025-10-14 21:04 ` Darrick J. Wong
2025-10-21  9:37 ` Carlos Maiolino

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).