* [PATCH] xfs: do not tight-pack write large files
@ 2025-10-13 6:45 Damien Le Moal
2025-10-13 7:09 ` Christoph Hellwig
0 siblings, 1 reply; 2+ messages in thread
From: Damien Le Moal @ 2025-10-13 6:45 UTC (permalink / raw)
To: linux-xfs, Carlos Maiolino; +Cc: Christoph Hellwig, Hans Holmberg
The tick-packing data block allocation which writes blocks of closed
files in the same zone is very efficient at improving write performance
on HDDs by reducing, and even suppressing, disk head seeks. However,
such tight packing does not make sense for large files that require at
least a full realtime block group (i.e. a zone). If tight-packing
placement is applied for such files, the VM writeback thread switching
between inodes result in the large file to be fragmented, thus
increasing the garbage collection penalty later when the used realtime
block group/zone needs to be reclaimed.
This problem can be avoided with a simple heuristic: if the size of the
inode being written back is at least equal to the realtime block group
size, do not use tight-packing. Modify xfs_zoned_pack_tight() to always
return false in this case.
With this change, a multi-writer workload writing files of 256 MB on a
file system backed by an SMR HDD with 256 MB zone size sees all files
occupying exactly one zone, thus completely removing the heavy
fragmentation observed without this change.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
fs/xfs/xfs_zone_alloc.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
index 1147bacb2da8..c51788550c7c 100644
--- a/fs/xfs/xfs_zone_alloc.c
+++ b/fs/xfs/xfs_zone_alloc.c
@@ -622,6 +622,17 @@ static inline enum rw_hint xfs_inode_write_hint(struct xfs_inode *ip)
*/
static inline bool xfs_zoned_pack_tight(struct xfs_inode *ip)
{
+ struct xfs_mount *mp = ip->i_mount;
+ size_t zone_capacity =
+ XFS_FSB_TO_B(mp, mp->m_groups[XG_TYPE_RTG].blocks);
+
+ /*
+ * Do not pack tight large files that are already using a full group
+ * (zone) to avoid fragmentation.
+ */
+ if (i_size_read(VFS_I(ip)) >= zone_capacity)
+ return false;
+
return !inode_is_open_for_write(VFS_I(ip)) &&
!(ip->i_diflags & XFS_DIFLAG_APPEND);
}
--
2.51.0
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH] xfs: do not tight-pack write large files
2025-10-13 6:45 [PATCH] xfs: do not tight-pack write large files Damien Le Moal
@ 2025-10-13 7:09 ` Christoph Hellwig
0 siblings, 0 replies; 2+ messages in thread
From: Christoph Hellwig @ 2025-10-13 7:09 UTC (permalink / raw)
To: Damien Le Moal
Cc: linux-xfs, Carlos Maiolino, Christoph Hellwig, Hans Holmberg
On Mon, Oct 13, 2025 at 03:45:12PM +0900, Damien Le Moal wrote:
> The tick-packing data block allocation which writes blocks of closed
> files in the same zone is very efficient at improving write performance
> on HDDs by reducing, and even suppressing, disk head seeks. However,
> such tight packing does not make sense for large files that require at
> least a full realtime block group (i.e. a zone). If tight-packing
> placement is applied for such files, the VM writeback thread switching
> between inodes result in the large file to be fragmented, thus
> increasing the garbage collection penalty later when the used realtime
> block group/zone needs to be reclaimed.
>
> This problem can be avoided with a simple heuristic: if the size of the
> inode being written back is at least equal to the realtime block group
> size, do not use tight-packing. Modify xfs_zoned_pack_tight() to always
> return false in this case.
>
> With this change, a multi-writer workload writing files of 256 MB on a
> file system backed by an SMR HDD with 256 MB zone size sees all files
> occupying exactly one zone, thus completely removing the heavy
> fragmentation observed without this change.
>
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> ---
> fs/xfs/xfs_zone_alloc.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
> index 1147bacb2da8..c51788550c7c 100644
> --- a/fs/xfs/xfs_zone_alloc.c
> +++ b/fs/xfs/xfs_zone_alloc.c
> @@ -622,6 +622,17 @@ static inline enum rw_hint xfs_inode_write_hint(struct xfs_inode *ip)
> */
> static inline bool xfs_zoned_pack_tight(struct xfs_inode *ip)
> {
> + struct xfs_mount *mp = ip->i_mount;
> + size_t zone_capacity =
> + XFS_FSB_TO_B(mp, mp->m_groups[XG_TYPE_RTG].blocks);
> +
> + /*
> + * Do not pack tight large files that are already using a full group
I'm not a native speaker, but shouldn't this be ordered differently
Do not pack large files that are already using a full group (zone)
to avoid fragmentation?
Also I'd say either zone or RTG. but not mix both names to avoid confusion.
Otherwise this looks good to me.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-10-13 7:09 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-13 6:45 [PATCH] xfs: do not tight-pack write large files Damien Le Moal
2025-10-13 7:09 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox