From: "Darrick J. Wong" <djwong@kernel.org>
To: Damien Le Moal <dlemoal@kernel.org>
Cc: linux-xfs@vger.kernel.org, Carlos Maiolino <cem@kernel.org>,
Christoph Hellwig <hch@lst.de>,
Hans Holmberg <hans.holmberg@wdc.com>
Subject: Re: [PATCH v4] xfs: do not tightly pack-write large files
Date: Tue, 14 Oct 2025 14:04:06 -0700 [thread overview]
Message-ID: <20251014210406.GD6188@frogsfrogsfrogs> (raw)
In-Reply-To: <20251014041945.760013-1-dlemoal@kernel.org>
On Tue, Oct 14, 2025 at 01:19:45PM +0900, Damien Le Moal wrote:
> When using a zoned realtime device, tightly packing of data blocks
> belonging to multiple closed files into the same realtime group (RTG)
> is very efficient at improving write performance. This is especially
> true with SMR HDDs as this can reduce, and even suppress, disk head
> seeks.
>
> However, such tight packing does not make sense for large files that
> require at least a full RTG. If tight packing placement is applied for
> such files, the VM writeback thread switching between inodes result in
> the large files to be fragmented, thus increasing the garbage collection
> penalty later when the RTG needs to be reclaimed.
>
> This problem can be avoided with a simple heuristic: if the size of the
> inode being written back is at least equal to the RTG size, do not use
> tight-packing. Modify xfs_zoned_pack_tight() to always return false in
> this case.
>
> With this change, a multi-writer workload writing files of 256 MB on a
> file system backed by an SMR HDD with 256 MB zone size as a realtime
> device sees all files occupying exactly one RTG (i.e. one device zone),
> thus completely removing the heavy fragmentation observed without this
> change.
>
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Seems reasonable to me, it's like tail packing of the old days.
Only now the blocks are 256M, like mkp says. ;)
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> Changes from v1:
> - Improved commit message
> - Improved code comments
> Changes from v2:
> - Fixed typos in the commit message
> Changes from v3:
> - Changed code comment as suggested by Christoph.
>
> fs/xfs/xfs_zone_alloc.c | 19 +++++++++++++++----
> 1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c
> index 1147bacb2da8..1b462cd5d8fa 100644
> --- a/fs/xfs/xfs_zone_alloc.c
> +++ b/fs/xfs/xfs_zone_alloc.c
> @@ -614,14 +614,25 @@ static inline enum rw_hint xfs_inode_write_hint(struct xfs_inode *ip)
> }
>
> /*
> - * Try to pack inodes that are written back after they were closed tight instead
> - * of trying to open new zones for them or spread them to the least recently
> - * used zone. This optimizes the data layout for workloads that untar or copy
> - * a lot of small files. Right now this does not separate multiple such
> + * Try to tightly pack small files that are written back after they were closed
> + * instead of trying to open new zones for them or spread them to the least
> + * recently used zone. This optimizes the data layout for workloads that untar
> + * or copy a lot of small files. Right now this does not separate multiple such
> * streams.
> */
> static inline bool xfs_zoned_pack_tight(struct xfs_inode *ip)
> {
> + struct xfs_mount *mp = ip->i_mount;
> + size_t zone_capacity =
> + XFS_FSB_TO_B(mp, mp->m_groups[XG_TYPE_RTG].blocks);
> +
> + /*
> + * Do not pack write files that are already using a full zone to avoid
> + * fragmentation.
> + */
> + if (i_size_read(VFS_I(ip)) >= zone_capacity)
> + return false;
> +
> return !inode_is_open_for_write(VFS_I(ip)) &&
> !(ip->i_diflags & XFS_DIFLAG_APPEND);
> }
> --
> 2.51.0
>
>
next prev parent reply other threads:[~2025-10-14 21:04 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-14 4:19 [PATCH v4] xfs: do not tightly pack-write large files Damien Le Moal
2025-10-14 4:25 ` Christoph Hellwig
2025-10-14 21:04 ` Darrick J. Wong [this message]
2025-10-21 9:37 ` Carlos Maiolino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251014210406.GD6188@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=cem@kernel.org \
--cc=dlemoal@kernel.org \
--cc=hans.holmberg@wdc.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox