From: "Darrick J. Wong" <djwong@kernel.org>
To: Wengang Wang <wen.gang.wang@oracle.com>
Cc: "linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
Subject: Re: [PATCH] xfs: dont remain new blocks in cowfork for unshare
Date: Fri, 31 May 2024 09:00:53 -0700 [thread overview]
Message-ID: <20240531160053.GQ52987@frogsfrogsfrogs> (raw)
In-Reply-To: <DA28C74B-7514-48E2-BC86-DA9A9824CDA5@oracle.com>
On Mon, May 20, 2024 at 10:42:02PM +0000, Wengang Wang wrote:
> Thanks Darrick for review, pls see inlines:
>
> > On May 20, 2024, at 11:08 AM, Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > On Fri, May 17, 2024 at 02:26:21PM -0700, Wengang Wang wrote:
> >> Unsharing blocks is implemented by doing CoW to those blocks. That has a side
> >> effect that some new allocatd blocks remain in inode Cow fork. As unsharing blocks
> >
> > allocated
> >
> >> has no hint that future writes would like come to the blocks that follow the
> >> unshared ones, the extra blocks in Cow fork is meaningless.
> >>
> >> This patch makes that no new blocks caused by unshare remain in Cow fork.
> >> The change in xfs_get_extsz_hint() makes the new blocks have more change to be
> >> contigurous in unshare path when there are multiple extents to unshare.
> >
> > contiguous
> >
> Sorry for typos.
>
> > Aha, so you're trying to combat fragmentation by making unshare use
> > delayed allocation so that we try to allocate one big extent all at once
> > instead of doing this piece by piece. Or maybe you also don't want
> > unshare to preallocate cow extents beyond the range requested?
> >
>
> Yes, The main purpose is for the later (avoid preallocating beyond).
But the user set an extent size hint, so presumably they want us to (try
to) obey that even for unshare operations, right?
> The patch also makes unshare use delayed allocation for bigger extent.
If there's a good reason for not trying, can we avoid the iflag by
detecting IOMAP_UNSHARE in the @flags parameter to
xfs_buffered_write_iomap_begin and thereby use delalloc if there isn't
an extent size hint set?
(IOWs I don't really like that an upper layer of the fs sets a flag for
a lower layer to catch based on the context of whatever operation it's
doing, and in the meantime another thread could observe that state and
make different decisions.)
> >> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
> >> ---
> >> fs/xfs/xfs_inode.c | 17 ++++++++++++++++
> >> fs/xfs/xfs_inode.h | 48 +++++++++++++++++++++++---------------------
> >> fs/xfs/xfs_reflink.c | 7 +++++--
> >> 3 files changed, 47 insertions(+), 25 deletions(-)
> >>
> >> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> >> index d55b42b2480d..ade945c8d783 100644
> >> --- a/fs/xfs/xfs_inode.c
> >> +++ b/fs/xfs/xfs_inode.c
> >> @@ -58,6 +58,15 @@ xfs_get_extsz_hint(
> >> */
> >> if (xfs_is_always_cow_inode(ip))
> >> return 0;
> >> +
> >> + /*
> >> + * let xfs_buffered_write_iomap_begin() do delayed allocation
> >> + * in unshare path so that the new blocks have more chance to
> >> + * be contigurous
"contiguous"
> >> + */
> >> + if (xfs_iflags_test(ip, XFS_IUNSHARE))
> >> + return 0;
> >
> > What if the inode is a realtime file? Will this work with the rt
> > delalloc support coming online in 6.10?
>
> This XFS_IUNSHARE is not set in xfs_reflink_unshare() for rt inodes.
> So rt inodes will keep current behavior.
<nod> Please rebase this patch against 6.10-rc1 now that it's out.
--D
> >
> >> +
> >> if ((ip->i_diflags & XFS_DIFLAG_EXTSIZE) && ip->i_extsize)
> >> return ip->i_extsize;
> >> if (XFS_IS_REALTIME_INODE(ip))
> >> @@ -77,6 +86,14 @@ xfs_get_cowextsz_hint(
> >> {
> >> xfs_extlen_t a, b;
> >>
> >> + /*
> >> + * in unshare path, allocate exactly the number of the blocks to be
> >> + * unshared so that no new blocks caused the unshare operation remain
> >> + * in Cow fork after the unshare is done
> >> + */
> >> + if (xfs_iflags_test(ip, XFS_IUNSHARE))
> >> + return 1;
> >
> > Aha, so this is also about turning off speculative preallocations
> > outside the range that's being unshared?
>
> Yes.
>
> >
> >> +
> >> a = 0;
> >> if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE)
> >> a = ip->i_cowextsize;
> >> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
> >> index ab46ffb3ac19..6a8ad68dac1e 100644
> >> --- a/fs/xfs/xfs_inode.h
> >> +++ b/fs/xfs/xfs_inode.h
> >> @@ -207,13 +207,13 @@ xfs_new_eof(struct xfs_inode *ip, xfs_fsize_t new_size)
> >> * i_flags helper functions
> >> */
> >> static inline void
> >> -__xfs_iflags_set(xfs_inode_t *ip, unsigned short flags)
> >> +__xfs_iflags_set(xfs_inode_t *ip, unsigned long flags)
> >
> > I think this is already queued for 6.10.
>
> Good to know.
>
> Thanks,
> Wengang
>
next prev parent reply other threads:[~2024-05-31 16:00 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-17 21:26 [PATCH] xfs: dont remain new blocks in cowfork for unshare Wengang Wang
2024-05-20 18:08 ` Darrick J. Wong
2024-05-20 22:42 ` Wengang Wang
2024-05-31 15:46 ` Wengang Wang
2024-05-31 16:00 ` Darrick J. Wong [this message]
2024-05-31 16:13 ` Christoph Hellwig
2024-05-31 17:53 ` Wengang Wang
2024-06-06 18:16 ` Wengang Wang
2024-06-11 17:00 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240531160053.GQ52987@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=wen.gang.wang@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox