public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Wengang Wang <wen.gang.wang@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 6/9] spaceman/defrag: workaround kernel xfs_reflink_try_clear_inode_flag()
Date: Thu, 1 Aug 2024 08:25:39 +1000	[thread overview]
Message-ID: <Zqq544tguW6FWIkm@dread.disaster.area> (raw)
In-Reply-To: <20240709191028.2329-7-wen.gang.wang@oracle.com>

On Tue, Jul 09, 2024 at 12:10:25PM -0700, Wengang Wang wrote:
> xfs_reflink_try_clear_inode_flag() takes very long in case file has huge number
> of extents and none of the extents are shared.
> 
> workaround:
> share the first real extent so that xfs_reflink_try_clear_inode_flag() returns
> quickly to save cpu times and speed up defrag significantly.
> 
> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
> ---
>  spaceman/defrag.c | 174 +++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 172 insertions(+), 2 deletions(-)

I had some insight on this late last night. The source of the issue
is that both the kernel and the defrag algorithm are walking
forwards across the file. Hence as we get t higher offsets in the
file during defrag which unshares shared ranges, we are moving the
first shared range to be higher in the file.

Hence the act of unsharing the file in ascending offset order
results in the ascending offset search for shared extents done by
the kernel growing in time.

The solution to this is to make defrag work backwards through the
file, so it leaves the low offset shared extents intact for the
kernel to find until the defrag process unshares them. At which
point the kernel will clear the reflink flag and the searching
stops.

IOWs, we either need to change the kernel code to do reverse order
shared extent searching, or change the defrag operation to work in
reverse sequential order, and then the performance problem relating
to unsharing having to determine if the file being defragged is
still shared or not goes away.

That said, I still think that the fact the defrag is completely
unsharing the source file is wrong. If we leave shared extents
intact as we defrag the file, then this problem doesn't need solving
at all because xfs_reflink_try_clear_inode_flag() will hit the same
shared extent near the front of the file every time...

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2024-07-31 22:25 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-09 19:10 [PATCH 0/9] introduce defrag to xfs_spaceman Wengang Wang
2024-07-09 19:10 ` [PATCH 1/9] xfsprogs: introduce defrag command to spaceman Wengang Wang
2024-07-09 21:18   ` Darrick J. Wong
2024-07-11 21:54     ` Wengang Wang
2024-07-15 21:30       ` Wengang Wang
2024-07-15 22:44         ` Darrick J. Wong
2024-07-09 19:10 ` [PATCH 2/9] spaceman/defrag: pick up segments from target file Wengang Wang
2024-07-09 21:50   ` [PATCH 2/9] spaceman/defrag: pick up segments from target fileOM Darrick J. Wong
2024-07-11 22:37     ` Wengang Wang
2024-07-15 23:40   ` [PATCH 2/9] spaceman/defrag: pick up segments from target file Dave Chinner
2024-07-16 20:23     ` Wengang Wang
2024-07-17  4:11       ` Dave Chinner
2024-07-18 19:03         ` Wengang Wang
2024-07-19  4:59           ` Dave Chinner
2024-07-19  4:01         ` Christoph Hellwig
2024-07-24 19:22         ` Wengang Wang
2024-07-30 22:13           ` Dave Chinner
2024-07-09 19:10 ` [PATCH 3/9] spaceman/defrag: defrag segments Wengang Wang
2024-07-09 21:57   ` Darrick J. Wong
2024-07-11 22:49     ` Wengang Wang
2024-07-12 19:07       ` Wengang Wang
2024-07-15 22:42         ` Darrick J. Wong
2024-07-16  0:08   ` Dave Chinner
2024-07-18 18:06     ` Wengang Wang
2024-07-09 19:10 ` [PATCH 4/9] spaceman/defrag: ctrl-c handler Wengang Wang
2024-07-09 21:08   ` Darrick J. Wong
2024-07-11 22:58     ` Wengang Wang
2024-07-15 22:56       ` Darrick J. Wong
2024-07-16 16:21         ` Wengang Wang
2024-07-09 19:10 ` [PATCH 5/9] spaceman/defrag: exclude shared segments on low free space Wengang Wang
2024-07-09 21:05   ` Darrick J. Wong
2024-07-11 23:08     ` Wengang Wang
2024-07-15 22:58       ` Darrick J. Wong
2024-07-09 19:10 ` [PATCH 6/9] spaceman/defrag: workaround kernel xfs_reflink_try_clear_inode_flag() Wengang Wang
2024-07-09 20:51   ` Darrick J. Wong
2024-07-11 23:11     ` Wengang Wang
2024-07-16  0:25   ` Dave Chinner
2024-07-18 18:24     ` Wengang Wang
2024-07-31 22:25   ` Dave Chinner [this message]
2024-07-09 19:10 ` [PATCH 7/9] spaceman/defrag: sleeps between segments Wengang Wang
2024-07-09 20:46   ` Darrick J. Wong
2024-07-11 23:26     ` Wengang Wang
2024-07-11 23:30     ` Wengang Wang
2024-07-09 19:10 ` [PATCH 8/9] spaceman/defrag: readahead for better performance Wengang Wang
2024-07-09 20:27   ` Darrick J. Wong
2024-07-11 23:29     ` Wengang Wang
2024-07-16  0:56   ` Dave Chinner
2024-07-18 18:40     ` Wengang Wang
2024-07-31  3:10       ` Dave Chinner
2024-08-02 18:31         ` Wengang Wang
2024-07-09 19:10 ` [PATCH 9/9] spaceman/defrag: warn on extsize Wengang Wang
2024-07-09 20:21   ` Darrick J. Wong
2024-07-11 23:36     ` Wengang Wang
2024-07-16  0:29       ` Dave Chinner
2024-07-22 18:01         ` Wengang Wang
2024-07-30 22:43           ` Dave Chinner
2024-07-15 23:03 ` [PATCH 0/9] introduce defrag to xfs_spaceman Dave Chinner
2024-07-16 19:45   ` Wengang Wang
2024-07-31  2:51     ` Dave Chinner
2024-08-02 18:14       ` Wengang Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zqq544tguW6FWIkm@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=wen.gang.wang@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox