From: Christoph Hellwig <hch@infradead.org>
To: "Steinar H. Gunderson" <steinar+kernel@gunderson.no>
Cc: Dave Chinner <david@fromorbit.com>, linux-xfs@vger.kernel.org
Subject: Re: Slow deduplication
Date: Mon, 3 Mar 2025 06:03:07 -0800 [thread overview]
Message-ID: <Z8W2m8U9uniM8AAc@infradead.org> (raw)
In-Reply-To: <20250302214933.dkp743wxlo624aj7@sesse.net>
On Sun, Mar 02, 2025 at 10:49:33PM +0100, Steinar H. Gunderson wrote:
> On Mon, Mar 03, 2025 at 08:35:57AM +1100, Dave Chinner wrote:
> > This does comparison one folio at a time and does no readahead.
> > Hence if the data isn't already in cache, it is doing synchronous
> > small reads and waiting for every single one of them. This really
> > should use an internal interface that is capable of issuing
> > readahead...
>
> Yes, I noticed that if I do dummy read() of each extent first,
> it becomes _massively_ faster. I'm not sure if I trust posix_fadvise()
> to just to MADV_WILLNEED given the manpage; would it work (and give
> roughly the same readahead that read() seems to be doing)?
The right thing to do it to just issue readahead in
vfs_dedupe_file_range_compare. The ractl structure is a bit odd so
it'll need slightky more careful thoughts than just a hacked up
one-liner, but it should still be realtively simple. I can look into
it once I find a little time if no one beats me to it.
next prev parent reply other threads:[~2025-03-03 14:03 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-02 8:47 Slow deduplication Steinar H. Gunderson
2025-03-02 21:35 ` Dave Chinner
2025-03-02 21:49 ` Steinar H. Gunderson
2025-03-03 14:03 ` Christoph Hellwig [this message]
2025-03-06 0:35 ` Christoph Hellwig
2025-03-06 8:17 ` Steinar H. Gunderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z8W2m8U9uniM8AAc@infradead.org \
--to=hch@infradead.org \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
--cc=steinar+kernel@gunderson.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox