public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: nate <linux-xfs@linuxpowered.net>
To: linux-xfs@vger.kernel.org
Subject: Re: XFS reflink copy to different filesystem performance question
Date: Wed, 16 Mar 2022 10:08:30 -0700	[thread overview]
Message-ID: <e99689e6c1232ffb564b0c2aecd8b0dd@linuxpowered.net> (raw)
In-Reply-To: <20220316083333.GQ3927073@dread.disaster.area>

On 2022-03-16 1:33, Dave Chinner wrote:

> Yeah, Veeam appears to use the shared data extent functionality in
> XFS for deduplication and cloning. reflink is the use facing name
> for space efficient file cloning (via cp --reflink).

I read bits and pieces about cp --reflink, I guess using that would be
a more "standard" *nix way of using dedupe? For example cp --reflink 
then
using rsync to do a delta sync against the new copy(to get the updates?
Not that I have a need to do this just curious on the workflow.

> I'm guessing that you're trying to copy a deduplicated file,
> resulting in the same physical blocks being read over and over again
> at different file offsets and causing the disks to seek because it's
> not physically sequential data.

Thanks for confirming that, it's what I suspected.

[..]

> Maybe they are doing that with FIEMAP to resolve deduplicated
> regions and caching them, or they have some other infomration in
> their backup/deduplication data store that allows them to optimise
> the IO. You'll need to actually run things like strace on the copies
> to find out exactly what it is doing....

ok thanks for the info. I do see a couple of times there are periods of 
lots
of disk reads on the source and no writes happening on the destination
I guess it is sorting through what it needs to get, one of those lasted
about 20mins.

> No, they don't exist because largely reading a reflinked file
> performs no differently to reading a non-shared file.

Good to know, certainly would be nice if there was at least a way to
identify a file as having X number of links.

> To do that efficiently (i.e. without a full filesystem scan) you
> need to look up the filesystem reverse mapping table to find all the
> owners of pointers to a given block.  I bet you didn't make the
> filesystem with "-m rmapbt=1" to enable that functionality - nobody
> does that unless they have a reason to because it's not enabled by
> default (yet).

I'm sure I did not do that either, but I can do that if you think it
would be advantageous. I do plan to ship this DL380Gen10 XFS system to
another location and am happy to reformat the XFS volume with that extra
option if it would be useful.

I don't anticipate needing to deal directly with this reflinked data,
just let Veeam do it's thing. Thanks for clearing things up for
me so quickly!

nate


  reply	other threads:[~2022-03-16 17:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-16  0:45 XFS reflink copy to different filesystem performance question nate
2022-03-16  8:33 ` Dave Chinner
2022-03-16 17:08   ` nate [this message]
2022-03-16 22:23     ` Dave Chinner
2022-03-17 16:43       ` nate

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e99689e6c1232ffb564b0c2aecd8b0dd@linuxpowered.net \
    --to=linux-xfs@linuxpowered.net \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox