From: nate <linux-xfs@linuxpowered.net>
To: linux-xfs@vger.kernel.org
Subject: Re: XFS reflink copy to different filesystem performance question
Date: Thu, 17 Mar 2022 09:43:55 -0700 [thread overview]
Message-ID: <3d9539b0f931cbb28dc26d68806f0b11@linuxpowered.net> (raw)
In-Reply-To: <20220316222304.GR3927073@dread.disaster.area>
On 2022-03-16 15:23, Dave Chinner wrote:
> reflink is not dedupe. file clones simply make a copy by reference,
> so it doesn't duplicate the data in the first place. IOWs, it ends
> up with a single physical copy that has multiple references to it.
>
> dedupe is done by a different operation, which requires comparing
> the data in two different locations and if they are the same
> reducing it to a single physical copy with multiple references.
Yeah sorry I didn't phrase that statement right but I understand
the situation.
> IIUC, you are asking about whether you can run a reflink copy on
> the destination before you run rsync, then do a delta sync using
> rsync to only move the changed blocks, so only store the changed
> blocks in the backup image?
>
> If so, then yes. This is how a reflink-based file-level backup farm
> would work. It is very similar to a hardlink based farm, but instead
> of keeping a repository of every version of the every file that is
> backed up in an object store and then creating the directory
> structure via hardlinks to the object store, it creates the new
> directory structure with reflink copies of the previous version and
> then does delta updates to the files directly.
ok thanks
> I haven't confirmed anything, just made a guess same as you have.
Well good enough for me thanks anyway!
> That sounds more like the dedupe process searching for duplicate
> blocks to dedupe....
I think so too.
> You can use FIEMAP (filefrag(1) or xfs_bmap(8)) to tell you if a
> specific extent is shared or not. But it cannot tell you how many
> references there are to it, nor what file those references belong
> to. For that, you need root permissions, ioctl_getfsmap(2) and
> rmapbt=1 support in your filesystem.
Sounds more complex than I would like to deal with.
> Unless you have an immediate use for filesystem metadata level
> introspection (generally unlikely), there's no need to enable it.
ok thanks for the info.
I am leaving the list now, thanks a bunch for the replies.
nate
prev parent reply other threads:[~2022-03-17 16:43 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-16 0:45 XFS reflink copy to different filesystem performance question nate
2022-03-16 8:33 ` Dave Chinner
2022-03-16 17:08 ` nate
2022-03-16 22:23 ` Dave Chinner
2022-03-17 16:43 ` nate [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3d9539b0f931cbb28dc26d68806f0b11@linuxpowered.net \
--to=linux-xfs@linuxpowered.net \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox