public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: nate <linux-xfs@linuxpowered.net>
To: linux-xfs@vger.kernel.org
Subject: Re: XFS reflink copy to different filesystem performance question
Date: Thu, 17 Mar 2022 09:43:55 -0700	[thread overview]
Message-ID: <3d9539b0f931cbb28dc26d68806f0b11@linuxpowered.net> (raw)
In-Reply-To: <20220316222304.GR3927073@dread.disaster.area>

On 2022-03-16 15:23, Dave Chinner wrote:
> reflink is not dedupe. file clones simply make a copy by reference,
> so it doesn't duplicate the data in the first place. IOWs, it ends
> up with a single physical copy that has multiple references to it.
> 
> dedupe is done by a different operation, which requires comparing
> the data in two different locations and if they are the same
> reducing it to a single physical copy with multiple references.

Yeah sorry I didn't phrase that statement right but I understand
the situation.

> IIUC, you are asking about whether you can run a reflink copy on
> the destination before you run rsync, then do a delta sync using
> rsync to only move the changed blocks, so only store the changed
> blocks in the backup image?
> 
> If so, then yes. This is how a reflink-based file-level backup farm
> would work. It is very similar to a hardlink based farm, but instead
> of keeping a repository of every version of the every file that is
> backed up in an object store and then creating the directory
> structure via hardlinks to the object store, it creates the new
> directory structure with reflink copies of the previous version and
> then does delta updates to the files directly.

ok thanks


> I haven't confirmed anything, just made a guess same as you have.

Well good enough for me thanks anyway!


> That sounds more like the dedupe process searching for duplicate
> blocks to dedupe....

I think so too.

> You can use FIEMAP (filefrag(1) or xfs_bmap(8)) to tell you if a
> specific extent is shared or not. But it cannot tell you how many
> references there are to it, nor what file those references belong
> to. For that, you need root permissions, ioctl_getfsmap(2) and
> rmapbt=1 support in your filesystem.

Sounds more complex than I would like to deal with.

> Unless you have an immediate use for filesystem metadata level
> introspection (generally unlikely), there's no need to enable it.

ok thanks for the info.

I am leaving the list now, thanks a bunch for the replies.

nate

      reply	other threads:[~2022-03-17 16:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-16  0:45 XFS reflink copy to different filesystem performance question nate
2022-03-16  8:33 ` Dave Chinner
2022-03-16 17:08   ` nate
2022-03-16 22:23     ` Dave Chinner
2022-03-17 16:43       ` nate [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3d9539b0f931cbb28dc26d68806f0b11@linuxpowered.net \
    --to=linux-xfs@linuxpowered.net \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox