* BTRFS deduplication
@ 2011-05-12 5:52 Swâmi Petaramesh
2011-05-12 15:41 ` Josef Bacik
0 siblings, 1 reply; 10+ messages in thread
From: Swâmi Petaramesh @ 2011-05-12 5:52 UTC (permalink / raw)
To: Linux BTRFS
Hi again list,
I've seen in a message dating back to january that offline deduplication
has been implemented in BTRFS, but I can't find it in my btrfs-tools
0.19+20100601-3ubuntu2
Has it reached release, or not yet ? How could I give it a try ?
I've seen a discussion about whether deduplication should be made
offline or online ; my usage case it to backup a number of laptops
having all about the same software and many files in common, to a single
backup server using rsync, I would be very much interested in online
deduplication - because I don't have "n" times the storage space, that
offline dedup might temporarily need, and because performance isn't
crucial for this application, as backups can be done overnight...
Thanks in advance :-)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS deduplication
2011-05-12 5:52 BTRFS deduplication Swâmi Petaramesh
@ 2011-05-12 15:41 ` Josef Bacik
0 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2011-05-12 15:41 UTC (permalink / raw)
To: Swâmi Petaramesh; +Cc: Linux BTRFS
On Thu, May 12, 2011 at 07:52:20AM +0200, Sw=E2mi Petaramesh wrote:
> Hi again list,
>=20
> I've seen in a message dating back to january that offline deduplicat=
ion
> has been implemented in BTRFS, but I can't find it in my btrfs-tools
> 0.19+20100601-3ubuntu2
>=20
> Has it reached release, or not yet ? How could I give it a try ?
>=20
> I've seen a discussion about whether deduplication should be made
> offline or online ; my usage case it to backup a number of laptops
> having all about the same software and many files in common, to a sin=
gle
> backup server using rsync, I would be very much interested in online
> deduplication - because I don't have "n" times the storage space, tha=
t
> offline dedup might temporarily need, and because performance isn't
> crucial for this application, as backups can be done overnight...
>=20
> Thanks in advance :-)
>=20
So the btrfs-progs patch only exists on the mailing list and the kernel=
patch is
sitting in my git tree. This was more of a weekend project and less of=
a
serious attempt at an actual solution. It could be cleaned up and actu=
ally
used, but I'm not at all interested in doing that :). Thanks,
Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* BTRFS Deduplication
@ 2017-09-11 6:05 shally verma
2017-09-11 6:46 ` Qu Wenruo
0 siblings, 1 reply; 10+ messages in thread
From: shally verma @ 2017-09-11 6:05 UTC (permalink / raw)
To: linux-btrfs; +Cc: Verma, Shally
I was going through BTRFS Deduplication page
(https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read
"As such, xfs_io, is able to perform deduplication on a BTRFS file system," ..
following this, I followed on to xfs_io link https://linux.die.net/man/8/xfs_io
As I understand, these are set of commands allow us to do different
operations on "xfs" filesystem.
and command set mentioned here, couldn't see which is command to
invoke dedupe task.
and how this works with BTRFS.
So, can anyone help here and point me what am I missing here.
Thanks
Shally
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication
2017-09-11 6:05 BTRFS Deduplication shally verma
@ 2017-09-11 6:46 ` Qu Wenruo
2017-09-11 7:54 ` shally verma
0 siblings, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2017-09-11 6:46 UTC (permalink / raw)
To: shally verma, linux-btrfs; +Cc: Verma, Shally
On 2017年09月11日 14:05, shally verma wrote:
> I was going through BTRFS Deduplication page
> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read
>
> "As such, xfs_io, is able to perform deduplication on a BTRFS file system," ..
>
> following this, I followed on to xfs_io link https://linux.die.net/man/8/xfs_io
>
> As I understand, these are set of commands allow us to do different
> operations on "xfs" filesystem.
Nope, it's just a tool triggering different read/write or ioctls.
In fact most of its command is fs independent.
Only a limited number of operations are only supported by XFS.
It's just due to historical reasons it's still named as xfs_io.
I won't be surprised if one day it's split as an independent tool.
> and command set mentioned here, couldn't see which is command to
> invoke dedupe task.
"dedupe" and "reflink" command.
> and how this works with BTRFS.
Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use it
to determine if two ranges are containing identical data.
And if they are identical, we use FICLONERANGE or BTRFS_IOC_CLONE_RANGE
ioctl to reflink one to another, freeing one of them.
BTW nowadays, such dedupe and reflink ioctl is genericized in VFS.
file_operations structure now includes both clone_file_range() and
dedupe_file_range() callbacks now.
Thanks,
Qu
>
> So, can anyone help here and point me what am I missing here.
>
> Thanks
> Shally
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication
2017-09-11 6:46 ` Qu Wenruo
@ 2017-09-11 7:54 ` shally verma
2017-09-11 8:12 ` Qu Wenruo
0 siblings, 1 reply; 10+ messages in thread
From: shally verma @ 2017-09-11 7:54 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs, Verma, Shally
On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
> On 2017年09月11日 14:05, shally verma wrote:
>>
>> I was going through BTRFS Deduplication page
>> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read
>>
>> "As such, xfs_io, is able to perform deduplication on a BTRFS file
>> system," ..
>>
>> following this, I followed on to xfs_io link
>> https://linux.die.net/man/8/xfs_io
>>
>> As I understand, these are set of commands allow us to do different
>> operations on "xfs" filesystem.
>
>
> Nope, it's just a tool triggering different read/write or ioctls.
> In fact most of its command is fs independent.
> Only a limited number of operations are only supported by XFS.
>
> It's just due to historical reasons it's still named as xfs_io.
>
> I won't be surprised if one day it's split as an independent tool.
>
>> and command set mentioned here, couldn't see which is command to
>> invoke dedupe task.
>
>
> "dedupe" and "reflink" command.
Oh. That means page link referred on BTRFS Wiki page is not updated
with this. I googled another page that has reference of these two
command in xfs_io here
https://www.systutorials.com/docs/linux/man/8-xfs_io/
May be Wiki need an update here.
>
>> and how this works with BTRFS.
>
>
> Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use it to
> determine if two ranges are containing identical data.
>
> And if they are identical, we use FICLONERANGE or BTRFS_IOC_CLONE_RANGE
> ioctl to reflink one to another, freeing one of them.
>
> BTW nowadays, such dedupe and reflink ioctl is genericized in VFS.
> file_operations structure now includes both clone_file_range() and
> dedupe_file_range() callbacks now.
Yea. Understand that part. So going by description of "dedupe" and
"reflink", seems through these commands, one can do deduplication part
and NOT duplicate find part. That's still out of xfs_io command scope.
Is that understanding correct?
Thanks
Shally
>
> Thanks,
> Qu
>>
>>
>> So, can anyone help here and point me what am I missing here.
>>
>> Thanks
>> Shally
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication
2017-09-11 7:54 ` shally verma
@ 2017-09-11 8:12 ` Qu Wenruo
2017-09-11 8:57 ` shally verma
0 siblings, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2017-09-11 8:12 UTC (permalink / raw)
To: shally verma; +Cc: linux-btrfs, Verma, Shally
On 2017年09月11日 15:54, shally verma wrote:
> On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>> On 2017年09月11日 14:05, shally verma wrote:
>>>
>>> I was going through BTRFS Deduplication page
>>> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read
>>>
>>> "As such, xfs_io, is able to perform deduplication on a BTRFS file
>>> system," ..
>>>
>>> following this, I followed on to xfs_io link
>>> https://linux.die.net/man/8/xfs_io
>>>
>>> As I understand, these are set of commands allow us to do different
>>> operations on "xfs" filesystem.
>>
>>
>> Nope, it's just a tool triggering different read/write or ioctls.
>> In fact most of its command is fs independent.
>> Only a limited number of operations are only supported by XFS.
>>
>> It's just due to historical reasons it's still named as xfs_io.
>>
>> I won't be surprised if one day it's split as an independent tool.
>>
>>> and command set mentioned here, couldn't see which is command to
>>> invoke dedupe task.
>>
>>
>> "dedupe" and "reflink" command.
> Oh. That means page link referred on BTRFS Wiki page is not updated
> with this. I googled another page that has reference of these two
> command in xfs_io here
> https://www.systutorials.com/docs/linux/man/8-xfs_io/
> May be Wiki need an update here.
If XFS has a regularly updated online man page, we can just use that.
(But unfortunately, not every fs user tools use asciidoc like btrfs,
which can generate both man page and html).
>
>>
>>> and how this works with BTRFS.
>>
>>
>> Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use it to
>> determine if two ranges are containing identical data.
>>
>> And if they are identical, we use FICLONERANGE or BTRFS_IOC_CLONE_RANGE
>> ioctl to reflink one to another, freeing one of them.
>>
>> BTW nowadays, such dedupe and reflink ioctl is genericized in VFS.
>> file_operations structure now includes both clone_file_range() and
>> dedupe_file_range() callbacks now.
> Yea. Understand that part. So going by description of "dedupe" and
> "reflink", seems through these commands, one can do deduplication part
> and NOT duplicate find part.
Yes, one don't need to call "dedupe" ioctl if they already knows some
data is identical and can go reflink straightforward.
> That's still out of xfs_io command scope.
Not sure what the scope here you mean, sorry for that.
Since xfs_io can be used to find duplication, and can remove
duplication, I don't find anything strange in that wiki page.
(Especially considering how popular the tool is, you can't find any more
handy tool than xfs_io)
Thanks,
Qu
> Is that understanding correct?
> Thanks
> Shally
>>
>> Thanks,
>> Qu
>>>
>>>
>>> So, can anyone help here and point me what am I missing here.
>>>
>>> Thanks
>>> Shally
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication
2017-09-11 8:12 ` Qu Wenruo
@ 2017-09-11 8:57 ` shally verma
2017-09-11 9:14 ` Qu Wenruo
0 siblings, 1 reply; 10+ messages in thread
From: shally verma @ 2017-09-11 8:57 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs, Verma, Shally
On Mon, Sep 11, 2017 at 1:42 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
> On 2017年09月11日 15:54, shally verma wrote:
>>
>> On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.btrfs@gmx.com>
>> wrote:
>>>
>>>
>>>
>>> On 2017年09月11日 14:05, shally verma wrote:
>>>>
>>>>
>>>> I was going through BTRFS Deduplication page
>>>> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read
>>>>
>>>> "As such, xfs_io, is able to perform deduplication on a BTRFS file
>>>> system," ..
>>>>
>>>> following this, I followed on to xfs_io link
>>>> https://linux.die.net/man/8/xfs_io
>>>>
>>>> As I understand, these are set of commands allow us to do different
>>>> operations on "xfs" filesystem.
>>>
>>>
>>>
>>> Nope, it's just a tool triggering different read/write or ioctls.
>>> In fact most of its command is fs independent.
>>> Only a limited number of operations are only supported by XFS.
>>>
>>> It's just due to historical reasons it's still named as xfs_io.
>>>
>>> I won't be surprised if one day it's split as an independent tool.
>>>
>>>> and command set mentioned here, couldn't see which is command to
>>>> invoke dedupe task.
>>>
>>>
>>>
>>> "dedupe" and "reflink" command.
>>
>> Oh. That means page link referred on BTRFS Wiki page is not updated
>> with this. I googled another page that has reference of these two
>> command in xfs_io here
>> https://www.systutorials.com/docs/linux/man/8-xfs_io/
>> May be Wiki need an update here.
>
>
> If XFS has a regularly updated online man page, we can just use that.
> (But unfortunately, not every fs user tools use asciidoc like btrfs, which
> can generate both man page and html).
>
>>
>>>
>>>> and how this works with BTRFS.
>>>
>>>
>>>
>>> Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use it
>>> to
>>> determine if two ranges are containing identical data.
>>>
>>> And if they are identical, we use FICLONERANGE or BTRFS_IOC_CLONE_RANGE
>>> ioctl to reflink one to another, freeing one of them.
>>>
>>> BTW nowadays, such dedupe and reflink ioctl is genericized in VFS.
>>> file_operations structure now includes both clone_file_range() and
>>> dedupe_file_range() callbacks now.
>>
>> Yea. Understand that part. So going by description of "dedupe" and
>> "reflink", seems through these commands, one can do deduplication part
>> and NOT duplicate find part.
>
>
> Yes, one don't need to call "dedupe" ioctl if they already knows some data
> is identical and can go reflink straightforward.
>
>> That's still out of xfs_io command scope.
>
>
> Not sure what the scope here you mean, sorry for that.
>
By "scope", I meant duplicate find part but that contradicts statement
you just written below:
> Since xfs_io can be used to find duplication,
Since "dedupe" command input only a "source file" and src and
dst_offset within that, so it can deduplicate the content within a
file where actual FS dedupe IOCTL can first ensure if two extents are
identical and if yes, then deduplicate them.
Is that correct?
Thanks
Shally
and can remove duplication, I
> don't find anything strange in that wiki page.
> (Especially considering how popular the tool is, you can't find any more
> handy tool than xfs_io)
>
> Thanks,
> Qu
>
>
>> Is that understanding correct?
>> Thanks
>> Shally
>>>
>>>
>>> Thanks,
>>> Qu
>>>>
>>>>
>>>>
>>>> So, can anyone help here and point me what am I missing here.
>>>>
>>>> Thanks
>>>> Shally
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>>>> in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication
2017-09-11 8:57 ` shally verma
@ 2017-09-11 9:14 ` Qu Wenruo
2017-09-11 9:25 ` Qu Wenruo
0 siblings, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2017-09-11 9:14 UTC (permalink / raw)
To: shally verma; +Cc: linux-btrfs, Verma, Shally
On 2017年09月11日 16:57, shally verma wrote:
> On Mon, Sep 11, 2017 at 1:42 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>> On 2017年09月11日 15:54, shally verma wrote:
>>>
>>> On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.btrfs@gmx.com>
>>> wrote:
>>>>
>>>>
>>>>
>>>> On 2017年09月11日 14:05, shally verma wrote:
>>>>>
>>>>>
>>>>> I was going through BTRFS Deduplication page
>>>>> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read
>>>>>
>>>>> "As such, xfs_io, is able to perform deduplication on a BTRFS file
>>>>> system," ..
>>>>>
>>>>> following this, I followed on to xfs_io link
>>>>> https://linux.die.net/man/8/xfs_io
>>>>>
>>>>> As I understand, these are set of commands allow us to do different
>>>>> operations on "xfs" filesystem.
>>>>
>>>>
>>>>
>>>> Nope, it's just a tool triggering different read/write or ioctls.
>>>> In fact most of its command is fs independent.
>>>> Only a limited number of operations are only supported by XFS.
>>>>
>>>> It's just due to historical reasons it's still named as xfs_io.
>>>>
>>>> I won't be surprised if one day it's split as an independent tool.
>>>>
>>>>> and command set mentioned here, couldn't see which is command to
>>>>> invoke dedupe task.
>>>>
>>>>
>>>>
>>>> "dedupe" and "reflink" command.
>>>
>>> Oh. That means page link referred on BTRFS Wiki page is not updated
>>> with this. I googled another page that has reference of these two
>>> command in xfs_io here
>>> https://www.systutorials.com/docs/linux/man/8-xfs_io/
>>> May be Wiki need an update here.
>>
>>
>> If XFS has a regularly updated online man page, we can just use that.
>> (But unfortunately, not every fs user tools use asciidoc like btrfs, which
>> can generate both man page and html).
>>
>>>
>>>>
>>>>> and how this works with BTRFS.
>>>>
>>>>
>>>>
>>>> Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use it
>>>> to
>>>> determine if two ranges are containing identical data.
>>>>
>>>> And if they are identical, we use FICLONERANGE or BTRFS_IOC_CLONE_RANGE
>>>> ioctl to reflink one to another, freeing one of them.
>>>>
>>>> BTW nowadays, such dedupe and reflink ioctl is genericized in VFS.
>>>> file_operations structure now includes both clone_file_range() and
>>>> dedupe_file_range() callbacks now.
>>>
>>> Yea. Understand that part. So going by description of "dedupe" and
>>> "reflink", seems through these commands, one can do deduplication part
>>> and NOT duplicate find part.
>>
>>
>> Yes, one don't need to call "dedupe" ioctl if they already knows some data
>> is identical and can go reflink straightforward.
>>
>>> That's still out of xfs_io command scope.
>>
>>
>> Not sure what the scope here you mean, sorry for that.
>>
> By "scope", I meant duplicate find part but that contradicts statement
> you just written below:
>> Since xfs_io can be used to find duplication,
>
> Since "dedupe" command input only a "source file" and src and
> dst_offset within that, so it can deduplicate the content within a
> file where actual FS dedupe IOCTL can first ensure if two extents are
> identical and if yes, then deduplicate them.
By "deduplicate", if you mean "removing duplication" then xfs_io
"dedupe" command itself doesn't do that.
The old btrfs ioctl describe this better, FILE_EXTENT_SAME.
"dedupe" command itself is only verifying if they have the same content.
So to make it clear, "dedupe" command and ioctl only do the
*verification* work.
"Reflink" will really remove the duplication (or even non-duplicated
data if you really want).
But please be careful, "reflink" is much like copy, so it can be
executed on file ranges with different contents.
In that case, reflink can free some space, but it also modifies the content.
So for full de-duplication, one must go through the full *verify* then
*reflink* circle.
Although "dedupe"(FILE_EXTENT_SAME) ioctl provides one verification
method, it's not the only solution.
But anyway, "dedupe" and "reflink" command provided by xfs_io does
provide every pieces to do de-duplication, so the wiki is still correct
IMHO.
Thanks,
Qu
>
> Is that correct?
>
> Thanks
> Shally
>
> and can remove duplication, I
>> don't find anything strange in that wiki page.
>> (Especially considering how popular the tool is, you can't find any more
>> handy tool than xfs_io)
>>
>> Thanks,
>> Qu
>>
>>
>>> Is that understanding correct?
>>> Thanks
>>> Shally
>>>>
>>>>
>>>> Thanks,
>>>> Qu
>>>>>
>>>>>
>>>>>
>>>>> So, can anyone help here and point me what am I missing here.
>>>>>
>>>>> Thanks
>>>>> Shally
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>>>>> in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication
2017-09-11 9:14 ` Qu Wenruo
@ 2017-09-11 9:25 ` Qu Wenruo
2017-09-11 9:27 ` shally verma
0 siblings, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2017-09-11 9:25 UTC (permalink / raw)
To: shally verma; +Cc: linux-btrfs, Verma, Shally
On 2017年09月11日 17:14, Qu Wenruo wrote:
>
>
> On 2017年09月11日 16:57, shally verma wrote:
>> On Mon, Sep 11, 2017 at 1:42 PM, Qu Wenruo <quwenruo.btrfs@gmx.com>
>> wrote:
>>>
>>>
>>> On 2017年09月11日 15:54, shally verma wrote:
>>>>
>>>> On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.btrfs@gmx.com>
>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 2017年09月11日 14:05, shally verma wrote:
>>>>>>
>>>>>>
>>>>>> I was going through BTRFS Deduplication page
>>>>>> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read
>>>>>>
>>>>>> "As such, xfs_io, is able to perform deduplication on a BTRFS file
>>>>>> system," ..
>>>>>>
>>>>>> following this, I followed on to xfs_io link
>>>>>> https://linux.die.net/man/8/xfs_io
>>>>>>
>>>>>> As I understand, these are set of commands allow us to do different
>>>>>> operations on "xfs" filesystem.
>>>>>
>>>>>
>>>>>
>>>>> Nope, it's just a tool triggering different read/write or ioctls.
>>>>> In fact most of its command is fs independent.
>>>>> Only a limited number of operations are only supported by XFS.
>>>>>
>>>>> It's just due to historical reasons it's still named as xfs_io.
>>>>>
>>>>> I won't be surprised if one day it's split as an independent tool.
>>>>>
>>>>>> and command set mentioned here, couldn't see which is command to
>>>>>> invoke dedupe task.
>>>>>
>>>>>
>>>>>
>>>>> "dedupe" and "reflink" command.
>>>>
>>>> Oh. That means page link referred on BTRFS Wiki page is not updated
>>>> with this. I googled another page that has reference of these two
>>>> command in xfs_io here
>>>> https://www.systutorials.com/docs/linux/man/8-xfs_io/
>>>> May be Wiki need an update here.
>>>
>>>
>>> If XFS has a regularly updated online man page, we can just use that.
>>> (But unfortunately, not every fs user tools use asciidoc like btrfs,
>>> which
>>> can generate both man page and html).
>>>
>>>>
>>>>>
>>>>>> and how this works with BTRFS.
>>>>>
>>>>>
>>>>>
>>>>> Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can
>>>>> use it
>>>>> to
>>>>> determine if two ranges are containing identical data.
>>>>>
>>>>> And if they are identical, we use FICLONERANGE or
>>>>> BTRFS_IOC_CLONE_RANGE
>>>>> ioctl to reflink one to another, freeing one of them.
>>>>>
>>>>> BTW nowadays, such dedupe and reflink ioctl is genericized in VFS.
>>>>> file_operations structure now includes both clone_file_range() and
>>>>> dedupe_file_range() callbacks now.
>>>>
>>>> Yea. Understand that part. So going by description of "dedupe" and
>>>> "reflink", seems through these commands, one can do deduplication part
>>>> and NOT duplicate find part.
>>>
>>>
>>> Yes, one don't need to call "dedupe" ioctl if they already knows some
>>> data
>>> is identical and can go reflink straightforward.
>>>
>>>> That's still out of xfs_io command scope.
>>>
>>>
>>> Not sure what the scope here you mean, sorry for that.
>>>
>> By "scope", I meant duplicate find part but that contradicts statement
>> you just written below:
>>> Since xfs_io can be used to find duplication,
>>
>> Since "dedupe" command input only a "source file" and src and
>> dst_offset within that, so it can deduplicate the content within a
>> file where actual FS dedupe IOCTL can first ensure if two extents are
>> identical and if yes, then deduplicate them.
>
> By "deduplicate", if you mean "removing duplication" then xfs_io
> "dedupe" command itself doesn't do that.
>
> The old btrfs ioctl describe this better, FILE_EXTENT_SAME.
> "dedupe" command itself is only verifying if they have the same content.
>
> So to make it clear, "dedupe" command and ioctl only do the
> *verification* work.
Sorry, I just checked the code and tried the ioctl.
If they are the same, "dedupe" will do "reflink" part also.
Code also shows that:
---
/* pass original length for comparison so we stay within i_size */
ret = btrfs_cmp_data(olen, &cmp);
if (ret == 0)
ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
---
So "dedupe" ioctl itself can do de-duplication.
And my previous answer is just totally wrong.
Sorry for that,
Qu
>
> "Reflink" will really remove the duplication (or even non-duplicated
> data if you really want).
>
>
> But please be careful, "reflink" is much like copy, so it can be
> executed on file ranges with different contents.
> In that case, reflink can free some space, but it also modifies the
> content.
>
> So for full de-duplication, one must go through the full *verify* then
> *reflink* circle.
> Although "dedupe"(FILE_EXTENT_SAME) ioctl provides one verification
> method, it's not the only solution.
>
> But anyway, "dedupe" and "reflink" command provided by xfs_io does
> provide every pieces to do de-duplication, so the wiki is still correct
> IMHO.
>
> Thanks,
> Qu
>
>>
>> Is that correct?
>>
>> Thanks
>> Shally
>>
>> and can remove duplication, I
>>> don't find anything strange in that wiki page.
>>> (Especially considering how popular the tool is, you can't find any more
>>> handy tool than xfs_io)
>>>
>>> Thanks,
>>> Qu
>>>
>>>
>>>> Is that understanding correct?
>>>> Thanks
>>>> Shally
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>>
>>>>>>
>>>>>>
>>>>>> So, can anyone help here and point me what am I missing here.
>>>>>>
>>>>>> Thanks
>>>>>> Shally
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>> linux-btrfs"
>>>>>> in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe
>>>> linux-btrfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication
2017-09-11 9:25 ` Qu Wenruo
@ 2017-09-11 9:27 ` shally verma
0 siblings, 0 replies; 10+ messages in thread
From: shally verma @ 2017-09-11 9:27 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs, Verma, Shally
On Mon, Sep 11, 2017 at 2:55 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
> On 2017年09月11日 17:14, Qu Wenruo wrote:
>>
>>
>>
>> On 2017年09月11日 16:57, shally verma wrote:
>>>
>>> On Mon, Sep 11, 2017 at 1:42 PM, Qu Wenruo <quwenruo.btrfs@gmx.com>
>>> wrote:
>>>>
>>>>
>>>>
>>>> On 2017年09月11日 15:54, shally verma wrote:
>>>>>
>>>>>
>>>>> On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.btrfs@gmx.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2017年09月11日 14:05, shally verma wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I was going through BTRFS Deduplication page
>>>>>>> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read
>>>>>>>
>>>>>>> "As such, xfs_io, is able to perform deduplication on a BTRFS file
>>>>>>> system," ..
>>>>>>>
>>>>>>> following this, I followed on to xfs_io link
>>>>>>> https://linux.die.net/man/8/xfs_io
>>>>>>>
>>>>>>> As I understand, these are set of commands allow us to do different
>>>>>>> operations on "xfs" filesystem.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Nope, it's just a tool triggering different read/write or ioctls.
>>>>>> In fact most of its command is fs independent.
>>>>>> Only a limited number of operations are only supported by XFS.
>>>>>>
>>>>>> It's just due to historical reasons it's still named as xfs_io.
>>>>>>
>>>>>> I won't be surprised if one day it's split as an independent tool.
>>>>>>
>>>>>>> and command set mentioned here, couldn't see which is command to
>>>>>>> invoke dedupe task.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> "dedupe" and "reflink" command.
>>>>>
>>>>>
>>>>> Oh. That means page link referred on BTRFS Wiki page is not updated
>>>>> with this. I googled another page that has reference of these two
>>>>> command in xfs_io here
>>>>> https://www.systutorials.com/docs/linux/man/8-xfs_io/
>>>>> May be Wiki need an update here.
>>>>
>>>>
>>>>
>>>> If XFS has a regularly updated online man page, we can just use that.
>>>> (But unfortunately, not every fs user tools use asciidoc like btrfs,
>>>> which
>>>> can generate both man page and html).
>>>>
>>>>>
>>>>>>
>>>>>>> and how this works with BTRFS.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use
>>>>>> it
>>>>>> to
>>>>>> determine if two ranges are containing identical data.
>>>>>>
>>>>>> And if they are identical, we use FICLONERANGE or
>>>>>> BTRFS_IOC_CLONE_RANGE
>>>>>> ioctl to reflink one to another, freeing one of them.
>>>>>>
>>>>>> BTW nowadays, such dedupe and reflink ioctl is genericized in VFS.
>>>>>> file_operations structure now includes both clone_file_range() and
>>>>>> dedupe_file_range() callbacks now.
>>>>>
>>>>>
>>>>> Yea. Understand that part. So going by description of "dedupe" and
>>>>> "reflink", seems through these commands, one can do deduplication part
>>>>> and NOT duplicate find part.
>>>>
>>>>
>>>>
>>>> Yes, one don't need to call "dedupe" ioctl if they already knows some
>>>> data
>>>> is identical and can go reflink straightforward.
>>>>
>>>>> That's still out of xfs_io command scope.
>>>>
>>>>
>>>>
>>>> Not sure what the scope here you mean, sorry for that.
>>>>
>>> By "scope", I meant duplicate find part but that contradicts statement
>>> you just written below:
>>>>
>>>> Since xfs_io can be used to find duplication,
>>>
>>>
>>> Since "dedupe" command input only a "source file" and src and
>>> dst_offset within that, so it can deduplicate the content within a
>>> file where actual FS dedupe IOCTL can first ensure if two extents are
>>> identical and if yes, then deduplicate them.
>>
>>
>> By "deduplicate", if you mean "removing duplication" then xfs_io "dedupe"
>> command itself doesn't do that.
>>
>> The old btrfs ioctl describe this better, FILE_EXTENT_SAME.
>> "dedupe" command itself is only verifying if they have the same content.
>>
>> So to make it clear, "dedupe" command and ioctl only do the *verification*
>> work.
>
>
> Sorry, I just checked the code and tried the ioctl.
>
> If they are the same, "dedupe" will do "reflink" part also.
>
> Code also shows that:
> ---
> /* pass original length for comparison so we stay within i_size */
> ret = btrfs_cmp_data(olen, &cmp);
> if (ret == 0)
> ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
> ---
>
> So "dedupe" ioctl itself can do de-duplication.
> And my previous answer is just totally wrong.
>
Yea. That corroborate my findings too. Thanks for confirming that :).
Thanks
Shally
> Sorry for that,
> Qu
>
>
>>
>> "Reflink" will really remove the duplication (or even non-duplicated data
>> if you really want).
>>
>>
>> But please be careful, "reflink" is much like copy, so it can be executed
>> on file ranges with different contents.
>> In that case, reflink can free some space, but it also modifies the
>> content.
>>
>> So for full de-duplication, one must go through the full *verify* then
>> *reflink* circle.
>> Although "dedupe"(FILE_EXTENT_SAME) ioctl provides one verification
>> method, it's not the only solution.
>>
>> But anyway, "dedupe" and "reflink" command provided by xfs_io does provide
>> every pieces to do de-duplication, so the wiki is still correct IMHO.
>>
>> Thanks,
>> Qu
>>
>>>
>>> Is that correct?
>>>
>>> Thanks
>>> Shally
>>>
>>> and can remove duplication, I
>>>>
>>>> don't find anything strange in that wiki page.
>>>> (Especially considering how popular the tool is, you can't find any more
>>>> handy tool than xfs_io)
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>
>>>>> Is that understanding correct?
>>>>> Thanks
>>>>> Shally
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> So, can anyone help here and point me what am I missing here.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Shally
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>>> linux-btrfs"
>>>>>>> in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>>>>> in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-09-11 9:28 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-12 5:52 BTRFS deduplication Swâmi Petaramesh
2011-05-12 15:41 ` Josef Bacik
-- strict thread matches above, loose matches on Subject: below --
2017-09-11 6:05 BTRFS Deduplication shally verma
2017-09-11 6:46 ` Qu Wenruo
2017-09-11 7:54 ` shally verma
2017-09-11 8:12 ` Qu Wenruo
2017-09-11 8:57 ` shally verma
2017-09-11 9:14 ` Qu Wenruo
2017-09-11 9:25 ` Qu Wenruo
2017-09-11 9:27 ` shally verma
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).