* BTRFS deduplication @ 2011-05-12 5:52 Swâmi Petaramesh 2011-05-12 15:41 ` Josef Bacik 0 siblings, 1 reply; 10+ messages in thread From: Swâmi Petaramesh @ 2011-05-12 5:52 UTC (permalink / raw) To: Linux BTRFS Hi again list, I've seen in a message dating back to january that offline deduplication has been implemented in BTRFS, but I can't find it in my btrfs-tools 0.19+20100601-3ubuntu2 Has it reached release, or not yet ? How could I give it a try ? I've seen a discussion about whether deduplication should be made offline or online ; my usage case it to backup a number of laptops having all about the same software and many files in common, to a single backup server using rsync, I would be very much interested in online deduplication - because I don't have "n" times the storage space, that offline dedup might temporarily need, and because performance isn't crucial for this application, as backups can be done overnight... Thanks in advance :-) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS deduplication 2011-05-12 5:52 BTRFS deduplication Swâmi Petaramesh @ 2011-05-12 15:41 ` Josef Bacik 0 siblings, 0 replies; 10+ messages in thread From: Josef Bacik @ 2011-05-12 15:41 UTC (permalink / raw) To: Swâmi Petaramesh; +Cc: Linux BTRFS On Thu, May 12, 2011 at 07:52:20AM +0200, Sw=E2mi Petaramesh wrote: > Hi again list, >=20 > I've seen in a message dating back to january that offline deduplicat= ion > has been implemented in BTRFS, but I can't find it in my btrfs-tools > 0.19+20100601-3ubuntu2 >=20 > Has it reached release, or not yet ? How could I give it a try ? >=20 > I've seen a discussion about whether deduplication should be made > offline or online ; my usage case it to backup a number of laptops > having all about the same software and many files in common, to a sin= gle > backup server using rsync, I would be very much interested in online > deduplication - because I don't have "n" times the storage space, tha= t > offline dedup might temporarily need, and because performance isn't > crucial for this application, as backups can be done overnight... >=20 > Thanks in advance :-) >=20 So the btrfs-progs patch only exists on the mailing list and the kernel= patch is sitting in my git tree. This was more of a weekend project and less of= a serious attempt at an actual solution. It could be cleaned up and actu= ally used, but I'm not at all interested in doing that :). Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" = in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* BTRFS Deduplication @ 2017-09-11 6:05 shally verma 2017-09-11 6:46 ` Qu Wenruo 0 siblings, 1 reply; 10+ messages in thread From: shally verma @ 2017-09-11 6:05 UTC (permalink / raw) To: linux-btrfs; +Cc: Verma, Shally I was going through BTRFS Deduplication page (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read "As such, xfs_io, is able to perform deduplication on a BTRFS file system," .. following this, I followed on to xfs_io link https://linux.die.net/man/8/xfs_io As I understand, these are set of commands allow us to do different operations on "xfs" filesystem. and command set mentioned here, couldn't see which is command to invoke dedupe task. and how this works with BTRFS. So, can anyone help here and point me what am I missing here. Thanks Shally ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication 2017-09-11 6:05 BTRFS Deduplication shally verma @ 2017-09-11 6:46 ` Qu Wenruo 2017-09-11 7:54 ` shally verma 0 siblings, 1 reply; 10+ messages in thread From: Qu Wenruo @ 2017-09-11 6:46 UTC (permalink / raw) To: shally verma, linux-btrfs; +Cc: Verma, Shally On 2017年09月11日 14:05, shally verma wrote: > I was going through BTRFS Deduplication page > (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read > > "As such, xfs_io, is able to perform deduplication on a BTRFS file system," .. > > following this, I followed on to xfs_io link https://linux.die.net/man/8/xfs_io > > As I understand, these are set of commands allow us to do different > operations on "xfs" filesystem. Nope, it's just a tool triggering different read/write or ioctls. In fact most of its command is fs independent. Only a limited number of operations are only supported by XFS. It's just due to historical reasons it's still named as xfs_io. I won't be surprised if one day it's split as an independent tool. > and command set mentioned here, couldn't see which is command to > invoke dedupe task. "dedupe" and "reflink" command. > and how this works with BTRFS. Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use it to determine if two ranges are containing identical data. And if they are identical, we use FICLONERANGE or BTRFS_IOC_CLONE_RANGE ioctl to reflink one to another, freeing one of them. BTW nowadays, such dedupe and reflink ioctl is genericized in VFS. file_operations structure now includes both clone_file_range() and dedupe_file_range() callbacks now. Thanks, Qu > > So, can anyone help here and point me what am I missing here. > > Thanks > Shally > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication 2017-09-11 6:46 ` Qu Wenruo @ 2017-09-11 7:54 ` shally verma 2017-09-11 8:12 ` Qu Wenruo 0 siblings, 1 reply; 10+ messages in thread From: shally verma @ 2017-09-11 7:54 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs, Verma, Shally On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > On 2017年09月11日 14:05, shally verma wrote: >> >> I was going through BTRFS Deduplication page >> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read >> >> "As such, xfs_io, is able to perform deduplication on a BTRFS file >> system," .. >> >> following this, I followed on to xfs_io link >> https://linux.die.net/man/8/xfs_io >> >> As I understand, these are set of commands allow us to do different >> operations on "xfs" filesystem. > > > Nope, it's just a tool triggering different read/write or ioctls. > In fact most of its command is fs independent. > Only a limited number of operations are only supported by XFS. > > It's just due to historical reasons it's still named as xfs_io. > > I won't be surprised if one day it's split as an independent tool. > >> and command set mentioned here, couldn't see which is command to >> invoke dedupe task. > > > "dedupe" and "reflink" command. Oh. That means page link referred on BTRFS Wiki page is not updated with this. I googled another page that has reference of these two command in xfs_io here https://www.systutorials.com/docs/linux/man/8-xfs_io/ May be Wiki need an update here. > >> and how this works with BTRFS. > > > Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use it to > determine if two ranges are containing identical data. > > And if they are identical, we use FICLONERANGE or BTRFS_IOC_CLONE_RANGE > ioctl to reflink one to another, freeing one of them. > > BTW nowadays, such dedupe and reflink ioctl is genericized in VFS. > file_operations structure now includes both clone_file_range() and > dedupe_file_range() callbacks now. Yea. Understand that part. So going by description of "dedupe" and "reflink", seems through these commands, one can do deduplication part and NOT duplicate find part. That's still out of xfs_io command scope. Is that understanding correct? Thanks Shally > > Thanks, > Qu >> >> >> So, can anyone help here and point me what am I missing here. >> >> Thanks >> Shally >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication 2017-09-11 7:54 ` shally verma @ 2017-09-11 8:12 ` Qu Wenruo 2017-09-11 8:57 ` shally verma 0 siblings, 1 reply; 10+ messages in thread From: Qu Wenruo @ 2017-09-11 8:12 UTC (permalink / raw) To: shally verma; +Cc: linux-btrfs, Verma, Shally On 2017年09月11日 15:54, shally verma wrote: > On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >> >> >> On 2017年09月11日 14:05, shally verma wrote: >>> >>> I was going through BTRFS Deduplication page >>> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read >>> >>> "As such, xfs_io, is able to perform deduplication on a BTRFS file >>> system," .. >>> >>> following this, I followed on to xfs_io link >>> https://linux.die.net/man/8/xfs_io >>> >>> As I understand, these are set of commands allow us to do different >>> operations on "xfs" filesystem. >> >> >> Nope, it's just a tool triggering different read/write or ioctls. >> In fact most of its command is fs independent. >> Only a limited number of operations are only supported by XFS. >> >> It's just due to historical reasons it's still named as xfs_io. >> >> I won't be surprised if one day it's split as an independent tool. >> >>> and command set mentioned here, couldn't see which is command to >>> invoke dedupe task. >> >> >> "dedupe" and "reflink" command. > Oh. That means page link referred on BTRFS Wiki page is not updated > with this. I googled another page that has reference of these two > command in xfs_io here > https://www.systutorials.com/docs/linux/man/8-xfs_io/ > May be Wiki need an update here. If XFS has a regularly updated online man page, we can just use that. (But unfortunately, not every fs user tools use asciidoc like btrfs, which can generate both man page and html). > >> >>> and how this works with BTRFS. >> >> >> Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use it to >> determine if two ranges are containing identical data. >> >> And if they are identical, we use FICLONERANGE or BTRFS_IOC_CLONE_RANGE >> ioctl to reflink one to another, freeing one of them. >> >> BTW nowadays, such dedupe and reflink ioctl is genericized in VFS. >> file_operations structure now includes both clone_file_range() and >> dedupe_file_range() callbacks now. > Yea. Understand that part. So going by description of "dedupe" and > "reflink", seems through these commands, one can do deduplication part > and NOT duplicate find part. Yes, one don't need to call "dedupe" ioctl if they already knows some data is identical and can go reflink straightforward. > That's still out of xfs_io command scope. Not sure what the scope here you mean, sorry for that. Since xfs_io can be used to find duplication, and can remove duplication, I don't find anything strange in that wiki page. (Especially considering how popular the tool is, you can't find any more handy tool than xfs_io) Thanks, Qu > Is that understanding correct? > Thanks > Shally >> >> Thanks, >> Qu >>> >>> >>> So, can anyone help here and point me what am I missing here. >>> >>> Thanks >>> Shally >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication 2017-09-11 8:12 ` Qu Wenruo @ 2017-09-11 8:57 ` shally verma 2017-09-11 9:14 ` Qu Wenruo 0 siblings, 1 reply; 10+ messages in thread From: shally verma @ 2017-09-11 8:57 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs, Verma, Shally On Mon, Sep 11, 2017 at 1:42 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > On 2017年09月11日 15:54, shally verma wrote: >> >> On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> >> wrote: >>> >>> >>> >>> On 2017年09月11日 14:05, shally verma wrote: >>>> >>>> >>>> I was going through BTRFS Deduplication page >>>> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read >>>> >>>> "As such, xfs_io, is able to perform deduplication on a BTRFS file >>>> system," .. >>>> >>>> following this, I followed on to xfs_io link >>>> https://linux.die.net/man/8/xfs_io >>>> >>>> As I understand, these are set of commands allow us to do different >>>> operations on "xfs" filesystem. >>> >>> >>> >>> Nope, it's just a tool triggering different read/write or ioctls. >>> In fact most of its command is fs independent. >>> Only a limited number of operations are only supported by XFS. >>> >>> It's just due to historical reasons it's still named as xfs_io. >>> >>> I won't be surprised if one day it's split as an independent tool. >>> >>>> and command set mentioned here, couldn't see which is command to >>>> invoke dedupe task. >>> >>> >>> >>> "dedupe" and "reflink" command. >> >> Oh. That means page link referred on BTRFS Wiki page is not updated >> with this. I googled another page that has reference of these two >> command in xfs_io here >> https://www.systutorials.com/docs/linux/man/8-xfs_io/ >> May be Wiki need an update here. > > > If XFS has a regularly updated online man page, we can just use that. > (But unfortunately, not every fs user tools use asciidoc like btrfs, which > can generate both man page and html). > >> >>> >>>> and how this works with BTRFS. >>> >>> >>> >>> Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use it >>> to >>> determine if two ranges are containing identical data. >>> >>> And if they are identical, we use FICLONERANGE or BTRFS_IOC_CLONE_RANGE >>> ioctl to reflink one to another, freeing one of them. >>> >>> BTW nowadays, such dedupe and reflink ioctl is genericized in VFS. >>> file_operations structure now includes both clone_file_range() and >>> dedupe_file_range() callbacks now. >> >> Yea. Understand that part. So going by description of "dedupe" and >> "reflink", seems through these commands, one can do deduplication part >> and NOT duplicate find part. > > > Yes, one don't need to call "dedupe" ioctl if they already knows some data > is identical and can go reflink straightforward. > >> That's still out of xfs_io command scope. > > > Not sure what the scope here you mean, sorry for that. > By "scope", I meant duplicate find part but that contradicts statement you just written below: > Since xfs_io can be used to find duplication, Since "dedupe" command input only a "source file" and src and dst_offset within that, so it can deduplicate the content within a file where actual FS dedupe IOCTL can first ensure if two extents are identical and if yes, then deduplicate them. Is that correct? Thanks Shally and can remove duplication, I > don't find anything strange in that wiki page. > (Especially considering how popular the tool is, you can't find any more > handy tool than xfs_io) > > Thanks, > Qu > > >> Is that understanding correct? >> Thanks >> Shally >>> >>> >>> Thanks, >>> Qu >>>> >>>> >>>> >>>> So, can anyone help here and point me what am I missing here. >>>> >>>> Thanks >>>> Shally >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" >>>> in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication 2017-09-11 8:57 ` shally verma @ 2017-09-11 9:14 ` Qu Wenruo 2017-09-11 9:25 ` Qu Wenruo 0 siblings, 1 reply; 10+ messages in thread From: Qu Wenruo @ 2017-09-11 9:14 UTC (permalink / raw) To: shally verma; +Cc: linux-btrfs, Verma, Shally On 2017年09月11日 16:57, shally verma wrote: > On Mon, Sep 11, 2017 at 1:42 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: >> >> >> On 2017年09月11日 15:54, shally verma wrote: >>> >>> On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> >>> wrote: >>>> >>>> >>>> >>>> On 2017年09月11日 14:05, shally verma wrote: >>>>> >>>>> >>>>> I was going through BTRFS Deduplication page >>>>> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read >>>>> >>>>> "As such, xfs_io, is able to perform deduplication on a BTRFS file >>>>> system," .. >>>>> >>>>> following this, I followed on to xfs_io link >>>>> https://linux.die.net/man/8/xfs_io >>>>> >>>>> As I understand, these are set of commands allow us to do different >>>>> operations on "xfs" filesystem. >>>> >>>> >>>> >>>> Nope, it's just a tool triggering different read/write or ioctls. >>>> In fact most of its command is fs independent. >>>> Only a limited number of operations are only supported by XFS. >>>> >>>> It's just due to historical reasons it's still named as xfs_io. >>>> >>>> I won't be surprised if one day it's split as an independent tool. >>>> >>>>> and command set mentioned here, couldn't see which is command to >>>>> invoke dedupe task. >>>> >>>> >>>> >>>> "dedupe" and "reflink" command. >>> >>> Oh. That means page link referred on BTRFS Wiki page is not updated >>> with this. I googled another page that has reference of these two >>> command in xfs_io here >>> https://www.systutorials.com/docs/linux/man/8-xfs_io/ >>> May be Wiki need an update here. >> >> >> If XFS has a regularly updated online man page, we can just use that. >> (But unfortunately, not every fs user tools use asciidoc like btrfs, which >> can generate both man page and html). >> >>> >>>> >>>>> and how this works with BTRFS. >>>> >>>> >>>> >>>> Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use it >>>> to >>>> determine if two ranges are containing identical data. >>>> >>>> And if they are identical, we use FICLONERANGE or BTRFS_IOC_CLONE_RANGE >>>> ioctl to reflink one to another, freeing one of them. >>>> >>>> BTW nowadays, such dedupe and reflink ioctl is genericized in VFS. >>>> file_operations structure now includes both clone_file_range() and >>>> dedupe_file_range() callbacks now. >>> >>> Yea. Understand that part. So going by description of "dedupe" and >>> "reflink", seems through these commands, one can do deduplication part >>> and NOT duplicate find part. >> >> >> Yes, one don't need to call "dedupe" ioctl if they already knows some data >> is identical and can go reflink straightforward. >> >>> That's still out of xfs_io command scope. >> >> >> Not sure what the scope here you mean, sorry for that. >> > By "scope", I meant duplicate find part but that contradicts statement > you just written below: >> Since xfs_io can be used to find duplication, > > Since "dedupe" command input only a "source file" and src and > dst_offset within that, so it can deduplicate the content within a > file where actual FS dedupe IOCTL can first ensure if two extents are > identical and if yes, then deduplicate them. By "deduplicate", if you mean "removing duplication" then xfs_io "dedupe" command itself doesn't do that. The old btrfs ioctl describe this better, FILE_EXTENT_SAME. "dedupe" command itself is only verifying if they have the same content. So to make it clear, "dedupe" command and ioctl only do the *verification* work. "Reflink" will really remove the duplication (or even non-duplicated data if you really want). But please be careful, "reflink" is much like copy, so it can be executed on file ranges with different contents. In that case, reflink can free some space, but it also modifies the content. So for full de-duplication, one must go through the full *verify* then *reflink* circle. Although "dedupe"(FILE_EXTENT_SAME) ioctl provides one verification method, it's not the only solution. But anyway, "dedupe" and "reflink" command provided by xfs_io does provide every pieces to do de-duplication, so the wiki is still correct IMHO. Thanks, Qu > > Is that correct? > > Thanks > Shally > > and can remove duplication, I >> don't find anything strange in that wiki page. >> (Especially considering how popular the tool is, you can't find any more >> handy tool than xfs_io) >> >> Thanks, >> Qu >> >> >>> Is that understanding correct? >>> Thanks >>> Shally >>>> >>>> >>>> Thanks, >>>> Qu >>>>> >>>>> >>>>> >>>>> So, can anyone help here and point me what am I missing here. >>>>> >>>>> Thanks >>>>> Shally >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" >>>>> in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication 2017-09-11 9:14 ` Qu Wenruo @ 2017-09-11 9:25 ` Qu Wenruo 2017-09-11 9:27 ` shally verma 0 siblings, 1 reply; 10+ messages in thread From: Qu Wenruo @ 2017-09-11 9:25 UTC (permalink / raw) To: shally verma; +Cc: linux-btrfs, Verma, Shally On 2017年09月11日 17:14, Qu Wenruo wrote: > > > On 2017年09月11日 16:57, shally verma wrote: >> On Mon, Sep 11, 2017 at 1:42 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> >> wrote: >>> >>> >>> On 2017年09月11日 15:54, shally verma wrote: >>>> >>>> On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> >>>> wrote: >>>>> >>>>> >>>>> >>>>> On 2017年09月11日 14:05, shally verma wrote: >>>>>> >>>>>> >>>>>> I was going through BTRFS Deduplication page >>>>>> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read >>>>>> >>>>>> "As such, xfs_io, is able to perform deduplication on a BTRFS file >>>>>> system," .. >>>>>> >>>>>> following this, I followed on to xfs_io link >>>>>> https://linux.die.net/man/8/xfs_io >>>>>> >>>>>> As I understand, these are set of commands allow us to do different >>>>>> operations on "xfs" filesystem. >>>>> >>>>> >>>>> >>>>> Nope, it's just a tool triggering different read/write or ioctls. >>>>> In fact most of its command is fs independent. >>>>> Only a limited number of operations are only supported by XFS. >>>>> >>>>> It's just due to historical reasons it's still named as xfs_io. >>>>> >>>>> I won't be surprised if one day it's split as an independent tool. >>>>> >>>>>> and command set mentioned here, couldn't see which is command to >>>>>> invoke dedupe task. >>>>> >>>>> >>>>> >>>>> "dedupe" and "reflink" command. >>>> >>>> Oh. That means page link referred on BTRFS Wiki page is not updated >>>> with this. I googled another page that has reference of these two >>>> command in xfs_io here >>>> https://www.systutorials.com/docs/linux/man/8-xfs_io/ >>>> May be Wiki need an update here. >>> >>> >>> If XFS has a regularly updated online man page, we can just use that. >>> (But unfortunately, not every fs user tools use asciidoc like btrfs, >>> which >>> can generate both man page and html). >>> >>>> >>>>> >>>>>> and how this works with BTRFS. >>>>> >>>>> >>>>> >>>>> Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can >>>>> use it >>>>> to >>>>> determine if two ranges are containing identical data. >>>>> >>>>> And if they are identical, we use FICLONERANGE or >>>>> BTRFS_IOC_CLONE_RANGE >>>>> ioctl to reflink one to another, freeing one of them. >>>>> >>>>> BTW nowadays, such dedupe and reflink ioctl is genericized in VFS. >>>>> file_operations structure now includes both clone_file_range() and >>>>> dedupe_file_range() callbacks now. >>>> >>>> Yea. Understand that part. So going by description of "dedupe" and >>>> "reflink", seems through these commands, one can do deduplication part >>>> and NOT duplicate find part. >>> >>> >>> Yes, one don't need to call "dedupe" ioctl if they already knows some >>> data >>> is identical and can go reflink straightforward. >>> >>>> That's still out of xfs_io command scope. >>> >>> >>> Not sure what the scope here you mean, sorry for that. >>> >> By "scope", I meant duplicate find part but that contradicts statement >> you just written below: >>> Since xfs_io can be used to find duplication, >> >> Since "dedupe" command input only a "source file" and src and >> dst_offset within that, so it can deduplicate the content within a >> file where actual FS dedupe IOCTL can first ensure if two extents are >> identical and if yes, then deduplicate them. > > By "deduplicate", if you mean "removing duplication" then xfs_io > "dedupe" command itself doesn't do that. > > The old btrfs ioctl describe this better, FILE_EXTENT_SAME. > "dedupe" command itself is only verifying if they have the same content. > > So to make it clear, "dedupe" command and ioctl only do the > *verification* work. Sorry, I just checked the code and tried the ioctl. If they are the same, "dedupe" will do "reflink" part also. Code also shows that: --- /* pass original length for comparison so we stay within i_size */ ret = btrfs_cmp_data(olen, &cmp); if (ret == 0) ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1); --- So "dedupe" ioctl itself can do de-duplication. And my previous answer is just totally wrong. Sorry for that, Qu > > "Reflink" will really remove the duplication (or even non-duplicated > data if you really want). > > > But please be careful, "reflink" is much like copy, so it can be > executed on file ranges with different contents. > In that case, reflink can free some space, but it also modifies the > content. > > So for full de-duplication, one must go through the full *verify* then > *reflink* circle. > Although "dedupe"(FILE_EXTENT_SAME) ioctl provides one verification > method, it's not the only solution. > > But anyway, "dedupe" and "reflink" command provided by xfs_io does > provide every pieces to do de-duplication, so the wiki is still correct > IMHO. > > Thanks, > Qu > >> >> Is that correct? >> >> Thanks >> Shally >> >> and can remove duplication, I >>> don't find anything strange in that wiki page. >>> (Especially considering how popular the tool is, you can't find any more >>> handy tool than xfs_io) >>> >>> Thanks, >>> Qu >>> >>> >>>> Is that understanding correct? >>>> Thanks >>>> Shally >>>>> >>>>> >>>>> Thanks, >>>>> Qu >>>>>> >>>>>> >>>>>> >>>>>> So, can anyone help here and point me what am I missing here. >>>>>> >>>>>> Thanks >>>>>> Shally >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe >>>>>> linux-btrfs" >>>>>> in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> >>>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe >>>> linux-btrfs" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: BTRFS Deduplication 2017-09-11 9:25 ` Qu Wenruo @ 2017-09-11 9:27 ` shally verma 0 siblings, 0 replies; 10+ messages in thread From: shally verma @ 2017-09-11 9:27 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs, Verma, Shally On Mon, Sep 11, 2017 at 2:55 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > On 2017年09月11日 17:14, Qu Wenruo wrote: >> >> >> >> On 2017年09月11日 16:57, shally verma wrote: >>> >>> On Mon, Sep 11, 2017 at 1:42 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> >>> wrote: >>>> >>>> >>>> >>>> On 2017年09月11日 15:54, shally verma wrote: >>>>> >>>>> >>>>> On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> >>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 2017年09月11日 14:05, shally verma wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> I was going through BTRFS Deduplication page >>>>>>> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read >>>>>>> >>>>>>> "As such, xfs_io, is able to perform deduplication on a BTRFS file >>>>>>> system," .. >>>>>>> >>>>>>> following this, I followed on to xfs_io link >>>>>>> https://linux.die.net/man/8/xfs_io >>>>>>> >>>>>>> As I understand, these are set of commands allow us to do different >>>>>>> operations on "xfs" filesystem. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Nope, it's just a tool triggering different read/write or ioctls. >>>>>> In fact most of its command is fs independent. >>>>>> Only a limited number of operations are only supported by XFS. >>>>>> >>>>>> It's just due to historical reasons it's still named as xfs_io. >>>>>> >>>>>> I won't be surprised if one day it's split as an independent tool. >>>>>> >>>>>>> and command set mentioned here, couldn't see which is command to >>>>>>> invoke dedupe task. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> "dedupe" and "reflink" command. >>>>> >>>>> >>>>> Oh. That means page link referred on BTRFS Wiki page is not updated >>>>> with this. I googled another page that has reference of these two >>>>> command in xfs_io here >>>>> https://www.systutorials.com/docs/linux/man/8-xfs_io/ >>>>> May be Wiki need an update here. >>>> >>>> >>>> >>>> If XFS has a regularly updated online man page, we can just use that. >>>> (But unfortunately, not every fs user tools use asciidoc like btrfs, >>>> which >>>> can generate both man page and html). >>>> >>>>> >>>>>> >>>>>>> and how this works with BTRFS. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use >>>>>> it >>>>>> to >>>>>> determine if two ranges are containing identical data. >>>>>> >>>>>> And if they are identical, we use FICLONERANGE or >>>>>> BTRFS_IOC_CLONE_RANGE >>>>>> ioctl to reflink one to another, freeing one of them. >>>>>> >>>>>> BTW nowadays, such dedupe and reflink ioctl is genericized in VFS. >>>>>> file_operations structure now includes both clone_file_range() and >>>>>> dedupe_file_range() callbacks now. >>>>> >>>>> >>>>> Yea. Understand that part. So going by description of "dedupe" and >>>>> "reflink", seems through these commands, one can do deduplication part >>>>> and NOT duplicate find part. >>>> >>>> >>>> >>>> Yes, one don't need to call "dedupe" ioctl if they already knows some >>>> data >>>> is identical and can go reflink straightforward. >>>> >>>>> That's still out of xfs_io command scope. >>>> >>>> >>>> >>>> Not sure what the scope here you mean, sorry for that. >>>> >>> By "scope", I meant duplicate find part but that contradicts statement >>> you just written below: >>>> >>>> Since xfs_io can be used to find duplication, >>> >>> >>> Since "dedupe" command input only a "source file" and src and >>> dst_offset within that, so it can deduplicate the content within a >>> file where actual FS dedupe IOCTL can first ensure if two extents are >>> identical and if yes, then deduplicate them. >> >> >> By "deduplicate", if you mean "removing duplication" then xfs_io "dedupe" >> command itself doesn't do that. >> >> The old btrfs ioctl describe this better, FILE_EXTENT_SAME. >> "dedupe" command itself is only verifying if they have the same content. >> >> So to make it clear, "dedupe" command and ioctl only do the *verification* >> work. > > > Sorry, I just checked the code and tried the ioctl. > > If they are the same, "dedupe" will do "reflink" part also. > > Code also shows that: > --- > /* pass original length for comparison so we stay within i_size */ > ret = btrfs_cmp_data(olen, &cmp); > if (ret == 0) > ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1); > --- > > So "dedupe" ioctl itself can do de-duplication. > And my previous answer is just totally wrong. > Yea. That corroborate my findings too. Thanks for confirming that :). Thanks Shally > Sorry for that, > Qu > > >> >> "Reflink" will really remove the duplication (or even non-duplicated data >> if you really want). >> >> >> But please be careful, "reflink" is much like copy, so it can be executed >> on file ranges with different contents. >> In that case, reflink can free some space, but it also modifies the >> content. >> >> So for full de-duplication, one must go through the full *verify* then >> *reflink* circle. >> Although "dedupe"(FILE_EXTENT_SAME) ioctl provides one verification >> method, it's not the only solution. >> >> But anyway, "dedupe" and "reflink" command provided by xfs_io does provide >> every pieces to do de-duplication, so the wiki is still correct IMHO. >> >> Thanks, >> Qu >> >>> >>> Is that correct? >>> >>> Thanks >>> Shally >>> >>> and can remove duplication, I >>>> >>>> don't find anything strange in that wiki page. >>>> (Especially considering how popular the tool is, you can't find any more >>>> handy tool than xfs_io) >>>> >>>> Thanks, >>>> Qu >>>> >>>> >>>>> Is that understanding correct? >>>>> Thanks >>>>> Shally >>>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Qu >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> So, can anyone help here and point me what am I missing here. >>>>>>> >>>>>>> Thanks >>>>>>> Shally >>>>>>> -- >>>>>>> To unsubscribe from this list: send the line "unsubscribe >>>>>>> linux-btrfs" >>>>>>> in >>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>> >>>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" >>>>> in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-09-11 9:28 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-05-12 5:52 BTRFS deduplication Swâmi Petaramesh 2011-05-12 15:41 ` Josef Bacik -- strict thread matches above, loose matches on Subject: below -- 2017-09-11 6:05 BTRFS Deduplication shally verma 2017-09-11 6:46 ` Qu Wenruo 2017-09-11 7:54 ` shally verma 2017-09-11 8:12 ` Qu Wenruo 2017-09-11 8:57 ` shally verma 2017-09-11 9:14 ` Qu Wenruo 2017-09-11 9:25 ` Qu Wenruo 2017-09-11 9:27 ` shally verma
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).