* cross-subvolume cp --reflink @ 2012-04-01 15:27 Norbert Scheibner 2012-04-01 15:30 ` Konstantinos Skarlatos 2012-04-01 15:42 ` Jérôme Poulin 0 siblings, 2 replies; 20+ messages in thread From: Norbert Scheibner @ 2012-04-01 15:27 UTC (permalink / raw) To: linux-btrfs Gl=C3=BCck Auf! I know its been discussed more then ones, but as a user I really would = like to see the patch for allowing this in the kernel. Some users tested this patch successfully for weeks or months in 2 or 3= kernel versions since then, true? I'd say by creating a snapshot, it's nothing else in the end. More then= one file or tree sharing the same data on disc, or am I wrong? So why not? Norbert -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" = in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-04-01 15:27 cross-subvolume cp --reflink Norbert Scheibner @ 2012-04-01 15:30 ` Konstantinos Skarlatos 2012-04-01 16:41 ` Norbert Scheibner 2012-04-01 15:42 ` Jérôme Poulin 1 sibling, 1 reply; 20+ messages in thread From: Konstantinos Skarlatos @ 2012-04-01 15:30 UTC (permalink / raw) To: Norbert Scheibner; +Cc: linux-btrfs On =CE=9A=CF=85=CF=81=CE=B9=CE=B1=CE=BA=CE=AE, 1 =CE=91=CF=80=CF=81=CE=AF= =CE=BB=CE=B9=CE=BF=CF=82 2012 6:27:49 =CE=BC=CE=BC, Norbert Scheibner w= rote: > Gl=C3=BCck Auf! > I know its been discussed more then ones, but as a user I really woul= d like to see the patch for allowing this in the kernel. > > Some users tested this patch successfully for weeks or months in 2 or= 3 kernel versions since then, true? > +1 from me too, i would save enormous amounts of space with a patch=20 like that, at least until dedupe is implemented. We could call it "poor= =20 man's dedupe" > I'd say by creating a snapshot, it's nothing else in the end. More th= en one file or tree sharing the same data on disc, or am I wrong? > > So why not? > Norbert > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs= " in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" = in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-04-01 15:30 ` Konstantinos Skarlatos @ 2012-04-01 16:41 ` Norbert Scheibner 2012-04-01 16:45 ` Konstantinos Skarlatos 0 siblings, 1 reply; 20+ messages in thread From: Norbert Scheibner @ 2012-04-01 16:41 UTC (permalink / raw) To: Konstantinos Skarlatos; +Cc: linux-btrfs > On: Sun, 01 Apr 2012 18:30:13 +0300 Konstantinos Skarlatos wrote > +1 from me too, i would save enormous amounts of space with a patch > like that, at least until dedupe is implemented. We could call it "poor > man's dedupe" That's my point. This poor man's dedupe would solve my problems here very well. I don't need a zfs-variant of dedupe. I can implement such a file-based dedupe with userland tools and would be happy. It's there, it's tested, it doesn't break another thing. Use it! Greetings Norbert ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-04-01 16:41 ` Norbert Scheibner @ 2012-04-01 16:45 ` Konstantinos Skarlatos 2012-04-01 17:07 ` Norbert Scheibner 0 siblings, 1 reply; 20+ messages in thread From: Konstantinos Skarlatos @ 2012-04-01 16:45 UTC (permalink / raw) To: Norbert Scheibner; +Cc: linux-btrfs On =CE=9A=CF=85=CF=81=CE=B9=CE=B1=CE=BA=CE=AE, 1 =CE=91=CF=80=CF=81=CE=AF= =CE=BB=CE=B9=CE=BF=CF=82 2012 7:41:36 =CE=BC=CE=BC, Norbert Scheibner w= rote: >> On: Sun, 01 Apr 2012 18:30:13 +0300 Konstantinos Skarlatos wrote > >> +1 from me too, i would save enormous amounts of space with a patch >> like that, at least until dedupe is implemented. We could call it "p= oor >> man's dedupe" > > That's my point. This poor man's dedupe would solve my problems here = very well. I don't need a zfs-variant of dedupe. I can implement such a= file-based dedupe with userland tools and would be happy. do you have any scripts that can search a btrfs filesystem for dupes=20 and replace them with cp --reflink? > > It's there, it's tested, it doesn't break another thing. Use it! > > Greetings > Norbert > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" = in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-04-01 16:45 ` Konstantinos Skarlatos @ 2012-04-01 17:07 ` Norbert Scheibner 2012-04-01 17:19 ` Konstantinos Skarlatos [not found] ` <4F788EE2.4010105@univie.ac.at> 0 siblings, 2 replies; 20+ messages in thread From: Norbert Scheibner @ 2012-04-01 17:07 UTC (permalink / raw) To: Konstantinos Skarlatos; +Cc: linux-btrfs > On: Sun, 01 Apr 2012 19:45:13 +0300 Konstantinos Skarlatos wrote > > That's my point. This poor man's dedupe would solve my problems here > very well. I don't need a zfs-variant of dedupe. I can implement such a > file-based dedupe with userland tools and would be happy. > > do you have any scripts that can search a btrfs filesystem for dupes > and replace them with cp --reflink? Nothing really working and tested very well. After I get to known the missing cp --reflink feature I stopped to develop the script any further. I use btrfs for my backups. Ones a day I rsync --delete --inplace the complete system to a subvolume, snapshot it, delete some tempfiles in the snapshot. In addition to that I wanted to shrink file-duplicates. What the script should do: 1. I md5sum every file 2. If the checksums are identical, I compare the files 3. If 2 or more files are really identical: - move one to a temp-dir - cp --reflink the second to the position and name of the first - do a chown --reference, chmod --reference and touch --reference to copy owner, file mode bits and time from the orginal to the reflink-copy and then delete the original in temp-dir Everything could be done with bash. Thinkable is the use of a database for the md5sums, which could be used for other purposes in the future. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-04-01 17:07 ` Norbert Scheibner @ 2012-04-01 17:19 ` Konstantinos Skarlatos 2012-04-01 18:11 ` Norbert Scheibner [not found] ` <4F788EE2.4010105@univie.ac.at> 1 sibling, 1 reply; 20+ messages in thread From: Konstantinos Skarlatos @ 2012-04-01 17:19 UTC (permalink / raw) To: Norbert Scheibner; +Cc: linux-btrfs On =CE=9A=CF=85=CF=81=CE=B9=CE=B1=CE=BA=CE=AE, 1 =CE=91=CF=80=CF=81=CE=AF= =CE=BB=CE=B9=CE=BF=CF=82 2012 8:07:54 =CE=BC=CE=BC, Norbert Scheibner w= rote: >> On: Sun, 01 Apr 2012 19:45:13 +0300 Konstantinos Skarlatos wrote > >>> That's my point. This poor man's dedupe would solve my problems her= e >> very well. I don't need a zfs-variant of dedupe. I can implement suc= h a >> file-based dedupe with userland tools and would be happy. >> >> do you have any scripts that can search a btrfs filesystem for dupes >> and replace them with cp --reflink? > > Nothing really working and tested very well. After I get to known the= missing cp --reflink feature I stopped to develop the script any furth= er. > > I use btrfs for my backups. Ones a day I rsync --delete --inplace the= complete system to a subvolume, snapshot it, delete some tempfiles in = the snapshot. In my setup I rsync --inplace many servers and workstations, 4-6 times=20 a day into a 12TB btrfs volume, each one in its own subvolume. After=20 every backup a new ro snapshot is created. I have many cross-subvolume duplicate files (OS files, programs, many=20 huge media files that are copied locally from the servers to the=20 workstations etc), so a good "dedupe" script could save lots of space,=20 and allow me to keep snapshots for much longer. > In addition to that I wanted to shrink file-duplicates. > > What the script should do: > 1. I md5sum every file > 2. If the checksums are identical, I compare the files > 3. If 2 or more files are really identical: > - move one to a temp-dir > - cp --reflink the second to the position and name of the first > - do a chown --reference, chmod --reference and touch --reference > to copy owner, file mode bits and time from the orginal to the > reflink-copy and then delete the original in temp-dir > > Everything could be done with bash. Thinkable is the use of a databas= e for the md5sums, which could be used for other purposes in the future= =2E -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" = in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-04-01 17:19 ` Konstantinos Skarlatos @ 2012-04-01 18:11 ` Norbert Scheibner 2012-04-01 19:42 ` Konstantinos Skarlatos 0 siblings, 1 reply; 20+ messages in thread From: Norbert Scheibner @ 2012-04-01 18:11 UTC (permalink / raw) To: Konstantinos Skarlatos; +Cc: linux-btrfs > On: Sun, 01 Apr 2012 20:19:24 +0300 Konstantinos Skarlatos wrote > > I use btrfs for my backups. Ones a day I rsync --delete --inplace the > complete system to a subvolume, snapshot it, delete some tempfiles in the > snapshot. > > In my setup I rsync --inplace many servers and workstations, 4-6 times > a day into a 12TB btrfs volume, each one in its own subvolume. After > every backup a new ro snapshot is created. > > I have many cross-subvolume duplicate files (OS files, programs, many > huge media files that are copied locally from the servers to the > workstations etc), so a good "dedupe" script could save lots of space, > and allow me to keep snapshots for much longer. So the script should be optimized not to try to deduplicate the whole fs everytime but the newly written ones. You could take such a file list out of the rsync output or the btrfs subvolume find-new command. Albeit the reflink patch, You could use such a bash-script inside one subvolume, after the rsync and before the snapshot. I don't know how much space it saves for You in this situation, but it's worth a try and a good way to develop such a script, because before You write anything to disc You can see how many duplicates are there and how much space could be freed. MfG Norbert ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-04-01 18:11 ` Norbert Scheibner @ 2012-04-01 19:42 ` Konstantinos Skarlatos 0 siblings, 0 replies; 20+ messages in thread From: Konstantinos Skarlatos @ 2012-04-01 19:42 UTC (permalink / raw) To: Norbert Scheibner; +Cc: linux-btrfs On 1/4/2012 9:11 =CE=BC=CE=BC, Norbert Scheibner wrote: >> On: Sun, 01 Apr 2012 20:19:24 +0300 Konstantinos Skarlatos wrote > >>> I use btrfs for my backups. Ones a day I rsync --delete --inplace >>> the >> complete system to a subvolume, snapshot it, delete some tempfiles >> in the snapshot. >> >> In my setup I rsync --inplace many servers and workstations, 4-6 >> times a day into a 12TB btrfs volume, each one in its own >> subvolume. After every backup a new ro snapshot is created. >> >> I have many cross-subvolume duplicate files (OS files, programs, >> many huge media files that are copied locally from the servers to >> the workstations etc), so a good "dedupe" script could save lots of >> space, and allow me to keep snapshots for much longer. > > So the script should be optimized not to try to deduplicate the whole > fs everytime but the newly written ones. You could take such a file > list out of the rsync output or the btrfs subvolume find-new > command. > a cron task with btrfs subvolume find-new would be ideal i think > Albeit the reflink patch, You could use such a bash-script inside one > subvolume, after the rsync and before the snapshot. I don't know how > much space it saves for You in this situation, but it's worth a try > and a good way to develop such a script, because before You write > anything to disc You can see how many duplicates are there and how > much space could be freed. > > MfG Norbert -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" = in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <4F788EE2.4010105@univie.ac.at>]
* Re: cross-subvolume cp --reflink [not found] ` <4F788EE2.4010105@univie.ac.at> @ 2012-04-01 18:39 ` Norbert Scheibner 2012-04-01 19:27 ` Konstantinos Skarlatos 0 siblings, 1 reply; 20+ messages in thread From: Norbert Scheibner @ 2012-04-01 18:39 UTC (permalink / raw) To: Klaus A. Kreil, k.skarlatos; +Cc: linux-btrfs > On: Sun, 01 Apr 2012 19:22:42 +0200"Klaus A. Kreil" wrote > I am just an interested reader on the btrfs list and so far have never > posted or sent a message to the list, but I do have a dedup bash script > that searches for duplicates underneath a directory (provided as an > argument) and hard links identical files. > > It works very well for an ext3 filesystem, but I guess the basics should > be the same for a btrfs filesystem. Everyone feel free to correct me here, but: At the moment there is a little problem with the maximum number of hard links in a directory. So I wouldn't use them wherever possible to avoid any thinkable problems in the near future. Plus to hard link 2 files means, that change one file You change the other one. It's something You either don't want to happen or something, which could be done in better ways. The cp --reflink method on a COW-fs is a much smarter method. Plus hard links across subvolumes do match the case of hard links across devices on a traditional fs, which is forbidden. Plus hard links In my opinion should really be substituted by soft links, because hard links are not transparent at the first sight and can not be copied as it. So no, I'd rather want the patch to allow cross-subvolume cp --reflink in the kernel and I will wait for that to happen. Greetings Norbert ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-04-01 18:39 ` Norbert Scheibner @ 2012-04-01 19:27 ` Konstantinos Skarlatos 2012-04-02 8:29 ` David Sterba 0 siblings, 1 reply; 20+ messages in thread From: Konstantinos Skarlatos @ 2012-04-01 19:27 UTC (permalink / raw) To: Norbert Scheibner; +Cc: Klaus A. Kreil, linux-btrfs On 1/4/2012 9:39 =CE=BC=CE=BC, Norbert Scheibner wrote: >> On: Sun, 01 Apr 2012 19:22:42 +0200"Klaus A. Kreil" wrote > >> I am just an interested reader on the btrfs list and so far have nev= er >> posted or sent a message to the list, but I do have a dedup bash scr= ipt >> that searches for duplicates underneath a directory (provided as an >> argument) and hard links identical files. >> >> It works very well for an ext3 filesystem, but I guess the basics sh= ould >> be the same for a btrfs filesystem. Thanks for the nice script, it works fine here! I just added a du -sh "$1" line at the beginning and end to see how muc= h=20 space it saves. > > Everyone feel free to correct me here, but: > At the moment there is a little problem with the maximum number of ha= rd links in a directory. So I wouldn't use them wherever possible to av= oid any thinkable problems in the near future. > > Plus to hard link 2 files means, that change one file You change the = other one. It's something You either don't want to happen or something,= which could be done in better ways. The cp --reflink method on a COW-f= s is a much smarter method. thats true, cp --reflink is much better. Also am I wrong that btrfs has= =20 a limitation on the number of hard links that can only be fixed with a=20 disk format change? > > Plus hard links across subvolumes do match the case of hard links acr= oss devices on a traditional fs, which is forbidden. > > Plus hard links In my opinion should really be substituted by soft li= nks, because hard links are not transparent at the first sight and can = not be copied as it. > > So no, I'd rather want the patch to allow cross-subvolume cp --reflin= k in the kernel and I will wait for that to happen. > > Greetings > Norbert -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" = in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-04-01 19:27 ` Konstantinos Skarlatos @ 2012-04-02 8:29 ` David Sterba 0 siblings, 0 replies; 20+ messages in thread From: David Sterba @ 2012-04-02 8:29 UTC (permalink / raw) To: Konstantinos Skarlatos; +Cc: Norbert Scheibner, Klaus A. Kreil, linux-btrfs On Sun, Apr 01, 2012 at 10:27:38PM +0300, Konstantinos Skarlatos wrote: > thats true, cp --reflink is much better. Also am I wrong that btrfs has > a limitation on the number of hard links that can only be fixed with a > disk format change? There's patch in development to lift the limit and it does not need the disk format change but adds an incompatibility bit. david ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-04-01 15:27 cross-subvolume cp --reflink Norbert Scheibner 2012-04-01 15:30 ` Konstantinos Skarlatos @ 2012-04-01 15:42 ` Jérôme Poulin 2012-04-28 23:53 ` Hubert Kario 1 sibling, 1 reply; 20+ messages in thread From: Jérôme Poulin @ 2012-04-01 15:42 UTC (permalink / raw) To: linux-btrfs; +Cc: Norbert Scheibner On Sun, Apr 1, 2012 at 11:27 AM, Norbert Scheibner <scno@gmx.net> wrote: > Some users tested this patch successfully for week,s or months in 2 or 3 kernel versions since then, true? If this feature must be implented in VFS in another patch, why not just activate what works and make the future patch disable it again? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-04-01 15:42 ` Jérôme Poulin @ 2012-04-28 23:53 ` Hubert Kario 2012-04-29 20:05 ` Norbert Scheibner 0 siblings, 1 reply; 20+ messages in thread From: Hubert Kario @ 2012-04-28 23:53 UTC (permalink / raw) To: Jérôme Poulin; +Cc: linux-btrfs, Norbert Scheibner [-- Attachment #1: Type: text/plain, Size: 851 bytes --] On Sunday 01 of April 2012 11:42:23 Jérôme Poulin wrote: > On Sun, Apr 1, 2012 at 11:27 AM, Norbert Scheibner <scno@gmx.net> wrote: > > Some users tested this patch successfully for week,s or months in 2 or 3 > > kernel versions since then, true? > If this feature must be implented in VFS in another patch, why not > just activate what works and make the future patch disable it again? Why would (should) it be impleemented in VFS? reflink copy is completely different from normal copy and hard link. Subvolumes in btrfs are barriers *only* in btrfs and not visible in VFS. IMHO it's strictly btrfs business and not supporting reflink copy between arbitrary directories is a bug. Regards, -- Hubert Kario QBS - Quality Business Software 02-656 Warszawa, ul. Ksawerów 30/85 tel. +48 (22) 646-61-51, 646-74-24 www.qbs.com.pl [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 2346 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-04-28 23:53 ` Hubert Kario @ 2012-04-29 20:05 ` Norbert Scheibner 2012-08-17 4:20 ` james northrup [not found] ` <CAPkEcwgSZ8umbFeuZ-fQAFAprBubL58eFSf4TQ=Z13ks8i9DOQ@mail.gmail.com> 0 siblings, 2 replies; 20+ messages in thread From: Norbert Scheibner @ 2012-04-29 20:05 UTC (permalink / raw) To: Jérôme Poulin, Hubert Kario; +Cc: linux-btrfs Am 29.04.2012, 01:53 Uhr, schrieb Hubert Kario <hka@qbs.com.pl>: > On Sunday 01 of April 2012 11:42:23 J=E9r=F4me Poulin wrote: >> On Sun, Apr 1, 2012 at 11:27 AM, Norbert Scheibner <scno@gmx.net> wr= ote: >> > Some users tested this patch successfully for week,s or months in = 2 =20 >> or 3 >> > kernel versions since then, true? >> If this feature must be implented in VFS in another patch, why not >> just activate what works and make the future patch disable it again? > > Why would (should) it be impleemented in VFS? reflink copy is complet= ely > different from normal copy and hard link. I wouldn't make a VFS issue out of that. That should be another discuss= ion. But: > Subvolumes in btrfs are barriers *only* in btrfs and not visible in V= =46S. That is just a bug in my opinion, so it should work anyway, but to look= at =20 it from VFS point of view is strengthening me in wanting the outstandin= g =20 patches integrated, as this feature could be supported by VFS in the =20 future. > IMHO it's strictly btrfs business and not supporting reflink copy bet= ween > arbitrary directories is a bug. I don't know exactly, but I think ZFS is another candidate for "cp =20 --reflink". For some of the log-structured filesystems this could be =20 usefull too, but I don't know if some of them already supports this or = =20 plan to support this in the future. Greetings Norbert -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" = in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-04-29 20:05 ` Norbert Scheibner @ 2012-08-17 4:20 ` james northrup 2012-08-17 5:20 ` Marc MERLIN [not found] ` <CAPkEcwgSZ8umbFeuZ-fQAFAprBubL58eFSf4TQ=Z13ks8i9DOQ@mail.gmail.com> 1 sibling, 1 reply; 20+ messages in thread From: james northrup @ 2012-08-17 4:20 UTC (permalink / raw) To: Norbert Scheibner; +Cc: Jérôme Poulin, Hubert Kario, linux-btrfs dunno if this thread is dead, but im inclined to patch in cp --reflink to "fdupes" prog. It currently does provide a poor-man's dedupe via md5sum and hardlink, or delete. all the better if the distro-kernels can backport cross-snapshot reflinks sooner than later. On Sun, Apr 29, 2012 at 1:05 PM, Norbert Scheibner <scno@gmx.net> wrote: > Am 29.04.2012, 01:53 Uhr, schrieb Hubert Kario <hka@qbs.com.pl>: > > >> On Sunday 01 of April 2012 11:42:23 Jérôme Poulin wrote: >>> >>> On Sun, Apr 1, 2012 at 11:27 AM, Norbert Scheibner <scno@gmx.net> wrote: >>> > Some users tested this patch successfully for week,s or months in 2 or >>> > 3 >>> > kernel versions since then, true? >>> If this feature must be implented in VFS in another patch, why not >>> just activate what works and make the future patch disable it again? >> >> >> Why would (should) it be impleemented in VFS? reflink copy is completely >> different from normal copy and hard link. > > > I wouldn't make a VFS issue out of that. That should be another discussion. > > But: > >> Subvolumes in btrfs are barriers *only* in btrfs and not visible in VFS. > > > That is just a bug in my opinion, so it should work anyway, but to look at > it from VFS point of view is strengthening me in wanting the outstanding > patches integrated, as this feature could be supported by VFS in the future. > > >> IMHO it's strictly btrfs business and not supporting reflink copy between >> arbitrary directories is a bug. > > > I don't know exactly, but I think ZFS is another candidate for "cp > --reflink". For some of the log-structured filesystems this could be usefull > too, but I don't know if some of them already supports this or plan to > support this in the future. > > Greetings > Norbert > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-08-17 4:20 ` james northrup @ 2012-08-17 5:20 ` Marc MERLIN 2012-08-19 5:08 ` Mitch Harder 0 siblings, 1 reply; 20+ messages in thread From: Marc MERLIN @ 2012-08-17 5:20 UTC (permalink / raw) To: james northrup Cc: Norbert Scheibner, Jérôme Poulin, Hubert Kario, linux-btrfs On Thu, Aug 16, 2012 at 09:20:00PM -0700, james northrup wrote: > dunno if this thread is dead, but im inclined to patch in cp --reflink > to "fdupes" prog. It currently does provide a poor-man's dedupe via > md5sum and hardlink, or delete. > > all the better if the distro-kernels can backport cross-snapshot > reflinks sooner than later. So, I'd love for cp --reflink to bring back a deleted VM (huge file) from a snapshot back to trunk without duplicating it. But how would fdupes help? I can't hardlink between two snapshots, can I? gandalfthegreat:/mnt/btrfs_pool1# ln usr_weekly_20120812_00\:02\:01/svn-commit.tmp usr/test ln: failed to create hard link `usr/test' => `usr_weekly_20120812_00:02:01/svn-commit.tmp': Invalid cross-device link So, is there anything user space can do without kernel support? Marc > On Sun, Apr 29, 2012 at 1:05 PM, Norbert Scheibner <scno@gmx.net> wrote: > > Am 29.04.2012, 01:53 Uhr, schrieb Hubert Kario <hka@qbs.com.pl>: > > > > > >> On Sunday 01 of April 2012 11:42:23 Jérôme Poulin wrote: > >>> > >>> On Sun, Apr 1, 2012 at 11:27 AM, Norbert Scheibner <scno@gmx.net> wrote: > >>> > Some users tested this patch successfully for week,s or months in 2 or > >>> > 3 > >>> > kernel versions since then, true? > >>> If this feature must be implented in VFS in another patch, why not > >>> just activate what works and make the future patch disable it again? > >> > >> > >> Why would (should) it be impleemented in VFS? reflink copy is completely > >> different from normal copy and hard link. > > > > > > I wouldn't make a VFS issue out of that. That should be another discussion. > > > > But: > > > >> Subvolumes in btrfs are barriers *only* in btrfs and not visible in VFS. > > > > > > That is just a bug in my opinion, so it should work anyway, but to look at > > it from VFS point of view is strengthening me in wanting the outstanding > > patches integrated, as this feature could be supported by VFS in the future. > > > > > >> IMHO it's strictly btrfs business and not supporting reflink copy between > >> arbitrary directories is a bug. > > > > > > I don't know exactly, but I think ZFS is another candidate for "cp > > --reflink". For some of the log-structured filesystems this could be usefull > > too, but I don't know if some of them already supports this or plan to > > support this in the future. > > > > Greetings > > Norbert > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-08-17 5:20 ` Marc MERLIN @ 2012-08-19 5:08 ` Mitch Harder 2012-08-19 6:43 ` Marc MERLIN 0 siblings, 1 reply; 20+ messages in thread From: Mitch Harder @ 2012-08-19 5:08 UTC (permalink / raw) To: Marc MERLIN Cc: james northrup, Norbert Scheibner, Jérôme Poulin, Hubert Kario, linux-btrfs On Fri, Aug 17, 2012 at 12:20 AM, Marc MERLIN <marc@merlins.org> wrote: > On Thu, Aug 16, 2012 at 09:20:00PM -0700, james northrup wrote: >> dunno if this thread is dead, but im inclined to patch in cp --reflink >> to "fdupes" prog. It currently does provide a poor-man's dedupe via >> md5sum and hardlink, or delete. >> >> all the better if the distro-kernels can backport cross-snapshot >> reflinks sooner than later. > > So, I'd love for cp --reflink to bring back a deleted VM (huge file) from a > snapshot back to trunk without duplicating it. > But how would fdupes help? I can't hardlink between two snapshots, can I? > > gandalfthegreat:/mnt/btrfs_pool1# ln usr_weekly_20120812_00\:02\:01/svn-commit.tmp usr/test > ln: failed to create hard link `usr/test' => `usr_weekly_20120812_00:02:01/svn-commit.tmp': Invalid cross-device link > > So, is there anything user space can do without kernel support? > A cross-subvolume copy patch has made it into 3.6_rc http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=362a20c5e27614739c4 This patch will allow cp --reflink across subvolumes, as long as the copy does not cross mount points. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-08-19 5:08 ` Mitch Harder @ 2012-08-19 6:43 ` Marc MERLIN 0 siblings, 0 replies; 20+ messages in thread From: Marc MERLIN @ 2012-08-19 6:43 UTC (permalink / raw) To: Mitch Harder Cc: james northrup, Norbert Scheibner, Jérôme Poulin, Hubert Kario, linux-btrfs On Sun, Aug 19, 2012 at 12:08:01AM -0500, Mitch Harder wrote: > On Fri, Aug 17, 2012 at 12:20 AM, Marc MERLIN <marc@merlins.org> wrote: > > On Thu, Aug 16, 2012 at 09:20:00PM -0700, james northrup wrote: > >> dunno if this thread is dead, but im inclined to patch in cp --reflink > >> to "fdupes" prog. It currently does provide a poor-man's dedupe via > >> md5sum and hardlink, or delete. > >> > >> all the better if the distro-kernels can backport cross-snapshot > >> reflinks sooner than later. > > > > So, I'd love for cp --reflink to bring back a deleted VM (huge file) from a > > snapshot back to trunk without duplicating it. > > But how would fdupes help? I can't hardlink between two snapshots, can I? > > > > gandalfthegreat:/mnt/btrfs_pool1# ln usr_weekly_20120812_00\:02\:01/svn-commit.tmp usr/test > > ln: failed to create hard link `usr/test' => `usr_weekly_20120812_00:02:01/svn-commit.tmp': Invalid cross-device link > > > > So, is there anything user space can do without kernel support? > > > > A cross-subvolume copy patch has made it into 3.6_rc > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=362a20c5e27614739c4 > > This patch will allow cp --reflink across subvolumes, as long as the > copy does not cross mount points. I missed that, that's great news, thanks to all those involved in getting this in. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <CAPkEcwgSZ8umbFeuZ-fQAFAprBubL58eFSf4TQ=Z13ks8i9DOQ@mail.gmail.com>]
* Re: cross-subvolume cp --reflink [not found] ` <CAPkEcwgSZ8umbFeuZ-fQAFAprBubL58eFSf4TQ=Z13ks8i9DOQ@mail.gmail.com> @ 2012-08-20 18:08 ` Jérôme Poulin 2012-08-21 0:20 ` james northrup 0 siblings, 1 reply; 20+ messages in thread From: Jérôme Poulin @ 2012-08-20 18:08 UTC (permalink / raw) To: linux-btrfs, james northrup On Thu, Aug 16, 2012 at 5:41 PM, james northrup <northrup.james@gmail.com> wrote: > > dunno if this thread is dead, but im inclined to patch in cp --reflink to "fdupes" prog. > It currently does provide a poor-man's dedupe via md5sum and hardlink, or delete. > > all the better if the distro-kernels can backport cross-snapshot reflinks sooner than later. > I was also wondering if it is possible for a program like fdupes to use BTRFS checksum to make searching for duplicates much faster as you wouldn't need to calculate checksum if BTRFS own checksum was mismatched between 2 groups of checksum blocks? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: cross-subvolume cp --reflink 2012-08-20 18:08 ` Jérôme Poulin @ 2012-08-21 0:20 ` james northrup 0 siblings, 0 replies; 20+ messages in thread From: james northrup @ 2012-08-21 0:20 UTC (permalink / raw) To: Jérôme Poulin; +Cc: linux-btrfs On Mon, Aug 20, 2012 at 11:08 AM, Jérôme Poulin <jeromepoulin@gmail.com> wrote: > > On Thu, Aug 16, 2012 at 5:41 PM, james northrup > <northrup.james@gmail.com> wrote: > > > > dunno if this thread is dead, but im inclined to patch in cp --reflink > > to "fdupes" prog. > > It currently does provide a poor-man's dedupe via md5sum and hardlink, > > or delete. > > > > all the better if the distro-kernels can backport cross-snapshot > > reflinks sooner than later. > > > > I was also wondering if it is possible for a program like fdupes to > use BTRFS checksum to make searching for duplicates much faster as you > wouldn't need to calculate checksum if BTRFS own checksum was > mismatched between 2 groups of checksum blocks? source in question is here.. http://code.google.com/p/fdupes/source/browse/trunk/fdupes.c i consider cp --reflink an option for this code. it doesn't strike me as scalable at first glance and i wouldnt want to spend more than a few minutes adding a new option. i like the idea of something a little more scalable to populate a radix tree with extent-checksums abound and to link them without regard to order or associative. i dont have this project in mind myself. ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2012-08-21 0:20 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-04-01 15:27 cross-subvolume cp --reflink Norbert Scheibner 2012-04-01 15:30 ` Konstantinos Skarlatos 2012-04-01 16:41 ` Norbert Scheibner 2012-04-01 16:45 ` Konstantinos Skarlatos 2012-04-01 17:07 ` Norbert Scheibner 2012-04-01 17:19 ` Konstantinos Skarlatos 2012-04-01 18:11 ` Norbert Scheibner 2012-04-01 19:42 ` Konstantinos Skarlatos [not found] ` <4F788EE2.4010105@univie.ac.at> 2012-04-01 18:39 ` Norbert Scheibner 2012-04-01 19:27 ` Konstantinos Skarlatos 2012-04-02 8:29 ` David Sterba 2012-04-01 15:42 ` Jérôme Poulin 2012-04-28 23:53 ` Hubert Kario 2012-04-29 20:05 ` Norbert Scheibner 2012-08-17 4:20 ` james northrup 2012-08-17 5:20 ` Marc MERLIN 2012-08-19 5:08 ` Mitch Harder 2012-08-19 6:43 ` Marc MERLIN [not found] ` <CAPkEcwgSZ8umbFeuZ-fQAFAprBubL58eFSf4TQ=Z13ks8i9DOQ@mail.gmail.com> 2012-08-20 18:08 ` Jérôme Poulin 2012-08-21 0:20 ` james northrup
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).