* More problems... @ 2005-04-29 16:01 Russell King 2005-04-29 16:12 ` Russell King 2005-04-29 18:27 ` Petr Baudis 0 siblings, 2 replies; 24+ messages in thread From: Russell King @ 2005-04-29 16:01 UTC (permalink / raw) To: git Ok. cogito-0.8. rmk@dyn-67:[linux-2.6-rmk]:<1049> cg-update origin `../linux-2.6/.git/refs/heads/master' -> `.git/refs/heads/origin' `../linux-2.6/.git/objects/00/78aeb85737197a84af1eeb0353dbef74427901' -> `.git/objects/00/78aeb85737197a84af1eeb0353dbef74427901' cp: cannot create link `.git/objects/00/78aeb85737197a84af1eeb0353dbef74427901': File exists `../linux-2.6/.git/objects/01/ca31cc7bfdd18d6f72288915021730442f386d' -> `.git/objects/01/ca31cc7bfdd18d6f72288915021730442f386d' cp: cannot create link `.git/objects/01/ca31cc7bfdd18d6f72288915021730442f386d': File exists `../linux-2.6/.git/objects/04/17820d15efac837a84f9ade46f56339016a282' -> `.git/objects/04/17820d15efac837a84f9ade46f56339016a282' cp: cannot create link `.git/objects/04/17820d15efac837a84f9ade46f56339016a282': File exists `../linux-2.6/.git/objects/07/da010b67d4d715b5e97dfb824eb70433776a20' -> `.git/objects/07/da010b67d4d715b5e97dfb824eb70433776a20' cp: cannot create link `.git/objects/07/da010b67d4d715b5e97dfb824eb70433776a20': File exists `../linux-2.6/.git/objects/07/5d3961a119e8f27294cd77193f8fee7908a521' -> `.git/objects/07/5d3961a119e8f27294cd77193f8fee7908a521' ... cp: cannot create link `.git/objects/fc/1428905472a61e8e51057a4237acab5d8594d8': File exists `../linux-2.6/.git/objects/fc/373c483e62dc1bbc5c3915f2d3c795fb316ec5' -> `.git/objects/fc/373c483e62dc1bbc5c3915f2d3c795fb316ec5' cp: cannot create link `.git/objects/fc/373c483e62dc1bbc5c3915f2d3c795fb316ec5': File exists `../linux-2.6/.git/objects/ff/c3be3dff7e20e2ad5367fa8d6d0d2f0baa8a24' -> `.git/objects/ff/c3be3dff7e20e2ad5367fa8d6d0d2f0baa8a24' cp: cannot create link `.git/objects/ff/c3be3dff7e20e2ad5367fa8d6d0d2f0baa8a24': File exists `../linux-2.6/.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783' -> `.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783' cp: cannot create link `.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783': File exists cg-pull: rsync error rmk@dyn-67:[linux-2.6-rmk]:<1052> md5sum .git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 ../linux-2.6/.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 194b70d0eed786807e14e97dd0a5ad8d .git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 194b70d0eed786807e14e97dd0a5ad8d ../linux-2.6/.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 rmk@dyn-67:[linux-2.6-rmk]:<1053> vdir .git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 ../linux-2.6/.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 -rw-rw-r-- 1 rmk rmk 3070 Apr 28 10:43 .git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 -rw-r--r-- 1 rmk rmk 3070 Apr 29 16:50 ../linux-2.6/.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 What seems to be happening is that I create changes in my tree, rsync them out to kernel.org. Linus pulls them into his tree. I pull them back into my reference tree, and then try and update my working tree. By that time, the object files in the reference tree appear to have a newer timestamp than the corresponding ones in my local tree, and cp -lua fails. Which means cogito fails to work for me... Help. -- Russell King ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-04-29 16:01 More problems Russell King @ 2005-04-29 16:12 ` Russell King 2005-04-29 17:51 ` Linus Torvalds 2005-04-29 18:27 ` Petr Baudis 1 sibling, 1 reply; 24+ messages in thread From: Russell King @ 2005-04-29 16:12 UTC (permalink / raw) To: git On Fri, Apr 29, 2005 at 05:01:27PM +0100, Russell King wrote: > Ok. cogito-0.8. > > rmk@dyn-67:[linux-2.6-rmk]:<1049> cg-update origin > `../linux-2.6/.git/refs/heads/master' -> `.git/refs/heads/origin' > `../linux-2.6/.git/objects/00/78aeb85737197a84af1eeb0353dbef74427901' -> `.git/objects/00/78aeb85737197a84af1eeb0353dbef74427901' > cp: cannot create link `.git/objects/00/78aeb85737197a84af1eeb0353dbef74427901': File exists > `../linux-2.6/.git/objects/01/ca31cc7bfdd18d6f72288915021730442f386d' -> `.git/objects/01/ca31cc7bfdd18d6f72288915021730442f386d' > cp: cannot create link `.git/objects/01/ca31cc7bfdd18d6f72288915021730442f386d': File exists > `../linux-2.6/.git/objects/04/17820d15efac837a84f9ade46f56339016a282' -> `.git/objects/04/17820d15efac837a84f9ade46f56339016a282' > cp: cannot create link `.git/objects/04/17820d15efac837a84f9ade46f56339016a282': File exists > `../linux-2.6/.git/objects/07/da010b67d4d715b5e97dfb824eb70433776a20' -> `.git/objects/07/da010b67d4d715b5e97dfb824eb70433776a20' > cp: cannot create link `.git/objects/07/da010b67d4d715b5e97dfb824eb70433776a20': File exists > `../linux-2.6/.git/objects/07/5d3961a119e8f27294cd77193f8fee7908a521' -> `.git/objects/07/5d3961a119e8f27294cd77193f8fee7908a521' > ... > cp: cannot create link `.git/objects/fc/1428905472a61e8e51057a4237acab5d8594d8': File exists > `../linux-2.6/.git/objects/fc/373c483e62dc1bbc5c3915f2d3c795fb316ec5' -> `.git/objects/fc/373c483e62dc1bbc5c3915f2d3c795fb316ec5' > cp: cannot create link `.git/objects/fc/373c483e62dc1bbc5c3915f2d3c795fb316ec5': File exists > `../linux-2.6/.git/objects/ff/c3be3dff7e20e2ad5367fa8d6d0d2f0baa8a24' -> `.git/objects/ff/c3be3dff7e20e2ad5367fa8d6d0d2f0baa8a24' > cp: cannot create link `.git/objects/ff/c3be3dff7e20e2ad5367fa8d6d0d2f0baa8a24': File exists > `../linux-2.6/.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783' -> `.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783' > cp: cannot create link `.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783': File exists > cg-pull: rsync error > > rmk@dyn-67:[linux-2.6-rmk]:<1052> md5sum .git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 ../linux-2.6/.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 > 194b70d0eed786807e14e97dd0a5ad8d .git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 > 194b70d0eed786807e14e97dd0a5ad8d ../linux-2.6/.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 > rmk@dyn-67:[linux-2.6-rmk]:<1053> vdir .git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 ../linux-2.6/.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 > -rw-rw-r-- 1 rmk rmk 3070 Apr 28 10:43 .git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 > -rw-r--r-- 1 rmk rmk 3070 Apr 29 16:50 ../linux-2.6/.git/objects/ff/8b49966a9f6ed23f6489bb986de87a14d4b783 > > What seems to be happening is that I create changes in my tree, rsync > them out to kernel.org. Linus pulls them into his tree. I pull them > back into my reference tree, and then try and update my working tree. > > By that time, the object files in the reference tree appear to have > a newer timestamp than the corresponding ones in my local tree, and > cp -lua fails. > > Which means cogito fails to work for me... Help. I've worked around it by doing: $ rm -rf .git/objects/* $ cg-update origin Not particularly nice, but very necessary, and apparantly the only way to get this to work. Grumble. Why am I seemingly the only one running into all these blocking problems. -- Russell King ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-04-29 16:12 ` Russell King @ 2005-04-29 17:51 ` Linus Torvalds 0 siblings, 0 replies; 24+ messages in thread From: Linus Torvalds @ 2005-04-29 17:51 UTC (permalink / raw) To: Russell King; +Cc: git On Fri, 29 Apr 2005, Russell King wrote: > > Not particularly nice, but very necessary, and apparantly the only > way to get this to work. Grumble. Why am I seemingly the only one > running into all these blocking problems. Well, I suspect not a lot of people are actually using cogito for kernel stuff. I still use my old "git-pull-script" to update stuff. I know how it works, and it does at least these things right (as long as "merge-base" works right, of course - which so far it has for me since the re-write). Linus ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-04-29 16:01 More problems Russell King 2005-04-29 16:12 ` Russell King @ 2005-04-29 18:27 ` Petr Baudis 2005-04-29 19:50 ` Ryan Anderson 1 sibling, 1 reply; 24+ messages in thread From: Petr Baudis @ 2005-04-29 18:27 UTC (permalink / raw) To: Russell King; +Cc: git Dear diary, on Fri, Apr 29, 2005 at 06:01:27PM CEST, I got a letter where Russell King <rmk@arm.linux.org.uk> told me that... > rmk@dyn-67:[linux-2.6-rmk]:<1049> cg-update origin > `../linux-2.6/.git/objects/00/78aeb85737197a84af1eeb0353dbef74427901' -> `.git/objects/00/78aeb85737197a84af1eeb0353dbef74427901' > cp: cannot create link `.git/objects/00/78aeb85737197a84af1eeb0353dbef74427901': File exists > > By that time, the object files in the reference tree appear to have > a newer timestamp than the corresponding ones in my local tree, and > cp -lua fails. I'm now away ,unfortunately, and no immediate idea stems to my mind on how to fix it. Ideas welcomed - I need to hardlink missing entries from one tree to another; it would be enough to be able to just tell cp to ignore already present files. Could you please try to give cp the -f flag? -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-04-29 18:27 ` Petr Baudis @ 2005-04-29 19:50 ` Ryan Anderson 2005-04-29 20:03 ` Thomas Glanzmann ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: Ryan Anderson @ 2005-04-29 19:50 UTC (permalink / raw) To: Petr Baudis; +Cc: Russell King, git On Fri, Apr 29, 2005 at 08:27:08PM +0200, Petr Baudis wrote: > Dear diary, on Fri, Apr 29, 2005 at 06:01:27PM CEST, I got a letter > where Russell King <rmk@arm.linux.org.uk> told me that... > > rmk@dyn-67:[linux-2.6-rmk]:<1049> cg-update origin > > `../linux-2.6/.git/objects/00/78aeb85737197a84af1eeb0353dbef74427901' -> `.git/objects/00/78aeb85737197a84af1eeb0353dbef74427901' > > cp: cannot create link `.git/objects/00/78aeb85737197a84af1eeb0353dbef74427901': File exists > > > > By that time, the object files in the reference tree appear to have > > a newer timestamp than the corresponding ones in my local tree, and > > cp -lua fails. > > I'm now away ,unfortunately, and no immediate idea stems to my mind on > how to fix it. Ideas welcomed - I need to hardlink missing entries from > one tree to another; it would be enough to be able to just tell cp to > ignore already present files. > > Could you please try to give cp the -f flag? Why not just use "rsync" for both remote and local synchronization, and provide a "relink" command to scan two .git/objects/ repositories and hardlink matching files together? With the SHA1 hash, you can even have a --unsafe option that just compares the has names and does a link based purely off of that and the stat(2) results of both files. (I'd expect that a ... safer variant would extract both files and compare them, but the --unsafe should be sufficient, in practice, I would think.) -- Ryan Anderson sometimes Pug Majere ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-04-29 19:50 ` Ryan Anderson @ 2005-04-29 20:03 ` Thomas Glanzmann 2005-04-29 20:21 ` Linus Torvalds 2005-05-02 21:13 ` More problems Petr Baudis 2 siblings, 0 replies; 24+ messages in thread From: Thomas Glanzmann @ 2005-04-29 20:03 UTC (permalink / raw) To: git Hello, > Why not just use "rsync" for both remote and local synchronization, and > provide a "relink" command to scan two .git/objects/ repositories and > hardlink matching files together? That came to my mind, too. And it is actually the only thing that makes sense. - In matters of KISS. :-) Thomas ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-04-29 19:50 ` Ryan Anderson 2005-04-29 20:03 ` Thomas Glanzmann @ 2005-04-29 20:21 ` Linus Torvalds 2005-04-29 21:07 ` Junio C Hamano 2005-05-04 5:54 ` [PATCH] Add git-relink-script, a tool to hardlink two existing repositories Ryan Anderson 2005-05-02 21:13 ` More problems Petr Baudis 2 siblings, 2 replies; 24+ messages in thread From: Linus Torvalds @ 2005-04-29 20:21 UTC (permalink / raw) To: Ryan Anderson; +Cc: Petr Baudis, Russell King, git On Fri, 29 Apr 2005, Ryan Anderson wrote: > > Why not just use "rsync" for both remote and local synchronization, and > provide a "relink" command to scan two .git/objects/ repositories and > hardlink matching files together? Absolutely. I use the same "git-pull-script" between two local directories on disk. The only issue there is that you have to give the ".git" directory, ie you should do git-pull-script ~/by/other/repository/.git instead of pointing to the other repo's root. Of course, I don't bother with the linking. But that's the trivial part. > With the SHA1 hash, you can even have a --unsafe option that just > compares the has names and does a link based purely off of that and the > stat(2) results of both files. (I'd expect that a ... safer variant > would extract both files and compare them, but the --unsafe should be > sufficient, in practice, I would think.) I don't think there is any point to unsafe. The assumption is that if you do things this way, the "unlinked" files will the the uncommon case, so what you do is - remember the list of files you copied when you did the pull (you had to have this list at some point anyway). Sort by name, - create a list of names of both repositories, sorted by name - do the union of those three lists (cheap, thanks to the sorting) - stat each name to see if it's already linked (which it will be, most of the time), continue to the next one.. - if they aren't linked, just do a "cmp" on them, and warn if they aren't the same, continue to the next one. - else link them. And if you want to, you can skip the first stage, and just relink two trees without looking at a list of "known new" files - it's going to be expensive to link two big repositories the _first_ time, but hey even the "expensive" part is likely to be pretty cheap in the end. If it takes an hour or two to relink some years of history, big deal. Do it overnight, you only need it once. Linus ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-04-29 20:21 ` Linus Torvalds @ 2005-04-29 21:07 ` Junio C Hamano 2005-04-29 21:19 ` Russell King 2005-04-29 21:27 ` Daniel Barkalow 2005-05-04 5:54 ` [PATCH] Add git-relink-script, a tool to hardlink two existing repositories Ryan Anderson 1 sibling, 2 replies; 24+ messages in thread From: Junio C Hamano @ 2005-04-29 21:07 UTC (permalink / raw) To: Linus Torvalds; +Cc: Ryan Anderson, Petr Baudis, Russell King, git >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes: LT> Absolutely. I use the same "git-pull-script" between two local directories LT> on disk... LT> Of course, I don't bother with the linking. But that's the trivial part. Would it be useful if somebody wrote local-pull.c similar to http-pull.c, which clones one local SHA_FILE_DIRECTORY to another, with an option to (1) try hardlink and if it fails fail; (2) try hardlink and if it fails try symlink and if it fails fail; (3) try hardlink and if it fails try copy and if it fails fail? Then from a source repository that contains good stuff plus throwaway experimental commits you can prepare pruned for-public tree. Of course you can do it today by copying and then running git-prune in the destination, though. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-04-29 21:07 ` Junio C Hamano @ 2005-04-29 21:19 ` Russell King 2005-04-29 21:57 ` Anton Altaparmakov 2005-04-29 21:27 ` Daniel Barkalow 1 sibling, 1 reply; 24+ messages in thread From: Russell King @ 2005-04-29 21:19 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Ryan Anderson, Petr Baudis, git On Fri, Apr 29, 2005 at 02:07:29PM -0700, Junio C Hamano wrote: > >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes: > > LT> Absolutely. I use the same "git-pull-script" between two local directories > LT> on disk... > LT> Of course, I don't bother with the linking. But that's the trivial part. > > Would it be useful if somebody wrote local-pull.c similar to > http-pull.c, which clones one local SHA_FILE_DIRECTORY to > another, with an option to (1) try hardlink and if it fails > fail; (2) try hardlink and if it fails try symlink and if it > fails fail; (3) try hardlink and if it fails try copy and if it > fails fail? What would be nice is if it finds an existing file for the one it's trying to hard link, it compares the contents (maybe - is this actually necessary?) and if identical, it removes the original file replacing it with a hard link. This means that you'll always be trying to maintain the hard linked structure between various working trees in the background. But maybe this should have an option to enable this behaviour. -- Russell King ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-04-29 21:19 ` Russell King @ 2005-04-29 21:57 ` Anton Altaparmakov 2005-05-02 19:33 ` Petr Baudis 0 siblings, 1 reply; 24+ messages in thread From: Anton Altaparmakov @ 2005-04-29 21:57 UTC (permalink / raw) To: Russell King Cc: Junio C Hamano, Linus Torvalds, Ryan Anderson, Petr Baudis, git On Fri, 29 Apr 2005, Russell King wrote: > On Fri, Apr 29, 2005 at 02:07:29PM -0700, Junio C Hamano wrote: > > >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes: > > LT> Absolutely. I use the same "git-pull-script" between two local directories > > LT> on disk... > > LT> Of course, I don't bother with the linking. But that's the trivial part. > > > > Would it be useful if somebody wrote local-pull.c similar to > > http-pull.c, which clones one local SHA_FILE_DIRECTORY to > > another, with an option to (1) try hardlink and if it fails > > fail; (2) try hardlink and if it fails try symlink and if it > > fails fail; (3) try hardlink and if it fails try copy and if it > > fails fail? > > What would be nice is if it finds an existing file for the one it's > trying to hard link, it compares the contents (maybe - is this actually > necessary?) and if identical, it removes the original file replacing > it with a hard link. Unless I have completely misunderstood things, you never need to compare the file contents. Just compare the file names. If they match, i.e. the SHA1 is the same, the contents must match by definition. So you only need a stat(), rather than read&decompress&compare. > This means that you'll always be trying to maintain the hard linked > structure between various working trees in the background. > > But maybe this should have an option to enable this behaviour. There should definitely be an option to either enable or disable this as there are legitimate cases for not wanting hard links or indeed using file systems which do not support them. Best regards, Anton -- Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-04-29 21:57 ` Anton Altaparmakov @ 2005-05-02 19:33 ` Petr Baudis 2005-05-02 19:44 ` Dave Kleikamp 2005-05-02 22:01 ` Anton Altaparmakov 0 siblings, 2 replies; 24+ messages in thread From: Petr Baudis @ 2005-05-02 19:33 UTC (permalink / raw) To: Anton Altaparmakov Cc: Russell King, Junio C Hamano, Linus Torvalds, Ryan Anderson, git Dear diary, on Fri, Apr 29, 2005 at 11:57:53PM CEST, I got a letter where Anton Altaparmakov <aia21@cam.ac.uk> told me that... > There should definitely be an option to either enable or disable this as > there are legitimate cases for not wanting hard links or indeed using > file systems which do not support them. Are there legitimate cases for not wanting hard links when you are able to create them? (Same filesystem, filesystem supports them...) -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-05-02 19:33 ` Petr Baudis @ 2005-05-02 19:44 ` Dave Kleikamp 2005-05-02 19:51 ` Thomas Glanzmann 2005-05-02 22:01 ` Anton Altaparmakov 1 sibling, 1 reply; 24+ messages in thread From: Dave Kleikamp @ 2005-05-02 19:44 UTC (permalink / raw) To: Petr Baudis Cc: Anton Altaparmakov, Russell King, Junio C Hamano, Linus Torvalds, Ryan Anderson, git On Mon, 2005-05-02 at 21:33 +0200, Petr Baudis wrote: > Dear diary, on Fri, Apr 29, 2005 at 11:57:53PM CEST, I got a letter > where Anton Altaparmakov <aia21@cam.ac.uk> told me that... > > There should definitely be an option to either enable or disable this as > > there are legitimate cases for not wanting hard links or indeed using > > file systems which do not support them. > > Are there legitimate cases for not wanting hard links when you are able > to create them? (Same filesystem, filesystem supports them...) Cloning a different user's repo? -- David Kleikamp IBM Linux Technology Center ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-05-02 19:44 ` Dave Kleikamp @ 2005-05-02 19:51 ` Thomas Glanzmann 0 siblings, 0 replies; 24+ messages in thread From: Thomas Glanzmann @ 2005-05-02 19:51 UTC (permalink / raw) To: git Hello, > Cloning a different user's repo? it isn't my quota. :-) So that's a feature. :-) Thomas ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-05-02 19:33 ` Petr Baudis 2005-05-02 19:44 ` Dave Kleikamp @ 2005-05-02 22:01 ` Anton Altaparmakov 2005-05-02 22:19 ` Linus Torvalds 1 sibling, 1 reply; 24+ messages in thread From: Anton Altaparmakov @ 2005-05-02 22:01 UTC (permalink / raw) To: Petr Baudis Cc: Russell King, Junio C Hamano, Linus Torvalds, Ryan Anderson, git On Mon, 2 May 2005, Petr Baudis wrote: > Dear diary, on Fri, Apr 29, 2005 at 11:57:53PM CEST, I got a letter > where Anton Altaparmakov <aia21@cam.ac.uk> told me that... > > There should definitely be an option to either enable or disable this as > > there are legitimate cases for not wanting hard links or indeed using > > file systems which do not support them. > > Are there legitimate cases for not wanting hard links when you are able > to create them? (Same filesystem, filesystem supports them...) I would say yes. For example, I want to update my git tools to the latest and greatest development version. Do I really want to let it loose on all the repositories? Probably not. So I would want to make a clone of the repository that is not connected in any way with the old one and then try the new tools. If there were hard links involved working on the cloned repository could potentially damage the original one. Yes, yes, I know all tools are perfect and never have bugs but I am paranoid. (-; Best regards, Anton -- Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-05-02 22:01 ` Anton Altaparmakov @ 2005-05-02 22:19 ` Linus Torvalds 2005-05-03 1:48 ` Petr Baudis 0 siblings, 1 reply; 24+ messages in thread From: Linus Torvalds @ 2005-05-02 22:19 UTC (permalink / raw) To: Anton Altaparmakov Cc: Petr Baudis, Russell King, Junio C Hamano, Ryan Anderson, git On Mon, 2 May 2005, Anton Altaparmakov wrote: > > Yes, yes, I know all tools are perfect and never have bugs but I am > paranoid. (-; I do agree. I think hardlinks are wonderful for - "git farms" (ie something like what kernel.org does, but in a more controlled manner - right now kernel.org is really just a standard location for different people putting their own files in). In this environment, doing hard-linking should also imply - mounting the filesystem "noatime" - using a different UID for the hardlinked objects ie the "farm administrator" does the hardlinking automatically, and chown()'s them to himself, so that different git trees cannot screw each other up. The "noatime" thing is there because having different users means that git's internal "O_NOATIME" optimization no longer works, and you really want to avoi dgetting lots of write-backs just for "atime". - people who have lots of trees. I think Jeff Garzik has something like 20+ BK trees. At that point, hardlinking just makes sense, and your work patterns are likely to be aware of the different trees anyway. But for "normal" situations, where you have a tree or two, the hardlinking win might not be big enough to warrant the maintenance headache. With hardlinking, you _do_ need to "trust" the other trees to some degree. Linus ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-05-02 22:19 ` Linus Torvalds @ 2005-05-03 1:48 ` Petr Baudis 2005-05-03 2:56 ` Daniel Barkalow 2005-05-03 15:00 ` Andreas Gal 0 siblings, 2 replies; 24+ messages in thread From: Petr Baudis @ 2005-05-03 1:48 UTC (permalink / raw) To: Linus Torvalds Cc: Anton Altaparmakov, Russell King, Junio C Hamano, Ryan Anderson, git Dear diary, on Tue, May 03, 2005 at 12:19:16AM CEST, I got a letter where Linus Torvalds <torvalds@osdl.org> told me that... > But for "normal" situations, where you have a tree or two, the hardlinking > win might not be big enough to warrant the maintenance headache. With > hardlinking, you _do_ need to "trust" the other trees to some degree. As long as the trees aren't yours and you aren't doing something really horrible with them... $ time git-local-pull -a -l $(cat ~/git-devel/.git/HEAD) ~/git-devel/.git/ real 0m0.332s $ time git-local-pull -a $(cat ~/git-devel/.git/HEAD) ~/git-devel/.git/ real 0m4.306s And this is only 13M Cogito objects database. I think one of the important things is to encourage branching, therefore it must be fast enough; that's why I really wanted to do hardlinks. The disk space is important, but the speed hit probably equally (if not more) so. BTW, the object database files should have 0444 or such; they really _are_ read-only and making them so mode-wise could help against some mistakes too. It's clear that Cogito should have a way to choose whether to hardlink or copy; the question is which one should be the default one and how should it be specified. I thought about using file:// vs. just local path to differentiate between copy and hardlinking, but that'd be totally non-obvious, therefore bad UI-wise. BTW, I've just committed support for pulling from remote repositories over the HTTP and SSH protocols (http://your.git/repo, git+ssh://root@git.nasa.gov/srv/git/mars) (note that I was unable to test the SSH stuff properly now; success reports or patches welcome). Also, the local hardlinking access is now done over git-local-pull, therefore the cp errors should go away now. I'm not yet decided whether locations like kernel.org:/pub/scm/cogito/cogito.git should invoke rsync, rpull, throw an error or print a fortune cookie. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-05-03 1:48 ` Petr Baudis @ 2005-05-03 2:56 ` Daniel Barkalow 2005-05-03 15:00 ` Andreas Gal 1 sibling, 0 replies; 24+ messages in thread From: Daniel Barkalow @ 2005-05-03 2:56 UTC (permalink / raw) To: Petr Baudis Cc: Linus Torvalds, Anton Altaparmakov, Russell King, Junio C Hamano, Ryan Anderson, git On Tue, 3 May 2005, Petr Baudis wrote: > BTW, I've just committed support for pulling from remote repositories > over the HTTP and SSH protocols (http://your.git/repo, > git+ssh://root@git.nasa.gov/srv/git/mars) (note that I was unable to > test the SSH stuff properly now; success reports or patches welcome). > Also, the local hardlinking access is now done over git-local-pull, > therefore the cp errors should go away now. Before you get too far with the SSH version, I have some protocol changes which (1) allow transmission of things other than objects; (2) allow the pushing side to report that it doesn't have something without killing the connection; (3) send refs. (1) and (2) are needed to make the protocol extensible; (3) takes advantage of (1) to make it possible to maintain a remote repository without doing anything other than rpush to it. This goes with my patches from the weekend to enable git-*-pull to transfer refs/ files in the same process. > I'm not yet decided whether locations like > > kernel.org:/pub/scm/cogito/cogito.git > > should invoke rsync, rpull, throw an error or print a fortune cookie. Probably not rpull, which requires a login, at least not unless the others fail. I think that http-pull is going to be nicer in the long run than rsync, since the remote repository could have a bunch of mingled heads and http-pull will get exclusively the interesting stuff. If you're trying to push, then rpush, since that's the only push. Personally, I've been using http://... for http-pull, rsync://... for rsync, and //... for rpull/rpush (which is somewhat justified wrt the URI standard for using the program's default method). -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-05-03 1:48 ` Petr Baudis 2005-05-03 2:56 ` Daniel Barkalow @ 2005-05-03 15:00 ` Andreas Gal 2005-05-03 19:18 ` Junio C Hamano 1 sibling, 1 reply; 24+ messages in thread From: Andreas Gal @ 2005-05-03 15:00 UTC (permalink / raw) To: Petr Baudis Cc: Linus Torvalds, Anton Altaparmakov, Russell King, Junio C Hamano, Ryan Anderson, git I am just soft-linking objects/ in the branched tree. I can live with dangling objects, branching is extremly fast, and diskspace is cheap anyway. The only downside is that it doesn't work too well with rsync as network protocol, but I use only http-pull and rpush anyway. Andreas On Tue, 3 May 2005, Petr Baudis wrote: > Dear diary, on Tue, May 03, 2005 at 12:19:16AM CEST, I got a letter > where Linus Torvalds <torvalds@osdl.org> told me that... > > But for "normal" situations, where you have a tree or two, the hardlinking > > win might not be big enough to warrant the maintenance headache. With > > hardlinking, you _do_ need to "trust" the other trees to some degree. > > As long as the trees aren't yours and you aren't doing something really > horrible with them... > > $ time git-local-pull -a -l $(cat ~/git-devel/.git/HEAD) ~/git-devel/.git/ > real 0m0.332s > > $ time git-local-pull -a $(cat ~/git-devel/.git/HEAD) ~/git-devel/.git/ > real 0m4.306s > > And this is only 13M Cogito objects database. I think one of the > important things is to encourage branching, therefore it must be fast > enough; that's why I really wanted to do hardlinks. The disk space is > important, but the speed hit probably equally (if not more) so. > > BTW, the object database files should have 0444 or such; they really > _are_ read-only and making them so mode-wise could help against some > mistakes too. > > It's clear that Cogito should have a way to choose whether to hardlink > or copy; the question is which one should be the default one and how > should it be specified. I thought about using file:// vs. just local > path to differentiate between copy and hardlinking, but that'd be > totally non-obvious, therefore bad UI-wise. > > BTW, I've just committed support for pulling from remote repositories > over the HTTP and SSH protocols (http://your.git/repo, > git+ssh://root@git.nasa.gov/srv/git/mars) (note that I was unable to > test the SSH stuff properly now; success reports or patches welcome). > Also, the local hardlinking access is now done over git-local-pull, > therefore the cp errors should go away now. > > I'm not yet decided whether locations like > > kernel.org:/pub/scm/cogito/cogito.git > > should invoke rsync, rpull, throw an error or print a fortune cookie. > > -- > Petr "Pasky" Baudis > Stuff: http://pasky.or.cz/ > C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor > - > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-05-03 15:00 ` Andreas Gal @ 2005-05-03 19:18 ` Junio C Hamano 0 siblings, 0 replies; 24+ messages in thread From: Junio C Hamano @ 2005-05-03 19:18 UTC (permalink / raw) To: Andreas Gal Cc: Petr Baudis, Linus Torvalds, Anton Altaparmakov, Russell King, Ryan Anderson, git >>>>> "AG" == Andreas Gal <gal@uci.edu> writes: AG> I am just soft-linking objects/ in the branched tree. I can live with AG> dangling objects, branching is extremly fast, and diskspace is cheap AG> anyway. The only downside is that it doesn't work too well with rsync as AG> network protocol,... I usually do not symlinks myself, but doesn't "rsync -L" work for you? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-04-29 21:07 ` Junio C Hamano 2005-04-29 21:19 ` Russell King @ 2005-04-29 21:27 ` Daniel Barkalow 2005-04-29 22:01 ` Junio C Hamano 1 sibling, 1 reply; 24+ messages in thread From: Daniel Barkalow @ 2005-04-29 21:27 UTC (permalink / raw) To: Junio C Hamano Cc: Linus Torvalds, Ryan Anderson, Petr Baudis, Russell King, git On Fri, 29 Apr 2005, Junio C Hamano wrote: > >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes: > > LT> Absolutely. I use the same "git-pull-script" between two local directories > LT> on disk... > LT> Of course, I don't bother with the linking. But that's the trivial part. > > Would it be useful if somebody wrote local-pull.c similar to > http-pull.c, which clones one local SHA_FILE_DIRECTORY to > another, with an option to (1) try hardlink and if it fails > fail; (2) try hardlink and if it fails try symlink and if it > fails fail; (3) try hardlink and if it fails try copy and if it > fails fail? If someone does this, they should make a pull.c out of http-pull and rpull; the logic for determining what you need to copy, given what you have and what the user wants to have, should be shared. (Note that some usage patterns only require the latest commit, or at least can deal with fetching other stuff only when needed.) -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-04-29 21:27 ` Daniel Barkalow @ 2005-04-29 22:01 ` Junio C Hamano 2005-04-30 5:36 ` [PATCH] Split out "pull" from particular methods Daniel Barkalow 0 siblings, 1 reply; 24+ messages in thread From: Junio C Hamano @ 2005-04-29 22:01 UTC (permalink / raw) To: Daniel Barkalow Cc: Linus Torvalds, Ryan Anderson, Petr Baudis, Russell King, git >>>>> "DB" == Daniel Barkalow <barkalow@iabervon.org> writes: DB> If someone does this, they should make a pull.c out of http-pull and DB> rpull; the logic for determining what you need to copy, given what you DB> have and what the user wants to have, should be shared. I agree with your analysis. I was hoping that that someone would be you, knowing where http-pull originated ;-). ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] Split out "pull" from particular methods 2005-04-29 22:01 ` Junio C Hamano @ 2005-04-30 5:36 ` Daniel Barkalow 0 siblings, 0 replies; 24+ messages in thread From: Daniel Barkalow @ 2005-04-30 5:36 UTC (permalink / raw) To: Junio C Hamano, Linus Torvalds Cc: Ryan Anderson, Petr Baudis, Russell King, git The method for deciding what to pull is useful separately from any of the ways of actually fetching the objects. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Split out "pull" functionality from http-pull and rpull Index: Makefile =================================================================== --- 8602fe7cb4bf668fd021ab3bfb2082ac7d535e57/Makefile (mode:100644 sha1:ef9a9fae88a1ac438c22beb50790f0f0e37ffc3c) +++ 41f4697d0ada8e79a2f262aa9b6357a45194f31d/Makefile (mode:100644 sha1:87fe8fef5ebd315f370af882bd3172632b850c02) @@ -82,9 +82,9 @@ git-export: export.c git-diff-cache: diff-cache.c git-convert-cache: convert-cache.c -git-http-pull: http-pull.c +git-http-pull: http-pull.c pull.c git-rpush: rsh.c -git-rpull: rsh.c +git-rpull: rsh.c pull.c git-rev-list: rev-list.c git-mktag: mktag.c git-diff-tree-helper: diff-tree-helper.c Index: http-pull.c =================================================================== --- 8602fe7cb4bf668fd021ab3bfb2082ac7d535e57/http-pull.c (mode:100644 sha1:192dcc370dee47c52c72915394bb6f2a79f64e12) +++ 41f4697d0ada8e79a2f262aa9b6357a45194f31d/http-pull.c (mode:100644 sha1:d877c4abe3ff7766d858bfeac5c9a0eaf1385b65) @@ -7,6 +7,8 @@ #include <errno.h> #include <stdio.h> +#include "pull.h" + #include <curl/curl.h> #include <curl/easy.h> @@ -14,10 +16,6 @@ static char *base; -static int tree = 0; -static int commits = 0; -static int all = 0; - static SHA_CTX c; static z_stream stream; @@ -47,7 +45,7 @@ return size; } -static int fetch(unsigned char *sha1) +int fetch(unsigned char *sha1) { char *hex = sha1_to_hex(sha1); char *filename = sha1_file_name(sha1); @@ -105,77 +103,21 @@ return 0; } -static int process_tree(unsigned char *sha1) -{ - struct tree *tree = lookup_tree(sha1); - struct tree_entry_list *entries; - - if (parse_tree(tree)) - return -1; - - for (entries = tree->entries; entries; entries = entries->next) { - if (fetch(entries->item.tree->object.sha1)) - return -1; - if (entries->directory) { - if (process_tree(entries->item.tree->object.sha1)) - return -1; - } - } - return 0; -} - -static int process_commit(unsigned char *sha1) -{ - struct commit *obj = lookup_commit(sha1); - - if (fetch(sha1)) - return -1; - - if (parse_commit(obj)) - return -1; - - if (tree) { - if (fetch(obj->tree->object.sha1)) - return -1; - if (process_tree(obj->tree->object.sha1)) - return -1; - if (!all) - tree = 0; - } - if (commits) { - struct commit_list *parents = obj->parents; - for (; parents; parents = parents->next) { - if (has_sha1_file(parents->item->object.sha1)) - continue; - if (fetch(parents->item->object.sha1)) { - /* The server might not have it, and - * we don't mind. - */ - continue; - } - if (process_commit(parents->item->object.sha1)) - return -1; - } - } - return 0; -} - int main(int argc, char **argv) { char *commit_id; char *url; int arg = 1; - unsigned char sha1[20]; while (arg < argc && argv[arg][0] == '-') { if (argv[arg][1] == 't') { - tree = 1; + get_tree = 1; } else if (argv[arg][1] == 'c') { - commits = 1; + get_history = 1; } else if (argv[arg][1] == 'a') { - all = 1; - tree = 1; - commits = 1; + get_all = 1; + get_tree = 1; + get_history = 1; } arg++; } @@ -186,17 +128,13 @@ commit_id = argv[arg]; url = argv[arg + 1]; - get_sha1_hex(commit_id, sha1); - curl_global_init(CURL_GLOBAL_ALL); curl = curl_easy_init(); base = url; - if (fetch(sha1)) - return 1; - if (process_commit(sha1)) + if (pull(commit_id)) return 1; curl_global_cleanup(); Index: pull.c =================================================================== --- /dev/null (tree:8602fe7cb4bf668fd021ab3bfb2082ac7d535e57) +++ 41f4697d0ada8e79a2f262aa9b6357a45194f31d/pull.c (mode:100644 sha1:86a7b6901fe69a82c12c3470b456982ef52cebd0) @@ -0,0 +1,77 @@ +#include "pull.h" + +#include "cache.h" +#include "commit.h" +#include "tree.h" + +int get_tree = 0; +int get_history = 0; +int get_all = 0; + +static int process_tree(unsigned char *sha1) +{ + struct tree *tree = lookup_tree(sha1); + struct tree_entry_list *entries; + + if (parse_tree(tree)) + return -1; + + for (entries = tree->entries; entries; entries = entries->next) { + if (fetch(entries->item.tree->object.sha1)) + return -1; + if (entries->directory) { + if (process_tree(entries->item.tree->object.sha1)) + return -1; + } + } + return 0; +} + +static int process_commit(unsigned char *sha1) +{ + struct commit *obj = lookup_commit(sha1); + + if (fetch(sha1)) + return -1; + + if (parse_commit(obj)) + return -1; + + if (get_tree) { + if (fetch(obj->tree->object.sha1)) + return -1; + if (process_tree(obj->tree->object.sha1)) + return -1; + if (!get_all) + get_tree = 0; + } + if (get_history) { + struct commit_list *parents = obj->parents; + for (; parents; parents = parents->next) { + if (has_sha1_file(parents->item->object.sha1)) + continue; + if (fetch(parents->item->object.sha1)) { + /* The server might not have it, and + * we don't mind. + */ + continue; + } + if (process_commit(parents->item->object.sha1)) + return -1; + } + } + return 0; +} + +int pull(char *target) +{ + int retval; + unsigned char sha1[20]; + retval = get_sha1_hex(target, sha1); + if (retval) + return retval; + retval = fetch(sha1); + if (retval) + return retval; + return process_commit(sha1); +} Index: pull.h =================================================================== --- /dev/null (tree:8602fe7cb4bf668fd021ab3bfb2082ac7d535e57) +++ 41f4697d0ada8e79a2f262aa9b6357a45194f31d/pull.h (mode:100644 sha1:314bc7e95ab1a73634f6a96a8a3782fda91ea261) @@ -0,0 +1,18 @@ +#ifndef PULL_H +#define PULL_H + +/** To be provided by the particular implementation. **/ +extern int fetch(unsigned char *sha1); + +/** Set to fetch the target tree. */ +extern int get_tree; + +/** Set to fetch the commit history. */ +extern int get_history; + +/** Set to fetch the trees in the commit history. **/ +extern int get_all; + +extern int pull(char *target); + +#endif /* PULL_H */ Index: rpull.c =================================================================== --- 8602fe7cb4bf668fd021ab3bfb2082ac7d535e57/rpull.c (mode:100644 sha1:c27af2c2464de28732b8ad1fff3ed8a0804250d6) +++ 41f4697d0ada8e79a2f262aa9b6357a45194f31d/rpull.c (mode:100644 sha1:6624440d5ad24854e1bd1a8dff628427581198e0) @@ -7,15 +7,12 @@ #include <errno.h> #include <stdio.h> #include "rsh.h" - -static int tree = 0; -static int commits = 0; -static int all = 0; +#include "pull.h" static int fd_in; static int fd_out; -static int fetch(unsigned char *sha1) +int fetch(unsigned char *sha1) { if (has_sha1_file(sha1)) return 0; @@ -23,87 +20,21 @@ return write_sha1_from_fd(sha1, fd_in); } -static int process_tree(unsigned char *sha1) -{ - struct tree *tree = lookup_tree(sha1); - struct tree_entry_list *entries; - - if (parse_tree(tree)) - return -1; - - for (entries = tree->entries; entries; entries = entries->next) { - /* - fprintf(stderr, "Tree %s ", sha1_to_hex(sha1)); - fprintf(stderr, "needs %s\n", - sha1_to_hex(entries->item.tree->object.sha1)); - */ - if (fetch(entries->item.tree->object.sha1)) { - return error("Missing item %s", - sha1_to_hex(entries->item.tree->object.sha1)); - } - if (entries->directory) { - if (process_tree(entries->item.tree->object.sha1)) - return -1; - } - } - return 0; -} - -static int process_commit(unsigned char *sha1) -{ - struct commit *obj = lookup_commit(sha1); - - if (fetch(sha1)) { - return error("Fetching %s", sha1_to_hex(sha1)); - } - - if (parse_commit(obj)) - return -1; - - if (tree) { - if (fetch(obj->tree->object.sha1)) - return -1; - if (process_tree(obj->tree->object.sha1)) - return -1; - if (!all) - tree = 0; - } - if (commits) { - struct commit_list *parents = obj->parents; - for (; parents; parents = parents->next) { - if (has_sha1_file(parents->item->object.sha1)) - continue; - if (fetch(parents->item->object.sha1)) { - /* The server might not have it, and - * we don't mind. - */ - error("Missing tree %s; continuing", - sha1_to_hex(parents->item->object.sha1)); - continue; - } - if (process_commit(parents->item->object.sha1)) - return -1; - } - } - return 0; -} - int main(int argc, char **argv) { char *commit_id; char *url; int arg = 1; - unsigned char sha1[20]; while (arg < argc && argv[arg][0] == '-') { if (argv[arg][1] == 't') { - tree = 1; + get_tree = 1; } else if (argv[arg][1] == 'c') { - commits = 1; + get_history = 1; } else if (argv[arg][1] == 'a') { - all = 1; - tree = 1; - commits = 1; + get_all = 1; + get_tree = 1; + get_history = 1; } arg++; } @@ -117,11 +48,7 @@ if (setup_connection(&fd_in, &fd_out, "rpush", url, arg, argv + 1)) return 1; - get_sha1_hex(commit_id, sha1); - - if (fetch(sha1)) - return 1; - if (process_commit(sha1)) + if (pull(commit_id)) return 1; return 0; ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] Add git-relink-script, a tool to hardlink two existing repositories. 2005-04-29 20:21 ` Linus Torvalds 2005-04-29 21:07 ` Junio C Hamano @ 2005-05-04 5:54 ` Ryan Anderson 1 sibling, 0 replies; 24+ messages in thread From: Ryan Anderson @ 2005-05-04 5:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: Petr Baudis, git Add git-relink-script, which will find common objects in two git repositories and replace one copy with a hardlink. Signed-Off-By: Ryan Anderson <ryan@michonline.com> --- commit a3bcc763d71bdb91a3b48e9105fbaa5e79abb807 tree 2553e2d8befbe0cda3e413616fd4cc7bf04157ad parent a31c6d022e2435a514fcc8ca57f9995c4376a986 author Ryan Anderson <ryan@mythryan2.(none)> 1115185675 -0400 committer Ryan Anderson <ryan@michonline.com> 1115185675 -0400 Index: Makefile =================================================================== --- 51a882a2dc62e0d3cdc79e0badc61559fb723481/Makefile (mode:100644 sha1:99b4753d34879842b972da9b68694c9d0485f216) +++ 2553e2d8befbe0cda3e413616fd4cc7bf04157ad/Makefile (mode:100644 sha1:a99665e252a2342caa84238e886a80a5f27ac3c8) @@ -13,7 +13,7 @@ AR=ar SCRIPTS=git-apply-patch-script git-merge-one-file-script git-prune-script \ - git-pull-script git-tag-script + git-pull-script git-tag-script git-relink-script PROG= git-update-cache git-diff-files git-init-db git-write-tree \ git-read-tree git-commit-tree git-cat-file git-fsck-cache \ Index: git-relink-script =================================================================== --- /dev/null (tree:51a882a2dc62e0d3cdc79e0badc61559fb723481) +++ 2553e2d8befbe0cda3e413616fd4cc7bf04157ad/git-relink-script (mode:100644 sha1:78c954edcc370d8be951c856bfbfd38975d08348) @@ -0,0 +1,115 @@ +#!/usr/bin/env perl +# Copyright 2005, Ryan Anderson <ryan@michonline.com> +# Distribution permitted under the GPL v2, as distributed +# by the Free Software Foundation. +# Later versions of the GPL at the discretion of Linus Torvalds +# +# Scan two git object-trees, and hardlink any common objects between them. + +use 5.006; +use strict; +use warnings; + +sub get_canonical_form($); +sub do_scan_directory($$$); +sub compare_two_files($$); + +# stats +my $linked = 0; +my $already = 0; + +my ($dir1, $dir2) = @ARGV; + +if (!defined $dir1 || !defined $dir2) { + print("Usage: $0 <dir1> <dir2>\nBoth dir1 and dir2 should contain a .git/objects/ subdirectory.\n"); + exit(1); +} + +$dir1 = get_canonical_form($dir1); +$dir2 = get_canonical_form($dir2); + +printf("Searching '%s' and '%s' for common objects and hardlinking them...\n",$dir1,$dir2); + +opendir(D,$dir1 . "objects/") + or die "Failed to open $dir1/objects/ : $!"; + +my @hashdirs = grep !/^\.{1,2}$/, readdir(D); +foreach my $hashdir (@hashdirs) { + do_scan_directory($dir1, $hashdir, $dir2); +} + +printf("Linked %d files, %d were already linked.\n",$linked, $already); + + +sub do_scan_directory($$$) { + my ($srcdir, $subdir, $dstdir) = @_; + + my $sfulldir = sprintf("%sobjects/%s/",$srcdir,$subdir); + my $dfulldir = sprintf("%sobjects/%s/",$dstdir,$subdir); + + opendir(S,$sfulldir) + or die "Failed to opendir $sfulldir: $!"; + + foreach my $file (grep(!/\.{1,2}$/, readdir(S))) { + my $sfilename = $sfulldir . $file; + my $dfilename = $dfulldir . $file; + + compare_two_files($sfilename,$dfilename); + + } + closedir(S); +} + +sub compare_two_files($$) { + my ($sfilename, $dfilename) = @_; + + # Perl's stat returns relevant information as follows: + # 0 = dev number + # 1 = inode number + # 7 = size + my @sstatinfo = stat($sfilename); + my @dstatinfo = stat($dfilename); + + if (@sstatinfo == 0 && @dstatinfo == 0) { + die sprintf("Stat of both %s and %s failed: %s\n",$sfilename, $dfilename, $!); + + } elsif (@dstatinfo == 0) { + return; + } + + if ( ($sstatinfo[0] == $dstatinfo[0]) && + ($sstatinfo[1] != $dstatinfo[1])) { + if ($sstatinfo[7] == $dstatinfo[7]) { + unlink($dfilename) + or die "Unlink of $dfilename failed: $!\n"; + + link($sfilename,$dfilename) + or die "Failed to link $sfilename to $dfilename: $!\n" . + "Git Repository containing $dfilename is probably corrupted, please copy '$sfilename' to '$dfilename' to fix.\n"; + + $linked++; + + } else { + die sprintf("ERROR: File sizes are not the same, cannot relink %s to %s.\n", + $sfilename, $dfilename); + } + + } elsif ( ($sstatinfo[0] == $dstatinfo[0]) && + ($sstatinfo[1] == $dstatinfo[1])) { + $already++; + } +} + +sub get_canonical_form($) { + my $dir = shift; + my $original = $dir; + + die "$dir is not a directory." unless -d $dir; + + $dir .= "/" unless $dir =~ m#/$#; + $dir .= ".git/" unless $dir =~ m#\.git/$#; + + die "$original does not have a .git/ subdirectory.\n" unless -d $dir; + + return $dir; +} -- Ryan Anderson sometimes Pug Majere ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: More problems... 2005-04-29 19:50 ` Ryan Anderson 2005-04-29 20:03 ` Thomas Glanzmann 2005-04-29 20:21 ` Linus Torvalds @ 2005-05-02 21:13 ` Petr Baudis 2 siblings, 0 replies; 24+ messages in thread From: Petr Baudis @ 2005-05-02 21:13 UTC (permalink / raw) To: Ryan Anderson; +Cc: Russell King, git Dear diary, on Fri, Apr 29, 2005 at 09:50:55PM CEST, I got a letter where Ryan Anderson <ryan@michonline.com> told me that... > Why not just use "rsync" for both remote and local synchronization, and > provide a "relink" command to scan two .git/objects/ repositories and > hardlink matching files together? No. This completely misses the point, which is to avoid useless I/O when doing this local stuff; also, it saves disk space to a degree, but it is wildly fluctuating. I like Junio's local-pull solution much more (from the conceptual standpoint; I didn't look at the code yet). -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2005-05-04 5:48 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-04-29 16:01 More problems Russell King 2005-04-29 16:12 ` Russell King 2005-04-29 17:51 ` Linus Torvalds 2005-04-29 18:27 ` Petr Baudis 2005-04-29 19:50 ` Ryan Anderson 2005-04-29 20:03 ` Thomas Glanzmann 2005-04-29 20:21 ` Linus Torvalds 2005-04-29 21:07 ` Junio C Hamano 2005-04-29 21:19 ` Russell King 2005-04-29 21:57 ` Anton Altaparmakov 2005-05-02 19:33 ` Petr Baudis 2005-05-02 19:44 ` Dave Kleikamp 2005-05-02 19:51 ` Thomas Glanzmann 2005-05-02 22:01 ` Anton Altaparmakov 2005-05-02 22:19 ` Linus Torvalds 2005-05-03 1:48 ` Petr Baudis 2005-05-03 2:56 ` Daniel Barkalow 2005-05-03 15:00 ` Andreas Gal 2005-05-03 19:18 ` Junio C Hamano 2005-04-29 21:27 ` Daniel Barkalow 2005-04-29 22:01 ` Junio C Hamano 2005-04-30 5:36 ` [PATCH] Split out "pull" from particular methods Daniel Barkalow 2005-05-04 5:54 ` [PATCH] Add git-relink-script, a tool to hardlink two existing repositories Ryan Anderson 2005-05-02 21:13 ` More problems Petr Baudis
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).