* git-relink status (or bug?) @ 2008-06-21 10:36 Marc Zonzon 2008-06-21 19:22 ` Junio C Hamano 0 siblings, 1 reply; 3+ messages in thread From: Marc Zonzon @ 2008-06-21 10:36 UTC (permalink / raw) To: git When trying to use git-relink, I found it quite disappointing when going over packs. Git relink seem to make the assumption that there is a unique mapping from object name to object identity, which is of course acceptable for loose objects that are named with their sha-1 but false for .pack and .idx, to pack objects with the same name have contains the same objects but may be not packed in the same order, or compression. Moreover .idx files can not be considered alone, but depends on the associated .pack. When it happen that you have two different packs with the same name but of different sizes, git relink does not hard link the .packs because the size differ, and hard link the idx. And your repository is corrupted. It happen when you clone a repository, repack the clone and relink the clone to the original one. I found very few information about git relink, but as it appears in changelog of v1.5.4 I suppose it is not obsoleted. What about the use of this script? Marc ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: git-relink status (or bug?) 2008-06-21 10:36 git-relink status (or bug?) Marc Zonzon @ 2008-06-21 19:22 ` Junio C Hamano 2008-06-21 20:23 ` marc zonzon 0 siblings, 1 reply; 3+ messages in thread From: Junio C Hamano @ 2008-06-21 19:22 UTC (permalink / raw) To: Marc Zonzon; +Cc: git Marc Zonzon <marc.zonzon+git@gmail.com> writes: > I found very few information about git relink, but as it appears in > changelog of v1.5.4 I suppose it is not obsoleted. I do not think anybody uses it these days. Instead either they clone with reference (or -s), or perhaps use new-workdir. Here is a totally untested fix. The "careful" part can be made much more clever and efficient by learning implementation details about the .idx file (it has the checksum for itself and the checksum for its .pack file at the end) but I did not bother. I do not think this in its current shape is committable, without improvements and success reports from the list. Hint, hint... git-relink.perl | 26 ++++++++++++++++++-------- 1 files changed, 18 insertions(+), 8 deletions(-) diff --git a/git-relink.perl b/git-relink.perl index 15fb932..68e0f0e 100755 --- a/git-relink.perl +++ b/git-relink.perl @@ -10,10 +10,11 @@ use 5.006; use strict; use warnings; use Getopt::Long; +use File::Compare; sub get_canonical_form($); sub do_scan_directory($$$); -sub compare_two_files($$); +sub compare_and_link($$$); sub usage(); sub link_two_files($$); @@ -67,6 +68,7 @@ sub do_scan_directory($$$) { my $sfulldir = sprintf("%sobjects/%s/",$srcdir,$subdir); my $dfulldir = sprintf("%sobjects/%s/",$dstdir,$subdir); + my $careful = ($subdir eq 'pack'); opendir(S,$sfulldir) or die "Failed to opendir $sfulldir: $!"; @@ -75,14 +77,14 @@ sub do_scan_directory($$$) { my $sfilename = $sfulldir . $file; my $dfilename = $dfulldir . $file; - compare_two_files($sfilename,$dfilename); + compare_and_link($sfilename, $dfilename, $careful); } closedir(S); } -sub compare_two_files($$) { - my ($sfilename, $dfilename) = @_; +sub compare_and_link($$$) { + my ($sfilename, $dfilename, $careful) = @_; # Perl's stat returns relevant information as follows: # 0 = dev number @@ -100,12 +102,20 @@ sub compare_two_files($$) { if ( ($sstatinfo[0] == $dstatinfo[0]) && ($sstatinfo[1] != $dstatinfo[1])) { - if ($sstatinfo[7] == $dstatinfo[7]) { + my $differs = undef; + if ($sstatinfo[7] != $dstatinfo[7]) { + $differs = "size"; + } + if (!$differs && $careful) { + if (File::Compare::compare($sfilename, $dfilename)) { + $differs = "contents"; + } + } + if (!$differs) { link_two_files($sfilename, $dfilename); - } else { - my $err = sprintf("ERROR: File sizes are not the same, cannot relink %s to %s.\n", - $sfilename, $dfilename); + my $err = sprintf("ERROR: File differs (%s), cannot relink %s to %s.\n", + $differs, $sfilename, $dfilename); if ($fail_on_different_sizes) { die $err; } else { ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: git-relink status (or bug?) 2008-06-21 19:22 ` Junio C Hamano @ 2008-06-21 20:23 ` marc zonzon 0 siblings, 0 replies; 3+ messages in thread From: marc zonzon @ 2008-06-21 20:23 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Thank you for your answer On Sat, Jun 21, 2008 at 9:22 PM, Junio C Hamano <gitster@pobox.com> wrote: > > I do not think anybody uses it these days. Instead either they clone with > reference (or -s), or perhaps use new-workdir. The goal of git-relink is analogous to the default local clone, hardlinks can be safer than sharing because you don't loose anything when the origin directory reset a branch. I remark that git-clone(1) warn about -s use, but not --reference, but they seems identical on these aspects. In numerous cases you cannot suppose your alternate will keep your objects forever. I have posted recently such a case study http://thread.gmane.org/gmane.comp.version-control.git/85407 and when trying hardlinks, i found this bug. It happens that sharing was a better solution (but only with the help of Shawn answer I could set it up!) This new-workdir seems also a nice script, that I never looked at before (But why is there no documentation on these contrib?) > > Here is a totally untested fix. > > The "careful" part can be made much more clever and efficient by learning > implementation details about the .idx file (it has the checksum for itself > and the checksum for its .pack file at the end) but I did not bother. Thank you I see that you only take the safe way, don't hardlink if something is different, but there would be a more efficient one, to link when the packs have the same name, and link also the idx. If they have the same name they have the same content (with a fair probability!) I cannot provide a patch for that, because I'm not a perl programmer, and I'm too lazy to rewrite it in C or python! > I do not think this in its current shape is committable, without > improvements and success reports from the list. Hint, hint... Being "perl challenged" I cannot readproof the script, but at least I can test it but only on trivial test cases which make git-relink fail! (I have only tried once to use it to solve the previously cited problem) Marc ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-06-21 20:24 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-06-21 10:36 git-relink status (or bug?) Marc Zonzon 2008-06-21 19:22 ` Junio C Hamano 2008-06-21 20:23 ` marc zonzon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).