git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git-svn: File added multiple times?
@ 2006-12-02 21:38 Florian Weimer
  2006-12-02 22:34 ` Eric Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Florian Weimer @ 2006-12-02 21:38 UTC (permalink / raw)
  To: git

Is this expected?

$ sort /tmp/git-svn-output | uniq -c | sort -rn | head
      4         A       mlton/trunk/doc/web/papers/index.html
      4         A       mlton/trunk/doc/web/papers/01-icfp.ps.gz
      4         A       mlton/trunk/doc/web/papers/00-esop.ps.gz
      4         A       mlton/trunk/doc/examples/save-world/save-world.sml
      4         A       mlton/trunk/doc/examples/save-world/Makefile
      4         A       mlton/trunk/doc/examples/profiling/profiling.sml
      4         A       mlton/trunk/doc/examples/profiling/Makefile
      4         A       mlton/trunk/doc/examples/ffi/Makefile
      4         A       mlton/trunk/doc/examples/ffi/main.sml
      4         A       mlton/trunk/doc/examples/ffi/ffi.h
$ 

It's somewhat counter-intuitive, at least.  This is with Debian's

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-svn: File added multiple times?
  2006-12-02 21:38 git-svn: File added multiple times? Florian Weimer
@ 2006-12-02 22:34 ` Eric Wong
  2006-12-02 22:38   ` Florian Weimer
  2006-12-03  0:19   ` [PATCH] git-svn: avoid fetching files twice in the same revision Eric Wong
  0 siblings, 2 replies; 6+ messages in thread
From: Eric Wong @ 2006-12-02 22:34 UTC (permalink / raw)
  To: Florian Weimer; +Cc: git

Florian Weimer <fw@deneb.enyo.de> wrote:
> Is this expected?
> 
> $ sort /tmp/git-svn-output | uniq -c | sort -rn | head
>       4         A       mlton/trunk/doc/web/papers/index.html
>       4         A       mlton/trunk/doc/web/papers/01-icfp.ps.gz
>       4         A       mlton/trunk/doc/web/papers/00-esop.ps.gz
>       4         A       mlton/trunk/doc/examples/save-world/save-world.sml
>       4         A       mlton/trunk/doc/examples/save-world/Makefile
>       4         A       mlton/trunk/doc/examples/profiling/profiling.sml
>       4         A       mlton/trunk/doc/examples/profiling/Makefile
>       4         A       mlton/trunk/doc/examples/ffi/Makefile
>       4         A       mlton/trunk/doc/examples/ffi/main.sml
>       4         A       mlton/trunk/doc/examples/ffi/ffi.h
> $ 
> 
> It's somewhat counter-intuitive, at least.  This is with Debian's
> git-core 1.4.4.1-1 package, and the SVN:: Perl modules are installed.

No it's not expected.  Is this on a public SVN repo I can look at?
Thanks.

git-svn 1.4.4.1 always cat-ed the entire file (this code was stolen from
git-svnimport, the version in master can transfer deltas).

This is (or only seems to be) a UI reporting error and the actual data
imported should be correct.

-- 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-svn: File added multiple times?
  2006-12-02 22:34 ` Eric Wong
@ 2006-12-02 22:38   ` Florian Weimer
  2006-12-02 22:41     ` Florian Weimer
  2006-12-03  0:19   ` [PATCH] git-svn: avoid fetching files twice in the same revision Eric Wong
  1 sibling, 1 reply; 6+ messages in thread
From: Florian Weimer @ 2006-12-02 22:38 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

* Eric Wong:

>> Is this expected?
>> It's somewhat counter-intuitive, at least.  This is with Debian's
>> git-core 1.4.4.1-1 package, and the SVN:: Perl modules are installed.
>
> No it's not expected.  Is this on a public SVN repo I can look at?
> Thanks.

This is the svn://mlton.org/mlton/trunk repository.  The second commit
shows this behavior, but it's a bit large.

> This is (or only seems to be) a UI reporting error and the actual data
> imported should be correct.

I think it might download the data multiple times as well (at least
the timing suggests that).  The generated repository seems to be fine,
though.  The import is still running, so I haven't done any

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-svn: File added multiple times?
  2006-12-02 22:38   ` Florian Weimer
@ 2006-12-02 22:41     ` Florian Weimer
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Weimer @ 2006-12-02 22:41 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

* Florian Weimer:

> * Eric Wong:
>
>>> Is this expected?
>>> It's somewhat counter-intuitive, at least.  This is with Debian's
>>> git-core 1.4.4.1-1 package, and the SVN:: Perl modules are installed.
>>
>> No it's not expected.  Is this on a public SVN repo I can look at?
>> Thanks.
>
> This is the svn://mlton.org/mlton/trunk repository.  The second commit
> shows this behavior, but it's a bit large.

It also occurs with r2048, which is smaller:

[...]
        A       mlton/trunk/doc/examples/finalizable/cons.c
        A       mlton/trunk/doc/examples/finalizable/cons.c
[...]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] git-svn: avoid fetching files twice in the same revision
  2006-12-02 22:34 ` Eric Wong
  2006-12-02 22:38   ` Florian Weimer
@ 2006-12-03  0:19   ` Eric Wong
  2006-12-03 16:42     ` Florian Weimer
  1 sibling, 1 reply; 6+ messages in thread
From: Eric Wong @ 2006-12-03  0:19 UTC (permalink / raw)
  To: Florian Weimer; +Cc: git

SVN is not entirely consistent in returning log information and
sometimes returns file information when adding subdirectories,
and sometimes it does not (only returning information about the
directory that was added).  This caused git-svn to occasionally
add a file to the list of files to be fetched twice.  Now we
change the data structure to be hash to avoid repeated fetches.

As of now (in master), this only affects repositories fetched
without deltas enabled (file://, and when manually overriden
with GIT_SVN_DELTA_FETCH=0); so this bug mainly affects users of
1.4.4.1 and maint.

Thanks to Florian Weimer for reporting this bug.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
---
 git-svn.perl |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index 3891122..d0bd0bd 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -2982,7 +2982,7 @@ sub libsvn_fetch_delta {
 sub libsvn_fetch_full {
 	my ($last_commit, $paths, $rev, $author, $date, $msg) = @_;
 	open my $gui, '| git-update-index -z --index-info' or croak $!;
-	my @amr;
+	my %amr;
 	my $p = $SVN->{svn_path};
 	foreach my $f (keys %$paths) {
 		my $m = $paths->{$f}->action();
@@ -3001,7 +3001,7 @@ sub libsvn_fetch_full {
 		my $t = $SVN->check_path($f, $rev, $pool);
 		if ($t == $SVN::Node::file) {
 			if ($m =~ /^[AMR]$/) {
-				push @amr, [ $m, $f ];
+				$amr{$f} = $m;
 			} else {
 				die "Unrecognized action: $m, ($f r$rev)\n";
 			}
@@ -3009,13 +3009,13 @@ sub libsvn_fetch_full {
 			my @traversed = ();
 			libsvn_traverse($gui, '', $f, $rev, \@traversed);
 			foreach (@traversed) {
-				push @amr, [ $m, $_ ]
+				$amr{$_} = $m;
 			}
 		}
 		$pool->clear;
 	}
-	foreach (@amr) {
-		libsvn_get_file($gui, $_->[1], $rev, $_->[0]);
+	foreach (keys %amr) {
+		libsvn_get_file($gui, $_, $rev, $amr{$_});
 	}
 	close $gui or croak $?;
 	return libsvn_log_entry($rev, $author, $date, $msg, [$last_commit]);
-- 
1.4.4.1.gdf6b

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] git-svn: avoid fetching files twice in the same revision
  2006-12-03  0:19   ` [PATCH] git-svn: avoid fetching files twice in the same revision Eric Wong
@ 2006-12-03 16:42     ` Florian Weimer
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Weimer @ 2006-12-03 16:42 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

* Eric Wong:

> +	foreach (keys %amr) {
> +		libsvn_get_file($gui, $_, $rev, $amr{$_});

You could throw in a "sort".  Perhaps the improved locality helps the
server a bit.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-12-03 16:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-02 21:38 git-svn: File added multiple times? Florian Weimer
2006-12-02 22:34 ` Eric Wong
2006-12-02 22:38   ` Florian Weimer
2006-12-02 22:41     ` Florian Weimer
2006-12-03  0:19   ` [PATCH] git-svn: avoid fetching files twice in the same revision Eric Wong
2006-12-03 16:42     ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).