git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Teach git-svn how to catch up with its tracking branches
@ 2008-05-08  1:39 Steven Grimm
  2008-05-08  1:55 ` Junio C Hamano
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Steven Grimm @ 2008-05-08  1:39 UTC (permalink / raw)
  To: git

In environments where a lot of people are sharing an svn repository using
git-svn, everyone has identical, but individually maintained, tracking
branches. If the svn repository is very active, it can take a while to
run "git svn fetch" (which has to individually construct each revision
by querying the svn server). It's much faster to run "git fetch" against
another git-svn repository to grab the exact same git revisions you'd get
from "git svn fetch". But until now, git-svn was confused by this because
it didn't know how to incrementally rebuild its map of revision IDs.
The only choice was to completely remove the map file and rebuild it
from scratch, possibly a lengthy operation when there's a lot of history.

With this change, git-svn will try to do an incremental update of its
revision map if it sees that its tracking branch has svn revisions that
aren't in the map yet.

Signed-off-by: Steven Grimm <koreth@midwinter.com>
---
 git-svn.perl |   62 ++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 58 insertions(+), 4 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index e47b1ea..87b104b 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -1382,6 +1382,7 @@ sub fetch_all {
 				$base = $lr if ($lr < $base);
 			}
 			push @gs, $gs;
+			$gs->sync_rev_map_with_commits;
 		}
 	}
 
@@ -2114,6 +2115,44 @@ sub gc {
 	command_noisy('gc', '--auto');
 };
 
+# sync_rev_map_with_commits:
+# If are commits on the tracking branch that aren't present in our revision
+# map (e.g., because the user has done a git fetch from another git-svn repo
+# rather than a git svn fetch), bring our revision map up to date. This is
+# a no-op if the revision map is already up to date.
+sub sync_rev_map_with_commits {
+	my ($self) = @_;
+	# If we can't pull metadata out of log messages, there's nothing
+	# to import.
+	return if $self->use_svm_props || $self->no_metadata;
+	# If there isn't a revision DB yet, we'll rebuild it from scratch
+	# elsewhere, so don't do anything here.
+	return if ! -e $self->map_path || -z $self->map_path;
+	# Look at the most recent commit with a git-svn-id line.
+	my ($log, $ctx) =
+	    command_output_pipe(qw/rev-list --pretty=raw --no-color /,
+				'--grep=^ *git-svn-id:',
+				'--max-count=1',
+				$self->refname, '--');
+	my ($url, $rev, $uuid, $c);
+	while (<$log>) {
+		if ( m{^commit ($::sha1)$} ) {
+			$c = $1;
+			next;
+		}
+		next unless s{^\s*(git-svn-id:)}{$1};
+		($url, $rev, $uuid) = ::extract_metadata($_);
+	}
+	my ($rev_commit) = $self->rev_map_get($rev, $uuid);
+	if (!$rev_commit) {
+		# The most recent commit in the branch isn't in our
+		# rev map. Pull in data from the revisions between the
+		# highest commit in our map and the head of the branch.
+		my ($max_rev, $max_commit) = $self->rev_map_max(1);
+		$self->rebuild($max_commit);
+	}
+}
+
 sub do_git_commit {
 	my ($self, $log_entry) = @_;
 	my $lr = $self->last_rev;
@@ -2489,6 +2528,7 @@ sub make_log_entry {
 sub fetch {
 	my ($self, $min_rev, $max_rev, @parents) = @_;
 	my ($last_rev, $last_commit) = $self->last_rev_commit;
+	$self->sync_rev_map_with_commits;
 	my ($base, $head) = $self->get_fetch_range($min_rev, $max_rev);
 	$self->ra->gs_fetch_loop_common($base, $head, [$self]);
 }
@@ -2535,10 +2575,17 @@ sub rebuild_from_rev_db {
 	unlink $path or croak "unlink: $!";
 }
 
+# rebuild:
+# Reconstructs a revision map from the available metadata. If $min_git_rev
+# is specified, this is an incremental rebuild that should stop when it hits
+# the revision in question.
+#
+# Incremental rebuilding is only supported when commits contain git-svn
+# metadata (the default) and not with use_svm_props or no_metadata.
 sub rebuild {
-	my ($self) = @_;
+	my ($self, $min_git_rev) = @_;
 	my $map_path = $self->map_path;
-	return if (-e $map_path && ! -z $map_path);
+	return if (!defined $min_git_rev && -e $map_path && ! -z $map_path);
 	return unless ::verify_ref($self->refname.'^0');
 	if ($self->use_svm_props || $self->no_metadata) {
 		my $rev_db = $self->rev_db_path;
@@ -2550,10 +2597,17 @@ sub rebuild {
 		$self->unlink_rev_db_symlink;
 		return;
 	}
-	print "Rebuilding $map_path ...\n";
+	my $revs_to_scan;
+	if (defined $min_git_rev) {
+		print "Updating $map_path ...\n";
+		$revs_to_scan = $min_git_rev . ".." . $self->refname;
+	} else {
+		print "Rebuilding $map_path ...\n";
+		$revs_to_scan = $self->refname;
+	}
 	my ($log, $ctx) =
 	    command_output_pipe(qw/rev-list --pretty=raw --no-color --reverse/,
-	                        $self->refname, '--');
+	                        $revs_to_scan, '--');
 	my $full_url = $self->full_url;
 	remove_username($full_url);
 	my $svn_uuid = $self->ra_uuid;
-- 
1.5.5.49.gf43e2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Teach git-svn how to catch up with its tracking branches
  2008-05-08  1:39 [PATCH] Teach git-svn how to catch up with its tracking branches Steven Grimm
@ 2008-05-08  1:55 ` Junio C Hamano
  2008-05-08  2:17   ` Steven Grimm
  2008-05-08  1:58 ` Chris Shoemaker
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2008-05-08  1:55 UTC (permalink / raw)
  To: Steven Grimm; +Cc: git

Steven Grimm <koreth@midwinter.com> writes:

> In environments where a lot of people are sharing an svn repository using
> git-svn, everyone has identical, but individually maintained, tracking
> branches. If the svn repository is very active, it can take a while to
> run "git svn fetch" (which has to individually construct each revision
> by querying the svn server). It's much faster to run "git fetch" against
> another git-svn repository to grab the exact same git revisions you'd get
> from "git svn fetch". But until now, git-svn was confused by this because
> it didn't know how to incrementally rebuild its map of revision IDs.
> The only choice was to completely remove the map file and rebuild it
> from scratch, possibly a lengthy operation when there's a lot of history.
>
> With this change, git-svn will try to do an incremental update of its
> revision map if it sees that its tracking branch has svn revisions that
> aren't in the map yet.

Being able to have a shared git-svn managed git repository that mirrors
svn is something people have asked often enough, but the recommended
practice has always been for each to have his own copy.

Although I do not use git-svn heavily myself, I like this addition.  We
would probably want to update the in-tree doc to cover a recommended
pattern of interacting multiple git repositories with a single svn
repository on the other side?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Teach git-svn how to catch up with its tracking branches
  2008-05-08  1:39 [PATCH] Teach git-svn how to catch up with its tracking branches Steven Grimm
  2008-05-08  1:55 ` Junio C Hamano
@ 2008-05-08  1:58 ` Chris Shoemaker
  2008-05-08  2:08   ` Steven Grimm
  2008-05-08  4:19 ` [PATCH v2] " Steven Grimm
  2008-05-08  6:48 ` [PATCH] " Asheesh Laroia
  3 siblings, 1 reply; 16+ messages in thread
From: Chris Shoemaker @ 2008-05-08  1:58 UTC (permalink / raw)
  To: Steven Grimm; +Cc: git

On Wed, May 07, 2008 at 06:39:56PM -0700, Steven Grimm wrote:
> In environments where a lot of people are sharing an svn repository using
> git-svn, everyone has identical, but individually maintained, tracking
> branches. If the svn repository is very active, it can take a while to
> run "git svn fetch" (which has to individually construct each revision
> by querying the svn server). It's much faster to run "git fetch" against
> another git-svn repository to grab the exact same git revisions you'd get
> from "git svn fetch". But until now, git-svn was confused by this because
> it didn't know how to incrementally rebuild its map of revision IDs.
> The only choice was to completely remove the map file and rebuild it
> from scratch, possibly a lengthy operation when there's a lot of history.
> 
> With this change, git-svn will try to do an incremental update of its
> revision map if it sees that its tracking branch has svn revisions that
> aren't in the map yet.

Since I'm not qualified to review the patch technically , I'll just
offer encouragement, comment and question.  First, nice work, this
seems like a very helpful feature.  It might go quite a way toward
enabling a semi-distributed workflow with an authoritative svn
upstream.

Second, what will happen when different developers have svn URLs with
different schemes, e.g. http vs. svn+ssh?

Third, I think such a feature surely deserves a mention in
git-svn.txt.

-chris

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Teach git-svn how to catch up with its tracking branches
  2008-05-08  1:58 ` Chris Shoemaker
@ 2008-05-08  2:08   ` Steven Grimm
  2008-05-08  2:25     ` Chris Shoemaker
  0 siblings, 1 reply; 16+ messages in thread
From: Steven Grimm @ 2008-05-08  2:08 UTC (permalink / raw)
  To: Chris Shoemaker; +Cc: git

On May 7, 2008, at 6:58 PM, Chris Shoemaker wrote:
> Second, what will happen when different developers have svn URLs with
> different schemes, e.g. http vs. svn+ssh?

That will cause the commit messages to be different, which means you  
won't have the same commit hashes, so this pretty much won't be  
useful. (You'd end up fetching the remote repo's entire svn history if  
you tried to do git fetch.)

The assumption here is that you have exactly the same revision history  
in your tracking branches as the repo you're fetching from.

-Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Teach git-svn how to catch up with its tracking branches
  2008-05-08  1:55 ` Junio C Hamano
@ 2008-05-08  2:17   ` Steven Grimm
  0 siblings, 0 replies; 16+ messages in thread
From: Steven Grimm @ 2008-05-08  2:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On May 7, 2008, at 6:55 PM, Junio C Hamano wrote:
> Being able to have a shared git-svn managed git repository that  
> mirrors
> svn is something people have asked often enough, but the recommended
> practice has always been for each to have his own copy.

Which is still sort of the case here -- it would be even better if you  
could share the revision map, but that's a bigger change. All this  
really does is let you skip talking to the svn server for update  
operations.

> Although I do not use git-svn heavily myself, I like this addition.   
> We
> would probably want to update the in-tree doc to cover a recommended
> pattern of interacting multiple git repositories with a single svn
> repository on the other side?

I'll write something up. This is still pretty new (it was an itch I  
had time to scratch today) so I honestly don't know yet what the  
optimal workflow is going to be. But at the very least I can document  
my setup as an example of something that works.

-Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Teach git-svn how to catch up with its tracking branches
  2008-05-08  2:08   ` Steven Grimm
@ 2008-05-08  2:25     ` Chris Shoemaker
  2008-05-08  7:38       ` Karl Hasselström
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Shoemaker @ 2008-05-08  2:25 UTC (permalink / raw)
  To: Steven Grimm; +Cc: git

On Wed, May 07, 2008 at 07:08:50PM -0700, Steven Grimm wrote:
> On May 7, 2008, at 6:58 PM, Chris Shoemaker wrote:
>> Second, what will happen when different developers have svn URLs with
>> different schemes, e.g. http vs. svn+ssh?
>
> That will cause the commit messages to be different, which means you won't 
> have the same commit hashes, so this pretty much won't be useful. (You'd 
> end up fetching the remote repo's entire svn history if you tried to do git 
> fetch.)

Yes, indeed.  But git fetching even the entire svn history is probably
often faster than git-svn fetching even one commit!  I guess the
question is really, if I replace my remote tracking branch with
someone else's remote tracking branch, and it invalidates old map
entries, what breaks?  Note, that even a slight difference in
svn.authorsfile would have the same effect.

> The assumption here is that you have exactly the same revision history in 
> your tracking branches as the repo you're fetching from.

In that case, it would be helpful to enumerate exactly how two
developers can ensure that they are creating the same revision
history.  At the very least, svn URL scheme and svnauthors file have
to be the same.

-chris

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] Teach git-svn how to catch up with its tracking branches
  2008-05-08  1:39 [PATCH] Teach git-svn how to catch up with its tracking branches Steven Grimm
  2008-05-08  1:55 ` Junio C Hamano
  2008-05-08  1:58 ` Chris Shoemaker
@ 2008-05-08  4:19 ` Steven Grimm
  2008-05-11  8:27   ` Eric Wong
  2008-05-08  6:48 ` [PATCH] " Asheesh Laroia
  3 siblings, 1 reply; 16+ messages in thread
From: Steven Grimm @ 2008-05-08  4:19 UTC (permalink / raw)
  To: git

In environments where a lot of people are sharing an svn repository using
git-svn, everyone has identical, but individually maintained, tracking
branches. If the svn repository is very active, it can take a while to
run "git svn fetch" (which has to individually construct each revision
by querying the svn server). It's much faster to run "git fetch" against
another git-svn repository to grab the exact same git revisions you'd get
from "git svn fetch". But until now, git-svn was confused by this because
it didn't know how to incrementally rebuild its map of revision IDs.
The only choice was to completely remove the map file and rebuild it
from scratch, possibly a lengthy operation when there's a lot of history.

With this change, git-svn will try to do an incremental update of its
revision map if it sees that its tracking branch has svn revisions that
aren't in the map yet.

Signed-off-by: Steven Grimm <koreth@midwinter.com>
---

	Added some documentation that I hope is comprehensive enough
	to be useful and non-misleading.

 Documentation/git-svn.txt |   72 +++++++++++++++++++++++++++++++++++++++++++-
 git-svn.perl              |   62 ++++++++++++++++++++++++++++++++++++--
 2 files changed, 128 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-svn.txt b/Documentation/git-svn.txt
index f4ba105..6bdbd51 100644
--- a/Documentation/git-svn.txt
+++ b/Documentation/git-svn.txt
@@ -511,8 +511,8 @@ inside git back upstream to SVN users.  Therefore it is advised that
 users keep history as linear as possible inside git to ease
 compatibility with SVN (see the CAVEATS section below).
 
-CAVEATS
--------
+SHARING REVISIONS WITH OTHER GIT-SVN REPOSITORIES
+-------------------------------------------------
 
 For the sake of simplicity and interoperating with a less-capable system
 (SVN), it is recommended that all git-svn users clone, fetch and dcommit
@@ -521,6 +521,74 @@ operations between git repositories and branches.  The recommended
 method of exchanging code between git branches and users is
 git-format-patch and git-am, or just dcommiting to the SVN repository.
 
+However, git-svn does have limited support for sharing the SVN history
+between git repositories. For this to work, both git repositories need
+to be using the same SVN repository via the same URL, such that the
+commits in the git-svn tracking branches (e.g., the default "git-svn"
+branch) have the same revision IDs in git.
+
+An easy way to test whether this is true is to use 'git svn find-rev'
+in both repositories and pass it a recent SVN revision number that is
+present in both repositories' history. For example, if both repositories
+contain SVN revision 97446:
+
+------------------------------------------------------------------------
+	git svn find-rev r97446
+------------------------------------------------------------------------
+
+If the 40-digit hexadecimal value shown by that command is the same in
+both repositories, their histories are compatible and the rest of this
+section applies to them. There is currently no easy way to share history
+between git repositories whose SVN tracking branches aren't identical.
+
+Assuming your two repositories match, you can use 'git fetch' in place
+of 'git svn fetch' to fetch new SVN revisions. This is often
+significantly faster than directly fetching from the svn server,
+especially if large numbers of revisions are being fetched. Note that
+this only applies to pulling changes *from* the SVN repository; you must
+still use 'git svn dcommit' to push changes *to* SVN.
+
+Here's an example of how this works.
+
+------------------------------------------------------------------------
+# Create the first git-svn clone; call it "localmirror"
+	git svn clone svn+ssh://host/svn-repo localmirror
+# Create an empty git-svn repository called "work" pointing to the same SVN repo
+# (you could also use another 'git svn clone' if preferred)
+	mkdir work
+	cd work
+	git svn init svn+ssh://host/svn-repo
+# Set up a remote so 'git fetch' in work will fetch from localmirror
+	git config remote.origin.url file://`pwd`/../localmirror
+	git config remote.origin.fetch refs/remotes/git-svn:refs/remotes/git-svn
+# Get a copy of the SVN history from localmirror
+	git fetch
+	git rebase git-svn
+
+# Some time passes...
+
+# Update localmirror with latest changes from SVN
+	cd ../localmirror
+	git svn fetch
+# And fetch those changes incrementally into work
+	cd ../work
+	git fetch
+# Incorporate the latest revisions into the current branch in work
+	git rebase git-svn
+# You can also fetch directly from SVN in work
+	git svn rebase
+# If you have local changes, you must dcommit directly to SVN
+	git svn dcommit
+------------------------------------------------------------------------
+
+In the above example, the "localmirror" repository might be set up to
+run 'git svn fetch' periodically from a cron job, so that local developers
+can always fetch recent SVN revisions without having to connect directly
+to the SVN repository.
+
+CAVEATS
+-------
+
 Running 'git-merge' or 'git-pull' is NOT recommended on a branch you
 plan to dcommit from.  Subversion does not represent merges in any
 reasonable or useful fashion; so users using Subversion cannot see any
diff --git a/git-svn.perl b/git-svn.perl
index e47b1ea..87b104b 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -1382,6 +1382,7 @@ sub fetch_all {
 				$base = $lr if ($lr < $base);
 			}
 			push @gs, $gs;
+			$gs->sync_rev_map_with_commits;
 		}
 	}
 
@@ -2114,6 +2115,44 @@ sub gc {
 	command_noisy('gc', '--auto');
 };
 
+# sync_rev_map_with_commits:
+# If are commits on the tracking branch that aren't present in our revision
+# map (e.g., because the user has done a git fetch from another git-svn repo
+# rather than a git svn fetch), bring our revision map up to date. This is
+# a no-op if the revision map is already up to date.
+sub sync_rev_map_with_commits {
+	my ($self) = @_;
+	# If we can't pull metadata out of log messages, there's nothing
+	# to import.
+	return if $self->use_svm_props || $self->no_metadata;
+	# If there isn't a revision DB yet, we'll rebuild it from scratch
+	# elsewhere, so don't do anything here.
+	return if ! -e $self->map_path || -z $self->map_path;
+	# Look at the most recent commit with a git-svn-id line.
+	my ($log, $ctx) =
+	    command_output_pipe(qw/rev-list --pretty=raw --no-color /,
+				'--grep=^ *git-svn-id:',
+				'--max-count=1',
+				$self->refname, '--');
+	my ($url, $rev, $uuid, $c);
+	while (<$log>) {
+		if ( m{^commit ($::sha1)$} ) {
+			$c = $1;
+			next;
+		}
+		next unless s{^\s*(git-svn-id:)}{$1};
+		($url, $rev, $uuid) = ::extract_metadata($_);
+	}
+	my ($rev_commit) = $self->rev_map_get($rev, $uuid);
+	if (!$rev_commit) {
+		# The most recent commit in the branch isn't in our
+		# rev map. Pull in data from the revisions between the
+		# highest commit in our map and the head of the branch.
+		my ($max_rev, $max_commit) = $self->rev_map_max(1);
+		$self->rebuild($max_commit);
+	}
+}
+
 sub do_git_commit {
 	my ($self, $log_entry) = @_;
 	my $lr = $self->last_rev;
@@ -2489,6 +2528,7 @@ sub make_log_entry {
 sub fetch {
 	my ($self, $min_rev, $max_rev, @parents) = @_;
 	my ($last_rev, $last_commit) = $self->last_rev_commit;
+	$self->sync_rev_map_with_commits;
 	my ($base, $head) = $self->get_fetch_range($min_rev, $max_rev);
 	$self->ra->gs_fetch_loop_common($base, $head, [$self]);
 }
@@ -2535,10 +2575,17 @@ sub rebuild_from_rev_db {
 	unlink $path or croak "unlink: $!";
 }
 
+# rebuild:
+# Reconstructs a revision map from the available metadata. If $min_git_rev
+# is specified, this is an incremental rebuild that should stop when it hits
+# the revision in question.
+#
+# Incremental rebuilding is only supported when commits contain git-svn
+# metadata (the default) and not with use_svm_props or no_metadata.
 sub rebuild {
-	my ($self) = @_;
+	my ($self, $min_git_rev) = @_;
 	my $map_path = $self->map_path;
-	return if (-e $map_path && ! -z $map_path);
+	return if (!defined $min_git_rev && -e $map_path && ! -z $map_path);
 	return unless ::verify_ref($self->refname.'^0');
 	if ($self->use_svm_props || $self->no_metadata) {
 		my $rev_db = $self->rev_db_path;
@@ -2550,10 +2597,17 @@ sub rebuild {
 		$self->unlink_rev_db_symlink;
 		return;
 	}
-	print "Rebuilding $map_path ...\n";
+	my $revs_to_scan;
+	if (defined $min_git_rev) {
+		print "Updating $map_path ...\n";
+		$revs_to_scan = $min_git_rev . ".." . $self->refname;
+	} else {
+		print "Rebuilding $map_path ...\n";
+		$revs_to_scan = $self->refname;
+	}
 	my ($log, $ctx) =
 	    command_output_pipe(qw/rev-list --pretty=raw --no-color --reverse/,
-	                        $self->refname, '--');
+	                        $revs_to_scan, '--');
 	my $full_url = $self->full_url;
 	remove_username($full_url);
 	my $svn_uuid = $self->ra_uuid;
-- 
1.5.5.49.gf43e2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Teach git-svn how to catch up with its tracking branches
  2008-05-08  1:39 [PATCH] Teach git-svn how to catch up with its tracking branches Steven Grimm
                   ` (2 preceding siblings ...)
  2008-05-08  4:19 ` [PATCH v2] " Steven Grimm
@ 2008-05-08  6:48 ` Asheesh Laroia
  2008-05-08  7:33   ` Steven Grimm
                     ` (2 more replies)
  3 siblings, 3 replies; 16+ messages in thread
From: Asheesh Laroia @ 2008-05-08  6:48 UTC (permalink / raw)
  To: Steven Grimm; +Cc: git

On Wed, 7 May 2008, Steven Grimm wrote:

> In environments where a lot of people are sharing an svn repository using
> git-svn, everyone has identical, but individually maintained, tracking
> branches.

To further muddy the waters, let me talk about my setup, also one with a 
"central git repository" from which all developers clone, and also one 
based on a Subversion tree.

The way I handle it is that, hidden somewhere, I have an account with a 
cron job that does this:

$ git svn fetch
$ git push origin refs/remotes/*:refs/heads/*
$ git push origin refs/remotes/trunk:refs/heads/master

The first push synchronizes "origin" to have the same branches as this 
git-svn copy of the git repository, and the second updates "origin" so 
that it has a "master"; without that second step, "git clone" will error 
out when it get to its checkout phase.

Note that in .git/config, the [remote "origin"] section has no "fetch" 
parameter.  If it did have one, a would end up creating the branch 
origin/master on the second push, and origin/origin/master on the third, 
and so on.

After the push, "origin" ends up being a git repository that looks just 
like the svn repository we're cloning.  When you "git clone" it, the 
remote has all the tags and branches of the upstream svn repository; and 
as the upstream svn repository updates its branches, the git branches get 
those updates.

I'm not saying this patch shouldn't be accepted; I have no comment on it. 
I just want to see what others think of my approach to this workflow.

-- Asheesh.

-- 
What happened last night can happen again.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Teach git-svn how to catch up with its tracking branches
  2008-05-08  6:48 ` [PATCH] " Asheesh Laroia
@ 2008-05-08  7:33   ` Steven Grimm
  2008-05-08  7:48   ` Chris Shoemaker
  2008-05-08  8:21   ` Chris Shoemaker
  2 siblings, 0 replies; 16+ messages in thread
From: Steven Grimm @ 2008-05-08  7:33 UTC (permalink / raw)
  To: Asheesh Laroia; +Cc: git

On May 7, 2008, at 11:48 PM, Asheesh Laroia wrote:
> The way I handle it is that, hidden somewhere, I have an account  
> with a cron job that does this:
>
> $ git svn fetch
> $ git push origin refs/remotes/*:refs/heads/*
> $ git push origin refs/remotes/trunk:refs/heads/master

That's a reasonable setup, and I think (without having tried it) that  
it will be compatible with my patch -- assuming the clones of your  
origin repository have appropriate svn-remote config entries, they  
should be able to mix and match fetching from your origin and the real  
svn repository, and dcommit stuff back to svn.

Though I'd try that out with a toy svn repo first...

-Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Teach git-svn how to catch up with its tracking branches
  2008-05-08  2:25     ` Chris Shoemaker
@ 2008-05-08  7:38       ` Karl Hasselström
  2008-05-08  7:43         ` Karl Hasselström
  0 siblings, 1 reply; 16+ messages in thread
From: Karl Hasselström @ 2008-05-08  7:38 UTC (permalink / raw)
  To: Chris Shoemaker; +Cc: Steven Grimm, git

On 2008-05-07 22:25:04 -0400, Chris Shoemaker wrote:

> On Wed, May 07, 2008 at 07:08:50PM -0700, Steven Grimm wrote:
>
> > The assumption here is that you have exactly the same revision
> > history in your tracking branches as the repo you're fetching
> > from.
>
> In that case, it would be helpful to enumerate exactly how two
> developers can ensure that they are creating the same revision
> history. At the very least, svn URL scheme and svnauthors file have
> to be the same.

Also, one mustn't use Subversion's ability to retroactively edit
commit messages. (Guess what we tend to do from time to time where I
work.)

What'd really be needed to get all of the corner cases right, I think,
is a single import point that everyone else pulls from. The patch that
started this thread would help that scenario, since it makes it
possible to pull from such a central import point, and then run the
dcommit locally. (There's still the problem that dcommit will do a
local import if necessary -- that would have to be fixed as well.
Maybe simply teach git-svn to always try to pull from a given git
repository before hitting the real svn repo.)

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Teach git-svn how to catch up with its tracking branches
  2008-05-08  7:38       ` Karl Hasselström
@ 2008-05-08  7:43         ` Karl Hasselström
  2008-05-08  7:58           ` Steven Grimm
  0 siblings, 1 reply; 16+ messages in thread
From: Karl Hasselström @ 2008-05-08  7:43 UTC (permalink / raw)
  To: Chris Shoemaker; +Cc: Steven Grimm, git

On 2008-05-08 09:38:51 +0200, Karl Hasselström wrote:

> (There's still the problem that dcommit will do a local import if
> necessary -- that would have to be fixed as well. Maybe simply teach
> git-svn to always try to pull from a given git repository before
> hitting the real svn repo.)

Or even _only_ pull from the git repo. That repo would have to have
some kind of hook to make sure that it's always up-to-date, then --
just a cron job won't do -- but I'm sure that can be done.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Teach git-svn how to catch up with its tracking branches
  2008-05-08  6:48 ` [PATCH] " Asheesh Laroia
  2008-05-08  7:33   ` Steven Grimm
@ 2008-05-08  7:48   ` Chris Shoemaker
  2008-05-08  8:21   ` Chris Shoemaker
  2 siblings, 0 replies; 16+ messages in thread
From: Chris Shoemaker @ 2008-05-08  7:48 UTC (permalink / raw)
  To: Asheesh Laroia; +Cc: Steven Grimm, git

On Wed, May 07, 2008 at 11:48:17PM -0700, Asheesh Laroia wrote:
> On Wed, 7 May 2008, Steven Grimm wrote:
>
>> In environments where a lot of people are sharing an svn repository using
>> git-svn, everyone has identical, but individually maintained, tracking
>> branches.
>
> To further muddy the waters, let me talk about my setup, also one with a 
> "central git repository" from which all developers clone, and also one 
> based on a Subversion tree.
>
> The way I handle it is that, hidden somewhere, I have an account with a 
> cron job that does this:
>
> $ git svn fetch
> $ git push origin refs/remotes/*:refs/heads/*
> $ git push origin refs/remotes/trunk:refs/heads/master
>
> The first push synchronizes "origin" to have the same branches as this 
> git-svn copy of the git repository, and the second updates "origin" so that 
> it has a "master"; without that second step, "git clone" will error out 
> when it get to its checkout phase.
>
> Note that in .git/config, the [remote "origin"] section has no "fetch" 
> parameter.  If it did have one, a would end up creating the branch 
> origin/master on the second push, and origin/origin/master on the third, 
> and so on.
>
> After the push, "origin" ends up being a git repository that looks just 
> like the svn repository we're cloning.  When you "git clone" it, the remote 
> has all the tags and branches of the upstream svn repository; and as the 
> upstream svn repository updates its branches, the git branches get those 
> updates.
>
> I'm not saying this patch shouldn't be accepted; I have no comment on it. I 
> just want to see what others think of my approach to this workflow.

This workflow doesn't seem to provide a way for the developers who
clone the "origin" above, to dcommit to svn.  Presumably, with the
right initialization, Steve's patch would allow all those clones to
dcommit to svn directly.

I like your automated mirror setup, but IMO, it becomes a lot more
useful in conjunction with Steve's patch.

-chris

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Teach git-svn how to catch up with its tracking branches
  2008-05-08  7:43         ` Karl Hasselström
@ 2008-05-08  7:58           ` Steven Grimm
  2008-05-08  8:13             ` Karl Hasselström
  0 siblings, 1 reply; 16+ messages in thread
From: Steven Grimm @ 2008-05-08  7:58 UTC (permalink / raw)
  To: Karl Hasselström; +Cc: Chris Shoemaker, git

On May 8, 2008, at 12:43 AM, Karl Hasselström wrote:
> Or even _only_ pull from the git repo. That repo would have to have
> some kind of hook to make sure that it's always up-to-date, then --
> just a cron job won't do -- but I'm sure that can be done.

If you control both the svn repo and the git repo, you can get close  
to that with an svn commit trigger, but even then there'll be a race  
condition when you want to dcommit. You either have to be able to pull  
from the svn repo when you want to dcommit, or you have to live with  
the possibility of a dcommit failing because you don't actually have  
the most recent rev locally.

This ties a bit into the patch I sent a few months back to allow  
update hooks to change refs. I kind of ran out of spare time to  
iterate more on that back then, but the ultimate goal there was that  
you could interact only with the bridge repo, never directly with svn,  
and the bridge repo would dcommit for you when you pushed to it.

The approach in this thread's patch is maybe not as conceptually  
clean, but it's much simpler and (apparently) less controversial.

-Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Teach git-svn how to catch up with its tracking branches
  2008-05-08  7:58           ` Steven Grimm
@ 2008-05-08  8:13             ` Karl Hasselström
  0 siblings, 0 replies; 16+ messages in thread
From: Karl Hasselström @ 2008-05-08  8:13 UTC (permalink / raw)
  To: Steven Grimm; +Cc: Chris Shoemaker, git

On 2008-05-08 00:58:48 -0700, Steven Grimm wrote:

> On May 8, 2008, at 12:43 AM, Karl Hasselström wrote:
>
> > Or even _only_ pull from the git repo. That repo would have to
> > have some kind of hook to make sure that it's always up-to-date,
> > then -- just a cron job won't do -- but I'm sure that can be done.
>
> If you control both the svn repo and the git repo, you can get close
> to that with an svn commit trigger, but even then there'll be a race
> condition when you want to dcommit. You either have to be able to
> pull from the svn repo when you want to dcommit, or you have to live
> with the possibility of a dcommit failing because you don't actually
> have the most recent rev locally.

I don't see why this has to be the case. Surely, if the local git repo
can dcommit without races by importing new revisions from svn, the
local git repo could dcommit without races by importing new revisions
from svn via an intermediate git repo. This would require the
intermediate repo to import new revisions when it gets a pull request,
but surely that should be doable with a pre-pull hook?

> This ties a bit into the patch I sent a few months back to allow
> update hooks to change refs. I kind of ran out of spare time to
> iterate more on that back then, but the ultimate goal there was that
> you could interact only with the bridge repo, never directly with
> svn, and the bridge repo would dcommit for you when you pushed to
> it.

I guess that approach is what I'd really like to see, since that's the
only one that can guarantee that every git clone of the svn repository
is identical.

Furthermore, in this setup the clients wouldn't need to run git-svn at
all. Only the bridge would need it.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Teach git-svn how to catch up with its tracking branches
  2008-05-08  6:48 ` [PATCH] " Asheesh Laroia
  2008-05-08  7:33   ` Steven Grimm
  2008-05-08  7:48   ` Chris Shoemaker
@ 2008-05-08  8:21   ` Chris Shoemaker
  2 siblings, 0 replies; 16+ messages in thread
From: Chris Shoemaker @ 2008-05-08  8:21 UTC (permalink / raw)
  To: Asheesh Laroia; +Cc: Steven Grimm, git

On Wed, May 07, 2008 at 11:48:17PM -0700, Asheesh Laroia wrote:
> On Wed, 7 May 2008, Steven Grimm wrote:
>
>> In environments where a lot of people are sharing an svn repository using
>> git-svn, everyone has identical, but individually maintained, tracking
>> branches.
>
> To further muddy the waters, let me talk about my setup, also one with a 
> "central git repository" from which all developers clone, and also one 
> based on a Subversion tree.
>
> The way I handle it is that, hidden somewhere, I have an account with a 
> cron job that does this:
>
> $ git svn fetch
> $ git push origin refs/remotes/*:refs/heads/*
> $ git push origin refs/remotes/trunk:refs/heads/master
>
> The first push synchronizes "origin" to have the same branches as this 
> git-svn copy of the git repository, and the second updates "origin" so that 
> it has a "master"; without that second step, "git clone" will error out 
> when it get to its checkout phase.

This got me thinking about a potential design for a git-svnserver.
[Warning: engineering hack ahead, proceeed with caution.]

Instead of re-implementing any part of svn, just use a stock svn repo
+ server.  From the svn post-commit hook, update a git-svn repo as
above.  From the git post-commit, do a git-svn rebase.  Of course, you
need a shared lock between the two pairs of pre/post commit hooks.

The problem of attribution in svn from git-svn is probably easier to
solve from within the context of a post-commit hook.  The problem of
having to round-trip git commits through svn in a way that changes
their ids remains.  Effectively, that means commits have to be
considered "unpublished" (for the purpose of not basing other work
upon them) until they are pushed to the git-half of the git+svn.

Still, this scenario is a pretty gentle migration path from svn to git
- one that allows regular git users to use only git-core, not git-svn,
and still allows svn clients to work.  Maybe some git-alias magic
could hide the fact that a git push has to really become a push +
fetch.


-chris

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] Teach git-svn how to catch up with its tracking branches
  2008-05-08  4:19 ` [PATCH v2] " Steven Grimm
@ 2008-05-11  8:27   ` Eric Wong
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Wong @ 2008-05-11  8:27 UTC (permalink / raw)
  To: Steven Grimm; +Cc: git

Steven Grimm <koreth@midwinter.com> wrote:
> In environments where a lot of people are sharing an svn repository using
> git-svn, everyone has identical, but individually maintained, tracking
> branches. If the svn repository is very active, it can take a while to
> run "git svn fetch" (which has to individually construct each revision
> by querying the svn server). It's much faster to run "git fetch" against
> another git-svn repository to grab the exact same git revisions you'd get
> from "git svn fetch". But until now, git-svn was confused by this because
> it didn't know how to incrementally rebuild its map of revision IDs.
> The only choice was to completely remove the map file and rebuild it
> from scratch, possibly a lengthy operation when there's a lot of history.
> 
> With this change, git-svn will try to do an incremental update of its
> revision map if it sees that its tracking branch has svn revisions that
> aren't in the map yet.

Cool.  I agree with this is a useful change for people wanting to save
time and bandwidth although I've never been in a situation to need it
myself.

However, I'm kind of uncomfortable with this being on by default, as it
really means they users have to trust the git repository they're
fetching from to always be configured identically to what they're using
and not change configurations midway through a project.  External things
like authors files would need to be synced, too, etc...

> +sub sync_rev_map_with_commits {
> +	my ($self) = @_;
> +	# If we can't pull metadata out of log messages, there's nothing
> +	# to import.
> +	return if $self->use_svm_props || $self->no_metadata;
> +	# If there isn't a revision DB yet, we'll rebuild it from scratch
> +	# elsewhere, so don't do anything here.
> +	return if ! -e $self->map_path || -z $self->map_path;
> +	# Look at the most recent commit with a git-svn-id line.
> +	my ($log, $ctx) =
> +	    command_output_pipe(qw/rev-list --pretty=raw --no-color /,
> +				'--grep=^ *git-svn-id:',

Even though rev-list outputs the commit message prefixed with spaces,
the --grep itself does not need ' *' to match the leading spaces.
No version of git-svn outputting a space before "git-svn-id: "
in the commit itself.

More importantly I'd also prefer to actually grep for the URL+path of
the ref we're tracking, too.  This can catch mistakes if people somehow
configured their remotes incorrectly.

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-05-11  8:28 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-08  1:39 [PATCH] Teach git-svn how to catch up with its tracking branches Steven Grimm
2008-05-08  1:55 ` Junio C Hamano
2008-05-08  2:17   ` Steven Grimm
2008-05-08  1:58 ` Chris Shoemaker
2008-05-08  2:08   ` Steven Grimm
2008-05-08  2:25     ` Chris Shoemaker
2008-05-08  7:38       ` Karl Hasselström
2008-05-08  7:43         ` Karl Hasselström
2008-05-08  7:58           ` Steven Grimm
2008-05-08  8:13             ` Karl Hasselström
2008-05-08  4:19 ` [PATCH v2] " Steven Grimm
2008-05-11  8:27   ` Eric Wong
2008-05-08  6:48 ` [PATCH] " Asheesh Laroia
2008-05-08  7:33   ` Steven Grimm
2008-05-08  7:48   ` Chris Shoemaker
2008-05-08  8:21   ` Chris Shoemaker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).