* Efficient cloning from svn (with multiple branches/tags subdirs)
@ 2009-10-13 18:13 Bruno Harbulot
2009-10-14 6:03 ` Eric Wong
0 siblings, 1 reply; 9+ messages in thread
From: Bruno Harbulot @ 2009-10-13 18:13 UTC (permalink / raw)
To: git
Hello,
I'm trying to clone an existing subversion repository (Restlet:
http://restlet.tigris.org/source/browse/). I'm using Git 1.6.5. The
layout of the project is like this:
trunk/
branches/1.0
branches/1.1
tags/1.0/1.0b1
tags/1.0/1.0b2
...
tags/1.0/1.0.1
...
tags/1.1/1.1.0
tags/1.1/1.1.1
...
Therefore, I've tried to use this (with and without '-T trunk', but
that's a separate problem):
git init
git svn init --prefix=svn/ -t tags/1.0 -t tags/1.1 -t tags/1.2 -t
tags/2.0 -b branches/1.0 -b branches/1.1
http://restlet.tigris.org/svn/restlet
git svn fetch
This takes a while (I've had to interrupt this) and this creates a
number of branches such as:
remotes/svn/tags/1.0b1
remotes/svn/tags/1.0b2
remotes/svn/tags/1.0b3
remotes/svn/tags/1.0b3@1883
remotes/svn/tags/1.0b3@323
What surprises me is that it looks like it's looping over and over,
since sometimes it starts back from SVN revision 1 when it's trying to
import a new tag.
Tt starts like this:
>
> Checked through r101
> Checked through r201
> Checked through r301
> A www/index.html
> r1 = 2ec77afc2e491e2b7c825cb685101e3bcbe7a8f7 (refs/remotes/svn/tags/1.0b1@312)
> A source/impl/License.txt
> A source/impl/Copyright.txt
> A source/impl/org/restlet/UniformInterface.java
> A source/impl/org/restlet/RestletException.java
> ...
Then, when it reaches r312, it starts again at r1:
> r312 = 5b40558b5bb2b4b04f9520f89b699ff6b0f50cdb (refs/remotes/svn/tags/1.0b1@312)
> r313 = 7ebcbd9da535cfdc23aacb612271e625445a7516 (refs/remotes/svn/tags/1.0b1@1881)
> r1882 = aed1582d4868a1be8ae8fcc0f15546822099f339 (refs/remotes/svn/tags/1.0b1)
> Checked through r101
> Checked through r201
> Checked through r301
> A www/index.html
> r1 = 2ec77afc2e491e2b7c825cb685101e3bcbe7a8f7 (refs/remotes/svn/tags/1.0b2@321)
> A source/impl/License.txt
> A source/impl/Copyright.txt
> A source/impl/org/restlet/UniformInterface.java
> A source/impl/org/restlet/RestletException.java
> A source/impl/org/restlet/AbstractRestlet.java
> A source/impl/org/restlet/connector/Resolver.java
(And so on for each tag).
This seems particularly inefficient and unfriendly for the resource
provider (I stopped as soon as I noticed). Is there a better way to do this?
Best wishes,
Bruno.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Efficient cloning from svn (with multiple branches/tags subdirs)
2009-10-13 18:13 Efficient cloning from svn (with multiple branches/tags subdirs) Bruno Harbulot
@ 2009-10-14 6:03 ` Eric Wong
2009-10-14 9:07 ` Bruno Harbulot
2009-10-14 16:28 ` Avery Pennarun
0 siblings, 2 replies; 9+ messages in thread
From: Eric Wong @ 2009-10-14 6:03 UTC (permalink / raw)
To: Bruno Harbulot; +Cc: git
Bruno Harbulot <Bruno.Harbulot@manchester.ac.uk> wrote:
> Hello,
>
> I'm trying to clone an existing subversion repository (Restlet:
> http://restlet.tigris.org/source/browse/). I'm using Git 1.6.5. The
> layout of the project is like this:
> trunk/
> branches/1.0
> branches/1.1
> tags/1.0/1.0b1
> tags/1.0/1.0b2
> ...
> tags/1.0/1.0.1
> ...
> tags/1.1/1.1.0
> tags/1.1/1.1.1
> ...
Hi Bruno,
That looks like there's two levels of tags. You should be able to do
this with your version of git in $GIT_CONFIG:
[svn-remote "svn"]
url = http://restlet.tigris.org/svn/restlet
fetch = trunk:refs/remotes/svn/trunk
branches = branches/*:refs/remotes/svn/*
tags = tags/*/*:refs/remotes/svn/tags/*/*
; note the */* to glob at multiple levels
> Therefore, I've tried to use this (with and without '-T trunk', but
> that's a separate problem):
>
> git init
> git svn init --prefix=svn/ -t tags/1.0 -t tags/1.1 -t tags/1.2 -t
> tags/2.0 -b branches/1.0 -b branches/1.1
> http://restlet.tigris.org/svn/restlet
> git svn fetch
>
>
> This takes a while (I've had to interrupt this) and this creates a
> number of branches such as:
> remotes/svn/tags/1.0b1
> remotes/svn/tags/1.0b2
> remotes/svn/tags/1.0b3
> remotes/svn/tags/1.0b3@1883
> remotes/svn/tags/1.0b3@323
>
>
> What surprises me is that it looks like it's looping over and over,
> since sometimes it starts back from SVN revision 1 when it's trying to
> import a new tag.
Yeah, that's an unfortunate thing about the flexibility of Subversion,
basically anything can be a "tag" or a directory and it's extremely
hard for git svn to support any uncommon cases for tags/branches
out-of-the box, so the manual config editing is needed.
--
Eric Wong
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Efficient cloning from svn (with multiple branches/tags subdirs)
2009-10-14 6:03 ` Eric Wong
@ 2009-10-14 9:07 ` Bruno Harbulot
2009-10-14 16:28 ` Avery Pennarun
1 sibling, 0 replies; 9+ messages in thread
From: Bruno Harbulot @ 2009-10-14 9:07 UTC (permalink / raw)
To: git
Hi Eric,
Eric Wong wrote:
> Hi Bruno,
>
> That looks like there's two levels of tags. You should be able to do
> this with your version of git in $GIT_CONFIG:
>
> [svn-remote "svn"]
> url = http://restlet.tigris.org/svn/restlet
> fetch = trunk:refs/remotes/svn/trunk
> branches = branches/*:refs/remotes/svn/*
> tags = tags/*/*:refs/remotes/svn/tags/*/*
> ; note the */* to glob at multiple levels
Thank you, here is what I had (with the multiple -t/-b):
[svn-remote "svn"]
url = http://restlet.tigris.org/svn/restlet
branches = branches/1.0/*:refs/remotes/svn/*
branches = branches/1.1/*:refs/remotes/svn/*
tags = tags/1.0/*:refs/remotes/svn/tags/*
tags = tags/1.1/*:refs/remotes/svn/tags/*
tags = tags/1.2/*:refs/remotes/svn/tags/*
tags = tags/2.0/*:refs/remotes/svn/tags/*
I think the notation you suggest "*/*" is indeed better, since I don't
have to specify each tag sub-directory. However, they change so rarely
that it was only a minor issue.
>> What surprises me is that it looks like it's looping over and over,
>> since sometimes it starts back from SVN revision 1 when it's trying to
>> import a new tag.
>
> Yeah, that's an unfortunate thing about the flexibility of Subversion,
> basically anything can be a "tag" or a directory and it's extremely
> hard for git svn to support any uncommon cases for tags/branches
> out-of-the box, so the manual config editing is needed.
I must admit I don't fully understand how git-svn does the import, but
even with this manual configuration, it still tries to pull (almost)
every revision from revision 1 for each tag, a bit as if there was:
for each tag:
for revision in 1 to tag.latest revision:
pull the revision
(This isn't even for each tag, but for each modification of each tag,
since tags aren't really tags in SVN).
What I'd like to be able to do (mainly for efficiency and more
importantly not to hammer tigris.org) is to pull each revision at most
once (even if it's for the directory at the top of trunk, branches and
tags).
Best wishes,
Bruno.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Efficient cloning from svn (with multiple branches/tags subdirs)
2009-10-14 6:03 ` Eric Wong
2009-10-14 9:07 ` Bruno Harbulot
@ 2009-10-14 16:28 ` Avery Pennarun
2009-10-14 18:00 ` Eric Wong
1 sibling, 1 reply; 9+ messages in thread
From: Avery Pennarun @ 2009-10-14 16:28 UTC (permalink / raw)
To: Eric Wong; +Cc: Bruno Harbulot, git
On Wed, Oct 14, 2009 at 2:03 AM, Eric Wong <normalperson@yhbt.net> wrote:
> Bruno Harbulot <Bruno.Harbulot@manchester.ac.uk> wrote:
>> What surprises me is that it looks like it's looping over and over,
>> since sometimes it starts back from SVN revision 1 when it's trying to
>> import a new tag.
>
> Yeah, that's an unfortunate thing about the flexibility of Subversion,
> basically anything can be a "tag" or a directory and it's extremely
> hard for git svn to support any uncommon cases for tags/branches
> out-of-the box, so the manual config editing is needed.
I've been thinking about this myself for some time. One option that
might be "interesting" would be to just grab the *entire* svn tree
(from the root), and then use git-subtree[1] to slice and dice it into
branches using your local copy of git (which is fast and uses no
bandwidth) instead of during the svn fetch (which is slow and uses
lots of bandwidth). I think it would also simplify the git-svn code
quite a lot, at least for fetching, since there would always be a
global view of the tree and SVN things like "copy branch A to tag B"
would just be exactly that.
Of course I have no time to code this up myself, so I apologize for
just dumping ideas on you without code behind them. If this inspires
anyone, I'd be happy to help with any missing features (or
documentation) this exposes in git-subtree, though.
Have fun,
Avery
[1] http://github.com/apenwarr/git-subtree
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Efficient cloning from svn (with multiple branches/tags subdirs)
2009-10-14 16:28 ` Avery Pennarun
@ 2009-10-14 18:00 ` Eric Wong
2009-10-14 18:26 ` Avery Pennarun
0 siblings, 1 reply; 9+ messages in thread
From: Eric Wong @ 2009-10-14 18:00 UTC (permalink / raw)
To: Avery Pennarun; +Cc: Bruno Harbulot, git
Avery Pennarun <apenwarr@gmail.com> wrote:
> On Wed, Oct 14, 2009 at 2:03 AM, Eric Wong <normalperson@yhbt.net> wrote:
> > Bruno Harbulot <Bruno.Harbulot@manchester.ac.uk> wrote:
> >> What surprises me is that it looks like it's looping over and over,
> >> since sometimes it starts back from SVN revision 1 when it's trying to
> >> import a new tag.
> >
> > Yeah, that's an unfortunate thing about the flexibility of Subversion,
> > basically anything can be a "tag" or a directory and it's extremely
> > hard for git svn to support any uncommon cases for tags/branches
> > out-of-the box, so the manual config editing is needed.
>
> I've been thinking about this myself for some time. One option that
> might be "interesting" would be to just grab the *entire* svn tree
> (from the root), and then use git-subtree[1] to slice and dice it into
> branches using your local copy of git (which is fast and uses no
> bandwidth) instead of during the svn fetch (which is slow and uses
> lots of bandwidth). I think it would also simplify the git-svn code
> quite a lot, at least for fetching, since there would always be a
> global view of the tree and SVN things like "copy branch A to tag B"
> would just be exactly that.
>
> Of course I have no time to code this up myself, so I apologize for
> just dumping ideas on you without code behind them. If this inspires
> anyone, I'd be happy to help with any missing features (or
> documentation) this exposes in git-subtree, though.
This was actually the original use case of git svn back when I started.
git svn clone SVNREPO_ROOT (without --stdlayout)
It's still an option if you have the disk space for the working copies,
but I had to create the branches/tags support since the working copies
would be become prohibitively large. If git-subtree could be
taught to work on a bare repo (git svn has a --no-checkout option)
it might be an option, too.
> Have fun,
>
> Avery
>
> [1] http://github.com/apenwarr/git-subtree
--
Eric Wong
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Efficient cloning from svn (with multiple branches/tags subdirs)
2009-10-14 18:00 ` Eric Wong
@ 2009-10-14 18:26 ` Avery Pennarun
2009-10-15 17:23 ` Bruno Harbulot
0 siblings, 1 reply; 9+ messages in thread
From: Avery Pennarun @ 2009-10-14 18:26 UTC (permalink / raw)
To: Eric Wong; +Cc: Bruno Harbulot, git
On Wed, Oct 14, 2009 at 2:00 PM, Eric Wong <normalperson@yhbt.net> wrote:
> Avery Pennarun <apenwarr@gmail.com> wrote:
>> I've been thinking about this myself for some time. One option that
>> might be "interesting" would be to just grab the *entire* svn tree
>> (from the root), and then use git-subtree[1] to slice and dice it into
>> branches using your local copy of git (which is fast and uses no
>> bandwidth) instead of during the svn fetch (which is slow and uses
>> lots of bandwidth). I think it would also simplify the git-svn code
>> quite a lot, at least for fetching, since there would always be a
>> global view of the tree and SVN things like "copy branch A to tag B"
>> would just be exactly that.
>
> This was actually the original use case of git svn back when I started.
>
> git svn clone SVNREPO_ROOT (without --stdlayout)
>
> It's still an option if you have the disk space for the working copies,
> but I had to create the branches/tags support since the working copies
> would be become prohibitively large. If git-subtree could be
> taught to work on a bare repo (git svn has a --no-checkout option)
> it might be an option, too.
I've never tested git-subtree without a working tree, however, it
doesn't *use* the working tree for anything when splitting, so at
worst, there might be a minor bug or two. Thus, there ought never be
a need to check out the whole huge tree (which I agree would be both
slow and huge).
dcommit might be a little weirder. Though I guess if we fixed the
git-svn-id tags in the split branches, you could just commit directly
into a branch, then fetch the new commit back from the root, then
rebase the branch, as dcommit already does.
You know, maybe this is actually easier than I thought... I was
thinking committing back to svn would be complicated since it requires
a working tree, but if we let you commit straight from one of the
branches, it shouldn't actually be too bad at all. Hmm.
Have fun,
Avery
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Efficient cloning from svn (with multiple branches/tags subdirs)
2009-10-14 18:26 ` Avery Pennarun
@ 2009-10-15 17:23 ` Bruno Harbulot
2009-10-15 17:29 ` B Smith-Mannschott
0 siblings, 1 reply; 9+ messages in thread
From: Bruno Harbulot @ 2009-10-15 17:23 UTC (permalink / raw)
To: git; +Cc: Eric Wong
Hello,
Avery Pennarun wrote:
> On Wed, Oct 14, 2009 at 2:00 PM, Eric Wong <normalperson@yhbt.net> wrote:
>> Avery Pennarun <apenwarr@gmail.com> wrote:
>>> I've been thinking about this myself for some time. One option that
>>> might be "interesting" would be to just grab the *entire* svn tree
>>> (from the root), and then use git-subtree[1] to slice and dice it into
>>> branches using your local copy of git (which is fast and uses no
>>> bandwidth) instead of during the svn fetch (which is slow and uses
>>> lots of bandwidth). I think it would also simplify the git-svn code
>>> quite a lot, at least for fetching, since there would always be a
>>> global view of the tree and SVN things like "copy branch A to tag B"
>>> would just be exactly that.
>> This was actually the original use case of git svn back when I started.
>>
>> git svn clone SVNREPO_ROOT (without --stdlayout)
>>
>> It's still an option if you have the disk space for the working copies,
>> but I had to create the branches/tags support since the working copies
>> would be become prohibitively large. If git-subtree could be
>> taught to work on a bare repo (git svn has a --no-checkout option)
>> it might be an option, too.
Thank you for your suggestions. Unfortunately, I'm not really familiar
with git-subtree and how it could work with git-svn, sorry.
I've tried another workaround: using svnsync to pull the repository only
once, and only then using git-svn fetch, locally, so as to avoid too
much network traffic (I don't mind too much if it loops locally). I was
hoping to be able to change the URL of the repository to the original
one afterwards, but it doesn't seem to work so easily, because of the
commit IDs. I'm assuming not having the same will cause problems for
further fetches (this time directly from the original SVN repository)
and for potential dcommits.
When I do this:
git init
git svn init -s --prefix=svn/ file:///path/to/local/restlet-svnroot
git svn fetch -r 1:2
I get this ID, for example:
r2 = c69a0b98d288a6e4e8779b50962b7fc65c4622e8
If I do this using the original http://restlet.tigris.org/svn/restlet, I
get this:
r2 = ce3b82915e92fe1ccf6ddedacd9d74b30bd4de86
I've even tried to install a Apache-based subversion server locally and
make it believe it was restlet.tigris.org (by editing /etc/hosts and
creating the appropriate VirtualHost), but this generates another SHA1
ID. (That's of course not a solution that would be generalisable.)
I've had a quick look at the git-svn code to see how this ID was
generated, but couldn't find anything obvious.
I realise this isn't the cleanest approach possible, but any suggestion
would be appreciated.
Best wishes,
Bruno.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Efficient cloning from svn (with multiple branches/tags subdirs)
2009-10-15 17:23 ` Bruno Harbulot
@ 2009-10-15 17:29 ` B Smith-Mannschott
2009-10-16 11:20 ` Bruno Harbulot
0 siblings, 1 reply; 9+ messages in thread
From: B Smith-Mannschott @ 2009-10-15 17:29 UTC (permalink / raw)
To: Bruno Harbulot; +Cc: git, Eric Wong
On Thu, Oct 15, 2009 at 19:23, Bruno Harbulot
<Bruno.Harbulot@manchester.ac.uk> wrote:
> Hello,
>
> Avery Pennarun wrote:
>>
>> On Wed, Oct 14, 2009 at 2:00 PM, Eric Wong <normalperson@yhbt.net> wrote:
>>>
>>> Avery Pennarun <apenwarr@gmail.com> wrote:
>>>>
>>>> I've been thinking about this myself for some time. One option that
>>>> might be "interesting" would be to just grab the *entire* svn tree
>>>> (from the root), and then use git-subtree[1] to slice and dice it into
>>>> branches using your local copy of git (which is fast and uses no
>>>> bandwidth) instead of during the svn fetch (which is slow and uses
>>>> lots of bandwidth). I think it would also simplify the git-svn code
>>>> quite a lot, at least for fetching, since there would always be a
>>>> global view of the tree and SVN things like "copy branch A to tag B"
>>>> would just be exactly that.
>>>
>>> This was actually the original use case of git svn back when I started.
>>>
>>> git svn clone SVNREPO_ROOT (without --stdlayout)
>>>
>>> It's still an option if you have the disk space for the working copies,
>>> but I had to create the branches/tags support since the working copies
>>> would be become prohibitively large. If git-subtree could be
>>> taught to work on a bare repo (git svn has a --no-checkout option)
>>> it might be an option, too.
>
> Thank you for your suggestions. Unfortunately, I'm not really familiar with
> git-subtree and how it could work with git-svn, sorry.
>
> I've tried another workaround: using svnsync to pull the repository only
> once, and only then using git-svn fetch, locally, so as to avoid too much
> network traffic (I don't mind too much if it loops locally). I was hoping to
> be able to change the URL of the repository to the original one afterwards,
> but it doesn't seem to work so easily, because of the commit IDs. I'm
> assuming not having the same will cause problems for further fetches (this
> time directly from the original SVN repository) and for potential dcommits.
>
> When I do this:
> git init
> git svn init -s --prefix=svn/ file:///path/to/local/restlet-svnroot
> git svn fetch -r 1:2
>
> I get this ID, for example:
> r2 = c69a0b98d288a6e4e8779b50962b7fc65c4622e8
>
> If I do this using the original http://restlet.tigris.org/svn/restlet, I get
> this:
> r2 = ce3b82915e92fe1ccf6ddedacd9d74b30bd4de86
>
>
> I've even tried to install a Apache-based subversion server locally and make
> it believe it was restlet.tigris.org (by editing /etc/hosts and creating the
> appropriate VirtualHost), but this generates another SHA1 ID. (That's of
> course not a solution that would be generalisable.)
>
> I've had a quick look at the git-svn code to see how this ID was generated,
> but couldn't find anything obvious.
> I realise this isn't the cleanest approach possible, but any suggestion
> would be appreciated.
When I 'git svn clone' from a svnsync mirror I pass
--use-svnsync-props. Have you tried that?
// Ben
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Efficient cloning from svn (with multiple branches/tags subdirs)
2009-10-15 17:29 ` B Smith-Mannschott
@ 2009-10-16 11:20 ` Bruno Harbulot
0 siblings, 0 replies; 9+ messages in thread
From: Bruno Harbulot @ 2009-10-16 11:20 UTC (permalink / raw)
To: git; +Cc: Eric Wong
B Smith-Mannschott wrote:
>> I've had a quick look at the git-svn code to see how this ID was generated,
>> but couldn't find anything obvious.
>> I realise this isn't the cleanest approach possible, but any suggestion
>> would be appreciated.
>
> When I 'git svn clone' from a svnsync mirror I pass
> --use-svnsync-props. Have you tried that?
Thank you, I hadn't noticed this option, but it was the right one indeed.
Best wishes,
Bruno.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-10-16 11:27 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-13 18:13 Efficient cloning from svn (with multiple branches/tags subdirs) Bruno Harbulot
2009-10-14 6:03 ` Eric Wong
2009-10-14 9:07 ` Bruno Harbulot
2009-10-14 16:28 ` Avery Pennarun
2009-10-14 18:00 ` Eric Wong
2009-10-14 18:26 ` Avery Pennarun
2009-10-15 17:23 ` Bruno Harbulot
2009-10-15 17:29 ` B Smith-Mannschott
2009-10-16 11:20 ` Bruno Harbulot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).