* Subversion integration with git
@ 2010-03-25 14:17 David Michael Barr
2010-03-25 14:31 ` Erik Faye-Lund
2010-03-25 18:23 ` Pascal Obry
0 siblings, 2 replies; 8+ messages in thread
From: David Michael Barr @ 2010-03-25 14:17 UTC (permalink / raw)
To: git
Hi folks,
As my first posting to the list, I'd like to start by giving a big thank you to all the git developers and maintainers for such a great tool.
Unfortunately, I still have to interact with lesser tools such as Subversion and that is what leads me to post.
I'm employed on proprietary project which is supported by a large number of open source tools. The 'canonical' source repository is hosted on a Subversion server on the other side of a rather unreliable WAN link. To date I've been using a combination of git-svn, cron, and a handful of bash scripts to handle marshalling commits between our git repositories and the Subversion instance. However, whilst this solution works well for incremental commits, every time a branch is created on the remote repository it's a hassle to synchronise.
So I thought I'd use git-svn and standard layout - this resulted in blasting my link with so many HTTP requests that I got a stern warning from our sysadmin and I'm sure the firm on the other side of the link weren't impressed.
After exploring a few solutions I used SVK to create a local mirror of the repository.
When I pointed git-svn at the local mirror, it took 4 days, a whole lot of RAM and fell over at 90% completion with a checksum error.
When I pointed svn-all-fast-export at the repository it had to skip three commits or would indefinitely spew garbage.
When I pointed svn2git.py at a dump of the repository it successfully imported 50% of commits and then ran at snail's pace, ETA next century.
I decided that I liked the idea of subversion dump in - git fast-import out but it had to scale well.
So I grabbed the git-fast-import documentation and the Subversion dump format documentation and tried to design a data structure that would map well between them and scale linearly with my repository.
I started a new project to implement my design and am curious as to how many git users actually care about this kind of problem. While conversion is once off for most projects - there are an awful number of projects currently using Subversion. As the community and tool-chain builds around git, that will mean many desiring to make the transition. I hope to make it far less painful than it has been for me.
My project is still in the preview phase but has enough to import commit-tree structure bar symlinks and executable flags. It imports my 22000+ commit 2.8GB dump in 4 minutes. It is currently 840 non-comment lines of C. I aim to produce output that git-svn can take over from.
Is it worthwhile to start a new project - or would it be better to grok the internals of existing projects and try to make them scale?
Best regards,
David M Barr
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Subversion integration with git
2010-03-25 14:17 Subversion integration with git David Michael Barr
@ 2010-03-25 14:31 ` Erik Faye-Lund
[not found] ` <E560EF9A-AF07-4316-9047-6D1A1802F743@cordelta.com>
2010-03-25 18:23 ` Pascal Obry
1 sibling, 1 reply; 8+ messages in thread
From: Erik Faye-Lund @ 2010-03-25 14:31 UTC (permalink / raw)
To: David Michael Barr; +Cc: git, artagnon, Sverre Rabbelier
On Thu, Mar 25, 2010 at 3:17 PM, David Michael Barr
<david.barr@cordelta.com> wrote:
> Hi folks,
>
> As my first posting to the list, I'd like to start by giving a big thank you to all the git developers and maintainers for such a great tool.
>
> Unfortunately, I still have to interact with lesser tools such as Subversion and that is what leads me to post.
>
> I'm employed on proprietary project which is supported by a large number of open source tools. The 'canonical' source repository is hosted on a Subversion server on the other side of a rather unreliable WAN link. To date I've been using a combination of git-svn, cron, and a handful of bash scripts to handle marshalling commits between our git repositories and the Subversion instance. However, whilst this solution works well for incremental commits, every time a branch is created on the remote repository it's a hassle to synchronise.
> So I thought I'd use git-svn and standard layout - this resulted in blasting my link with so many HTTP requests that I got a stern warning from our sysadmin and I'm sure the firm on the other side of the link weren't impressed.
> After exploring a few solutions I used SVK to create a local mirror of the repository.
>
> When I pointed git-svn at the local mirror, it took 4 days, a whole lot of RAM and fell over at 90% completion with a checksum error.
>
> When I pointed svn-all-fast-export at the repository it had to skip three commits or would indefinitely spew garbage.
>
> When I pointed svn2git.py at a dump of the repository it successfully imported 50% of commits and then ran at snail's pace, ETA next century.
>
> I decided that I liked the idea of subversion dump in - git fast-import out but it had to scale well.
>
> So I grabbed the git-fast-import documentation and the Subversion dump format documentation and tried to design a data structure that would map well between them and scale linearly with my repository.
>
> I started a new project to implement my design and am curious as to how many git users actually care about this kind of problem. While conversion is once off for most projects - there are an awful number of projects currently using Subversion. As the community and tool-chain builds around git, that will mean many desiring to make the transition. I hope to make it far less painful than it has been for me.
>
> My project is still in the preview phase but has enough to import commit-tree structure bar symlinks and executable flags. It imports my 22000+ commit 2.8GB dump in 4 minutes. It is currently 840 non-comment lines of C. I aim to produce output that git-svn can take over from.
>
Wow, your figures sounds very impressive. I'd love to have a look at
it! I've tried to convert simiar-sized SVN repos before, but given up
due to the poor performance. So at work I'm currently using git-svn
with only parts of the history imported, and falling back to SVN when
having to dig far in the history (which is not much fun).
> Is it worthwhile to start a new project - or would it be better to grok the internals of existing projects and try to make them scale?
>
I think it falls very close to the native-git-svn Google SoC
project[1], and if you are able to share what you have I'm sure
Ramkumar (I hope you don't mind me CC'ing you, and that I spelled your
name right) would appreciate having a look.
[1]: https://git.wiki.kernel.org/index.php/SoC2010Ideas#A_remote_helper_for_svn
--
Erik "kusma" Faye-Lund
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Subversion integration with git
2010-03-25 14:17 Subversion integration with git David Michael Barr
2010-03-25 14:31 ` Erik Faye-Lund
@ 2010-03-25 18:23 ` Pascal Obry
2010-03-28 12:03 ` David Michael Barr
1 sibling, 1 reply; 8+ messages in thread
From: Pascal Obry @ 2010-03-25 18:23 UTC (permalink / raw)
To: David Michael Barr; +Cc: git
David,
> My project is still in the preview phase but has enough to import
> commit-tree structure bar symlinks and executable flags. It imports
> my 22000+ commit 2.8GB dump in 4 minutes. It is currently 840
> non-comment lines of C. I aim to produce output that git-svn can
> take over from.
Impressive numbers! I've converted many projects using git-svn and yes
it is slow. Just curious, does it handles branches? Can it handles not
standard layout (trunk/branch/tags)? When you have a git-svn compatible
output I would be willing to test it on a project.
Pascal.
--
--|------------------------------------------------------
--| Pascal Obry Team-Ada Member
--| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE
--|------------------------------------------------------
--| http://www.obry.net - http://v2p.fr.eu.org
--| "The best way to travel is by means of imagination"
--|
--| gpg --keyserver keys.gnupg.net --recv-key F949BD3B
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Subversion integration with git
2010-03-25 18:23 ` Pascal Obry
@ 2010-03-28 12:03 ` David Michael Barr
0 siblings, 0 replies; 8+ messages in thread
From: David Michael Barr @ 2010-03-28 12:03 UTC (permalink / raw)
To: pascal; +Cc: git
Pascal,
>> My project is still in the preview phase but has enough to import
>> commit-tree structure bar symlinks and executable flags. It imports
>> my 22000+ commit 2.8GB dump in 4 minutes. It is currently 840
>> non-comment lines of C. I aim to produce output that git-svn can
>> take over from.
>
> Impressive numbers! I've converted many projects using git-svn and yes
> it is slow. Just curious, does it handles branches? Can it handles not
> standard layout (trunk/branch/tags)? When you have a git-svn compatible
> output I would be willing to test it on a project.
My initial design target is a one-to-one translation of the subversion history to a single linear git branch. I'm working under the assumption that something like git filter-branch can be used to transform the history to a more logical representation. This should allow any subversion layout to be handled.
When I have git-svn compatible output, I'll proudly announce the first release.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-03-30 14:30 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-25 14:17 Subversion integration with git David Michael Barr
2010-03-25 14:31 ` Erik Faye-Lund
[not found] ` <E560EF9A-AF07-4316-9047-6D1A1802F743@cordelta.com>
2010-03-25 17:52 ` Ramkumar Ramachandra
2010-03-25 23:50 ` David Michael Barr
2010-03-30 14:05 ` David Michael Barr
2010-03-30 14:29 ` Ramkumar Ramachandra
2010-03-25 18:23 ` Pascal Obry
2010-03-28 12:03 ` David Michael Barr
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).