From: Eric Wong <normalperson@yhbt.net>
To: Geert Bosch <bosch@adacore.com>
Cc: Steven Grimm <koreth@midwinter.com>,
"git@vger.kernel.org List" <git@vger.kernel.org>
Subject: Re: Excruciatingly slow git-svn imports
Date: Mon, 5 May 2008 20:28:54 -0700 [thread overview]
Message-ID: <20080506032846.GA15521@untitled> (raw)
In-Reply-To: <2B9E6C04-69F1-42BD-AE60-AFCE401E093E@adacore.com>
Geert Bosch <bosch@adacore.com> wrote:
> On Apr 29, 2008, at 03:11, Eric Wong wrote:
>
> >>I've found that git-svn gets slower as it runs. Try interrupting the
> >>clone and running "git svn fetch" -- it should pick up where it left
> >>off and will be MUCH faster if my experience is any indication.
> >>When I clone the big svn repository at work I usually restart it
> >>every 1000 revisions or so and it finishes in a fraction of the time
> >>it takes if I let it do everything in a single run.
> >
> >That's really strange to hear... The git-svn process itself does not
> >store much state other than the current revision and the log
> >information for the next 100 or so revisions it needs to import.
> >
> >Are you packing the repository? Which SVN protocol are you using?
> >Does memory usage of git-svn stay stable throughout the run?
>
> I found the same. After about 5 days (with maybe 10 break/restarts), I
> had a converted repository with all 135K commits and a total size of
> just under 1 GB. The last 100K commits took (much?) less than a day,
> almost all the time was spend in the earlier ones. These commits
> seemed all to have thousands of files, even though most were probably
> the same. I'm sure this repositor, which covers 15 years of
> development of a multi-million line project, has a lot of tags and it
> seemed that it just had to chew through many copies of the complete
> set of files to find out that they're all the same.
Interesting. By "These commits seemed all to have thousands of files",
you mean the first 35K that took up most of the time? If so, yes,
that's definitely a problem...
git-svn requests a log from SVN containing a list of all paths modified
in each revision. By default, git-svn only requests log entries for up
to 100 revisions at a time to reduce memory usage. However, having
thousands of files modified for each revision would still be
problematic, as would having insanely long commit messages.
Is this repository public by any chance? I'd like to be able to take
a look at it in case I have time and have access to decent hardware.
Also, what command-line arguments did you use?
> It's great git-svn can be restarted so well and doesn't get confused
> by uncleanly terminated runs. My final repository is fast and small.
> I'm still struggling with how to properly synchronize branches, but
> that probably is mostly a matter of user education.
>
> Thanks all for these great tools.
You're welcome, thanks for the feedback! Restartability in git-svn
is one of the things I focused on from the beginning.
--
Eric Wong
next prev parent reply other threads:[~2008-05-06 3:29 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-24 18:54 Excruciatingly slow git-svn imports Geert Bosch
2008-04-24 19:57 ` Steven Grimm
2008-04-29 7:11 ` Eric Wong
2008-05-05 4:29 ` Geert Bosch
2008-05-06 3:28 ` Eric Wong [this message]
2008-05-06 3:56 ` Avery Pennarun
2008-05-06 4:25 ` Eric Wong
2008-05-06 11:23 ` Geert Bosch
2008-04-29 7:03 ` Eric Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080506032846.GA15521@untitled \
--to=normalperson@yhbt.net \
--cc=bosch@adacore.com \
--cc=git@vger.kernel.org \
--cc=koreth@midwinter.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).