From: Eric Wong <normalperson@yhbt.net>
To: Avery Pennarun <apenwarr@gmail.com>
Cc: Geert Bosch <bosch@adacore.com>,
Steven Grimm <koreth@midwinter.com>,
"git@vger.kernel.org List" <git@vger.kernel.org>
Subject: Re: Excruciatingly slow git-svn imports
Date: Mon, 5 May 2008 21:25:08 -0700 [thread overview]
Message-ID: <20080506042508.GA23465@untitled> (raw)
In-Reply-To: <32541b130805052056g450b69cfg46693bc3c0c5a1ed@mail.gmail.com>
Avery Pennarun <apenwarr@gmail.com> wrote:
> On 5/5/08, Eric Wong <normalperson@yhbt.net> wrote:
> > Interesting. By "These commits seemed all to have thousands of files",
> > you mean the first 35K that took up most of the time? If so, yes,
> > that's definitely a problem...
> >
> > git-svn requests a log from SVN containing a list of all paths modified
> > in each revision. By default, git-svn only requests log entries for up
> > to 100 revisions at a time to reduce memory usage. However, having
> > thousands of files modified for each revision would still be
> > problematic, as would having insanely long commit messages.
>
> On my system, any branch that was created using "svn cp" of a toplevel
> directory seems to cause git-svn to (rather slowly) download every
> single file in the entire branch for the first commit on that branch,
> giving a symptom that sounds a lot like the above "commits with
> thousands of files". I assumed this was just an intentional design
> decision in git-svn, to be slow and safe instead of fast and loose.
> Is it actually supposed to do something smarter than that?
When using "svn cp" on a top-level directory, it *should*
just show up as a single file change in the log entry.
Something like:
A /project/branch/my-new-branch (from /project/trunk:1234)
This would not take much memory at all.
However, I've also occasionally seen stuff like this:
A /project/branch/my-new-branch
A /project/branch/my-new-branch/file1 (from /project/trunk/file1:1234)
A /project/branch/my-new-branch/file2 (from /project/trunk/file2:1234)
A /project/branch/my-new-branch/file3 (from /project/trunk/file3:1234)
.... many more files and directories along the same lines ...
This is what I suspect Geert is seeing in his repository and causing
problems. Perhaps something caused by cvs2svn importing those tags into
SVN originally?
But the symptom you're seeing with git-svn downloading every file seems
to be the result of using a pre-1.4.3 version of the Perl SVN bindings
which lacked a working do_switch() function. I fallback to using
do_update() and checking out a new tree for SVN 1.4.2 and before.
So yes, I'm definitely safe, slow and _lazy_ by falling back to
do_update() instead of doing something fancy to workaround something
that's already fixed in SVN :)
--
Eric Wong
next prev parent reply other threads:[~2008-05-06 4:26 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-24 18:54 Excruciatingly slow git-svn imports Geert Bosch
2008-04-24 19:57 ` Steven Grimm
2008-04-29 7:11 ` Eric Wong
2008-05-05 4:29 ` Geert Bosch
2008-05-06 3:28 ` Eric Wong
2008-05-06 3:56 ` Avery Pennarun
2008-05-06 4:25 ` Eric Wong [this message]
2008-05-06 11:23 ` Geert Bosch
2008-04-29 7:03 ` Eric Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080506042508.GA23465@untitled \
--to=normalperson@yhbt.net \
--cc=apenwarr@gmail.com \
--cc=bosch@adacore.com \
--cc=git@vger.kernel.org \
--cc=koreth@midwinter.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.