From: "Jon Smirl" <jonsmirl@gmail.com>
To: "Petr Baudis" <pasky@suse.cz>
Cc: "Andy Whitcroft" <apw@shadowen.org>,
"Git Mailing List" <git@vger.kernel.org>
Subject: Re: Mozilla, git and Windows
Date: Mon, 27 Nov 2006 20:35:05 -0500 [thread overview]
Message-ID: <9e4733910611271735y14bed29bk70ae67b5d28eb055@mail.gmail.com> (raw)
In-Reply-To: <20061127221338.GP7201@pasky.or.cz>
On 11/27/06, Petr Baudis <pasky@suse.cz> wrote:
> On Mon, Nov 27, 2006 at 05:13:10PM CET, Jon Smirl wrote:
> > The SVN version of the Mozilla repository is about 3GB. It takes
> > around a week of CPU time for svnimport to process it.
>
> Is there a reason why a SVN importer would _have_ to take _longer_ than
> a CVS importer? I'd expect the opposite from an optimized importer since
> you don't have to guess the changesets...
These import programs take forever because they fork off git, SVN or
CVS millions of times. It really does take a week to fork a CVS
process that many times. It's not the application code that is taking
a week to run, it is the millions of forks.
As was mentioned in the thread about doing CVS to git import, the
trick is to write your own CVS file parser, parse the file once (not
once for each revision) and output all of the revisions to the git
database in a single pass. When code is structured that way I can
import the whole Mozilla repository into git in two hours. The
fast-import back end also works with out forking, it just listens to
command and stdin and acts on them, all of the commands are implement
in a single binary.
The speed of fork in Linux is fine for most purposes, but it is not
fine if you are going to fork off good sized apps several million
times. When I measured those forks in oprofile, 60% of the CPU was
being consumed by the kernel.
--
Jon Smirl
next prev parent reply other threads:[~2006-11-28 1:35 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-27 15:28 Mozilla, git and Windows Jon Smirl
2006-11-27 15:34 ` Andy Whitcroft
2006-11-27 16:13 ` Jon Smirl
2006-11-27 16:37 ` Robin Rosenberg
2006-11-27 22:13 ` Petr Baudis
2006-11-28 1:35 ` Jon Smirl [this message]
2006-11-28 12:17 ` Michael Haggerty
2006-11-28 0:30 ` Sam Vilain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9e4733910611271735y14bed29bk70ae67b5d28eb055@mail.gmail.com \
--to=jonsmirl@gmail.com \
--cc=apw@shadowen.org \
--cc=git@vger.kernel.org \
--cc=pasky@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).