From: "Jon Smirl" <jonsmirl@gmail.com>
To: "Michael Haggerty" <mhagger@alum.mit.edu>
Cc: "Patwardhan, Rajesh" <rajesh.patwardhan@etrade.com>,
"Martin Langhoff" <martin.langhoff@gmail.com>,
"Guilhem Bonnefille" <guilhem.bonnefille@gmail.com>,
git@vger.kernel.org, users@cvs2svn.tigris.org
Subject: Re: cvs2svn conversion directly to git ready for experimentation
Date: Fri, 3 Aug 2007 16:27:49 -0400 [thread overview]
Message-ID: <9e4733910708031327u7df2205ap56a7ad5430380fb@mail.gmail.com> (raw)
In-Reply-To: <9e4733910708031316x1b7d2a40n5d0298cedd6cf97c@mail.gmail.com>
On 8/3/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> Make a bulk importer for SVN like git-fastimport. I measured some SVN
> imports and the bulk of the time was spent forking off SVN. Before
> git-fast import it would have taken git two weeks to import Mozilla
> CVS.
And add a CVS parser to cvs2svn. Use the one I posted or write it again.
Fork is not a very fast operation, millions of forks take a week to run.
In the cvs2git code I did there was one process running cvs2svn and it
parsed the CVS files internally. A second process ran git-fastimport.
Nothing else was forked.
When I first started we were forking both git and cvs. When I ran
oprofile on it 95% of the CPU time was being spent in the kernel.
Linus helped me figure out what was going on. It was the overhead of
page table copies associated with millions of forks that was taking so
long. The solution is to eliminate the forks.
My first try with forks for both cvs and git took about a week to
import Mozilla CVS. After all the forks were eliminated I could import
Mozilla CVS in four hours.
>
> >
> > Michael
> >
> >
>
>
> --
> Jon Smirl
> jonsmirl@gmail.com
>
--
Jon Smirl
jonsmirl@gmail.com
prev parent reply other threads:[~2007-08-03 20:28 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-01 0:09 cvs2svn conversion directly to git ready for experimentation Michael Haggerty
2007-08-01 0:41 ` Johannes Schindelin
2007-08-01 22:09 ` Jakub Narebski
2007-08-02 16:58 ` Michael Haggerty
2007-08-02 23:44 ` Jon Smirl
2007-08-02 8:49 ` Steffen Prohaska
2007-08-02 17:23 ` Michael Haggerty
2007-08-02 19:22 ` Marko Macek
2007-08-02 23:59 ` Jon Smirl
2007-08-05 7:58 ` Oswald Buddenhagen
2007-08-02 17:35 ` Simon 'corecode' Schubert
2007-08-02 19:13 ` Steffen Prohaska
2007-08-02 19:29 ` Simon 'corecode' Schubert
2007-08-02 20:21 ` Robin Rosenberg
[not found] ` <200708022221.13129.robin.rosenberg.lists-RgPrefM1rjDQT0dZR+AlfA@public.gmane.org>
2007-08-02 20:31 ` Lübbe Onken
2007-08-02 20:32 ` Lübbe Onken
2007-08-02 20:33 ` Lübbe Onken
2007-08-02 22:02 ` Steffen Prohaska
2007-08-02 22:50 ` Simon 'corecode' Schubert
2007-08-02 23:50 ` Michael Haggerty
2007-08-03 8:40 ` Simon 'corecode' Schubert
2007-08-04 8:28 ` Steffen Prohaska
2007-08-03 3:07 ` Shawn O. Pearce
2007-08-02 23:37 ` Michael Haggerty
2007-08-02 20:43 ` Linus Torvalds
2007-08-02 23:19 ` Michael Haggerty
2007-08-03 3:12 ` Shawn O. Pearce
2007-08-02 23:55 ` Jon Smirl
[not found] ` <8b65902a0708010438s24d16109k601b52c04cf9c066@mail.gmail.com>
2007-08-02 15:34 ` Michael Haggerty
2007-08-02 23:08 ` Martin Langhoff
2007-08-03 4:03 ` Johannes Schindelin
2007-08-03 6:48 ` Steffen Prohaska
2007-08-03 7:10 ` Steffen Prohaska
2007-08-03 8:36 ` Michael Haggerty
2007-08-03 14:35 ` Patwardhan, Rajesh
2007-08-03 15:41 ` Jon Smirl
2007-08-03 16:42 ` Patwardhan, Rajesh
2007-08-03 18:58 ` Michael Haggerty
2007-08-03 20:16 ` Jon Smirl
2007-08-03 20:27 ` Jon Smirl [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9e4733910708031327u7df2205ap56a7ad5430380fb@mail.gmail.com \
--to=jonsmirl@gmail.com \
--cc=git@vger.kernel.org \
--cc=guilhem.bonnefille@gmail.com \
--cc=martin.langhoff@gmail.com \
--cc=mhagger@alum.mit.edu \
--cc=rajesh.patwardhan@etrade.com \
--cc=users@cvs2svn.tigris.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).