git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jon Smirl" <jonsmirl@gmail.com>
To: "Martin Langhoff" <martin.langhoff@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Importing Mozilla CVS into git
Date: Tue, 6 Jun 2006 20:40:50 -0400	[thread overview]
Message-ID: <9e4733910606061740v797886baif6b1edd969dfab2a@mail.gmail.com> (raw)
In-Reply-To: <9e4733910606060813r41037467u74235f7a9386c1e0@mail.gmail.com>

On 6/6/06, Jon Smirl <jonsmirl@gmail.com> wrote:
> On 6/6/06, Martin Langhoff <martin.langhoff@gmail.com> wrote:
> > On 6/3/06, Jon Smirl <jonsmirl@gmail.com> wrote:
> > > On 6/1/06, Jon Smirl <jonsmirl@gmail.com> wrote:
> > > > With the attached patch you can parse the entire Mozilla tree. The
> > > > tree has over 100,000 files in it and about 300 branches.
> > >
> > > I was a little low with these counts, more like 110,000 files and some
> > > parts of the tree have 1,000 branches. Total tree size is 3GB.
> >
> > I don't think it really has that many branches. If I am to believe
> > cvsps (which took 3GB to walk the history), it has some branches with
> > recursive loops in their ancestry (MANG_MATH_BRANCH and
> > SpiderMonkey140_BRANCH have eachother as ancestors!?), 197969 commits
> > and 796 branches.

My full import to svn just finished after a day and a half.
Here are the stats:

cvs2svn Statistics:
------------------
Total CVS Files:             99851
Total CVS Revisions:        948580
Total Unique Tags:            1505
Total Unique Branches:        1577
CVS Repos Size in KB:      2725843
Total SVN Commits:          205787
First Revision Date:    Fri Mar 27 21:13:08 1998
Last Revision Date:     Tue May 30 19:28:10 2006
------------------
Timings:
------------------
pass 1:  3602 seconds
pass 2:   227 seconds
pass 3:    66 seconds
pass 4:  1070 seconds
pass 8:124650 seconds
total: 124650 seconds
[jonsmirl@jonsmirl ~]$

[jonsmirl@jonsmirl svn]$ du -h
4.0K    ./svntest/dav
12K     ./svntest/locks
40K     ./svntest/hooks
16K     ./svntest/conf
7.4G    ./svntest/db/revs
808M    ./svntest/db/revprops
4.0K    ./svntest/db/transactions
8.2G    ./svntest/db
8.2G    ./svntest
8.2G    .

[jonsmirl@jonsmirl svn]$ find | wc
 411607  411607 10891057

There are two directories that each contain about 205k files. 205K
files in a single directory is causing svn problems on Ext3.

Bottom line, cvs2svn import tool works quite well. Highest memory
consumption I saw was 100MB and it used 6GB of extra disk while
running plus space need by svn.

I don't know quite enough about git yet to replace the svn commands it
uses with git equivalents but if that were done I think most of the
cvs import problems would be solved. Obviously the svn team has put a
great deal of work into this program.

I don't think replacing the svn commands is very hard, I just haven't
figured out the right way to build branches with low-level git yet and
I don't know Python. I'll bet someone already familiar with git and
cvs import could convert it in a couple of hours.

-- 
Jon Smirl
jonsmirl@gmail.com

  parent reply	other threads:[~2006-06-07  0:40 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-01 22:21 Importing Mozilla CVS into git Jon Smirl
2006-06-01 23:20 ` Keith Packard
2006-06-02  0:55   ` Jon Smirl
2006-06-02  2:07     ` Keith Packard
2006-06-02  2:36       ` Jon Smirl
2006-06-02  2:56         ` Shawn Pearce
2006-06-02  3:39         ` Keith Packard
2006-06-02  3:47           ` Jon Smirl
2006-06-02  3:55             ` Keith Packard
2006-06-02  4:00               ` Jon Smirl
2006-06-02  4:11                 ` Shawn Pearce
2006-06-02  4:39                   ` Pavel Roskin
2006-06-02  4:44                     ` Shawn Pearce
2006-06-02  7:46                       ` Johannes Schindelin
2006-06-02  4:44                     ` Jon Smirl
2006-06-07  9:02                       ` Igor Bukanov
2006-06-07 15:21                         ` Pavel Roskin
2006-06-07 15:30                         ` Jon Smirl
2006-06-07 15:58                           ` Jakub Narebski
2006-06-07 16:17                             ` Linus Torvalds
2006-06-07 18:29                               ` Martin Langhoff
2006-06-02  4:16                 ` Martin Langhoff
2006-06-03 23:16                   ` Robin Rosenberg (list subscriber)
2006-06-03 23:47                     ` Linus Torvalds
2006-06-04  2:24                       ` Bertrand Jacquin
2006-06-04  7:05                       ` Jakub Narebski
2006-06-04 17:55                         ` Linus Torvalds
2006-06-04 19:44                           ` Robin Rosenberg (list subscriber)
2006-06-04 20:00                             ` Linus Torvalds
2006-06-04 21:25                               ` Robin Rosenberg (list subscriber)
2006-06-04 22:02                                 ` Robin Rosenberg (list subscriber)
2006-06-04 23:19                                 ` Linus Torvalds
2006-06-05  0:10                       ` Yakov Lerner
2006-06-03  0:09               ` Jon Smirl
2006-06-03  4:28     ` Jon Smirl
2006-06-06  5:55       ` Martin Langhoff
2006-06-06 15:13         ` Jon Smirl
2006-06-06 19:57           ` Martin Langhoff
2006-06-07  0:12             ` Keith Packard
2006-06-07  0:40           ` Jon Smirl [this message]
2006-06-01 23:48 ` Linus Torvalds
2006-06-02  0:59   ` Jon Smirl
2006-06-02  1:11     ` Linus Torvalds
2006-06-02  6:40       ` Junio C Hamano
2006-06-02 15:53         ` Linus Torvalds
2006-06-02 16:00           ` Junio C Hamano
2006-06-02  4:14 ` Martin Langhoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e4733910606061740v797886baif6b1edd969dfab2a@mail.gmail.com \
    --to=jonsmirl@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=martin.langhoff@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).