git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Mozilla, git and Windows
@ 2006-11-27 15:28 Jon Smirl
  2006-11-27 15:34 ` Andy Whitcroft
  2006-11-28  0:30 ` Sam Vilain
  0 siblings, 2 replies; 8+ messages in thread
From: Jon Smirl @ 2006-11-27 15:28 UTC (permalink / raw)
  To: Git Mailing List

In the other thread we are discussing the conversion of Mozilla CVS to
git format. This is something that has to be done but it is not the
only issue. Without a native Windows port they won't even consider
using git. There is also the risk that the features needed by Mozilla
will be completed after they choose to use a different SCM.

Even if we implement all of the needed features git still needs to win
the competition against the other possible choices. The last I heard
the leading candiate is SVN/SVK.

-- 
Jon Smirl

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mozilla, git and Windows
  2006-11-27 15:28 Mozilla, git and Windows Jon Smirl
@ 2006-11-27 15:34 ` Andy Whitcroft
  2006-11-27 16:13   ` Jon Smirl
  2006-11-28  0:30 ` Sam Vilain
  1 sibling, 1 reply; 8+ messages in thread
From: Andy Whitcroft @ 2006-11-27 15:34 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Git Mailing List

Jon Smirl wrote:
> In the other thread we are discussing the conversion of Mozilla CVS to
> git format. This is something that has to be done but it is not the
> only issue. Without a native Windows port they won't even consider
> using git. There is also the risk that the features needed by Mozilla
> will be completed after they choose to use a different SCM.
> 
> Even if we implement all of the needed features git still needs to win
> the competition against the other possible choices. The last I heard
> the leading candiate is SVN/SVK.

Do we need to worry too much about taking over the world in one day?
Yes of course git is _the_ superior solution etc, but too many new users
at once is always painful.

I think you are more likely to win letting them convert over to SVN.
From there people naturally start using git mirrors from the SVN trunk.
 Cirtainly I have two projects which do not use git, one in CVS and one
in SVN.  I just svnimport that and work in git.  I am confident with
time the project will migrate, but I am happy other git users are happy
all without it being the tool of choice.

-apw

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mozilla, git and Windows
  2006-11-27 15:34 ` Andy Whitcroft
@ 2006-11-27 16:13   ` Jon Smirl
  2006-11-27 16:37     ` Robin Rosenberg
  2006-11-27 22:13     ` Petr Baudis
  0 siblings, 2 replies; 8+ messages in thread
From: Jon Smirl @ 2006-11-27 16:13 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: Git Mailing List

On 11/27/06, Andy Whitcroft <apw@shadowen.org> wrote:
> Jon Smirl wrote:
> > In the other thread we are discussing the conversion of Mozilla CVS to
> > git format. This is something that has to be done but it is not the
> > only issue. Without a native Windows port they won't even consider
> > using git. There is also the risk that the features needed by Mozilla
> > will be completed after they choose to use a different SCM.
> >
> > Even if we implement all of the needed features git still needs to win
> > the competition against the other possible choices. The last I heard
> > the leading candiate is SVN/SVK.
>
> Do we need to worry too much about taking over the world in one day?
> Yes of course git is _the_ superior solution etc, but too many new users
> at once is always painful.
>
> I think you are more likely to win letting them convert over to SVN.
> From there people naturally start using git mirrors from the SVN trunk.
>  Cirtainly I have two projects which do not use git, one in CVS and one
> in SVN.  I just svnimport that and work in git.  I am confident with
> time the project will migrate, but I am happy other git users are happy
> all without it being the tool of choice.

The SVN version of the Mozilla repository is about 3GB. It takes
around a week of CPU time for svnimport to process it.

-- 
Jon Smirl

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mozilla, git and Windows
  2006-11-27 16:13   ` Jon Smirl
@ 2006-11-27 16:37     ` Robin Rosenberg
  2006-11-27 22:13     ` Petr Baudis
  1 sibling, 0 replies; 8+ messages in thread
From: Robin Rosenberg @ 2006-11-27 16:37 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Andy Whitcroft, Git Mailing List

måndag 27 november 2006 17:13 skrev Jon Smirl:
> On 11/27/06, Andy Whitcroft <apw@shadowen.org> wrote:
> > I think you are more likely to win letting them convert over to SVN.
> > From there people naturally start using git mirrors from the SVN trunk.
> >  Cirtainly I have two projects which do not use git, one in CVS and one
> > in SVN.  I just svnimport that and work in git.  I am confident with
> > time the project will migrate, but I am happy other git users are happy
> > all without it being the tool of choice.
>
> The SVN version of the Mozilla repository is about 3GB. It takes
> around a week of CPU time for svnimport to process it.

You can track parts of an SVN repo using git-svn. You rarely need the
whole history to start working on a project.

In addition some nice soul with too much hardware will probably make an import 
and publish it and track it so everybody won't have to. We see a lot of 
git-tracked repos SVN/CVS already.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mozilla, git and Windows
  2006-11-27 16:13   ` Jon Smirl
  2006-11-27 16:37     ` Robin Rosenberg
@ 2006-11-27 22:13     ` Petr Baudis
  2006-11-28  1:35       ` Jon Smirl
  1 sibling, 1 reply; 8+ messages in thread
From: Petr Baudis @ 2006-11-27 22:13 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Andy Whitcroft, Git Mailing List

On Mon, Nov 27, 2006 at 05:13:10PM CET, Jon Smirl wrote:
> The SVN version of the Mozilla repository is about 3GB. It takes
> around a week of CPU time for svnimport to process it.

Is there a reason why a SVN importer would _have_ to take _longer_ than
a CVS importer? I'd expect the opposite from an optimized importer since
you don't have to guess the changesets...

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
The meaning of Stonehenge in Traflamadorian, when viewed from above, is:
"Replacement part being rushed with all possible speed."

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mozilla, git and Windows
  2006-11-27 15:28 Mozilla, git and Windows Jon Smirl
  2006-11-27 15:34 ` Andy Whitcroft
@ 2006-11-28  0:30 ` Sam Vilain
  1 sibling, 0 replies; 8+ messages in thread
From: Sam Vilain @ 2006-11-28  0:30 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Git Mailing List

Jon Smirl wrote:
> In the other thread we are discussing the conversion of Mozilla CVS to
> git format. This is something that has to be done but it is not the
> only issue. Without a native Windows port they won't even consider
> using git. There is also the risk that the features needed by Mozilla
> will be completed after they choose to use a different SCM.
>
> Even if we implement all of the needed features git still needs to win
> the competition against the other possible choices. The last I heard
> the leading candiate is SVN/SVK.

Jon,

When I met clkao in August to discuss the possibility of using git as a
depot for SVK, he seemed very open to the idea, and we worked on an
initial plan for this.  This should eventually allow svk to be used as a
porcelain, which might make it more palatable to the Windows crowd.

However that doesn't solve the Windows porting issue - it would still
need to access the repository.

I've been working on and off on an abstraction to git, in Moose (perl 6
objects on perl 5) - what's working so far is making core objects, and
producing correct checksums.

You can see this early implementation at
http://utsl.gen.nz/gitweb/?p=VCS-Git

What I've found is that I need a good abstraction of UNIX pipelines that
is as portable as Perl, so that I can prototype the code basing it on
setting up command pipelines on UNIX, and have it gracefully degrade to
either using temporary files on Windows, or threading if portions are
re-implemented using Perl (or a ported libgit).  I am currently working
on this and hope to have a release by the end of the week, though I will
not have tested the Windows portability by then.

I would love it if anyone interested in the project would like to help
me complete the OO-based API, provide tests, documentation, or any kind
of feedback really.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mozilla, git and Windows
  2006-11-27 22:13     ` Petr Baudis
@ 2006-11-28  1:35       ` Jon Smirl
  2006-11-28 12:17         ` Michael Haggerty
  0 siblings, 1 reply; 8+ messages in thread
From: Jon Smirl @ 2006-11-28  1:35 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Andy Whitcroft, Git Mailing List

On 11/27/06, Petr Baudis <pasky@suse.cz> wrote:
> On Mon, Nov 27, 2006 at 05:13:10PM CET, Jon Smirl wrote:
> > The SVN version of the Mozilla repository is about 3GB. It takes
> > around a week of CPU time for svnimport to process it.
>
> Is there a reason why a SVN importer would _have_ to take _longer_ than
> a CVS importer? I'd expect the opposite from an optimized importer since
> you don't have to guess the changesets...

These import programs take forever because they fork off git, SVN or
CVS millions of times. It really does take a week to fork a CVS
process that many times. It's not the application code that is taking
a week to run, it is the millions of forks.

As was mentioned in the thread about doing CVS to git import, the
trick is to write your own CVS file parser, parse the file once (not
once for each revision) and output all of the revisions to the git
database in a single pass. When code is structured that way I can
import the whole Mozilla repository into git in two hours. The
fast-import back end also works with out forking, it just listens to
command and stdin and acts on them, all of the commands are implement
in a single binary.

The speed of fork in Linux is fine for most purposes, but it is not
fine if you are going to fork off good sized apps several million
times. When I measured those forks in oprofile, 60% of the CPU was
being consumed by the kernel.

-- 
Jon Smirl

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Mozilla, git and Windows
  2006-11-28  1:35       ` Jon Smirl
@ 2006-11-28 12:17         ` Michael Haggerty
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Haggerty @ 2006-11-28 12:17 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Petr Baudis, Andy Whitcroft, Git Mailing List, dev

Jon Smirl wrote:
> As was mentioned in the thread about doing CVS to git import, the
> trick is to write your own CVS file parser, parse the file once (not
> once for each revision) and output all of the revisions to the git
> database in a single pass. When code is structured that way I can
> import the whole Mozilla repository into git in two hours. The
> fast-import back end also works with out forking, it just listens to
> command and stdin and acts on them, all of the commands are implement
> in a single binary.

Using cvs2svn, it is now possible to avoid having to invoke CVS/RCS
zillions of times.  Here is a brief description of how the new hooks work.

There is an interface called RevisionReader that is used to retrieve the
contents of a file.  The RevisionReader that should be used for a run of
cvs2svn can be set using the --options file method with a line like:

ctx.revision_reader = MyRevisionReader()

The RevisionReader interface includes a method get_revision_recorder(),
which should return an instance of RevisionRecorder.  The
RevisionRecorder has callback methods that are invoked as the CVS files
are parsed.  For example, RevisionRecorder.record_text() is passed the
log message and text (full text or delta) for each file revision.  The
record_text() method is allowed to return an arbitrary token (for
example, a content hash), and that token is stored into
CVSRevision.revision_recorder_token and carried along by cvs2svn.

The concrete RevisionReaders included with cvs2svn are RCSRevisionReader
and CVSRevisionReader, which have do-nothing RevisionRecorders and which
call rcs or cvs in OutputPass to get the file contents.  (This repeated
invocation of rcs/cvs is the most expensive part of the conversion.)

So what you would do to speed things up is write your own
RevisionRecorder, which constructs the file fulltext from the CVS deltas
and stores the contents in a git store, returning the file revision's
content hash as token.

Then write a RevisionReader that returns an instance of your
RevisionRecorder to be used in the CollectRevsPass of the conversion.
For OutputPass, the RevisionReader has to implement the method
get_content_stream(), which is passed a CVSRevision instance and has to
return a stream object that produces the file revision's contents.  In
your case, you wouldn't need the contents at all, but could just work
with CVSRevision.revision_recorder_token, which contains the hash that
was generated by your RevisionRecorder.

How you actually cook these tokens together into a git repository is up
to you :-)

Michael


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-11-28 12:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-27 15:28 Mozilla, git and Windows Jon Smirl
2006-11-27 15:34 ` Andy Whitcroft
2006-11-27 16:13   ` Jon Smirl
2006-11-27 16:37     ` Robin Rosenberg
2006-11-27 22:13     ` Petr Baudis
2006-11-28  1:35       ` Jon Smirl
2006-11-28 12:17         ` Michael Haggerty
2006-11-28  0:30 ` Sam Vilain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).