git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Scott Lamb <slamb@slamb.org>
Cc: Simon Hausmann <simon@lst.de>, Junio C Hamano <junkio@cox.net>,
	git@vger.kernel.org
Subject: Re: git-p4import.py robustness changes
Date: Mon, 4 Jun 2007 01:54:33 -0400	[thread overview]
Message-ID: <20070604055433.GD4507@spearce.org> (raw)
In-Reply-To: <839AEF71-ED29-4A79-BE97-C79EAFEDC466@slamb.org>

Scott Lamb <slamb@slamb.org> wrote:
> On Jun 3, 2007, at 6:11 AM, Simon Hausmann wrote:
> >On the topic of git integration with perforce, what are the chances  
> >of getting
> >git-p4 ( http://repo.or.cz/w/fast-export.git ) into git's contrib/ 
> >fast-export
> >area? :)
> 
> I missed that one...I just saw Tailor and the Perl script someone  
> else had written.

Perhaps why it should be in contrib/fast-import?  ;-)
 
> As for performance...hmm. Looks like git-p4import.py runs these  
> commands for each Perforce revision:
> 
>     realtime  operation
>         3.4%  p4 describe -s N
>        66.6%  p4 sync ...@N
>    [*] 10.2%  git ls-files -m -d -o -z | git update-index --add -- 
> remove -z --stdin
>         2.6%  git rev-parse --verify HEAD
>         4.2%  git write-tree
>         2.8%  git commit-tree xxxxxx
>         7.5%  git tag -f p4/N xxxxxx
>         2.7%  git update-ref HEAD xxxxxx
...
> git-p4 seems to use "git fast-import". I guess the big performance  
> improvement there is removing the ls-files operation? So we're  
> talking about a 0-10% speedup, right? Plus some fork()/exec() overhead.

fast-import folds all of the git commands you list above behind
a single engine that is *fast*.  So its actually a 0-30% gain
that is available by using the fast-import backend, with a single
fork()/exec() for the *entire import*.  The local object IO performed
by Git is also minimized, so large imports have much better IO
behavior from the Git perspective.  Its not something to sneeze at.

fast-import also can run in parallel with the frontend process,
allowing you to use a dual-core system, to the extent that your
disk(s) and network can keep up.  Generally p4 is going to be
the bottleneck.

I think writing data to fast-import is much easier than running
the raw Git commands, especially when you are talking about an
import engine where you need to set all of the special environment
variables for git-commit-tree or git-tag to do its job properly.
Its a good tool that simply doesn't get enough use, partly because
nobody is using it...

-- 
Shawn.

  reply	other threads:[~2007-06-04  5:54 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-31 16:47 git-p4import.py robustness changes Scott Lamb
2007-05-31 23:53 ` Junio C Hamano
2007-06-02 20:41   ` Scott Lamb
2007-06-02 21:33     ` Junio C Hamano
2007-06-02 23:21       ` Scott Lamb
2007-06-02 23:52         ` Junio C Hamano
2007-06-03 13:11       ` Simon Hausmann
2007-06-03 20:12         ` Scott Lamb
2007-06-04  5:54           ` Shawn O. Pearce [this message]
2007-06-04  6:09             ` Dana How
2007-06-04  6:18               ` Shawn O. Pearce
2007-06-04  7:19             ` Scott Lamb
2007-06-05  7:21               ` Simon Hausmann
2007-06-04  8:41           ` Marius Storm-Olsen
2007-06-04  5:56         ` Shawn O. Pearce
2007-06-12 21:46           ` Simon Hausmann
2007-06-13 21:06             ` Scott Lamb
2007-06-13 22:34               ` Simon Hausmann
2007-06-14  5:35             ` Shawn O. Pearce
2007-06-14 21:44               ` Simon Hausmann
2007-06-15  3:13                 ` Shawn O. Pearce
2007-06-15  5:30                 ` Marius Storm-Olsen, mstormo_git
2007-06-03  3:58 ` [PATCH 1/4] git-p4import: fix subcommand error handling Scott Lamb
2007-06-03  3:58   ` [PATCH 2/4] git-p4import: use lists of subcommand arguments Scott Lamb
2007-06-03  3:58     ` [PATCH 3/4] git-p4import: resume on correct p4 changeset Scott Lamb
2007-06-03  3:58       ` [PATCH 4/4] git-p4import: partial history Scott Lamb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070604055433.GD4507@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=simon@lst.de \
    --cc=slamb@slamb.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).