git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git-p4import.py robustness changes
@ 2007-05-31 16:47 Scott Lamb
  2007-05-31 23:53 ` Junio C Hamano
  2007-06-03  3:58 ` [PATCH 1/4] git-p4import: fix subcommand error handling Scott Lamb
  0 siblings, 2 replies; 26+ messages in thread
From: Scott Lamb @ 2007-05-31 16:47 UTC (permalink / raw)
  To: git

I'm trying out git-p4import.py (and git itself) for the first time.  
I'm frustrated with its error behavior. For example, it's saying this:

     $ git-p4import.py //my/path/... master
     Setting perforce to  //my/path/...
     Already up to date...

when it should be saying this:

     $ git-p4import.py //my/path/... master
     Setting perforce to  //my/path/...
     git-p4import fatal error: p4 changes //my/path/...@1,#head:  
Request too large (over 150000); see 'p4 help maxresults'.

There's a logfile option, but that's a poor excuse for no error  
handling. I'd like to fix it. A couple questions, though:


First, is it acceptable to switch from os.popen to the subprocess  
module? I ask because the latter was only introduced with Python 2.4  
on. The subprocess module does work with earlier versions of Python  
(definitely 2.3) and is GPL-compatible, so maybe it could be thrown  
into the distribution if desired.

I could make do with popen2.Popen3, but subprocess is actually  
pleasant to use:

         git = subprocess.Popen(cmdlist,
                                stdin=subprocess.PIPE,
                                stdout=subprocess.PIPE,
                                stderr=subprocess.PIPE)
         stdout, stderr = git.communicate(stdin)
         if git.wait() != 0:
             raise GitException("'git %s' failed: %s" % (cmd, stderr))

vs. the popen2 way, which is longer and uglier. It'd probably involve  
tempfiles rather than reimplementing subprocess.Popen.communicate().


Second, this crowd seems to want sequences of tiny patches. How does  
this sound?

* patch 1 - use subprocess to make git_command.git() and p4_command.p4 
() throw properly-typed exceptions on error, fix caller exception  
handling to match.

* patch 2 - remove the use of the shell and pipelines (fix some  
escaping problems).

* patch 3 - use lists instead of space separation for the commandline  
arguments (fix more escaping problems).

* patch 4 - allow grabbing partial history (make my error go away).


Cheers,
Scott

-- 
Scott Lamb <http://www.slamb.org/>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-05-31 16:47 git-p4import.py robustness changes Scott Lamb
@ 2007-05-31 23:53 ` Junio C Hamano
  2007-06-02 20:41   ` Scott Lamb
  2007-06-03  3:58 ` [PATCH 1/4] git-p4import: fix subcommand error handling Scott Lamb
  1 sibling, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2007-05-31 23:53 UTC (permalink / raw)
  To: Scott Lamb; +Cc: git

Scott Lamb <slamb@slamb.org> writes:

> There's a logfile option, but that's a poor excuse for no error
> handling. I'd like to fix it. A couple questions, though:

Good to have somebody who has access to p4.

> First, is it acceptable to switch from os.popen to the subprocess
> module? I ask because the latter was only introduced with Python 2.4
> on. The subprocess module does work with earlier versions of Python
> (definitely 2.3) and is GPL-compatible, so maybe it could be thrown
> into the distribution if desired.

We actually did ship with our own copy after clearing the
licensing situation with Python people, although we removed it
when it lost the last script that used it.  I do not think
resurrecting it is a problem.

> Second, this crowd seems to want sequences of tiny patches. How does
> this sound?
>
> * patch 1 - use subprocess to make git_command.git() and p4_command.p4
> () throw properly-typed exceptions on error, fix caller exception
> handling to match.
>
> * patch 2 - remove the use of the shell and pipelines (fix some
> escaping problems).
>
> * patch 3 - use lists instead of space separation for the commandline
> arguments (fix more escaping problems).
>
> * patch 4 - allow grabbing partial history (make my error go away).

Actually, my preference is to have a "patch 0" before all of the
above, that demotes git-p4import to contrib/ hierarchy.  Having
no access to p4 managed repositories (nor much inclination to
get one), I can never test nor maintain it myself, so it is just
crazy for me to be the maintainer for it.

But I do read Python and speak it passably -- the above 4 step
outline sounds sane to me.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-05-31 23:53 ` Junio C Hamano
@ 2007-06-02 20:41   ` Scott Lamb
  2007-06-02 21:33     ` Junio C Hamano
  0 siblings, 1 reply; 26+ messages in thread
From: Scott Lamb @ 2007-06-02 20:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


On May 31, 2007, at 4:53 PM, Junio C Hamano wrote:

> Actually, my preference is to have a "patch 0" before all of the
> above, that demotes git-p4import to contrib/ hierarchy.  Having
> no access to p4 managed repositories (nor much inclination to
> get one), I can never test nor maintain it myself, so it is just
> crazy for me to be the maintainer for it.

Will do. What does that mean for Documentation/git-p4import.txt and  
the git-p4 rpm (defined in git.spec.in)? Should I move them with it?  
(Seems nothing else in the main tree references contrib.) If so,  
maybe I should set up a common "Documentation/asciidoc.mak" or  
something for building the man/html pages rather than duplicating all  
that Makefile logic.

-- 
Scott Lamb <http://www.slamb.org/>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-02 20:41   ` Scott Lamb
@ 2007-06-02 21:33     ` Junio C Hamano
  2007-06-02 23:21       ` Scott Lamb
  2007-06-03 13:11       ` Simon Hausmann
  0 siblings, 2 replies; 26+ messages in thread
From: Junio C Hamano @ 2007-06-02 21:33 UTC (permalink / raw)
  To: Scott Lamb; +Cc: git

Scott Lamb <slamb@slamb.org> writes:

> On May 31, 2007, at 4:53 PM, Junio C Hamano wrote:
>
>> Actually, my preference is to have a "patch 0" before all of the
>> above, that demotes git-p4import to contrib/ hierarchy.  Having
>> no access to p4 managed repositories (nor much inclination to
>> get one), I can never test nor maintain it myself, so it is just
>> crazy for me to be the maintainer for it.
>
> Will do. What does that mean for Documentation/git-p4import.txt and
> the git-p4 rpm (defined in git.spec.in)? Should I move them with it?
> (Seems nothing else in the main tree references contrib.) If so,
> maybe I should set up a common "Documentation/asciidoc.mak" or
> something for building the man/html pages rather than duplicating all
> that Makefile logic.

A much more preferable alternative is for you to say "Hey, don't
say you want to demote it.  I'll keep it maintained, I regularly
use p4 and have a strong incentive to keep it working".  Then we
do not have to do the "patch 0" ;-)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-02 21:33     ` Junio C Hamano
@ 2007-06-02 23:21       ` Scott Lamb
  2007-06-02 23:52         ` Junio C Hamano
  2007-06-03 13:11       ` Simon Hausmann
  1 sibling, 1 reply; 26+ messages in thread
From: Scott Lamb @ 2007-06-02 23:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


On Jun 2, 2007, at 2:33 PM, Junio C Hamano wrote:

> A much more preferable alternative is for you to say "Hey, don't
> say you want to demote it.  I'll keep it maintained, I regularly
> use p4 and have a strong incentive to keep it working".  Then we
> do not have to do the "patch 0" ;-)

Hmm. I'd like to say that, but keep in mind that I'd never even used  
git before Wednesday, and I'm not sure yet how well git-p4import.py  
will work out for me.

It'd be a huge leap from git-p4import.py to something that could  
remove my need to use p4 commands daily. First, I'd need something  
that could follow all upstream branches with merge history. Then I'd  
either need to convince my team to ditch p4 entirely (not easy, and  
then I wouldn't use/maintain git-p4import.py afterward anyway) or a  
way to robustly generate "p4 integrate", "p4 resolve", "p4 submit"  
command sequences to merge between upstream branches.

We're attempting to address more modest needs, like those of our off- 
site contractors who should only be sending patches anyway. (Another  
guy wrote a script that pulls changes into an svn mirror basically by  
"svn ci -m 'changed some stuff'" every half hour, but I made fun of  
it and now have to replace it. ;)

I'll at least finish up the other patches first and see how it goes.

-- 
Scott Lamb <http://www.slamb.org/>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-02 23:21       ` Scott Lamb
@ 2007-06-02 23:52         ` Junio C Hamano
  0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2007-06-02 23:52 UTC (permalink / raw)
  To: Scott Lamb; +Cc: git

Scott Lamb <slamb@slamb.org> writes:

> On Jun 2, 2007, at 2:33 PM, Junio C Hamano wrote:
>
>> A much more preferable alternative is for you to say "Hey, don't
>> say you want to demote it.  I'll keep it maintained, I regularly
>> use p4 and have a strong incentive to keep it working".  Then we
>> do not have to do the "patch 0" ;-)
>
> Hmm. I'd like to say that, but keep in mind that I'd never even used
> git before Wednesday, and I'm not sure yet how well git-p4import.py
> will work out for me.

Oh, you do not have to worry so much.  There certainly are
people who are familiar enough with git on this list to help
improving p4import from the git side.

The thing to me personally is that I am p4 illiterate.  However,
as you might be aware, a few people posted their own version of
"better than p4import" scripts to the list in the past few
months, so there should also be enough people with p4 expertise
and motivation to help you with it.

It _might_ turn out that one of their scripts might be better
than the p4import we have in-tree and we would end up replacing
it with it.  I won't be able to judge that myself, but if people
who need to interoperate with p4 on the list can join forces it
would be good.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/4] git-p4import: fix subcommand error handling
  2007-05-31 16:47 git-p4import.py robustness changes Scott Lamb
  2007-05-31 23:53 ` Junio C Hamano
@ 2007-06-03  3:58 ` Scott Lamb
  2007-06-03  3:58   ` [PATCH 2/4] git-p4import: use lists of subcommand arguments Scott Lamb
  1 sibling, 1 reply; 26+ messages in thread
From: Scott Lamb @ 2007-06-03  3:58 UTC (permalink / raw)
  To: git, Junio C Hamano; +Cc: Scott Lamb

Use the Python "subcommand" module to properly handle the subcommand
pipeline and raise exceptions on error.

Signed-off-by: Scott Lamb <slamb@slamb.org>
---
I folded the elimination of the shell pipelines into this patch - the error
handling couldn't work otherwise.

The difference between exceptions paths with "die" and ones with an exception
thrown to the top is somewhat arbitrary; I tried to use it to distinguish
between user error and bugs.

 git-p4import.py |  166 +++++++++++++++++++++++++++++++++----------------------
 1 files changed, 99 insertions(+), 67 deletions(-)

diff --git a/git-p4import.py b/git-p4import.py
index 60a758b..002f8d8 100644
--- a/git-p4import.py
+++ b/git-p4import.py
@@ -13,6 +13,9 @@ import os
 import sys
 import time
 import getopt
+import subprocess
+import re
+import errno
 
 from signal import signal, \
    SIGPIPE, SIGINT, SIG_DFL, \
@@ -34,7 +37,7 @@ def usage():
     sys.exit(1)
 
 verbosity = 1
-logfile = "/dev/null"
+logfile = file("/dev/null", "a")
 ignore_warnings = False
 stitch = 0
 tagall = True
@@ -44,59 +47,70 @@ def report(level, msg, *args):
     global logfile
     for a in args:
         msg = "%s %s" % (msg, a)
-    fd = open(logfile, "a")
-    fd.writelines(msg)
-    fd.close()
+    logfile.writelines(msg)
     if level <= verbosity:
         print msg
 
+class P4Exception(Exception):
+    def __init__(self, cmd, errmsg):
+        Exception.__init__(self, '%r: %s' % (cmd, errmsg))
+        self.cmd = cmd
+        self.errmsg = errmsg
+
 class p4_command:
     def __init__(self, _repopath):
-        try:
-            global logfile
-            self.userlist = {}
-            if _repopath[-1] == '/':
-                self.repopath = _repopath[:-1]
-            else:
-                self.repopath = _repopath
-            if self.repopath[-4:] != "/...":
-                self.repopath= "%s/..." % self.repopath
-            f=os.popen('p4 -V 2>>%s'%logfile, 'rb')
-            a = f.readlines()
-            if f.close():
-                raise
-        except:
-                die("Could not find the \"p4\" command")
+        self.userlist = {}
+        if _repopath[-1] == '/':
+            self.repopath = _repopath[:-1]
+        else:
+            self.repopath = _repopath
+        if self.repopath[-4:] != "/...":
+            self.repopath= "%s/..." % self.repopath
+        p4 = subprocess.Popen(['p4', '-V'],
+                              stdout=file('/dev/null', 'a'),
+                              stderr=subprocess.PIPE)
+        err = p4.stderr.read()
+        if p4.wait() != 0:
+            die("Could not run the \"p4\" command: %r" % (err,))
 
     def p4(self, cmd, *args):
         global logfile
         cmd = "%s %s" % (cmd, ' '.join(args))
+        cmdlist = ['p4', '-G'] + cmd.split(' ')
         report(2, "P4:", cmd)
-        f=os.popen('p4 -G %s 2>>%s' % (cmd,logfile), 'rb')
+        p4 = subprocess.Popen(cmdlist,
+                              stdout=subprocess.PIPE,
+                              stderr=logfile)
         list = []
         while 1:
            try:
-                list.append(marshal.load(f))
+                elem = marshal.load(p4.stdout)
            except EOFError:
                 break
-        self.ret = f.close()
+           if elem['code'] == 'error':
+                raise P4Exception(cmd, elem['data'])
+           list.append(elem)
+        if p4.wait() != 0:
+            raise Exception("'p4 %s' failed" % (cmd,))
         return list
 
     def sync(self, id, force=False, trick=False, test=False):
-        if force:
-            ret = self.p4("sync -f %s@%s"%(self.repopath, id))[0]
-        elif trick:
-            ret = self.p4("sync -k %s@%s"%(self.repopath, id))[0]
-        elif test:
-            ret = self.p4("sync -n %s@%s"%(self.repopath, id))[0]
-        else:
-            ret = self.p4("sync    %s@%s"%(self.repopath, id))[0]
-        if ret['code'] == "error":
-             data = ret['data'].upper()
-             if data.find('VIEW') > 0:
-                 die("Perforce reports %s is not in client view"% self.repopath)
-             elif data.find('UP-TO-DATE') < 0:
-                 die("Could not sync files from perforce", self.repopath)
+        try:
+            if force:
+                self.p4("sync -f %s@%s"%(self.repopath, id))
+            elif trick:
+                self.p4("sync -k %s@%s"%(self.repopath, id))
+            elif test:
+                self.p4("sync -n %s@%s"%(self.repopath, id))
+            else:
+                self.p4("sync    %s@%s"%(self.repopath, id))
+        except P4Exception, e:
+            data = e.errmsg.upper()
+            if data.find('VIEW') > 0:
+                die("Perforce reports %s is not in client view: %s"
+                    % (self.repopath, e))
+            elif data.find('UP-TO-DATE') < 0:
+                die(e)
 
     def changes(self, since=0):
         try:
@@ -105,8 +119,8 @@ class p4_command:
                 list.append(rec['change'])
             list.reverse()
             return list
-        except:
-            return []
+        except P4Exception, e:
+            die(e)
 
     def authors(self, filename):
         f=open(filename)
@@ -122,7 +136,8 @@ class p4_command:
             try:
                 user = self.p4("users", id)[0]
                 self.userlist[id] = (user['FullName'], user['Email'])
-            except:
+            except P4Exception, e:
+                report(2, "P4: missing user %s" % (id,))
                 self.userlist[id] = (id, "")
         return self.userlist[id]
 
@@ -143,7 +158,7 @@ class p4_command:
     def where(self):
         try:
             return self.p4("where %s" % self.repopath)[-1]['path']
-        except:
+        except P4Exception, e:
             return ""
 
     def describe(self, num):
@@ -153,16 +168,18 @@ class p4_command:
         self.date = self._format_date(time.localtime(long(desc['time'])))
         return self
 
+class GitException(Exception): pass
+
 class git_command:
     def __init__(self):
         try:
             self.version = self.git("--version")[0][12:].rstrip()
-        except:
-            die("Could not find the \"git\" command")
+        except GitException, e:
+            die(e)
         try:
             self.gitdir = self.get_single("rev-parse --git-dir")
             report(2, "gdir:", self.gitdir)
-        except:
+        except GitException, e:
             die("Not a git repository... did you forget to \"git init\" ?")
         try:
             self.cdup = self.get_single("rev-parse --show-cdup")
@@ -170,38 +187,47 @@ class git_command:
                 os.chdir(self.cdup)
             self.topdir = os.getcwd()
             report(2, "topdir:", self.topdir)
-        except:
+        except GitException, e:
             die("Could not find top git directory")
 
-    def git(self, cmd):
-        global logfile
+    def git(self, cmd, stdin=None):
         report(2, "GIT:", cmd)
-        f=os.popen('git %s 2>>%s' % (cmd,logfile), 'rb')
-        r=f.readlines()
-        self.ret = f.close()
-        return r
+        cmdlist = ['git'] + cmd.split(' ')
+        git = subprocess.Popen(cmdlist,
+                               stdin=subprocess.PIPE,
+                               stdout=subprocess.PIPE,
+                               stderr=subprocess.PIPE)
+        stdout, stderr = git.communicate(stdin)
+        if git.wait() != 0:
+            raise GitException("'git %s' failed: %s" % (cmd, stderr))
+        if stderr != '':
+            report(2, stderr)
+        return re.findall(r'.*\n', stdout)
 
     def get_single(self, cmd):
-        return self.git(cmd)[0].rstrip()
+        list = self.git(cmd)
+        if len(list) != 1:
+            raise GitException("%r returned %r" % (cmd, list))
+        return list[0].rstrip()
 
     def current_branch(self):
         try:
             testit = self.git("rev-parse --verify HEAD")[0]
             return self.git("symbolic-ref HEAD")[0][11:].rstrip()
-        except:
+        except GitException, e:
             return None
 
     def get_config(self, variable):
         try:
             return self.git("config --get %s" % variable)[0].rstrip()
-        except:
+        except GitException, e:
             return None
 
     def set_config(self, variable, value):
         try:
             self.git("config %s %s"%(variable, value) )
-        except:
-            die("Could not set %s to " % variable, value)
+        except GitException, e:
+            die(e)
 
     def make_tag(self, name, head):
         self.git("tag -f %s %s"%(name,head))
@@ -217,7 +243,8 @@ class git_command:
             return 0
 
     def update_index(self):
-        self.git("ls-files -m -d -o -z | git update-index --add --remove -z --stdin")
+        files = self.git("ls-files -m -d -o -z")
+        self.git("update-index --add --remove -z --stdin", stdin=files)
 
     def checkout(self, branch):
         self.git("checkout %s" % branch)
@@ -226,15 +253,21 @@ class git_command:
         self.git("symbolic-ref HEAD refs/heads/%s" % branch)
 
     def remove_files(self):
-        self.git("ls-files | xargs rm")
+        files = self.git("ls-files")
+        for file in files:
+            os.unlink(file)
 
     def clean_directories(self):
         self.git("clean -d")
 
     def fresh_branch(self, branch):
         report(1, "Creating new branch", branch)
-        self.git("ls-files | xargs rm")
-        os.remove(".git/index")
+        self.remove_files()
+        try:
+            os.remove(".git/index")
+        except OSError, e:
+            if e.errno != errno.ENOENT:
+                raise
         self.repoint_head(branch)
         self.git("clean -d")
 
@@ -243,21 +276,17 @@ class git_command:
 
     def commit(self, author, email, date, msg, id):
         self.update_index()
-        fd=open(".msg", "w")
-        fd.writelines(msg)
-        fd.close()
         try:
                 current = self.get_single("rev-parse --verify HEAD")
                 head = "-p HEAD"
-        except:
+        except GitException, e:
                 current = ""
                 head = ""
         tree = self.get_single("write-tree")
         for r,l in [('DATE',date),('NAME',author),('EMAIL',email)]:
             os.environ['GIT_AUTHOR_%s'%r] = l
             os.environ['GIT_COMMITTER_%s'%r] = l
-        commit = self.get_single("commit-tree %s %s < .msg" % (tree,head))
-        os.remove(".msg")
+        commit = self.get_single("commit-tree %s %s" % (tree,head), stdin=msg)
         self.make_tag("p4/%s"%id, commit)
         self.git("update-ref HEAD %s %s" % (commit, current) )
 
@@ -273,7 +302,7 @@ for o, a in opts:
     if o == "-v":
         verbosity += 1
     if o in ("--log"):
-        logfile = a
+        logfile = file(a, "a")
     if o in ("--notags"):
         tagall = False
     if o in ("-h", "--help"):
@@ -293,7 +322,10 @@ for o, a in opts:
 
 if len(args) == 2:
     branch = args[1]
-    git.checkout(branch)
+    try:
+        git.checkout(branch)
+    except GitException, e:
+        pass
     if branch == git.current_branch():
         die("Branch %s already exists!" % branch)
     report(1, "Setting perforce to ", args[0])
-- 
1.5.2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/4] git-p4import: use lists of subcommand arguments
  2007-06-03  3:58 ` [PATCH 1/4] git-p4import: fix subcommand error handling Scott Lamb
@ 2007-06-03  3:58   ` Scott Lamb
  2007-06-03  3:58     ` [PATCH 3/4] git-p4import: resume on correct p4 changeset Scott Lamb
  0 siblings, 1 reply; 26+ messages in thread
From: Scott Lamb @ 2007-06-03  3:58 UTC (permalink / raw)
  To: git, Junio C Hamano; +Cc: Scott Lamb

This fixes problems with spaces in filenames.

Signed-off-by: Scott Lamb <slamb@slamb.org>
---
 git-p4import.py |   84 +++++++++++++++++++++++++++++-------------------------
 1 files changed, 45 insertions(+), 39 deletions(-)

diff --git a/git-p4import.py b/git-p4import.py
index 002f8d8..54e5e9e 100644
--- a/git-p4import.py
+++ b/git-p4import.py
@@ -73,11 +73,12 @@ class p4_command:
         if p4.wait() != 0:
             die("Could not run the \"p4\" command: %r" % (err,))
 
-    def p4(self, cmd, *args):
+    def p4(self, args):
         global logfile
-        cmd = "%s %s" % (cmd, ' '.join(args))
-        cmdlist = ['p4', '-G'] + cmd.split(' ')
+        cmd = ' '.join(args)
         report(2, "P4:", cmd)
+        cmdlist = ['p4', '-G']
+        cmdlist.extend(args)
         p4 = subprocess.Popen(cmdlist,
                               stdout=subprocess.PIPE,
                               stderr=logfile)
@@ -97,13 +98,14 @@ class p4_command:
     def sync(self, id, force=False, trick=False, test=False):
         try:
             if force:
-                self.p4("sync -f %s@%s"%(self.repopath, id))
+                extra = ["-f"]
             elif trick:
-                self.p4("sync -k %s@%s"%(self.repopath, id))
+                extra = ["-k"]
             elif test:
-                self.p4("sync -n %s@%s"%(self.repopath, id))
+                extra = ["-n"]
             else:
-                self.p4("sync    %s@%s"%(self.repopath, id))
+                extra = []
+            self.p4(["sync"] + extra + ["%s@%s"%(self.repopath, id)])
         except P4Exception, e:
             data = e.errmsg.upper()
             if data.find('VIEW') > 0:
@@ -115,7 +117,8 @@ class p4_command:
     def changes(self, since=0):
         try:
             list = []
-            for rec in self.p4("changes %s@%s,#head" % (self.repopath, since+1)):
+            for rec in self.p4(["changes", "%s@%s,#head"
+                               % (self.repopath, since+1)]):
                 list.append(rec['change'])
             list.reverse()
             return list
@@ -134,7 +137,7 @@ class p4_command:
     def _get_user(self, id):
         if not self.userlist.has_key(id):
             try:
-                user = self.p4("users", id)[0]
+                user = self.p4(["users", id])[0]
                 self.userlist[id] = (user['FullName'], user['Email'])
             except P4Exception, e:
                 report(2, "P4: missing user %s" % (id,))
@@ -157,12 +160,12 @@ class p4_command:
 
     def where(self):
         try:
-            return self.p4("where %s" % self.repopath)[-1]['path']
+            return self.p4(["where", self.repopath])[-1]['path']
         except P4Exception, e:
             return ""
 
     def describe(self, num):
-        desc = self.p4("describe -s", num)[0]
+        desc = self.p4(["describe", "-s", num])[0]
         self.msg = desc['desc']
         self.author, self.email = self._get_user(desc['user'])
         self.date = self._format_date(time.localtime(long(desc['time'])))
@@ -173,16 +176,16 @@ class GitException(Exception): pass
 class git_command:
     def __init__(self):
         try:
-            self.version = self.git("--version")[0][12:].rstrip()
+            self.version = self.git(["--version"])[0][12:].rstrip()
         except GitException, e:
             die(e)
         try:
-            self.gitdir = self.get_single("rev-parse --git-dir")
+            self.gitdir = self.get_single(["rev-parse", "--git-dir"])
             report(2, "gdir:", self.gitdir)
         except GitException, e:
             die("Not a git repository... did you forget to \"git init\" ?")
         try:
-            self.cdup = self.get_single("rev-parse --show-cdup")
+            self.cdup = self.get_single(["rev-parse", "--show-cdup"])
             if self.cdup != "":
                 os.chdir(self.cdup)
             self.topdir = os.getcwd()
@@ -190,9 +193,11 @@ class git_command:
         except GitException, e:
             die("Could not find top git directory")
 
-    def git(self, cmd, stdin=None):
+    def git(self, args, stdin=None):
+        cmd = ' '.join(args)
         report(2, "GIT:", cmd)
-        cmdlist = ['git'] + cmd.split(' ')
+        cmdlist = ['git']
+        cmdlist.extend(args)
         git = subprocess.Popen(cmdlist,
                                stdin=subprocess.PIPE,
                                stdout=subprocess.PIPE,
@@ -204,37 +209,37 @@ class git_command:
             report(2, stderr)
         return re.findall(r'.*\n', stdout)
 
-    def get_single(self, cmd):
-        list = self.git(cmd)
+    def get_single(self, args, stdin=None):
+        list = self.git(args, stdin=stdin)
         if len(list) != 1:
-            raise GitException("%r returned %r" % (cmd, list))
+            raise GitException("%r returned %r" % (' '.join(args), list))
         return list[0].rstrip()
 
     def current_branch(self):
         try:
-            testit = self.git("rev-parse --verify HEAD")[0]
-            return self.git("symbolic-ref HEAD")[0][11:].rstrip()
+            testit = self.git(["rev-parse", "--verify", "HEAD"])[0]
+            return self.git(["symbolic-ref", "HEAD"])[0][11:].rstrip()
         except GitException, e:
             return None
 
     def get_config(self, variable):
         try:
-            return self.git("config --get %s" % variable)[0].rstrip()
+            return self.git(["config", "--get", variable])[0].rstrip()
         except GitException, e:
             return None
 
     def set_config(self, variable, value):
         try:
-            self.git("config %s %s"%(variable, value) )
+            self.git(["config", variable, value])
         except GitException, e:
             die(e)
 
     def make_tag(self, name, head):
-        self.git("tag -f %s %s"%(name,head))
+        self.git(["tag", "-f", name, head])
 
     def top_change(self, branch):
         try:
-            a=self.get_single("name-rev --tags refs/heads/%s" % branch)
+            a=self.get_single(["name-rev", "--tags", "refs/heads/%s" % branch])
             loc = a.find(' tags/') + 6
             if a[loc:loc+3] != "p4/":
                 raise
@@ -243,22 +248,23 @@ class git_command:
             return 0
 
     def update_index(self):
-        files = self.git("ls-files -m -d -o -z")
-        self.git("update-index --add --remove -z --stdin", stdin=files)
+        files = self.git("ls-files -m -d -o -z".split(" "))
+        self.git("update-index --add --remove -z --stdin".split(" "),
+                 stdin=files)
 
     def checkout(self, branch):
-        self.git("checkout %s" % branch)
+        self.git(["checkout", branch])
 
     def repoint_head(self, branch):
-        self.git("symbolic-ref HEAD refs/heads/%s" % branch)
+        self.git(["symbolic-ref", "HEAD", "refs/heads/%s" % branch])
 
     def remove_files(self):
-        files = self.git("ls-files")
+        files = self.git(["ls-files"])
         for file in files:
             os.unlink(file)
 
     def clean_directories(self):
-        self.git("clean -d")
+        self.git(["clean", "-d"])
 
     def fresh_branch(self, branch):
         report(1, "Creating new branch", branch)
@@ -269,7 +275,7 @@ class git_command:
             if e.errno != errno.ENOENT:
                 raise
         self.repoint_head(branch)
-        self.git("clean -d")
+        self.clean_directories()
 
     def basedir(self):
         return self.topdir
@@ -277,18 +283,18 @@ class git_command:
     def commit(self, author, email, date, msg, id):
         self.update_index()
         try:
-                current = self.get_single("rev-parse --verify HEAD")
-                head = "-p HEAD"
+                current = [self.get_single(["rev-parse", "--verify", "HEAD"])]
+                head = ["-p", "HEAD"]
         except GitException, e:
-                current = ""
-                head = ""
-        tree = self.get_single("write-tree")
+                current = []
+                head = []
+        tree = self.get_single(["write-tree"])
         for r,l in [('DATE',date),('NAME',author),('EMAIL',email)]:
             os.environ['GIT_AUTHOR_%s'%r] = l
             os.environ['GIT_COMMITTER_%s'%r] = l
-        commit = self.get_single("commit-tree %s %s" % (tree,head), stdin=msg)
+        commit = self.get_single(["commit-tree", tree] + head, stdin=msg)
         self.make_tag("p4/%s"%id, commit)
-        self.git("update-ref HEAD %s %s" % (commit, current) )
+        self.git(["update-ref", "HEAD", commit] + current)
 
 try:
     opts, args = getopt.getopt(sys.argv[1:], "qhvt:",
-- 
1.5.2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/4] git-p4import: resume on correct p4 changeset
  2007-06-03  3:58   ` [PATCH 2/4] git-p4import: use lists of subcommand arguments Scott Lamb
@ 2007-06-03  3:58     ` Scott Lamb
  2007-06-03  3:58       ` [PATCH 4/4] git-p4import: partial history Scott Lamb
  0 siblings, 1 reply; 26+ messages in thread
From: Scott Lamb @ 2007-06-03  3:58 UTC (permalink / raw)
  To: git, Junio C Hamano; +Cc: Scott Lamb

This had been resuming on change 222 rather than 22283.

top_change's removal of the last two characters must have predated the use
of rstrip() in get_single(). A regexp should be less fragile, or at least
more obvious when it breaks.

Signed-off-by: Scott Lamb <slamb@slamb.org>
---
 git-p4import.py |   13 +++++++------
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/git-p4import.py b/git-p4import.py
index 54e5e9e..e7a52b3 100644
--- a/git-p4import.py
+++ b/git-p4import.py
@@ -237,15 +237,16 @@ class git_command:
     def make_tag(self, name, head):
         self.git(["tag", "-f", name, head])
 
+    _tag_re = re.compile(r'tags/p4/(\d+)')
     def top_change(self, branch):
         try:
             a=self.get_single(["name-rev", "--tags", "refs/heads/%s" % branch])
-            loc = a.find(' tags/') + 6
-            if a[loc:loc+3] != "p4/":
-                raise
-            return int(a[loc+3:][:-2])
-        except:
-            return 0
+        except GitException, e:
+            return 0 # fresh repository
+        m = self._tag_re.search(a)
+        if m is None:
+            raise Exception('unable to parse: %r' % (a,))
+        return int(m.group(1))
 
     def update_index(self):
         files = self.git("ls-files -m -d -o -z".split(" "))
-- 
1.5.2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/4] git-p4import: partial history
  2007-06-03  3:58     ` [PATCH 3/4] git-p4import: resume on correct p4 changeset Scott Lamb
@ 2007-06-03  3:58       ` Scott Lamb
  0 siblings, 0 replies; 26+ messages in thread
From: Scott Lamb @ 2007-06-03  3:58 UTC (permalink / raw)
  To: git, Junio C Hamano; +Cc: Scott Lamb

Allow importing partial history, which is quicker and may be necessary with
a low Perforce MaxScanRows limit.

Signed-off-by: Scott Lamb <slamb@slamb.org>
---
 Documentation/git-p4import.txt |    6 ++++++
 git-p4import.py                |   14 ++++++++++++--
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-p4import.txt b/Documentation/git-p4import.txt
index 714abbe..bf40b5a 100644
--- a/Documentation/git-p4import.txt
+++ b/Documentation/git-p4import.txt
@@ -10,6 +10,7 @@ SYNOPSIS
 --------
 [verse]
 `git-p4import` [-q|-v] [--notags] [--authors <file>] [-t <timezone>]
+               [--start-with <change>]
                <//p4repo/path> <branch>
 `git-p4import` --stitch <//p4repo/path>
 `git-p4import`
@@ -59,6 +60,11 @@ OPTIONS
 	etc.  You only need to specify this once, it will be saved in
 	the git config file for the repository.
 
+\--start-with::
+	Start the import with the given Perforce change. A partial history can
+	be much faster to generate and is possible even with a low MaxScanRows
+	limit.
+
 <//p4repo/path>::
 	The Perforce path that will be imported into the specified branch.
 
diff --git a/git-p4import.py b/git-p4import.py
index e7a52b3..c7a2033 100644
--- a/git-p4import.py
+++ b/git-p4import.py
@@ -33,7 +33,12 @@ def die(msg, *args):
     sys.exit(1)
 
 def usage():
-    print "USAGE: git-p4import [-q|-v]  [--authors=<file>]  [-t <timezone>]  [//p4repo/path <branch>]"
+    print "usage:"
+    print "  git-p4import [-q|-v] [--notags] [--authors <file>] [-t <timezone>]"
+    print "               [--start-with <change>]"
+    print "               <//p4repo/path> <branch>"
+    print "  git-p4import --stitch <//p4repo/path>"
+    print "  git-p4import"
     sys.exit(1)
 
 verbosity = 1
@@ -41,6 +46,7 @@ logfile = file("/dev/null", "a")
 ignore_warnings = False
 stitch = 0
 tagall = True
+start_with = 0
 
 def report(level, msg, *args):
     global verbosity
@@ -299,7 +305,8 @@ class git_command:
 
 try:
     opts, args = getopt.getopt(sys.argv[1:], "qhvt:",
-            ["authors=","help","stitch=","timezone=","log=","ignore","notags"])
+            ["authors=","help","stitch=","timezone=","log=","ignore","notags",
+             "start-with="])
 except getopt.GetoptError:
     usage()
 
@@ -316,6 +323,8 @@ for o, a in opts:
         usage()
     if o in ("--ignore"):
         ignore_warnings = True
+    if o in ("--start-with"):
+        start_with = int(a)
 
 git = git_command()
 branch=git.current_branch()
@@ -361,6 +370,7 @@ if stitch == 0:
     top = git.top_change(branch)
 else:
     top = 0
+top = max(top, start_with)
 changes = p4.changes(top)
 count = len(changes)
 if count == 0:
-- 
1.5.2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-02 21:33     ` Junio C Hamano
  2007-06-02 23:21       ` Scott Lamb
@ 2007-06-03 13:11       ` Simon Hausmann
  2007-06-03 20:12         ` Scott Lamb
  2007-06-04  5:56         ` Shawn O. Pearce
  1 sibling, 2 replies; 26+ messages in thread
From: Simon Hausmann @ 2007-06-03 13:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1489 bytes --]

On Saturday 02 June 2007 23:33:25 Junio C Hamano wrote:
> Scott Lamb <slamb@slamb.org> writes:
> > On May 31, 2007, at 4:53 PM, Junio C Hamano wrote:
> >> Actually, my preference is to have a "patch 0" before all of the
> >> above, that demotes git-p4import to contrib/ hierarchy.  Having
> >> no access to p4 managed repositories (nor much inclination to
> >> get one), I can never test nor maintain it myself, so it is just
> >> crazy for me to be the maintainer for it.
> >
> > Will do. What does that mean for Documentation/git-p4import.txt and
> > the git-p4 rpm (defined in git.spec.in)? Should I move them with it?
> > (Seems nothing else in the main tree references contrib.) If so,
> > maybe I should set up a common "Documentation/asciidoc.mak" or
> > something for building the man/html pages rather than duplicating all
> > that Makefile logic.
>
> A much more preferable alternative is for you to say "Hey, don't
> say you want to demote it.  I'll keep it maintained, I regularly
> use p4 and have a strong incentive to keep it working".  Then we
> do not have to do the "patch 0" ;-)

On the topic of git integration with perforce, what are the chances of getting 
git-p4 ( http://repo.or.cz/w/fast-export.git ) into git's contrib/fast-export 
area? :)

git-p4 can do everything git-p4import can do plus a lot more (it can track 
multiple branches, it's a hell of a lot faster, it can export back to p4 and 
it also works on Windows!).


Simon

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-03 13:11       ` Simon Hausmann
@ 2007-06-03 20:12         ` Scott Lamb
  2007-06-04  5:54           ` Shawn O. Pearce
  2007-06-04  8:41           ` Marius Storm-Olsen
  2007-06-04  5:56         ` Shawn O. Pearce
  1 sibling, 2 replies; 26+ messages in thread
From: Scott Lamb @ 2007-06-03 20:12 UTC (permalink / raw)
  To: Simon Hausmann; +Cc: Junio C Hamano, git


On Jun 3, 2007, at 6:11 AM, Simon Hausmann wrote:

> On the topic of git integration with perforce, what are the chances  
> of getting
> git-p4 ( http://repo.or.cz/w/fast-export.git ) into git's contrib/ 
> fast-export
> area? :)
>
> git-p4 can do everything git-p4import can do plus a lot more (it  
> can track
> multiple branches, it's a hell of a lot faster, it can export back  
> to p4 and
> it also works on Windows!).

I missed that one...I just saw Tailor and the Perl script someone  
else had written.

Ergh. git-p4 imports both "subprocess" and "popen2" and also uses  
"system" and "os.popen". Why use four different modules to launch git  
and p4?

The branch support's interesting. Have you considered tracking  
integration history? I was pondering it and am not sure if it's  
feasible. Perforce doesn't seem to have an efficient way of  
displaying it (just "p4 integrates" that will fetch *all* revisions  
even if you want incremental results and "p4 filelog" which would  
have to be done on each file). Also, I think there's some mismatch  
between the Perforce and git models.

git-p4import.py should work fine on Windows, too - the binary mode on  
the pipe should be all handled by "subprocess", and git-p4's  
data.replace("\r\n", "\n") is not necessary if you use "LineEnd:  
unix" or "share" in the Perforce client specification.

As for performance...hmm. Looks like git-p4import.py runs these  
commands for each Perforce revision:

     realtime  operation
         3.4%  p4 describe -s N
        66.6%  p4 sync ...@N
    [*] 10.2%  git ls-files -m -d -o -z | git update-index --add -- 
remove -z --stdin
         2.6%  git rev-parse --verify HEAD
         4.2%  git write-tree
         2.8%  git commit-tree xxxxxx
         7.5%  git tag -f p4/N xxxxxx
         2.7%  git update-ref HEAD xxxxxx

That's with Perforce running over the network. Are you running locally?

git-p4 seems to use "git fast-import". I guess the big performance  
improvement there is removing the ls-files operation? So we're  
talking about a 0-10% speedup, right? Plus some fork()/exec() overhead.

[*] - Note that I just discovered a big performance regression in my  
patches. Reading the ls-files into Python, through a regexp, and back  
out through update-index was a horrible idea. The times above are  
with that fixed.

-- 
Scott Lamb <http://www.slamb.org/>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-03 20:12         ` Scott Lamb
@ 2007-06-04  5:54           ` Shawn O. Pearce
  2007-06-04  6:09             ` Dana How
  2007-06-04  7:19             ` Scott Lamb
  2007-06-04  8:41           ` Marius Storm-Olsen
  1 sibling, 2 replies; 26+ messages in thread
From: Shawn O. Pearce @ 2007-06-04  5:54 UTC (permalink / raw)
  To: Scott Lamb; +Cc: Simon Hausmann, Junio C Hamano, git

Scott Lamb <slamb@slamb.org> wrote:
> On Jun 3, 2007, at 6:11 AM, Simon Hausmann wrote:
> >On the topic of git integration with perforce, what are the chances  
> >of getting
> >git-p4 ( http://repo.or.cz/w/fast-export.git ) into git's contrib/ 
> >fast-export
> >area? :)
> 
> I missed that one...I just saw Tailor and the Perl script someone  
> else had written.

Perhaps why it should be in contrib/fast-import?  ;-)
 
> As for performance...hmm. Looks like git-p4import.py runs these  
> commands for each Perforce revision:
> 
>     realtime  operation
>         3.4%  p4 describe -s N
>        66.6%  p4 sync ...@N
>    [*] 10.2%  git ls-files -m -d -o -z | git update-index --add -- 
> remove -z --stdin
>         2.6%  git rev-parse --verify HEAD
>         4.2%  git write-tree
>         2.8%  git commit-tree xxxxxx
>         7.5%  git tag -f p4/N xxxxxx
>         2.7%  git update-ref HEAD xxxxxx
...
> git-p4 seems to use "git fast-import". I guess the big performance  
> improvement there is removing the ls-files operation? So we're  
> talking about a 0-10% speedup, right? Plus some fork()/exec() overhead.

fast-import folds all of the git commands you list above behind
a single engine that is *fast*.  So its actually a 0-30% gain
that is available by using the fast-import backend, with a single
fork()/exec() for the *entire import*.  The local object IO performed
by Git is also minimized, so large imports have much better IO
behavior from the Git perspective.  Its not something to sneeze at.

fast-import also can run in parallel with the frontend process,
allowing you to use a dual-core system, to the extent that your
disk(s) and network can keep up.  Generally p4 is going to be
the bottleneck.

I think writing data to fast-import is much easier than running
the raw Git commands, especially when you are talking about an
import engine where you need to set all of the special environment
variables for git-commit-tree or git-tag to do its job properly.
Its a good tool that simply doesn't get enough use, partly because
nobody is using it...

-- 
Shawn.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-03 13:11       ` Simon Hausmann
  2007-06-03 20:12         ` Scott Lamb
@ 2007-06-04  5:56         ` Shawn O. Pearce
  2007-06-12 21:46           ` Simon Hausmann
  1 sibling, 1 reply; 26+ messages in thread
From: Shawn O. Pearce @ 2007-06-04  5:56 UTC (permalink / raw)
  To: Simon Hausmann; +Cc: Junio C Hamano, git

Simon Hausmann <simon@lst.de> wrote:
> On the topic of git integration with perforce, what are the chances of getting 
> git-p4 ( http://repo.or.cz/w/fast-export.git ) into git's contrib/fast-export 
> area? :)
> 
> git-p4 can do everything git-p4import can do plus a lot more (it can track 
> multiple branches, it's a hell of a lot faster, it can export back to p4 and 
> it also works on Windows!).

I was sort of hoping we could fold the fast-export Git repository
on repo.or.cz into core Git at some point.  Right now the only
thing in contrib/fast-export is the import-tars.perl script that
I maintain in my fastimport repository...  ;-)

Like Junio I don't use Perforce, and can't test against it, but
if you can maintain git-p4 (and I think the history on repo.or.cz
shows that you do) then it may be a good idea to add it to core Git.

Send a patch to add it.  Worst that happens is both Junio and I
decide not to apply it.  Or I apply it, but Junio refuses to pull
from me afterwards.  ;-)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-04  5:54           ` Shawn O. Pearce
@ 2007-06-04  6:09             ` Dana How
  2007-06-04  6:18               ` Shawn O. Pearce
  2007-06-04  7:19             ` Scott Lamb
  1 sibling, 1 reply; 26+ messages in thread
From: Dana How @ 2007-06-04  6:09 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Scott Lamb, Simon Hausmann, Junio C Hamano, git, danahow

On 6/3/07, Shawn O. Pearce <spearce@spearce.org> wrote:
> I think writing data to fast-import is much easier than running
> the raw Git commands, especially when you are talking about an
> import engine where you need to set all of the special environment
> variables for git-commit-tree or git-tag to do its job properly.
> Its a good tool that simply doesn't get enough use, partly because
> nobody is using it...

Well,  perhaps they use it *once*,  in that they write a wrapper script for
it and then forget about it.  At least that's what I did.  And the _only_
annoyance was the trailing NL requirement on the delimited "data" statement,
so you don't get much noise/complaints when people use it.

Thanks,
-- 
Dana L. How  danahow@gmail.com  +1 650 804 5991 cell

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-04  6:09             ` Dana How
@ 2007-06-04  6:18               ` Shawn O. Pearce
  0 siblings, 0 replies; 26+ messages in thread
From: Shawn O. Pearce @ 2007-06-04  6:18 UTC (permalink / raw)
  To: Dana How; +Cc: Scott Lamb, Simon Hausmann, Junio C Hamano, git

Dana How <danahow@gmail.com> wrote:
> On 6/3/07, Shawn O. Pearce <spearce@spearce.org> wrote:
> >I think writing data to fast-import is much easier than running
> >the raw Git commands, especially when you are talking about an
> >import engine where you need to set all of the special environment
> >variables for git-commit-tree or git-tag to do its job properly.
> >Its a good tool that simply doesn't get enough use, partly because
> >nobody is using it...
> 
> Well,  perhaps they use it *once*,  in that they write a wrapper script for
> it and then forget about it.  At least that's what I did.  And the _only_
> annoyance was the trailing NL requirement on the delimited "data" statement,
> so you don't get much noise/complaints when people use it.

True.  I did try to make fast-import take a simple enough format
that you could write throwaway code against it, run it, and never
look back...

The trailing NL after data was because of cvs2svn.  The SVN dump
file format apparently does something like this, and the version
of cvs2svn that Jon Smirl was working on output that trailing NL.
Accepting it in fast-import was easier than fixing cvs2svn to not
create it.

I'll admit the error handling in fast-import could probably
be easier, and that NL after data probably could be optional.
I don't think the input stream parser needs it to understand what
is going on.  Its just sheer laziness on my part that the code
requires it there.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-04  5:54           ` Shawn O. Pearce
  2007-06-04  6:09             ` Dana How
@ 2007-06-04  7:19             ` Scott Lamb
  2007-06-05  7:21               ` Simon Hausmann
  1 sibling, 1 reply; 26+ messages in thread
From: Scott Lamb @ 2007-06-04  7:19 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Simon Hausmann, Junio C Hamano, git


On Jun 3, 2007, at 10:54 PM, Shawn O. Pearce wrote:

> I think writing data to fast-import is much easier than running
> the raw Git commands, especially when you are talking about an
> import engine where you need to set all of the special environment
> variables for git-commit-tree or git-tag to do its job properly.
> Its a good tool that simply doesn't get enough use, partly because
> nobody is using it...

Yeah, I'm sold. I read git-p4 more thoroughly and tried it out...it's  
pretty nice. The P4Sync command has a simpler, more trustworthy flow  
than git-p4import.py.

On the Perforce side, I particularly like the use of "p4 print" to  
grab the files instead of "p4 sync". It avoids playing weird games  
with the client - I think nothing good can come of git-p4import.py's  
"p4 sync -k" and symlinks to map multiple branches into the same  
directory, which is not the Perforce way. Makes me nervous that  
what's submitted to git won't be the same as what's in the Perforce  
depot.

I would have thought launching a "p4 print" on each file would be  
horribly slow with the network latency of each request, but...well,  
apparently not.

Maybe I'll work up git-p4 patches for subcommand error handling, like  
my git-p4import.py ones. And fix some style - seriously, who puts  
semicolons at the end of Python commands? *grumble*

Best regards,
Scott

-- 
Scott Lamb <http://www.slamb.org/>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-03 20:12         ` Scott Lamb
  2007-06-04  5:54           ` Shawn O. Pearce
@ 2007-06-04  8:41           ` Marius Storm-Olsen
  1 sibling, 0 replies; 26+ messages in thread
From: Marius Storm-Olsen @ 2007-06-04  8:41 UTC (permalink / raw)
  To: Scott Lamb; +Cc: Simon Hausmann, Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 811 bytes --]

> git-p4import.py should work fine on Windows, too - the binary mode on  
> the pipe should be all handled by "subprocess", and git-p4's  
> data.replace("\r\n", "\n") is not necessary if you use "LineEnd:  
> unix" or "share" in the Perforce client specification.

The problem is that you cannot set the LineEnd when using the 'p4 
print' command, since it doesn't use the client spec; so Perforce the 
uses the platform default when printing the file.

> git-p4 seems to use "git fast-import". I guess the big performance
> improvement there is removing the ls-files operation? So we're 
> talking about a 0-10% speedup, right? Plus some fork()/exec()
> overhead.

With git-p4 the performance bottleneck is from what we can see the 
Perforce server, on non-Windows machines.

-- 
.marius


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 187 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-04  7:19             ` Scott Lamb
@ 2007-06-05  7:21               ` Simon Hausmann
  0 siblings, 0 replies; 26+ messages in thread
From: Simon Hausmann @ 2007-06-05  7:21 UTC (permalink / raw)
  To: Scott Lamb; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 2977 bytes --]

On Monday 04 June 2007 09:19:56 Scott Lamb wrote:
> On Jun 3, 2007, at 10:54 PM, Shawn O. Pearce wrote:
> > I think writing data to fast-import is much easier than running
> > the raw Git commands, especially when you are talking about an
> > import engine where you need to set all of the special environment
> > variables for git-commit-tree or git-tag to do its job properly.
> > Its a good tool that simply doesn't get enough use, partly because
> > nobody is using it...
>
> Yeah, I'm sold. I read git-p4 more thoroughly and tried it out...it's
> pretty nice. The P4Sync command has a simpler, more trustworthy flow
> than git-p4import.py.
>
> On the Perforce side, I particularly like the use of "p4 print" to
> grab the files instead of "p4 sync". It avoids playing weird games
> with the client - I think nothing good can come of git-p4import.py's
> "p4 sync -k" and symlinks to map multiple branches into the same
> directory, which is not the Perforce way. Makes me nervous that
> what's submitted to git won't be the same as what's in the Perforce
> depot.
>
> I would have thought launching a "p4 print" on each file would be
> horribly slow with the network latency of each request, but...well,
> apparently not.

I've found it to be fast enough for "standard software development". When 
importing big changes like integrations of an entire branch then it naturally 
slows down. The workaround me and my colleague have come up with is to 
combine git-p4 usage with the regular git protocol:

For imports of simple projects from perforce the direct use of git-p4 clone 
and sync/rebase is good enough.

For big projects we have set up a dedicated (recycled old) machine that 
continuously imports from the perforce server. That makes the initial clone 
very fast thanks to the use of the git protocol, it still allows imports from 
perforce afterwards and when the developer syncs the chances are very high 
that the dedicated machine already imported the necessary changes/objects 
from the perforce server and the faster git protocol instead of "p4 print" on 
a lot of files can be used.

In order to avoid that machine constantly polling the p4 server we've come up 
with a neat little trick by adding a change-commit trigger on the p4 server 
that consists of a little perl script that just sends a single udp packet 
with the latest change number as notification to the git machine, which upon 
reception imports then.

That is why git-p4 sync/rebase call "git fetch" by default (configurable 
through config key) if there is an origin remote present.

> Maybe I'll work up git-p4 patches for subcommand error handling, like
> my git-p4import.py ones. And fix some style - seriously, who puts
> semicolons at the end of Python commands? *grumble*

I'd be more than happy to apply style patches. I'm not a very experienced 
python programmer and I admit that I certainly lack the style there :)

Simon

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-04  5:56         ` Shawn O. Pearce
@ 2007-06-12 21:46           ` Simon Hausmann
  2007-06-13 21:06             ` Scott Lamb
  2007-06-14  5:35             ` Shawn O. Pearce
  0 siblings, 2 replies; 26+ messages in thread
From: Simon Hausmann @ 2007-06-12 21:46 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 2032 bytes --]

On Monday 04 June 2007 07:56:00 Shawn O. Pearce wrote:
> Simon Hausmann <simon@lst.de> wrote:
> > On the topic of git integration with perforce, what are the chances of
> > getting git-p4 ( http://repo.or.cz/w/fast-export.git ) into git's
> > contrib/fast-export area? :)
> >
> > git-p4 can do everything git-p4import can do plus a lot more (it can
> > track multiple branches, it's a hell of a lot faster, it can export back
> > to p4 and it also works on Windows!).
>
> I was sort of hoping we could fold the fast-export Git repository
> on repo.or.cz into core Git at some point.  Right now the only
> thing in contrib/fast-export is the import-tars.perl script that
> I maintain in my fastimport repository...  ;-)
>
> Like Junio I don't use Perforce, and can't test against it, but
> if you can maintain git-p4 (and I think the history on repo.or.cz
> shows that you do) then it may be a good idea to add it to core Git.
>
> Send a patch to add it.  Worst that happens is both Junio and I
> decide not to apply it.  Or I apply it, but Junio refuses to pull
> from me afterwards.  ;-)

Ok, I'll give it a try :)

I've used git-filter-branch to rewrite the history in fast-export to include 
only changes relevant to git-p4 and at the same time move all files into 
contrib/fast-import. The result is available as separate branch at

	git://repo.or.cz/fast-export.git git-p4

and technically merges fine into git.git's contrib/fast-import directory with 
three files (git-p4, git-p4.txt and git-p4.bat for windows convenience).

Please let me know if there's anything missing or if you prefer a different 
format or so. I also realized that I haven't really used the 'Signed-off-by' 
tags in the past but I'd be happy to adopt it for git inclusion if you prefer 
that :)


_If_ one of you decides to pull then my plan is to discontinue the git-p4 
branch in the fast-export repository and instead work in a git.git fork on 
repo.or.cz (similar to the fastimport repository).


Simon

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-12 21:46           ` Simon Hausmann
@ 2007-06-13 21:06             ` Scott Lamb
  2007-06-13 22:34               ` Simon Hausmann
  2007-06-14  5:35             ` Shawn O. Pearce
  1 sibling, 1 reply; 26+ messages in thread
From: Scott Lamb @ 2007-06-13 21:06 UTC (permalink / raw)
  To: Simon Hausmann; +Cc: Shawn O. Pearce, Junio C Hamano, git

Simon Hausmann wrote:
> _If_ one of you decides to pull then my plan is to discontinue the git-p4 
> branch in the fast-export repository and instead work in a git.git fork on 
> repo.or.cz (similar to the fastimport repository).

So you'll continue maintain this code and others should submit changes 
through you? What is the best way to do so? (Not sure what it was 
before, or if it would change under this plan.) Email a format-patch To: 
you? Cc: this list? some other list? no list?

-- 
Scott Lamb <http://www.slamb.org/>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-13 21:06             ` Scott Lamb
@ 2007-06-13 22:34               ` Simon Hausmann
  0 siblings, 0 replies; 26+ messages in thread
From: Simon Hausmann @ 2007-06-13 22:34 UTC (permalink / raw)
  To: Scott Lamb; +Cc: Shawn O. Pearce, Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 1533 bytes --]

On Wednesday 13 June 2007 23:06:49 Scott Lamb wrote:
> Simon Hausmann wrote:
> > _If_ one of you decides to pull then my plan is to discontinue the git-p4
> > branch in the fast-export repository and instead work in a git.git fork
> > on repo.or.cz (similar to the fastimport repository).
>
> So you'll continue maintain this code and others should submit changes
> through you? What is the best way to do so? (Not sure what it was
> before, or if it would change under this plan.) Email a format-patch To:
> you? Cc: this list? some other list? no list?

I would say whichever you prefer :)

For the fast-export repository multiple people have access and for example 
after Han-Wen made a lot of patches I asked Chris Lee (owner of the module on 
repo.or.cz) to add Han-Wen to the list of people with push access and he 
pushed his changes directly. I actually like working that way for a project 
that is as simple as that, I don't mind if somebody pushes simple changes 
directly as much as I like discussing bigger plans if they potentially clash 
with somebody else's work or use-case.

So if git-p4 continues to live in fast-export I'll continue to encourage Chris 
Lee to give git-p4 contributors push access, if it's in a git.git fork I'd be 
happy to do so myself (give access).

If git-p4 also ends up in git/contrib/fastimport and somebody likes to send 
patches to Junio or somebody else and CC this list that's fine with me, too.

It's just a few lines of python code after all ;-)

Simon

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-12 21:46           ` Simon Hausmann
  2007-06-13 21:06             ` Scott Lamb
@ 2007-06-14  5:35             ` Shawn O. Pearce
  2007-06-14 21:44               ` Simon Hausmann
  1 sibling, 1 reply; 26+ messages in thread
From: Shawn O. Pearce @ 2007-06-14  5:35 UTC (permalink / raw)
  To: Simon Hausmann; +Cc: Junio C Hamano, git

Simon Hausmann <simon@lst.de> wrote:
> I've used git-filter-branch to rewrite the history in fast-export to include 
> only changes relevant to git-p4 and at the same time move all files into 
> contrib/fast-import. The result is available as separate branch at
> 
> 	git://repo.or.cz/fast-export.git git-p4
> 
> and technically merges fine into git.git's contrib/fast-import directory with 
> three files (git-p4, git-p4.txt and git-p4.bat for windows convenience).
> 
> Please let me know if there's anything missing or if you prefer a different 
> format or so. I also realized that I haven't really used the 'Signed-off-by' 
> tags in the past but I'd be happy to adopt it for git inclusion if you prefer 
> that :)

Yes.  The SBO line is your assertion that you own the rights to the
code and can release it under the license you are offering it under.
One of the issues I have with this git-p4 history you have built
is the lack of the SBO line on all 255 commits.

Of course an SBO line doesn't carry that much weight, its just a line
after all, but according to Git's project standards it should be there
if you are agreeing to release it.  See Documentation/SubmittingPatches
for details.

My other problem with this history is a commit like b79112 "a
little bit more convenience" (and there are many such commits).
This message is insanely short, doesn't really talk at all about
what a little bit is, how it is more convenient, or who it is more
convenient for.

Think about how that oneline (and the others) would look in Junio's
"What's new in git.git" emails, or in gitweb.  There is not enough
detail here to be of any value to the reader.  Expanding out to the
full message offers nothing additional either, because that is all
there is in the entire commit message body.

I do appreciate you taking the time to use filter-branch to try to
cleanup this history a bit.  I really had originally planned on
pulling your tree through to my fastimport tree and then talking
Junio into merging with me.  But after reading through this history I
don't want do that, because of the oneline summaries I just pointed
out above, and because of the missing SBO.
 
-- 
Shawn.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-14  5:35             ` Shawn O. Pearce
@ 2007-06-14 21:44               ` Simon Hausmann
  2007-06-15  3:13                 ` Shawn O. Pearce
  2007-06-15  5:30                 ` Marius Storm-Olsen, mstormo_git
  0 siblings, 2 replies; 26+ messages in thread
From: Simon Hausmann @ 2007-06-14 21:44 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 3009 bytes --]

On Thursday 14 June 2007 07:35:38 Shawn O. Pearce wrote:
> Simon Hausmann <simon@lst.de> wrote:
> > I've used git-filter-branch to rewrite the history in fast-export to
> > include only changes relevant to git-p4 and at the same time move all
> > files into contrib/fast-import. The result is available as separate
> > branch at
> >
> > 	git://repo.or.cz/fast-export.git git-p4
> >
> > and technically merges fine into git.git's contrib/fast-import directory
> > with three files (git-p4, git-p4.txt and git-p4.bat for windows
> > convenience).
> >
> > Please let me know if there's anything missing or if you prefer a
> > different format or so. I also realized that I haven't really used the
> > 'Signed-off-by' tags in the past but I'd be happy to adopt it for git
> > inclusion if you prefer that :)
>
> Yes.  The SBO line is your assertion that you own the rights to the
> code and can release it under the license you are offering it under.
> One of the issues I have with this git-p4 history you have built
> is the lack of the SBO line on all 255 commits.
>
> Of course an SBO line doesn't carry that much weight, its just a line
> after all, but according to Git's project standards it should be there
> if you are agreeing to release it.  See Documentation/SubmittingPatches
> for details.
>
> My other problem with this history is a commit like b79112 "a
> little bit more convenience" (and there are many such commits).
> This message is insanely short, doesn't really talk at all about
> what a little bit is, how it is more convenient, or who it is more
> convenient for.
>
> Think about how that oneline (and the others) would look in Junio's
> "What's new in git.git" emails, or in gitweb.  There is not enough
> detail here to be of any value to the reader.  Expanding out to the
> full message offers nothing additional either, because that is all
> there is in the entire commit message body.
>
> I do appreciate you taking the time to use filter-branch to try to
> cleanup this history a bit.  I really had originally planned on
> pulling your tree through to my fastimport tree and then talking
> Junio into merging with me.  But after reading through this history I
> don't want do that, because of the oneline summaries I just pointed
> out above, and because of the missing SBO.

First of all thanks for looking at the branch. I agree with your concerns and 
I do admit that I've been a bit too sloppy with the log messages.

I have started cleaning up the history even more by reworking the log messages 
of my commits (git-p4-enhanced-logs branch in fast-export, starting at the 
last page). Once that is done (I expect that to take a few days) I'll add the 
missing SOB lines with git-filter-branch and see if I can get an agreement 
from Han-Wen and Marius for doing the same with their commits (adding the 
missing lines).

Would you be willing to reevaluate the situation regarding a merge once that's 
done?


Simon

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-14 21:44               ` Simon Hausmann
@ 2007-06-15  3:13                 ` Shawn O. Pearce
  2007-06-15  5:30                 ` Marius Storm-Olsen, mstormo_git
  1 sibling, 0 replies; 26+ messages in thread
From: Shawn O. Pearce @ 2007-06-15  3:13 UTC (permalink / raw)
  To: Simon Hausmann; +Cc: Junio C Hamano, git

Simon Hausmann <simon@lst.de> wrote:
> On Thursday 14 June 2007 07:35:38 Shawn O. Pearce wrote:
> > I do appreciate you taking the time to use filter-branch to try to
> > cleanup this history a bit.  I really had originally planned on
> > pulling your tree through to my fastimport tree and then talking
> > Junio into merging with me.  But after reading through this history I
> > don't want do that, because of the oneline summaries I just pointed
> > out above, and because of the missing SBO.
...
> I have started cleaning up the history even more by reworking the log messages 
> of my commits (git-p4-enhanced-logs branch in fast-export, starting at the 
> last page). Once that is done (I expect that to take a few days) I'll add the 
> missing SOB lines with git-filter-branch and see if I can get an agreement 
> from Han-Wen and Marius for doing the same with their commits (adding the 
> missing lines).

OK.
 
> Would you be willing to reevaluate the situation regarding a merge once that's 
> done?

Absolutely.  I would like to see the git-p4 work in the main tree,
so it is more readily available to users, even though I'm not a p4
user myself.  ;-)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git-p4import.py robustness changes
  2007-06-14 21:44               ` Simon Hausmann
  2007-06-15  3:13                 ` Shawn O. Pearce
@ 2007-06-15  5:30                 ` Marius Storm-Olsen, mstormo_git
  1 sibling, 0 replies; 26+ messages in thread
From: Marius Storm-Olsen, mstormo_git @ 2007-06-15  5:30 UTC (permalink / raw)
  To: Simon Hausmann; +Cc: Shawn O. Pearce, Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 742 bytes --]

Simon Hausmann said the following on 14.06.2007 23:44:
> First of all thanks for looking at the branch. I agree with your
> concerns and I do admit that I've been a bit too sloppy with the
> log messages.
> 
> I have started cleaning up the history even more by reworking the
> log messages of my commits (git-p4-enhanced-logs branch in
> fast-export, starting at the last page). Once that is done (I
> expect that to take a few days) I'll add the missing SOB lines with
> git-filter-branch and see if I can get an agreement from Han-Wen
> and Marius for doing the same with their commits (adding the 
> missing lines).

Simon,

Of course! Go right ahead and add the SOB for my commits while you're 
at it.

-- 
.marius


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 187 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2007-06-15  6:50 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-31 16:47 git-p4import.py robustness changes Scott Lamb
2007-05-31 23:53 ` Junio C Hamano
2007-06-02 20:41   ` Scott Lamb
2007-06-02 21:33     ` Junio C Hamano
2007-06-02 23:21       ` Scott Lamb
2007-06-02 23:52         ` Junio C Hamano
2007-06-03 13:11       ` Simon Hausmann
2007-06-03 20:12         ` Scott Lamb
2007-06-04  5:54           ` Shawn O. Pearce
2007-06-04  6:09             ` Dana How
2007-06-04  6:18               ` Shawn O. Pearce
2007-06-04  7:19             ` Scott Lamb
2007-06-05  7:21               ` Simon Hausmann
2007-06-04  8:41           ` Marius Storm-Olsen
2007-06-04  5:56         ` Shawn O. Pearce
2007-06-12 21:46           ` Simon Hausmann
2007-06-13 21:06             ` Scott Lamb
2007-06-13 22:34               ` Simon Hausmann
2007-06-14  5:35             ` Shawn O. Pearce
2007-06-14 21:44               ` Simon Hausmann
2007-06-15  3:13                 ` Shawn O. Pearce
2007-06-15  5:30                 ` Marius Storm-Olsen, mstormo_git
2007-06-03  3:58 ` [PATCH 1/4] git-p4import: fix subcommand error handling Scott Lamb
2007-06-03  3:58   ` [PATCH 2/4] git-p4import: use lists of subcommand arguments Scott Lamb
2007-06-03  3:58     ` [PATCH 3/4] git-p4import: resume on correct p4 changeset Scott Lamb
2007-06-03  3:58       ` [PATCH 4/4] git-p4import: partial history Scott Lamb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).