git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] Added support for purged files and also optimised memory usage.
@ 2008-11-08  3:22 John Chapman
  2008-11-08  3:22 ` [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows John Chapman
  0 siblings, 1 reply; 11+ messages in thread
From: John Chapman @ 2008-11-08  3:22 UTC (permalink / raw)
  To: git; +Cc: John Chapman

Purged files are handled as if they are merely deleted, which is not
entirely optimal, but I don't know of any other way to handle them.
File data is deleted from memory as early as they can, and they are more
efficiently handled, at (significant) cost to CPU usage.

Still need to handle p4 branches with spaces in their names.
Still need to make git-p4 clone more reliable.
 - Perhaps with a --continue option. (Sometimes the p4 server kills
 the connection)

Signed-off-by: John Chapman <thestar@fussycoder.id.au>
---
 contrib/fast-import/git-p4 |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/contrib/fast-import/git-p4 b/contrib/fast-import/git-p4
index 2216cac..38d1a17 100755
--- a/contrib/fast-import/git-p4
+++ b/contrib/fast-import/git-p4
@@ -946,7 +946,7 @@ class P4Sync(Command):
 
             if includeFile:
                 filesForCommit.append(f)
-                if f['action'] != 'delete':
+                if f['action'] not in ('delete', 'purge'):
                     filesToRead.append(f)
 
         filedata = []
@@ -965,11 +965,11 @@ class P4Sync(Command):
         while j < len(filedata):
             stat = filedata[j]
             j += 1
-            text = [];
+            text = ''
             while j < len(filedata) and filedata[j]['code'] in ('text', 'unicode', 'binary'):
-                text.append(filedata[j]['data'])
+                text += filedata[j]['data']
+                del filedata[j]['data']
                 j += 1
-            text = ''.join(text)
 
             if not stat.has_key('depotFile'):
                 sys.stderr.write("p4 print fails with: %s\n" % repr(stat))
@@ -1038,7 +1038,7 @@ class P4Sync(Command):
                 continue
 
             relPath = self.stripRepoPath(file['path'], branchPrefixes)
-            if file["action"] == "delete":
+            if file["action"] in ("delete", "purge"):
                 self.gitStream.write("D %s\n" % relPath)
             else:
                 data = file['data']
@@ -1077,7 +1077,7 @@ class P4Sync(Command):
 
                 cleanedFiles = {}
                 for info in files:
-                    if info["action"] == "delete":
+                    if info["action"] in ("delete", "purge"):
                         continue
                     cleanedFiles[info["depotFile"]] = info["rev"]
 
@@ -1400,7 +1400,7 @@ class P4Sync(Command):
             if change > newestRevision:
                 newestRevision = change
 
-            if info["action"] == "delete":
+            if info["action"] in ("delete", "purge"):
                 # don't increase the file cnt, otherwise details["depotFile123"] will have gaps!
                 #fileCnt = fileCnt + 1
                 continue
-- 
1.6.0.3.643.g233db

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows.
  2008-11-08  3:22 [PATCH 1/2] Added support for purged files and also optimised memory usage John Chapman
@ 2008-11-08  3:22 ` John Chapman
  2008-11-08  5:19   ` David Symonds
  2008-11-09 18:33   ` Junio C Hamano
  0 siblings, 2 replies; 11+ messages in thread
From: John Chapman @ 2008-11-08  3:22 UTC (permalink / raw)
  To: git; +Cc: John Chapman


Signed-off-by: John Chapman <thestar@fussycoder.id.au>
---
 contrib/fast-import/git-p4 |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/contrib/fast-import/git-p4 b/contrib/fast-import/git-p4
index 38d1a17..9f0a5f9 100755
--- a/contrib/fast-import/git-p4
+++ b/contrib/fast-import/git-p4
@@ -316,8 +316,11 @@ def gitBranchExists(branch):
                             stderr=subprocess.PIPE, stdout=subprocess.PIPE);
     return proc.wait() == 0;
 
+_gitConfig = {}
 def gitConfig(key):
-    return read_pipe("git config %s" % key, ignore_error=True).strip()
+    if not _gitConfig.has_key(key):
+        _gitConfig[key] = read_pipe("git config %s" % key, ignore_error=True).strip()
+    return _gitConfig[key]
 
 def p4BranchesInGit(branchesAreInRemotes = True):
     branches = {}
-- 
1.6.0.3.643.g233db

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows.
  2008-11-08  3:22 ` [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows John Chapman
@ 2008-11-08  5:19   ` David Symonds
  2008-11-08  6:52     ` Arafangion
  2008-11-08 10:13     ` Jakub Narebski
  2008-11-09 18:33   ` Junio C Hamano
  1 sibling, 2 replies; 11+ messages in thread
From: David Symonds @ 2008-11-08  5:19 UTC (permalink / raw)
  To: John Chapman; +Cc: git

On Fri, Nov 7, 2008 at 7:22 PM, John Chapman <thestar@fussycoder.id.au> wrote:

> +_gitConfig = {}
>  def gitConfig(key):
> -    return read_pipe("git config %s" % key, ignore_error=True).strip()
> +    if not _gitConfig.has_key(key):
> +        _gitConfig[key] = read_pipe("git config %s" % key, ignore_error=True).strip()
> +    return _gitConfig[key]

If this is truly a noticeable bottleneck on Windows, something like
the following might be even better:  (completely untested!)

_gitConfig = None
def gitConfig(key):
  if _gitConfig is None:
    lines = read_pipe("git config -l", ignore_error=True).readlines():
    _gitConfig = dict([l.strip().split('=', 1) for l in lines])
  return _gitConfig.get(key, None)



Dave.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows.
  2008-11-08  5:19   ` David Symonds
@ 2008-11-08  6:52     ` Arafangion
  2008-11-10  8:38       ` Steve Frécinaux
  2008-11-08 10:13     ` Jakub Narebski
  1 sibling, 1 reply; 11+ messages in thread
From: Arafangion @ 2008-11-08  6:52 UTC (permalink / raw)
  To: David Symonds; +Cc: git

On Fri, 2008-11-07 at 21:19 -0800, David Symonds wrote:
<snip>
> _gitConfig = None
> def gitConfig(key):
>   if _gitConfig is None:
>     lines = read_pipe("git config -l", ignore_error=True).readlines():
>     _gitConfig = dict([l.strip().split('=', 1) for l in lines])
>   return _gitConfig.get(key, None)

That certainly is better, if one can assume that git's configuration is
small. (And relative to the memory usage of the script, it will
definetly be small).

I shall give that a go, although the change won't make it even faster -
I suspect that much of the performance penalty in windows is the
pathetic fork() performance, particularly as the memory usage of the
script increases. (If subprocess does fork() and exec() in order to open
another process, in cygwin).

Thankyou.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows.
  2008-11-08  5:19   ` David Symonds
  2008-11-08  6:52     ` Arafangion
@ 2008-11-08 10:13     ` Jakub Narebski
  1 sibling, 0 replies; 11+ messages in thread
From: Jakub Narebski @ 2008-11-08 10:13 UTC (permalink / raw)
  To: David Symonds; +Cc: John Chapman, git

"David Symonds" <dsymonds@gmail.com> writes:

> On Fri, Nov 7, 2008 at 7:22 PM, John Chapman <thestar@fussycoder.id.au> wrote:
> 
> > +_gitConfig = {}
> >  def gitConfig(key):
> > -    return read_pipe("git config %s" % key, ignore_error=True).strip()
> > +    if not _gitConfig.has_key(key):
> > +        _gitConfig[key] = read_pipe("git config %s" % key, ignore_error=True).strip()
> > +    return _gitConfig[key]
> 
> If this is truly a noticeable bottleneck on Windows, something like
> the following might be even better:  (completely untested!)
> 
> _gitConfig = None
> def gitConfig(key):
>   if _gitConfig is None:
>     lines = read_pipe("git config -l", ignore_error=True).readlines():
>     _gitConfig = dict([l.strip().split('=', 1) for l in lines])
>   return _gitConfig.get(key, None)

Wouldn't it be better to use "git config -l -z", split lines at "\0"
(NUL), and split key from value at first "\N" (CR)? This format was
meant for scripts.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows.
  2008-11-08  3:22 ` [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows John Chapman
  2008-11-08  5:19   ` David Symonds
@ 2008-11-09 18:33   ` Junio C Hamano
  2008-11-10  3:50     ` Han-Wen Nienhuys
  2008-11-10  9:46     ` Simon Hausmann
  1 sibling, 2 replies; 11+ messages in thread
From: Junio C Hamano @ 2008-11-09 18:33 UTC (permalink / raw)
  To: Simon Hausmann, Han-Wen Nienhuys; +Cc: John Chapman, git

These are patches to fast-import/git-p4, which you two seem to in charge
of.

    From:	John Chapman <thestar@fussycoder.id.au>
    Subject: [PATCH 1/2] Added support for purged files and also optimised memory usage.
    Date:	Sat,  8 Nov 2008 14:22:48 +1100
    Message-Id: <1226114569-8506-1-git-send-email-thestar@fussycoder.id.au>

    From:	John Chapman <thestar@fussycoder.id.au>
    Subject: [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows.
    Date:	Sat,  8 Nov 2008 14:22:49 +1100
    Message-Id: <1226114569-8506-2-git-send-email-thestar@fussycoder.id.au>

It was unfortunately not immediately obvious from the Subject: line what
these patches are about, and I am guessing you missed them because of that.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows.
  2008-11-09 18:33   ` Junio C Hamano
@ 2008-11-10  3:50     ` Han-Wen Nienhuys
  2008-11-10  9:46     ` Simon Hausmann
  1 sibling, 0 replies; 11+ messages in thread
From: Han-Wen Nienhuys @ 2008-11-10  3:50 UTC (permalink / raw)
  To: git

Hi Junio,

I haven't been involved with git-p4 for a long time.  I'm not really fit 
for judging these patches.


Junio C Hamano escreveu:
> These are patches to fast-import/git-p4, which you two seem to in charge
> of.
> 
>     From:	John Chapman <thestar@fussycoder.id.au>
>     Subject: [PATCH 1/2] Added support for purged files and also optimised memory usage.
>     Date:	Sat,  8 Nov 2008 14:22:48 +1100
>     Message-Id: <1226114569-8506-1-git-send-email-thestar@fussycoder.id.au>
> 
>     From:	John Chapman <thestar@fussycoder.id.au>
>     Subject: [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows.
>     Date:	Sat,  8 Nov 2008 14:22:49 +1100
>     Message-Id: <1226114569-8506-2-git-send-email-thestar@fussycoder.id.au>
> 
> It was unfortunately not immediately obvious from the Subject: line what
> these patches are about, and I am guessing you missed them because of that.


-- 
 Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows.
  2008-11-08  6:52     ` Arafangion
@ 2008-11-10  8:38       ` Steve Frécinaux
  0 siblings, 0 replies; 11+ messages in thread
From: Steve Frécinaux @ 2008-11-10  8:38 UTC (permalink / raw)
  To: Arafangion; +Cc: David Symonds, git

Arafangion wrote:
> On Fri, 2008-11-07 at 21:19 -0800, David Symonds wrote:
> <snip>
>> _gitConfig = None
>> def gitConfig(key):
>>   if _gitConfig is None:
>>     lines = read_pipe("git config -l", ignore_error=True).readlines():
>>     _gitConfig = dict([l.strip().split('=', 1) for l in lines])
>>   return _gitConfig.get(key, None)
> 
> That certainly is better, if one can assume that git's configuration is
> small. (And relative to the memory usage of the script, it will
> definetly be small).

What about using git config --get-regexp to only get the p4-related 
settings ?

I don't really know the options used by git-p4, but something like this 
seems like a good candidate to address every trade-off concerns?

git-config --get-regexp '^(p4|user)\.'

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows.
  2008-11-09 18:33   ` Junio C Hamano
  2008-11-10  3:50     ` Han-Wen Nienhuys
@ 2008-11-10  9:46     ` Simon Hausmann
  2008-11-12  0:50       ` Junio C Hamano
  1 sibling, 1 reply; 11+ messages in thread
From: Simon Hausmann @ 2008-11-10  9:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Han-Wen Nienhuys, John Chapman, git

On Sunday 09 November 2008 Junio C Hamano, wrote:
> These are patches to fast-import/git-p4, which you two seem to in charge
> of.
>
>     From:	John Chapman <thestar@fussycoder.id.au>
>     Subject: [PATCH 1/2] Added support for purged files and also optimised
> memory usage. Date:	Sat,  8 Nov 2008 14:22:48 +1100
>     Message-Id: <1226114569-8506-1-git-send-email-thestar@fussycoder.id.au>
>
>     From:	John Chapman <thestar@fussycoder.id.au>
>     Subject: [PATCH 2/2] Cached the git configuration, which is now
> noticibly faster on windows. Date:	Sat,  8 Nov 2008 14:22:49 +1100
>     Message-Id: <1226114569-8506-2-git-send-email-thestar@fussycoder.id.au>
>
> It was unfortunately not immediately obvious from the Subject: line what
> these patches are about, and I am guessing you missed them because of that.

Ack on both patches. The second one could be done better, as suggested in the 
follow-ups, but both are clearly an improvement :)


Simon

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows.
  2008-11-10  9:46     ` Simon Hausmann
@ 2008-11-12  0:50       ` Junio C Hamano
  2008-11-12 11:54         ` Arafangion
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2008-11-12  0:50 UTC (permalink / raw)
  To: Simon Hausmann; +Cc: Han-Wen Nienhuys, John Chapman, git

Simon Hausmann <simon@lst.de> writes:

> On Sunday 09 November 2008 Junio C Hamano, wrote:
>> These are patches to fast-import/git-p4, which you two seem to in charge
>> of.
>>
>>     From:	John Chapman <thestar@fussycoder.id.au>
>>     Subject: [PATCH 1/2] Added support for purged files and also optimised
>> memory usage. Date:	Sat,  8 Nov 2008 14:22:48 +1100
>>     Message-Id: <1226114569-8506-1-git-send-email-thestar@fussycoder.id.au>
>>
>>     From:	John Chapman <thestar@fussycoder.id.au>
>>     Subject: [PATCH 2/2] Cached the git configuration, which is now
>> noticibly faster on windows. Date:	Sat,  8 Nov 2008 14:22:49 +1100
>>     Message-Id: <1226114569-8506-2-git-send-email-thestar@fussycoder.id.au>
>>
>> It was unfortunately not immediately obvious from the Subject: line what
>> these patches are about, and I am guessing you missed them because of that.
>
> Ack on both patches. The second one could be done better, as suggested
> in the follow-ups, but both are clearly an improvement :)

Thanks, both of you.  Will apply.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows.
  2008-11-12  0:50       ` Junio C Hamano
@ 2008-11-12 11:54         ` Arafangion
  0 siblings, 0 replies; 11+ messages in thread
From: Arafangion @ 2008-11-12 11:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Simon Hausmann, Han-Wen Nienhuys, git

Thanks for making those patches more visible, however I do feel the need
to mention one thing I expected to have been raised during patch review:
1) The memory optimisation may cause significant slowdown, which wasn't
a big issue on my machine, perhaps because I'm _still_ trying to get it
to work on my particular repo. (It's still using too much memory), and I
have a very fast machine.  It switches git-p4 from a RAM-intensive app
to a somewhat-less-RAM-intensive app at the cost of also becomming much
more CPU-intensive.



On Tue, 2008-11-11 at 16:50 -0800, Junio C Hamano wrote:
> Simon Hausmann <simon@lst.de> writes:
> 
> > On Sunday 09 November 2008 Junio C Hamano, wrote:
> >> These are patches to fast-import/git-p4, which you two seem to in charge
> >> of.
> >>
> >>     From:	John Chapman <thestar@fussycoder.id.au>
> >>     Subject: [PATCH 1/2] Added support for purged files and also optimised
> >> memory usage. Date:	Sat,  8 Nov 2008 14:22:48 +1100
> >>     Message-Id: <1226114569-8506-1-git-send-email-thestar@fussycoder.id.au>
> >>
> >>     From:	John Chapman <thestar@fussycoder.id.au>
> >>     Subject: [PATCH 2/2] Cached the git configuration, which is now
> >> noticibly faster on windows. Date:	Sat,  8 Nov 2008 14:22:49 +1100
> >>     Message-Id: <1226114569-8506-2-git-send-email-thestar@fussycoder.id.au>
> >>
> >> It was unfortunately not immediately obvious from the Subject: line what
> >> these patches are about, and I am guessing you missed them because of that.
> >
> > Ack on both patches. The second one could be done better, as suggested
> > in the follow-ups, but both are clearly an improvement :)
> 
> Thanks, both of you.  Will apply.
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-11-12 12:59 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-08  3:22 [PATCH 1/2] Added support for purged files and also optimised memory usage John Chapman
2008-11-08  3:22 ` [PATCH 2/2] Cached the git configuration, which is now noticibly faster on windows John Chapman
2008-11-08  5:19   ` David Symonds
2008-11-08  6:52     ` Arafangion
2008-11-10  8:38       ` Steve Frécinaux
2008-11-08 10:13     ` Jakub Narebski
2008-11-09 18:33   ` Junio C Hamano
2008-11-10  3:50     ` Han-Wen Nienhuys
2008-11-10  9:46     ` Simon Hausmann
2008-11-12  0:50       ` Junio C Hamano
2008-11-12 11:54         ` Arafangion

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).