* git-p4 fails when cloning a p4 depo.
@ 2007-06-08 16:41 Benjamin Sergeant
2007-06-08 18:13 ` Benjamin Sergeant
0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Sergeant @ 2007-06-08 16:41 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 1246 bytes --]
I attached a lame patch to die without showing the Python Traceback,
but I'd rather succeed :)
Maybe there is a different mailing list for git-p4. If there is tell
me and I'll post there.
Benjamin.
[bsergean@flanders sandbox]$ rm -rf dev ; git-p4 clone
//Work/Users/Capture3D/A3D810/pdfl/Common/a3d/dev
Importing from //Work/Users/Capture3D/A3D810/pdfl/Common/a3d/dev into dev
Initialized empty Git repository in .git/
Doing initial import of
//Work/Users/Capture3D/A3D810/pdfl/Common/a3d/dev/ from revision #head
[{'p4ExitCode': 32512}]
Traceback (most recent call last):
File "/home/bsergean/src/fast-export/git-p4", line 1489, in <module>
main()
File "/home/bsergean/src/fast-export/git-p4", line 1484, in main
if not cmd.run(args):
File "/home/bsergean/src/fast-export/git-p4", line 1395, in run
if not P4Sync.run(self, depotPaths):
File "/home/bsergean/src/fast-export/git-p4", line 1203, in run
self.commit(details, self.extractFilesFromCommit(details),
self.branch, self.depotPaths)
File "/home/bsergean/src/fast-export/git-p4", line 744, in commit
self.readP4Files(files)
File "/home/bsergean/src/fast-export/git-p4", line 722, in readP4Files
contents[stat['depotFile']] = text
KeyError: 'depotFile'
[-- Attachment #2: git-p4.diff --]
[-- Type: text/x-patch, Size: 429 bytes --]
diff --git a/git-p4 b/git-p4
index 36fe69a..3e1a878 100755
--- a/git-p4
+++ b/git-p4
@@ -707,6 +707,9 @@ class P4Sync(Command):
f['rev'])
for f in files]))
+ if "p4ExitCode" in filedata[0]:
+ die("Problems executing p4");
+
j = 0;
contents = {}
while j < len(filedata):
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: git-p4 fails when cloning a p4 depo.
2007-06-08 16:41 git-p4 fails when cloning a p4 depo Benjamin Sergeant
@ 2007-06-08 18:13 ` Benjamin Sergeant
2007-06-08 21:31 ` Scott Lamb
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Benjamin Sergeant @ 2007-06-08 18:13 UTC (permalink / raw)
To: git
A perforce command with all the files in the repo is generated to get
all the file content.
Here is a patch to break it into multiple successive perforce command
who uses 4K of parameter max, and collect the output for later.
It works, but not for big depos, because the whole perforce depo
content is stored in memory in P4Sync.run(), and it looks like mine is
bigger than 2 Gigs, so I had to kill the process.
diff --git a/git-p4 b/git-p4
index 36fe69a..906b193 100755
--- a/git-p4
+++ b/git-p4
@@ -703,9 +703,22 @@ class P4Sync(Command):
if not files:
return
- filedata = p4CmdList('print %s' % ' '.join(['"%s#%s"' % (f['path'],
- f['rev'])
- for f in files]))
+ # We cannot put all the files on the command line
+ # OS have limitations on the max lenght of arguments
+ # POSIX says it's 4096 bytes, default for Linux seems to be 130 K.
+ # and all OS from the table below seems to be higher than POSIX.
+ # See http://www.in-ulm.de/~mascheck/various/argmax/
+ chunk = ''
+ filedata = []
+ for i in xrange(len(files)):
+ f = files[i]
+ chunk += '"%s#%s" ' % (f['path'], f['rev'])
+ if len(chunk) > 4000 or i == len(files)-1:
+ data = p4CmdList('print %s' % chunk)
+ if "p4ExitCode" in data[0]:
+ die("Problems executing p4. Error: [%d]." %
(data[0]['p4ExitCode']));
+ filedata.extend(data)
+ chunk = ''
j = 0;
contents = {}
@@ -1486,3 +1499,5 @@ def main():
if __name__ == '__main__':
main()
+
+# vim: set filetype=python sts=4 sw=4 et si :
On 6/8/07, Benjamin Sergeant <bsergean@gmail.com> wrote:
> I attached a lame patch to die without showing the Python Traceback,
> but I'd rather succeed :)
> Maybe there is a different mailing list for git-p4. If there is tell
> me and I'll post there.
>
> Benjamin.
>
> [bsergean@flanders sandbox]$ rm -rf dev ; git-p4 clone
> //Work/Users/Capture3D/A3D810/pdfl/Common/a3d/dev
> Importing from //Work/Users/Capture3D/A3D810/pdfl/Common/a3d/dev into dev
> Initialized empty Git repository in .git/
> Doing initial import of
> //Work/Users/Capture3D/A3D810/pdfl/Common/a3d/dev/ from revision #head
> [{'p4ExitCode': 32512}]
> Traceback (most recent call last):
> File "/home/bsergean/src/fast-export/git-p4", line 1489, in <module>
> main()
> File "/home/bsergean/src/fast-export/git-p4", line 1484, in main
> if not cmd.run(args):
> File "/home/bsergean/src/fast-export/git-p4", line 1395, in run
> if not P4Sync.run(self, depotPaths):
> File "/home/bsergean/src/fast-export/git-p4", line 1203, in run
> self.commit(details, self.extractFilesFromCommit(details),
> self.branch, self.depotPaths)
> File "/home/bsergean/src/fast-export/git-p4", line 744, in commit
> self.readP4Files(files)
> File "/home/bsergean/src/fast-export/git-p4", line 722, in readP4Files
> contents[stat['depotFile']] = text
> KeyError: 'depotFile'
>
>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: git-p4 fails when cloning a p4 depo.
2007-06-08 18:13 ` Benjamin Sergeant
@ 2007-06-08 21:31 ` Scott Lamb
2007-06-08 21:34 ` Scott Lamb
2007-06-08 22:38 ` Simon Hausmann
2007-06-12 1:13 ` Han-Wen Nienhuys
2 siblings, 1 reply; 14+ messages in thread
From: Scott Lamb @ 2007-06-08 21:31 UTC (permalink / raw)
To: Benjamin Sergeant; +Cc: git
Benjamin Sergeant wrote:
> A perforce command with all the files in the repo is generated to get
> all the file content.
> Here is a patch to break it into multiple successive perforce command
> who uses 4K of parameter max, and collect the output for later.
>
> It works, but not for big depos, because the whole perforce depo
> content is stored in memory in P4Sync.run(), and it looks like mine is
> bigger than 2 Gigs, so I had to kill the process.
Hmm. I tried git-p4 out on Sunday, and it definitely didn't do this
then...this commit must have showed up since:
commit 6460cf12df4556f889888b5f0b49e07040747e6f
Author: Han-Wen Nienhuys <hanwen@google.com>
Date: Wed May 23 18:49:35 2007 -0300
Read p4 files in one batch.
I believe HEAD at the time was this:
commit 458e0545cb3dd03af9cd1a61480cbb764639043a
Author: Simon Hausmann <simon@lst.de>
Date: Mon May 28 19:24:57 2007 +0200
Fix typo in listExistingP4Branches that broke sync.
so you might try checking out an old version, and I'll go RTFM on
reading merge history, because I can't figure out when this happened.
>
>
> diff --git a/git-p4 b/git-p4
> index 36fe69a..906b193 100755
> --- a/git-p4
> +++ b/git-p4
> @@ -703,9 +703,22 @@ class P4Sync(Command):
> if not files:
> return
>
> - filedata = p4CmdList('print %s' % ' '.join(['"%s#%s"' %
> (f['path'],
> - f['rev'])
> - for f in files]))
> + # We cannot put all the files on the command line
> + # OS have limitations on the max lenght of arguments
> + # POSIX says it's 4096 bytes, default for Linux seems to be 130 K.
> + # and all OS from the table below seems to be higher than POSIX.
> + # See http://www.in-ulm.de/~mascheck/various/argmax/
No need to hardcode - from Python this is
os.sysconf(os.sysconf_names['SC_ARG_MAX'])
> + chunk = ''
> + filedata = []
> + for i in xrange(len(files)):
> + f = files[i]
> + chunk += '"%s#%s" ' % (f['path'], f['rev'])
> + if len(chunk) > 4000 or i == len(files)-1:
> + data = p4CmdList('print %s' % chunk)
> + if "p4ExitCode" in data[0]:
> + die("Problems executing p4. Error: [%d]." %
> (data[0]['p4ExitCode']));
> + filedata.extend(data)
> + chunk = ''
>
> j = 0;
> contents = {}
> @@ -1486,3 +1499,5 @@ def main():
>
> if __name__ == '__main__':
> main()
> +
> +# vim: set filetype=python sts=4 sw=4 et si :
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: git-p4 fails when cloning a p4 depo.
2007-06-08 21:31 ` Scott Lamb
@ 2007-06-08 21:34 ` Scott Lamb
2007-06-08 22:04 ` Benjamin Sergeant
0 siblings, 1 reply; 14+ messages in thread
From: Scott Lamb @ 2007-06-08 21:34 UTC (permalink / raw)
To: Benjamin Sergeant; +Cc: git
Scott Lamb wrote:
> No need to hardcode - from Python this is
> os.sysconf(os.sysconf_names['SC_ARG_MAX'])
In fact, just os.sysconf('SC_ARG_MAX') will do.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: git-p4 fails when cloning a p4 depo.
2007-06-08 21:34 ` Scott Lamb
@ 2007-06-08 22:04 ` Benjamin Sergeant
2007-06-08 22:25 ` Benjamin Sergeant
2007-06-08 23:33 ` Han-Wen Nienhuys
0 siblings, 2 replies; 14+ messages in thread
From: Benjamin Sergeant @ 2007-06-08 22:04 UTC (permalink / raw)
To: Scott Lamb; +Cc: git
On 6/8/07, Scott Lamb <slamb@slamb.org> wrote:
> Scott Lamb wrote:
> > No need to hardcode - from Python this is
> > os.sysconf(os.sysconf_names['SC_ARG_MAX'])
>
> In fact, just os.sysconf('SC_ARG_MAX') will do.
>
magic number are lot of fun, why would you want to use the clean method :)
So are you saying that in the old days, git-p4 was importing the p4
depo in small slices to not overkill the process memory (in case the
depo is big) ?
BTW, there is the whole universe in my depot, so using -//Work/Users
in my client specification I usually manage to have less megs of code
on my disk after a sync.
This way the git-p4 clone would not use too much memory. But we would
have to change the way git-p4 works, it should be able to read a full
client view instead of just a single perforce path.
Would you give me the git command to fetch the up
git clone <git-url> --date <the good date> ?
Thanks,
Benjamin.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: git-p4 fails when cloning a p4 depo.
2007-06-08 22:04 ` Benjamin Sergeant
@ 2007-06-08 22:25 ` Benjamin Sergeant
2007-06-08 23:33 ` Han-Wen Nienhuys
1 sibling, 0 replies; 14+ messages in thread
From: Benjamin Sergeant @ 2007-06-08 22:25 UTC (permalink / raw)
To: Scott Lamb; +Cc: git
I got my git-p4 from here BTW.
http://repo.or.cz/w/fast-export.git
On 6/8/07, Benjamin Sergeant <bsergean@gmail.com> wrote:
> On 6/8/07, Scott Lamb <slamb@slamb.org> wrote:
> > Scott Lamb wrote:
> > > No need to hardcode - from Python this is
> > > os.sysconf(os.sysconf_names['SC_ARG_MAX'])
> >
> > In fact, just os.sysconf('SC_ARG_MAX') will do.
> >
>
> magic number are lot of fun, why would you want to use the clean method :)
>
> So are you saying that in the old days, git-p4 was importing the p4
> depo in small slices to not overkill the process memory (in case the
> depo is big) ?
>
> BTW, there is the whole universe in my depot, so using -//Work/Users
> in my client specification I usually manage to have less megs of code
> on my disk after a sync.
> This way the git-p4 clone would not use too much memory. But we would
> have to change the way git-p4 works, it should be able to read a full
> client view instead of just a single perforce path.
>
> Would you give me the git command to fetch the up
> git clone <git-url> --date <the good date> ?
>
> Thanks,
> Benjamin.
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: git-p4 fails when cloning a p4 depo.
2007-06-08 18:13 ` Benjamin Sergeant
2007-06-08 21:31 ` Scott Lamb
@ 2007-06-08 22:38 ` Simon Hausmann
2007-06-12 1:07 ` Han-Wen Nienhuys
2007-06-12 1:08 ` Han-Wen Nienhuys
2007-06-12 1:13 ` Han-Wen Nienhuys
2 siblings, 2 replies; 14+ messages in thread
From: Simon Hausmann @ 2007-06-08 22:38 UTC (permalink / raw)
To: Han-Wen Nienhuys; +Cc: Benjamin Sergeant, git
[-- Attachment #1: Type: text/plain, Size: 948 bytes --]
On Friday 08 June 2007 20:13:55 Benjamin Sergeant wrote:
> A perforce command with all the files in the repo is generated to get
> all the file content.
> Here is a patch to break it into multiple successive perforce command
> who uses 4K of parameter max, and collect the output for later.
>
> It works, but not for big depos, because the whole perforce depo
> content is stored in memory in P4Sync.run(), and it looks like mine is
> bigger than 2 Gigs, so I had to kill the process.
I'd be generally fine with splitting up the "p4 print ..." calls into chunks
but you have a good point with the memory usage. The old approach of calling
print per file did not have any of those limitations. Han-Wen, what do you
think? How much of a performance improvement is the batched print?
(I didn't notice any immediate difference, but then I have a very fast
connection to the perforce server and usually small changesets)
Simon
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: git-p4 fails when cloning a p4 depo.
2007-06-08 22:04 ` Benjamin Sergeant
2007-06-08 22:25 ` Benjamin Sergeant
@ 2007-06-08 23:33 ` Han-Wen Nienhuys
2007-06-09 0:32 ` Benjamin Sergeant
1 sibling, 1 reply; 14+ messages in thread
From: Han-Wen Nienhuys @ 2007-06-08 23:33 UTC (permalink / raw)
To: git; +Cc: Scott Lamb
Benjamin Sergeant escreveu:
> So are you saying that in the old days, git-p4 was importing the p4
> depo in small slices to not overkill the process memory (in case the
> depo is big) ?
no, in the "old days" git-p4 used a separate p4 invocation for each file.
--
Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: git-p4 fails when cloning a p4 depo.
2007-06-08 23:33 ` Han-Wen Nienhuys
@ 2007-06-09 0:32 ` Benjamin Sergeant
0 siblings, 0 replies; 14+ messages in thread
From: Benjamin Sergeant @ 2007-06-09 0:32 UTC (permalink / raw)
To: hanwen; +Cc: git, Scott Lamb
On 6/8/07, Han-Wen Nienhuys <hanwen@xs4all.nl> wrote:
> Benjamin Sergeant escreveu:
>
> > So are you saying that in the old days, git-p4 was importing the p4
> > depo in small slices to not overkill the process memory (in case the
> > depo is big) ?
>
> no, in the "old days" git-p4 used a separate p4 invocation for each file.
>
Anyway, in case you hit command line lenght limit here it is. That
might be interesting for the "next days" :)
Benjamin.
[bsergean@flanders fast-export]$ git format-patch -k -m --stdout origin
From 45f2dbdb9a8c0b3beb007ae892613cdc4afab80a Mon Sep 17 00:00:00 2001
From: Benjamin Sergeant <bsergean@flanders.(none)>
Date: Fri, 8 Jun 2007 09:58:57 -0700
Subject: Split p4 print call into multiple call to not exceed the
command line lenght maximum
---
git-p4 | 21 ++++++++++++++++++---
1 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/git-p4 b/git-p4
index 36fe69a..906b193 100755
--- a/git-p4
+++ b/git-p4
@@ -703,9 +703,22 @@ class P4Sync(Command):
if not files:
return
- filedata = p4CmdList('print %s' % ' '.join(['"%s#%s"' % (f['path'],
- f['rev'])
- for f in files]))
+ # We cannot put all the files on the command line
+ # OS have limitations on the max lenght of arguments
+ # POSIX says it's 4096 bytes, default for Linux seems to be 130 K.
+ # and all OS from the table below seems to be higher than POSIX.
+ # See http://www.in-ulm.de/~mascheck/various/argmax/
+ chunk = ''
+ filedata = []
+ for i in xrange(len(files)):
+ f = files[i]
+ chunk += '"%s#%s" ' % (f['path'], f['rev'])
+ if len(chunk) > 4000 or i == len(files)-1:
+ data = p4CmdList('print %s' % chunk)
+ if "p4ExitCode" in data[0]:
+ die("Problems executing p4. Error: [%d]." %
(data[0]['p4ExitCode']));
+ filedata.extend(data)
+ chunk = ''
j = 0;
contents = {}
@@ -1486,3 +1499,5 @@ def main():
if __name__ == '__main__':
main()
+
+# vim: set filetype=python sts=4 sw=4 et si :
--
1.5.0.4
>From dd9975708433efeec37b608755f54fbeaedf0f3f Mon Sep 17 00:00:00 2001
From: Benjamin Sergeant <bsergean@flanders.(none)>
Date: Fri, 8 Jun 2007 10:20:39 -0700
Subject: Use os.sysconf('SC_ARG_MAX') to retrieve the max value, and
build the string using join (faster)
---
git-p4 | 17 ++++++++++-------
1 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/git-p4 b/git-p4
index 906b193..8dc1963 100755
--- a/git-p4
+++ b/git-p4
@@ -705,20 +705,23 @@ class P4Sync(Command):
# We cannot put all the files on the command line
# OS have limitations on the max lenght of arguments
- # POSIX says it's 4096 bytes, default for Linux seems to be 130 K.
- # and all OS from the table below seems to be higher than POSIX.
# See http://www.in-ulm.de/~mascheck/various/argmax/
- chunk = ''
+ chunks = []
+ chunkLenght = 0
filedata = []
+ maxlenght = max(int(os.sysconf('SC_ARG_MAX') * 0.90), 4000)
+ print maxlenght
for i in xrange(len(files)):
f = files[i]
- chunk += '"%s#%s" ' % (f['path'], f['rev'])
- if len(chunk) > 4000 or i == len(files)-1:
- data = p4CmdList('print %s' % chunk)
+ chunkLenght += len(f['path']) + len(f['rev'])
+ chunks.append('"%s#%s" ' % (f['path'], f['rev']))
+ if chunkLenght > maxlenght or i == len(files)-1:
+ data = p4CmdList('print %s' % ' '.join(chunks))
if "p4ExitCode" in data[0]:
die("Problems executing p4. Error: [%d]." %
(data[0]['p4ExitCode']));
filedata.extend(data)
- chunk = ''
+ chunks = []
+ chunkLenght = 0
j = 0;
contents = {}
--
1.5.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: git-p4 fails when cloning a p4 depo.
2007-06-08 22:38 ` Simon Hausmann
@ 2007-06-12 1:07 ` Han-Wen Nienhuys
2007-06-12 1:08 ` Han-Wen Nienhuys
1 sibling, 0 replies; 14+ messages in thread
From: Han-Wen Nienhuys @ 2007-06-12 1:07 UTC (permalink / raw)
To: git; +Cc: Han-Wen Nienhuys, Benjamin Sergeant, git
Simon Hausmann escreveu:
> On Friday 08 June 2007 20:13:55 Benjamin Sergeant wrote:
>> A perforce command with all the files in the repo is generated to get
>> all the file content.
>> Here is a patch to break it into multiple successive perforce command
>> who uses 4K of parameter max, and collect the output for later.
>>
>> It works, but not for big depos, because the whole perforce depo
>> content is stored in memory in P4Sync.run(), and it looks like mine is
>> bigger than 2 Gigs, so I had to kill the process.
>
> I'd be generally fine with splitting up the "p4 print ..." calls into chunks
> but you have a good point with the memory usage. The old approach of calling
> print per file did not have any of those limitations. Han-Wen, what do you
> think? How much of a performance improvement is the batched print?
One unscientific measurement (getting 1 file vs. 30 files)
indicates that this is about 5x times faster.
--
Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: git-p4 fails when cloning a p4 depo.
2007-06-08 22:38 ` Simon Hausmann
2007-06-12 1:07 ` Han-Wen Nienhuys
@ 2007-06-12 1:08 ` Han-Wen Nienhuys
1 sibling, 0 replies; 14+ messages in thread
From: Han-Wen Nienhuys @ 2007-06-12 1:08 UTC (permalink / raw)
To: git; +Cc: Benjamin Sergeant
Simon Hausmann escreveu:
> On Friday 08 June 2007 20:13:55 Benjamin Sergeant wrote:
>> A perforce command with all the files in the repo is generated to get
>> all the file content.
>> Here is a patch to break it into multiple successive perforce command
>> who uses 4K of parameter max, and collect the output for later.
>>
>> It works, but not for big depos, because the whole perforce depo
>> content is stored in memory in P4Sync.run(), and it looks like mine is
>> bigger than 2 Gigs, so I had to kill the process.
>
> I'd be generally fine with splitting up the "p4 print ..." calls into chunks
> but you have a good point with the memory usage. The old approach of calling
> print per file did not have any of those limitations. Han-Wen, what do you
> think? How much of a performance improvement is the batched print?
One unscientific measurement (getting 1 file vs. 30 files)
indicates that this is about 5x times faster.
--
Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: git-p4 fails when cloning a p4 depo.
2007-06-08 18:13 ` Benjamin Sergeant
2007-06-08 21:31 ` Scott Lamb
2007-06-08 22:38 ` Simon Hausmann
@ 2007-06-12 1:13 ` Han-Wen Nienhuys
2007-06-17 8:11 ` Simon Hausmann
2 siblings, 1 reply; 14+ messages in thread
From: Han-Wen Nienhuys @ 2007-06-12 1:13 UTC (permalink / raw)
To: git
Benjamin Sergeant escreveu:
> A perforce command with all the files in the repo is generated to get
> all the file content.
> Here is a patch to break it into multiple successive perforce command
> who uses 4K of parameter max, and collect the output for later.
>
> It works, but not for big depos, because the whole perforce depo
> content is stored in memory in P4Sync.run(), and it looks like mine is
> bigger than 2 Gigs, so I had to kill the process.
General idea of the patch is ok. some nits:
> + chunk = ''
> + filedata = []
> + for i in xrange(len(files)):
why not
for f in files:
?
> + f = files[i]
> + chunk += '"%s#%s" ' % (f['path'], f['rev'])
> + if len(chunk) > 4000 or i == len(files)-1:
4k seems reasonable enough, but can you take the min() with
os.sysconf('SC_ARG_MAX') ?
Can you address this and resend so we can apply the patch?
Thanks.
--
Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: git-p4 fails when cloning a p4 depo.
2007-06-12 1:13 ` Han-Wen Nienhuys
@ 2007-06-17 8:11 ` Simon Hausmann
2007-06-17 16:09 ` Benjamin Sergeant
0 siblings, 1 reply; 14+ messages in thread
From: Simon Hausmann @ 2007-06-17 8:11 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 1335 bytes --]
On Tuesday 12 June 2007 03:13:17 Han-Wen Nienhuys wrote:
> Benjamin Sergeant escreveu:
> > A perforce command with all the files in the repo is generated to get
> > all the file content.
> > Here is a patch to break it into multiple successive perforce command
> > who uses 4K of parameter max, and collect the output for later.
> >
> > It works, but not for big depos, because the whole perforce depo
> > content is stored in memory in P4Sync.run(), and it looks like mine is
> > bigger than 2 Gigs, so I had to kill the process.
>
> General idea of the patch is ok. some nits:
> > + chunk = ''
> > + filedata = []
> > + for i in xrange(len(files)):
>
> why not
>
> for f in files:
>
> ?
It seems 'i' is used a bit later. Is there a nicer way to express this in
python?
> > + f = files[i]
> > + chunk += '"%s#%s" ' % (f['path'], f['rev'])
> > + if len(chunk) > 4000 or i == len(files)-1:
>
> 4k seems reasonable enough, but can you take the min() with
> os.sysconf('SC_ARG_MAX') ?
>
> Can you address this and resend so we can apply the patch?
> Thanks.
Since I ran into the very problem of a too long commandline myself yesterday I
took the liberty of adding the SC_ARG_MAX bit to Benjamin's patch and
comitting it then.
Simon
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: git-p4 fails when cloning a p4 depo.
2007-06-17 8:11 ` Simon Hausmann
@ 2007-06-17 16:09 ` Benjamin Sergeant
0 siblings, 0 replies; 14+ messages in thread
From: Benjamin Sergeant @ 2007-06-17 16:09 UTC (permalink / raw)
To: Simon Hausmann; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 1812 bytes --]
On 6/17/07, Simon Hausmann <simon@lst.de> wrote:
> On Tuesday 12 June 2007 03:13:17 Han-Wen Nienhuys wrote:
> > Benjamin Sergeant escreveu:
> > > A perforce command with all the files in the repo is generated to get
> > > all the file content.
> > > Here is a patch to break it into multiple successive perforce command
> > > who uses 4K of parameter max, and collect the output for later.
> > >
> > > It works, but not for big depos, because the whole perforce depo
> > > content is stored in memory in P4Sync.run(), and it looks like mine is
> > > bigger than 2 Gigs, so I had to kill the process.
> >
> > General idea of the patch is ok. some nits:
> > > + chunk = ''
> > > + filedata = []
> > > + for i in xrange(len(files)):
> >
> > why not
> >
> > for f in files:
> >
> > ?
>
> It seems 'i' is used a bit later. Is there a nicer way to express this in
> python?
>
> > > + f = files[i]
> > > + chunk += '"%s#%s" ' % (f['path'], f['rev'])
> > > + if len(chunk) > 4000 or i == len(files)-1:
> >
> > 4k seems reasonable enough, but can you take the min() with
> > os.sysconf('SC_ARG_MAX') ?
> >
> > Can you address this and resend so we can apply the patch?
> > Thanks.
>
> Since I ran into the very problem of a too long commandline myself yesterday I
> took the liberty of adding the SC_ARG_MAX bit to Benjamin's patch and
> comitting it then.
>
Cool.
(probably useless but)
For what it's worth ; Here is a tar file with 2 patchs:
- The original one
- The second one that adds the SC_ARG_MAX
BTW, doing a += on a string is not supposed to be fast, appending elem
to a sequence and then using ' '.join on them to build the big string
is said to be faster (I did not timed it thought). (in the second
patch also)
Thanks,
Benjamin.
>
> Simon
>
>
[-- Attachment #2: p4-sync-chunks.tar --]
[-- Type: application/x-tar, Size: 10240 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2007-06-17 16:09 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-08 16:41 git-p4 fails when cloning a p4 depo Benjamin Sergeant
2007-06-08 18:13 ` Benjamin Sergeant
2007-06-08 21:31 ` Scott Lamb
2007-06-08 21:34 ` Scott Lamb
2007-06-08 22:04 ` Benjamin Sergeant
2007-06-08 22:25 ` Benjamin Sergeant
2007-06-08 23:33 ` Han-Wen Nienhuys
2007-06-09 0:32 ` Benjamin Sergeant
2007-06-08 22:38 ` Simon Hausmann
2007-06-12 1:07 ` Han-Wen Nienhuys
2007-06-12 1:08 ` Han-Wen Nienhuys
2007-06-12 1:13 ` Han-Wen Nienhuys
2007-06-17 8:11 ` Simon Hausmann
2007-06-17 16:09 ` Benjamin Sergeant
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).