From: Luke Diamand <luke@diamand.org>
To: Lex Spoon <lex@lexspoon.org>,
git@vger.kernel.org, Pete Wyckoff <pw@padd.com>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v3] git-p4: Use -m when running p4 changes
Date: Mon, 20 Apr 2015 10:53:07 +0100 [thread overview]
Message-ID: <5534CC83.2000304@diamand.org> (raw)
In-Reply-To: <1429312285-13552-1-git-send-email-lex@lexspoon.org>
On 18/04/15 00:11, Lex Spoon wrote:
> Simply running "p4 changes" on a large branch can
> result in a "too many rows scanned" error from the
> Perforce server. It is better to use a sequence
> of smaller calls to "p4 changes", using the "-m"
> option to limit the size of each call.
>
> Signed-off-by: Lex Spoon <lex@lexspoon.org>
> Reviewed-by: Junio C Hamano <gitster@pobox.com>
> Reviewed-by: Luke Diamand <luke@diamand.org>
I could be wrong about this, but it looks like importNewBranches() is
taking an extra argument, but that isn't reflected in the place where it
gets called. I think it just got missed.
As a result, t9801-git-p4-branch.sh fails with this error:
Importing revision 3 (37%)
Importing new branch depot/branch1
Traceback (most recent call last):
File "/home/lgd/git/git/git-p4", line 3327, in <module>
main()
File "/home/lgd/git/git/git-p4", line 3321, in main
if not cmd.run(args):
File "/home/lgd/git/git/git-p4", line 3195, in run
if not P4Sync.run(self, depotPaths):
File "/home/lgd/git/git/git-p4", line 3057, in run
self.importChanges(changes)
File "/home/lgd/git/git/git-p4", line 2692, in importChanges
if self.importNewBranch(branch, change - 1):
TypeError: importNewBranch() takes exactly 4 arguments (3 given)
rm: cannot remove `/home/lgd/git/git/t/trash
directory.t9801-git-p4-branch/git/.git/objects/pack': Directory not empty
not ok 8 - import depot, branch detection, branchList branch definition
Thanks!
Luke
> ---
> Updated as suggested:
> - documentation added
> - avoided touch(1)
> - used test_seq
> - used || exit for test commands inside for loops
> - more tabs
> - fewer line breaks
> - expanded commit message
>
> Documentation/git-p4.txt | 17 ++++++++++---
> git-p4.py | 54 +++++++++++++++++++++++++++++++---------
> t/t9818-git-p4-block.sh | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 120 insertions(+), 15 deletions(-)
> create mode 100755 t/t9818-git-p4-block.sh
>
> diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
> index a1664b9..82aa5d6 100644
> --- a/Documentation/git-p4.txt
> +++ b/Documentation/git-p4.txt
> @@ -225,9 +225,20 @@ Git repository:
> they can find the p4 branches in refs/heads.
>
> --max-changes <n>::
> - Limit the number of imported changes to 'n'. Useful to
> - limit the amount of history when using the '@all' p4 revision
> - specifier.
> + Import at most 'n' changes, rather than the entire range of
> + changes included in the given revision specifier. A typical
> + usage would be use '@all' as the revision specifier, but then
> + to use '--max-changes 1000' to import only the last 1000
> + revisions rather than the entire revision history.
> +
> +--changes-block-size <n>::
> + The internal block size to use when converting a revision
> + specifier such as '@all' into a list of specific change
> + numbers. Instead of using a single call to 'p4 changes' to
> + find the full list of changes for the conversion, there are a
> + sequence of calls to 'p4 changes -m', each of which requests
> + one block of changes of the given size. The default block size
> + is 500, which should usually be suitable.
>
> --keep-path::
> The mapping of file names from the p4 depot path to Git, by
> diff --git a/git-p4.py b/git-p4.py
> index 549022e..1fba3aa 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -740,17 +740,43 @@ def createOrUpdateBranchesFromOrigin(localRefPrefix = "refs/remotes/p4/", silent
> def originP4BranchesExist():
> return gitBranchExists("origin") or gitBranchExists("origin/p4") or gitBranchExists("origin/p4/master")
>
> -def p4ChangesForPaths(depotPaths, changeRange):
> +def p4ChangesForPaths(depotPaths, changeRange, block_size):
> assert depotPaths
> - cmd = ['changes']
> - for p in depotPaths:
> - cmd += ["%s...%s" % (p, changeRange)]
> - output = p4_read_pipe_lines(cmd)
> + assert block_size
> +
> + # Parse the change range into start and end
> + if changeRange is None or changeRange == '':
> + changeStart = '@1'
> + changeEnd = '#head'
> + else:
> + parts = changeRange.split(',')
> + assert len(parts) == 2
> + changeStart = parts[0]
> + changeEnd = parts[1]
>
> + # Accumulate change numbers in a dictionary to avoid duplicates
> changes = {}
> - for line in output:
> - changeNum = int(line.split(" ")[1])
> - changes[changeNum] = True
> +
> + for p in depotPaths:
> + # Retrieve changes a block at a time, to prevent running
> + # into a MaxScanRows error from the server.
> + start = changeStart
> + end = changeEnd
> + get_another_block = True
> + while get_another_block:
> + new_changes = []
> + cmd = ['changes']
> + cmd += ['-m', str(block_size)]
> + cmd += ["%s...%s,%s" % (p, start, end)]
> + for line in p4_read_pipe_lines(cmd):
> + changeNum = int(line.split(" ")[1])
> + new_changes.append(changeNum)
> + changes[changeNum] = True
> + if len(new_changes) == block_size:
> + get_another_block = True
> + end = '@' + str(min(new_changes))
> + else:
> + get_another_block = False
>
> changelist = changes.keys()
> changelist.sort()
> @@ -1911,7 +1937,10 @@ class P4Sync(Command, P4UserMap):
> optparse.make_option("--import-labels", dest="importLabels", action="store_true"),
> optparse.make_option("--import-local", dest="importIntoRemotes", action="store_false",
> help="Import into refs/heads/ , not refs/remotes"),
> - optparse.make_option("--max-changes", dest="maxChanges"),
> + optparse.make_option("--max-changes", dest="maxChanges",
> + help="Maximum number of changes to import"),
> + optparse.make_option("--changes-block-size", dest="changes_block_size", type="int",
> + help="Internal block size to use when iteratively calling p4 changes"),
> optparse.make_option("--keep-path", dest="keepRepoPath", action='store_true',
> help="Keep entire BRANCH/DIR/SUBDIR prefix during import"),
> optparse.make_option("--use-client-spec", dest="useClientSpec", action='store_true',
> @@ -1940,6 +1969,7 @@ class P4Sync(Command, P4UserMap):
> self.syncWithOrigin = True
> self.importIntoRemotes = True
> self.maxChanges = ""
> + self.changes_block_size = 500
> self.keepRepoPath = False
> self.depotPaths = None
> self.p4BranchesInGit = []
> @@ -2578,7 +2608,7 @@ class P4Sync(Command, P4UserMap):
>
> return ""
>
> - def importNewBranch(self, branch, maxChange):
> + def importNewBranch(self, branch, maxChange, changes_block_size):
> # make fast-import flush all changes to disk and update the refs using the checkpoint
> # command so that we can try to find the branch parent in the git history
> self.gitStream.write("checkpoint\n\n");
> @@ -2586,7 +2616,7 @@ class P4Sync(Command, P4UserMap):
> branchPrefix = self.depotPaths[0] + branch + "/"
> range = "@1,%s" % maxChange
> #print "prefix" + branchPrefix
> - changes = p4ChangesForPaths([branchPrefix], range)
> + changes = p4ChangesForPaths([branchPrefix], range, changes_block_size)
> if len(changes) <= 0:
> return False
> firstChange = changes[0]
> @@ -3002,7 +3032,7 @@ class P4Sync(Command, P4UserMap):
> if self.verbose:
> print "Getting p4 changes for %s...%s" % (', '.join(self.depotPaths),
> self.changeRange)
> - changes = p4ChangesForPaths(self.depotPaths, self.changeRange)
> + changes = p4ChangesForPaths(self.depotPaths, self.changeRange, self.changes_block_size)
>
> if len(self.maxChanges) > 0:
> changes = changes[:min(int(self.maxChanges), len(changes))]
> diff --git a/t/t9818-git-p4-block.sh b/t/t9818-git-p4-block.sh
> new file mode 100755
> index 0000000..153b20a
> --- /dev/null
> +++ b/t/t9818-git-p4-block.sh
> @@ -0,0 +1,64 @@
> +#!/bin/sh
> +
> +test_description='git p4 fetching changes in multiple blocks'
> +
> +. ./lib-git-p4.sh
> +
> +test_expect_success 'start p4d' '
> + start_p4d
> +'
> +
> +test_expect_success 'Create a repo with ~100 changes' '
> + (
> + cd "$cli" &&
> + >file.txt &&
> + p4 add file.txt &&
> + p4 submit -d "Add file.txt" &&
> + for i in $(test_seq 0 9)
> + do
> + >outer$i.txt &&
> + p4 add outer$i.txt &&
> + p4 submit -d "Adding outer$i.txt" &&
> + for j in $(test_seq 0 9)
> + do
> + p4 edit file.txt &&
> + echo $i$j >file.txt &&
> + p4 submit -d "Commit $i$j" || exit
> + done || exit
> + done
> + )
> +'
> +
> +test_expect_success 'Clone the repo' '
> + git p4 clone --dest="$git" --changes-block-size=10 --verbose //depot@all
> +'
> +
> +test_expect_success 'All files are present' '
> + echo file.txt >expected &&
> + test_write_lines outer0.txt outer1.txt outer2.txt outer3.txt outer4.txt >>expected &&
> + test_write_lines outer5.txt outer6.txt outer7.txt outer8.txt outer9.txt >>expected &&
> + ls "$git" >current &&
> + test_cmp expected current
> +'
> +
> +test_expect_success 'file.txt is correct' '
> + echo 99 >expected &&
> + test_cmp expected "$git/file.txt"
> +'
> +
> +test_expect_success 'Correct number of commits' '
> + (cd "$git" && git log --oneline) >log &&
> + test_line_count = 111 log
> +'
> +
> +test_expect_success 'Previous version of file.txt is correct' '
> + (cd "$git" && git checkout HEAD^^) &&
> + echo 97 >expected &&
> + test_cmp expected "$git/file.txt"
> +'
> +
> +test_expect_success 'kill p4d' '
> + kill_p4d
> +'
> +
> +test_done
>
next prev parent reply other threads:[~2015-04-20 9:53 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CALM2SnY62u3OXJOMSqSfghH_NYwZhzSedm3-wcde-dQCX6eB9Q@mail.gmail.com>
2015-04-14 11:25 ` [PATCH] git-p4: Use -m when running p4 changes Luke Diamand
[not found] ` <CALM2SnY=ZcSMSXk6Ks0uU65gPX5vC8QKG+iSrQxd3X7N=sw+Ww@mail.gmail.com>
2015-04-14 19:48 ` Lex Spoon
2015-04-15 3:47 ` Lex Spoon
2015-04-16 18:56 ` Junio C Hamano
2015-04-16 23:15 ` Luke Diamand
2015-04-17 13:20 ` Lex Spoon
2015-04-17 23:11 ` [PATCH v3] " Lex Spoon
2015-04-20 9:53 ` Luke Diamand [this message]
2015-04-20 14:30 ` Lex Spoon
2015-04-20 15:00 ` [PATCH v4] " Lex Spoon
2015-04-20 15:15 ` Luke Diamand
2015-04-20 15:25 ` Lex Spoon
2015-04-20 19:17 ` Luke Diamand
2015-04-20 17:54 ` Junio C Hamano
2015-04-20 18:04 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5534CC83.2000304@diamand.org \
--to=luke@diamand.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=lex@lexspoon.org \
--cc=pw@padd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).