From: Michael J Gruber <git@drmicha.warpmail.net>
To: Git Mailing List <git@vger.kernel.org>
Subject: [RFH] git cherry vs. git rev-list --cherry, or: Why does "..." suck?
Date: Tue, 22 Mar 2011 13:07:53 +0100 [thread overview]
Message-ID: <4D889119.3020009@drmicha.warpmail.net> (raw)
In the process of converting "git cherry" and "git format patch" to use
the new rev-list options (the saner way according to d7a17ca (git-log
--cherry-pick A...B, 2007-04-09) already!), I have a simple question and
a hard one which I both ask help for:
run_command
===========
I could use either run_command_v_opt(args, RUN_GIT_CMD) or setup the
walker, call it etc. For the former I have to check how to treat the
third argument to "git cherry", the latter seems to be more code (and I
would need to call the rev-list/log output loop somehow).
Is there a general preference for using or avoiding run_command?
(There's also the question of what details of git cherry's output I need
to preserve.)
Performance
===========
I don't get this:
git cherry A B: 0.4s
git rev-list --cherry A...B: 1.7s
(more details below)
This makes "rev-list --cherry" almost unacceptable as a replacement. But
I'd like to understand this difference (and maybe do something about
it). I'm lost with gprof, but here are more details on the timing:
A is pu at 0f169fc
B is next at 5ddab49 plus three commits which are not upstream
rev-list --count 5ddab49..A is 166 (117 without merges), for B it is 3
Now the timings (rev-list done with --count):
cherry A B: 0.4s
cherry B A: 0.4s
rev-list --cherry A...B: 1.7s
The latter computes merge bases (there are 25), the former does not. How
much is it:
merge-base A B: 0.95s
merge-base --all A B: 0.95s
rev-parse A...B: 0.95s
So this accounts for much of the difference (and we need to do something
about get_merge_bases()), but not all. How much is the patch-id computation:
rev-list --no-merges --right-only --cherry-pick A...B: 1.7s
(the above is --cherry)
rev-list --no-merges --right-only A...B: 1.0s
rev-list --no-merges --left-right A...B: 1.0s
Why does it take rev-list 0.7s to do the same patch-id computations that
cherry does in less than 0.4s? (More details on what they do below.)
rev-list --no-merges A..B: 0.04s (counting to 3, yeah)
rev-list --no-merges A..B: 0.6s (counting to 117)
The latter has no patch-id nor merge computation. Should this really
take 0.6s?
I'm stomped. Help, please!
Michael
What the commands roughly do:
cherry A B [limit]:
===================
add pending B ^A
walk B..A (on temp rev_info) and
add_commit_patch_id() on these
clear_commit_marks()
add pending ^limit if specified
walk A..B and
reverse that list and
has_commit_patch_id() on these
rev-list --cherry A...B:
========================
get_merge_bases for A,B
add pending --not merge bases
add pending A B
add_commit_patch_id() on smaller side
has_commit_patch_id() on other side (&& mark id seen)
recheck smaller side (based on id->seen)
This seems to enumerate A..B and B..A more often, but is iterating
through a commit list that time consuming? The number of patch-id
computations is the same as far as I can see.
next reply other threads:[~2011-03-22 12:11 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-22 12:07 Michael J Gruber [this message]
2011-03-23 16:46 ` [RFH] git cherry vs. git rev-list --cherry, or: Why does "..." suck? Michael J Gruber
2011-03-23 18:20 ` Junio C Hamano
2011-03-24 7:38 ` Michael J Gruber
2011-03-23 17:19 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D889119.3020009@drmicha.warpmail.net \
--to=git@drmicha.warpmail.net \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).