git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFH] git cherry vs. git rev-list --cherry, or: Why does "..." suck?
@ 2011-03-22 12:07 Michael J Gruber
  2011-03-23 16:46 ` Michael J Gruber
  2011-03-23 17:19 ` Jeff King
  0 siblings, 2 replies; 5+ messages in thread
From: Michael J Gruber @ 2011-03-22 12:07 UTC (permalink / raw)
  To: Git Mailing List

In the process of converting "git cherry" and "git format patch" to use
the new rev-list options (the saner way according to d7a17ca (git-log
--cherry-pick A...B, 2007-04-09) already!), I have a simple question and
a hard one which I both ask help for:

run_command
===========

I could use either run_command_v_opt(args, RUN_GIT_CMD) or setup the
walker, call it etc. For the former I have to check how to treat the
third argument to "git cherry", the latter seems to be more code (and I
would need to call the rev-list/log output loop somehow).

Is there a general preference for using or avoiding run_command?

(There's also the question of what details of git cherry's output I need
to preserve.)


Performance
===========

I don't get this:

git cherry A B: 0.4s
git rev-list --cherry A...B: 1.7s
(more details below)

This makes "rev-list --cherry" almost unacceptable as a replacement. But
I'd like to understand this difference (and maybe do something about
it). I'm lost with gprof, but here are more details on the timing:

A is pu at 0f169fc
B is next at 5ddab49 plus three commits which are not upstream

rev-list --count 5ddab49..A is 166 (117 without merges), for B it is 3

Now the timings (rev-list done with --count):

cherry A B: 0.4s
cherry B A: 0.4s
rev-list --cherry A...B: 1.7s

The latter computes merge bases (there are 25), the former does not. How
much is it:

merge-base A B: 0.95s
merge-base --all A B: 0.95s
rev-parse A...B: 0.95s

So this accounts for much of the difference (and we need to do something
about get_merge_bases()), but not all. How much is the patch-id computation:

rev-list --no-merges --right-only --cherry-pick A...B: 1.7s
(the above is --cherry)
rev-list --no-merges --right-only A...B: 1.0s
rev-list --no-merges --left-right A...B: 1.0s

Why does it take rev-list 0.7s to do the same patch-id computations that
cherry does in less than 0.4s? (More details on what they do below.)

rev-list --no-merges A..B: 0.04s (counting to 3, yeah)
rev-list --no-merges A..B: 0.6s (counting to 117)

The latter has no patch-id nor merge computation. Should this really
take 0.6s?

I'm stomped. Help, please!

Michael

What the commands roughly do:

cherry A B [limit]:
===================
add pending B ^A
walk B..A (on temp rev_info) and
add_commit_patch_id() on these
clear_commit_marks()
add pending ^limit if specified
walk A..B and
reverse that list and
has_commit_patch_id() on these

rev-list --cherry A...B:
========================
get_merge_bases for A,B
add pending --not merge bases
add pending A B
add_commit_patch_id() on smaller side
has_commit_patch_id() on other side (&& mark id seen)
recheck smaller side (based on id->seen)

This seems to enumerate A..B and B..A more often, but is iterating
through a commit list that time consuming? The number of patch-id
computations is the same as far as I can see.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-03-24  7:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-22 12:07 [RFH] git cherry vs. git rev-list --cherry, or: Why does "..." suck? Michael J Gruber
2011-03-23 16:46 ` Michael J Gruber
2011-03-23 18:20   ` Junio C Hamano
2011-03-24  7:38     ` Michael J Gruber
2011-03-23 17:19 ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).