All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael J Gruber <git@drmicha.warpmail.net>
To: Git Mailing List <git@vger.kernel.org>
Subject: [RFH] git cherry vs. git rev-list --cherry, or: Why does "..." suck?
Date: Tue, 22 Mar 2011 13:07:53 +0100	[thread overview]
Message-ID: <4D889119.3020009@drmicha.warpmail.net> (raw)

In the process of converting "git cherry" and "git format patch" to use
the new rev-list options (the saner way according to d7a17ca (git-log
--cherry-pick A...B, 2007-04-09) already!), I have a simple question and
a hard one which I both ask help for:

run_command
===========

I could use either run_command_v_opt(args, RUN_GIT_CMD) or setup the
walker, call it etc. For the former I have to check how to treat the
third argument to "git cherry", the latter seems to be more code (and I
would need to call the rev-list/log output loop somehow).

Is there a general preference for using or avoiding run_command?

(There's also the question of what details of git cherry's output I need
to preserve.)


Performance
===========

I don't get this:

git cherry A B: 0.4s
git rev-list --cherry A...B: 1.7s
(more details below)

This makes "rev-list --cherry" almost unacceptable as a replacement. But
I'd like to understand this difference (and maybe do something about
it). I'm lost with gprof, but here are more details on the timing:

A is pu at 0f169fc
B is next at 5ddab49 plus three commits which are not upstream

rev-list --count 5ddab49..A is 166 (117 without merges), for B it is 3

Now the timings (rev-list done with --count):

cherry A B: 0.4s
cherry B A: 0.4s
rev-list --cherry A...B: 1.7s

The latter computes merge bases (there are 25), the former does not. How
much is it:

merge-base A B: 0.95s
merge-base --all A B: 0.95s
rev-parse A...B: 0.95s

So this accounts for much of the difference (and we need to do something
about get_merge_bases()), but not all. How much is the patch-id computation:

rev-list --no-merges --right-only --cherry-pick A...B: 1.7s
(the above is --cherry)
rev-list --no-merges --right-only A...B: 1.0s
rev-list --no-merges --left-right A...B: 1.0s

Why does it take rev-list 0.7s to do the same patch-id computations that
cherry does in less than 0.4s? (More details on what they do below.)

rev-list --no-merges A..B: 0.04s (counting to 3, yeah)
rev-list --no-merges A..B: 0.6s (counting to 117)

The latter has no patch-id nor merge computation. Should this really
take 0.6s?

I'm stomped. Help, please!

Michael

What the commands roughly do:

cherry A B [limit]:
===================
add pending B ^A
walk B..A (on temp rev_info) and
add_commit_patch_id() on these
clear_commit_marks()
add pending ^limit if specified
walk A..B and
reverse that list and
has_commit_patch_id() on these

rev-list --cherry A...B:
========================
get_merge_bases for A,B
add pending --not merge bases
add pending A B
add_commit_patch_id() on smaller side
has_commit_patch_id() on other side (&& mark id seen)
recheck smaller side (based on id->seen)

This seems to enumerate A..B and B..A more often, but is iterating
through a commit list that time consuming? The number of patch-id
computations is the same as far as I can see.

             reply	other threads:[~2011-03-22 12:11 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-22 12:07 Michael J Gruber [this message]
2011-03-23 16:46 ` [RFH] git cherry vs. git rev-list --cherry, or: Why does "..." suck? Michael J Gruber
2011-03-23 18:20   ` Junio C Hamano
2011-03-24  7:38     ` Michael J Gruber
2011-03-23 17:19 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D889119.3020009@drmicha.warpmail.net \
    --to=git@drmicha.warpmail.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.