From: Michael J Gruber <git@drmicha.warpmail.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [RFH] git cherry vs. git rev-list --cherry, or: Why does "..." suck?
Date: Thu, 24 Mar 2011 08:38:48 +0100 [thread overview]
Message-ID: <4D8AF508.7070709@drmicha.warpmail.net> (raw)
In-Reply-To: <7vfwqdem0p.fsf@alter.siamese.dyndns.org>
Junio C Hamano venit, vidit, dixit 23.03.2011 19:20:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
>
>> Adding some recent insight:
>>
>> Michael J Gruber venit, vidit, dixit 22.03.2011 13:07:
>>> Performance
>>> ===========
>>>
>>> I don't get this:
>>>
>>> git cherry A B: 0.4s
>>> git rev-list --cherry A...B: 1.7s
>>> (more details below)
>>
>> I can get the latter down to 0.95s and this
>>
>>> merge-base A B: 0.95s
>>> merge-base --all A B: 0.95s
>>> rev-parse A...B: 0.95s
>>
>> to 0.16s each. The downside is that merge-base may give a few
>> unneccessary candidates (commits which are ancestors of other commits it
>> returns), but this does not change the results for rev-list, of course.
>>
>> I get this dramatic speedup by removing the check for duplicates from
>> get_merge_bases_many() in commit.c. After a first merge_bases_many()
>> run, returning N commits, that check calls merge_bases_many() again for
>> each pair (N choose 2) to check whether one is contained in the other.
>> Quite a bottleneck. Removing it works great. But can we live with a few
>> additional merge bases?
>
> When we run merge-base as the top-level command (this includes
> reduce_heads() that is used by "git merge"), we have to cull unnecessary
> phantom bases that can be reached by other bases, so you are not allowed
> to make such a change unconditionally.
>
> Passing down a parameter from a caller that is prepared to handle phantom
> merge bases correctly is probably the right approach. Existing callers
> can make "safer" calls for now; you can later examine them and turn them
> into "faster" calls if they operate correctly given a result that contain
> phantom bases.
Yes, I was thinking of having thorough vs. fast mode, but I'll dig more
into merge_bases_many().
My current impression is that those phantom merge bases appear only
(mainly?) when there are severe date ordering problems in the dag.
merge_bases_many() uses commit_list_insert_by_date(), and given the way
it walks, later merge bases which are ancestors of another one get
marked STALE automatically.
For callers which are interested in one base only, the check makes a
difference only if there are date ordering problems.
In my A B example, merge_bases_many() comes up with the correct 25 ones
during its first run, and then gets called 300 times again (25*24/2) to
check each pair of them, without any reduction. (Clearly, it's a non
issue in the case of unique merge base.)
What I'm mainly interested in is the A...B case. And for a revision walk
with one A...B range and revs->limited, one could even embed the whole
mb-logic into the walk! But I don't see how to do this for multiple
symmetric ranges.
Michael
next prev parent reply other threads:[~2011-03-24 7:42 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-22 12:07 [RFH] git cherry vs. git rev-list --cherry, or: Why does "..." suck? Michael J Gruber
2011-03-23 16:46 ` Michael J Gruber
2011-03-23 18:20 ` Junio C Hamano
2011-03-24 7:38 ` Michael J Gruber [this message]
2011-03-23 17:19 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D8AF508.7070709@drmicha.warpmail.net \
--to=git@drmicha.warpmail.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).