From: Junio C Hamano <gitster@pobox.com>
To: "SZEDER Gábor" <szeder.dev@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] bisect: loosen halfway() check for a large number of commits
Date: Thu, 22 Oct 2020 10:18:46 -0700 [thread overview]
Message-ID: <xmqqv9f2mb61.fsf@gitster.c.googlers.com> (raw)
In-Reply-To: <20201022103806.26680-1-szeder.dev@gmail.com> ("SZEDER Gábor"'s message of "Thu, 22 Oct 2020 12:38:06 +0200")
SZEDER Gábor <szeder.dev@gmail.com> writes:
> However, when we have thousands of commits it's not all that important
> to find the _exact_ halfway point, a few commits more or less doesn't
> make any real difference for the bisection.
Cute idea.
> So I ran some tests to see how often that happens: picked random good
> and bad starting revisions at least 50k commits apart and a random
> first bad commit in between in git.git, and used 'git bisect run git
> merge-base --is-ancestor HEAD $first_bad_commit' to check the number
> of necessary bisection steps. After repeating all this 1000 times
> both with and without this patch I found that:
>
> - 146 cases needed one more bisection step than before, 149 cases
> needed one less step, while in the remaining 705 cases the number
> of steps didn't change. So the number of bisection steps does
> indeed change in a non-negligible number of cases, but it seems
> that the average number of steps doesn't change in the long run.
It somehow is a bit surprising that there are cases that need fewer
steps, but I guess that is how rounding-error cuts both ways?
> - The first 'git bisect start' command got over 3x faster in 456
> cases, so this "no commit at the exact halfway point" case seems
> to be common enough to care about.
In any case, I like the re-realization that the counting reachable
commits in a mergy history is costly (see comments before the
count_distance() function that we already knew it from the
beginning, though), and the general idea of speeding up the entire
thing by avoiding the cost we need to pay in the count_distance()
function, which my earlier 1c4fea3a (git-rev-list --bisect:
optimization, 2007-03-21) also did.
Side note. I've been waiting for all these years to see
somebody new comes up and makes a fundamental change to
count_distance() such that it no longer is costly---alas,
that hasn't happened yet.
Mildly (only because such a bisection session over a long span is
rarer) excited to see this RFC completed ;-)
next prev parent reply other threads:[~2020-10-22 17:18 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-22 10:38 [PATCH] bisect: loosen halfway() check for a large number of commits SZEDER Gábor
2020-10-22 10:40 ` SZEDER Gábor
2020-10-22 17:18 ` Junio C Hamano [this message]
2020-10-24 7:41 ` Christian Couder
2020-10-25 18:01 ` SZEDER Gábor
2020-11-12 16:19 ` [PATCH v2] " SZEDER Gábor
2020-11-12 18:23 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqv9f2mb61.fsf@gitster.c.googlers.com \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).