From: Ingo Molnar <mingo@elte.hu>
To: david@lang.hm
Cc: Christian Couder <chriscool@tuxfamily.org>,
git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
Andreas Ericsson <ae@op5.se>, Jeff Garzik <jeff@garzik.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Bill Lear <rael@zopyra.com>, Jon Seymour <jon.seymour@gmail.com>,
Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: Article about "git bisect run" on LWN
Date: Fri, 6 Feb 2009 02:46:55 +0100 [thread overview]
Message-ID: <20090206014655.GA26807@elte.hu> (raw)
In-Reply-To: <alpine.DEB.1.10.0902051838180.5340@asgard.lang.hm>
* david@lang.hm <david@lang.hm> wrote:
> On Thu, 5 Feb 2009, Ingo Molnar wrote:
>
>> * Christian Couder <chriscool@tuxfamily.org> wrote:
>>
>>> Hi,
>>>
>>> For information, an article from me, 'Fully automated bisecting with "git
>>> bisect run"' has been published in today's edition of LWN on the
>>> development page:
>>>
>>> http://lwn.net/Articles/317154/
>>
>> Nice article!
>>
>> In terms of possible future enhancements of git bisect, here's a couple of
>> random ideas that would help my auto-bisection efforts:
>>
>> - Feature: support "Bisection Redundancy"
>>
>> This feature helps developers realize if a bug is sporadic. This happens
>> quite often in the kernel space: a bug looks deterministic, but down the
>> line it becomes sporadic. Sometimes a boot crash only occurs with a 75%
>> probability - and if one is unlucky it can cause a _lot_ of wasted
>> bisection time. The wrong commit gets blamed and the wrong set of
>> developers start scratching their heads. It's a reoccuring theme on lkml.
>>
>> What git could do here is to allow testers to inject a bit of extra
>> "redundancy" automatically, and use the redundant test-points to detect
>> conflicts in good/bad constraints.
>>
>> It would work like this:
>>
>> git bisect start --redundancy=33%
>>
>> It would mean that for every third bisection points, Git would
>> _not_ chose the ideal (estimated) 'middle point' from the set of "unknown
>> quality" changes that are still outstanding - but would intentionally
>> "weer outside" and select one commit from the _known_ set of commits.
>>
>> If such a redundant re-test of the known-good or known-bad set yields a
>> nonsensical result then Git aborts the bisection with a "logic
>> inconsistency detected" kind of message - and people could at this point
>> realize the non-determinism of the test.
>>
>> ( Git can do this when a "redundant" test point is marked as 'bad' -
>> despite an earlier bisection already categorizing that test point as
>> 'good' - or if it's the other way around. Git will only continue with
>> the bisection if the test point has the expected quality. )
>>
>> This essentially means an automated re-test - but it's much better than
>> just a repeated bisection - i've often met non-deterministic bugs that
>> yield the _exact same_ nonsensical commit even on repeat bisections. That
>> happens when a timing bug depends on the exact kernel layout, or a
>> miscompilation or linker bug depends on the exact kernel layout, etc.
>>
>> It's also faster than a re-done bisection: 33% more testpoints is better
>> than twice as many test-points. Also, auto-bisection can deal with
>> redundancy just fine - it does not really matter whether i have to wait
>> 20 or 30 minutes for a test result since there's no manual intervention
>> needed - but it _very_ much matters whether i can trust the validity of
>> the bisection result.
>
> when you gave this the title of redundnancy and described the problem I
> assumed that you would then propose running the test multiple times (so
> "git bisect run X --redundancy 5" would run each test 5 times, it would
> pass IFF it passed the test all 5 times. that would seem to be a better
> match for the name, as well as being a better test
Yeah, but using 100%, 200%, 300%, etc. redundancy is a bit wasteful and not
granular enough for my purposes.
Here's the math:
A typical kernel bisection takes 15 test steps. 30% of redundancy means that
it takes only 30% longer, but for that we get +5 tests. Five extra test
points are usually enough to establish whether a test method shows sporadic
tendencies or not, with an ~90% confidence factor.
Repeating the test 5 times would bring a 15-steps kernel bisection from 30
minutes [it's about 60 seconds to build a kernel, 60 seconds to boot it] to
about 2.5 hours - that's very long. The confidence factor only goes from
~90% to 99% - that extra 9% is not worth the cost.
The idea would be to insert 30% redunancy into my bisections automatically -
so that i could trust _all_ bisections more - not just the ones i suspect to
be non-deterministic. Hence the suggestion to enable lower levels of
redundancy like 30%. (but even 10% or 20% might be enough to weed out the
most obvious cases)
Ingo
next prev parent reply other threads:[~2009-02-06 1:49 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-05 6:47 Article about "git bisect run" on LWN Christian Couder
2009-02-05 13:34 ` Bill Lear
2009-02-05 14:13 ` Ingo Molnar
2009-02-06 2:42 ` david
2009-02-06 1:46 ` Ingo Molnar [this message]
2009-02-06 1:52 ` Ingo Molnar
2009-02-06 5:23 ` Christian Couder
2009-02-07 4:41 ` Christian Couder
2009-02-07 12:55 ` David Symonds
2009-02-07 18:09 ` Christian Couder
2009-02-07 18:16 ` Junio C Hamano
2009-02-09 12:19 ` Ingo Molnar
2009-02-09 13:15 ` Johannes Schindelin
2009-02-09 21:03 ` David Symonds
2009-02-10 6:12 ` Christian Couder
2009-02-05 16:23 ` Jonathan Corbet
2009-02-05 20:54 ` Christian Couder
2009-02-06 2:49 ` david
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090206014655.GA26807@elte.hu \
--to=mingo@elte.hu \
--cc=Johannes.Schindelin@gmx.de \
--cc=ae@op5.se \
--cc=chriscool@tuxfamily.org \
--cc=david@lang.hm \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jeff@garzik.org \
--cc=jon.seymour@gmail.com \
--cc=rael@zopyra.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).