Re: Article about "git bisect run" on LWN

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@elte.hu>
To: david@lang.hm
Cc: Christian Couder <chriscool@tuxfamily.org>,
	git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Andreas Ericsson <ae@op5.se>, Jeff Garzik <jeff@garzik.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Bill Lear <rael@zopyra.com>, Jon Seymour <jon.seymour@gmail.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: Article about "git bisect run" on LWN
Date: Fri, 6 Feb 2009 02:46:55 +0100	[thread overview]
Message-ID: <20090206014655.GA26807@elte.hu> (raw)
In-Reply-To: <alpine.DEB.1.10.0902051838180.5340@asgard.lang.hm>


* david@lang.hm <david@lang.hm> wrote:

> On Thu, 5 Feb 2009, Ingo Molnar wrote:
>
>> * Christian Couder <chriscool@tuxfamily.org> wrote:
>>
>>> Hi,
>>>
>>> For information, an article from me, 'Fully automated bisecting with "git
>>> bisect run"' has been published in today's edition of LWN on the
>>> development page:
>>>
>>> http://lwn.net/Articles/317154/
>>
>> Nice article!
>>
>> In terms of possible future enhancements of git bisect, here's a couple of
>> random ideas that would help my auto-bisection efforts:
>>
>> - Feature: support "Bisection Redundancy"
>>
>>   This feature helps developers realize if a bug is sporadic. This happens
>>   quite often in the kernel space: a bug looks deterministic, but down the
>>   line it becomes sporadic. Sometimes a boot crash only occurs with a 75%
>>   probability - and if one is unlucky it can cause a _lot_ of wasted
>>   bisection time. The wrong commit gets blamed and the wrong set of
>>   developers start scratching their heads. It's a reoccuring theme on lkml.
>>
>>   What git could do here is to allow testers to inject a bit of extra
>>   "redundancy" automatically, and use the redundant test-points to detect
>>   conflicts in good/bad constraints.
>>
>>   It would work like this:
>>
>>      git bisect start --redundancy=33%
>>
>>   It would mean that for every third bisection points, Git would
>>   _not_ chose the ideal (estimated) 'middle point' from the set of "unknown
>>   quality" changes that are still outstanding - but would intentionally
>>   "weer outside" and select one commit from the _known_ set of commits.
>>
>>   If such a redundant re-test of the known-good or known-bad set yields a
>>   nonsensical result then Git aborts the bisection with a "logic
>>   inconsistency detected" kind of message - and people could at this point
>>   realize the non-determinism of the test.
>>
>>   ( Git can do this when a "redundant" test point is marked as 'bad' -
>>     despite an earlier bisection already categorizing that test point as
>>     'good' - or if it's the other way around. Git will only continue with
>>     the bisection if the test point has the expected quality. )
>>
>>   This essentially means an automated re-test - but it's much better than
>>   just a repeated bisection - i've often met non-deterministic bugs that
>>   yield the _exact same_ nonsensical commit even on repeat bisections. That
>>   happens when a timing bug depends on the exact kernel layout, or a
>>   miscompilation or linker bug depends on the exact kernel layout, etc.
>>
>>   It's also faster than a re-done bisection: 33% more testpoints is better
>>   than twice as many test-points. Also, auto-bisection can deal with
>>   redundancy just fine - it does not really matter whether i have to wait
>>   20 or 30 minutes for a test result since there's no manual intervention
>>   needed - but it _very_ much matters whether i can trust the validity of
>>   the bisection result.
>
> when you gave this the title of redundnancy and described the problem I  
> assumed that you would then propose running the test multiple times (so  
> "git bisect run X --redundancy 5" would run each test 5 times, it would  
> pass IFF it passed the test all 5 times. that would seem to be a better  
> match for the name, as well as being a better test

Yeah, but using 100%, 200%, 300%, etc. redundancy is a bit wasteful and not 
granular enough for my purposes.

Here's the math:

A typical kernel bisection takes 15 test steps. 30% of redundancy means that 
it takes only 30% longer, but for that we get +5 tests. Five extra test 
points are usually enough to establish whether a test method shows sporadic 
tendencies or not, with an ~90% confidence factor.

Repeating the test 5 times would bring a 15-steps kernel bisection from 30 
minutes [it's about 60 seconds to build a kernel, 60 seconds to boot it] to 
about 2.5 hours - that's very long. The confidence factor only goes from 
~90% to 99% - that extra 9% is not worth the cost.

The idea would be to insert 30% redunancy into my bisections automatically - 
so that i could trust _all_ bisections more - not just the ones i suspect to 
be non-deterministic. Hence the suggestion to enable lower levels of 
redundancy like 30%. (but even 10% or 20% might be enough to weed out the 
most obvious cases)

	Ingo

next prev parent reply	other threads:[~2009-02-06  1:49 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-05  6:47 Article about "git bisect run" on LWN Christian Couder
2009-02-05 13:34 ` Bill Lear
2009-02-05 14:13 ` Ingo Molnar
2009-02-06  2:42   ` david
2009-02-06  1:46     ` Ingo Molnar [this message]
2009-02-06  1:52       ` Ingo Molnar
2009-02-06  5:23   ` Christian Couder
2009-02-07  4:41     ` Christian Couder
2009-02-07 12:55       ` David Symonds
2009-02-07 18:09         ` Christian Couder
2009-02-07 18:16         ` Junio C Hamano
2009-02-09 12:19           ` Ingo Molnar
2009-02-09 13:15             ` Johannes Schindelin
2009-02-09 21:03               ` David Symonds
2009-02-10  6:12                 ` Christian Couder
2009-02-05 16:23 ` Jonathan Corbet
2009-02-05 20:54   ` Christian Couder
2009-02-06  2:49   ` david

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090206014655.GA26807@elte.hu \
    --to=mingo@elte.hu \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=ae@op5.se \
    --cc=chriscool@tuxfamily.org \
    --cc=david@lang.hm \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeff@garzik.org \
    --cc=jon.seymour@gmail.com \
    --cc=rael@zopyra.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.