git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Lutomirski <luto@mit.edu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Couder <christian.couder@gmail.com>,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	git@vger.kernel.org, Shuang He <shuang.he@intel.com>
Subject: Re: AAARGH bisection is hard (Re: [2.6.39 regression] X locks up hard right after logging in)
Date: Fri, 13 May 2011 13:24:02 -0400	[thread overview]
Message-ID: <BANLkTinyzBnksHk_rt8K2pmg90q5WyZX3w@mail.gmail.com> (raw)
In-Reply-To: <BANLkTimE2GkkhcFZtNrYZASWp0LDhUx=GQ@mail.gmail.com>

On Fri, May 13, 2011 at 12:11 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, May 13, 2011 at 7:56 AM, Andrew Lutomirski <luto@mit.edu> wrote:
>>
>> So what I really want is a fancy version of git bisect that makes no
>> assumptions about the relationship of good and bad commits in the
>> graph and just finds me a commit that is bad but for which all parents
>> are good or vice versa.
>
> Ehh. That's the "non-fancy" way of testing, I'm afraid: if you cannot
> make assumption about the relationship between good and bad commits,
> then you have to test _every_ commit.

Actually, I disagree.  I suspect, although I haven't convinced myself
very well yet, that if you assume that the bug was caused one or more
times by some commit C that works but where all of C's parents don't
work (or vice versa), then there exists an algorithm that, at least
for most histories, will find such a commit in polylog tries given a
starting commit that works and another one that fails.  But I have to
do real work before I think too much more about that.

That being said, even the fairly weak requirement I wanted wasn't really true...

[I said in a different email:]
>
> In conclusion, I found the problem.  It's a clusterfuck and I think
> there's no way that any bisection tool under any sane assumptions
> could have found it.  Patch coming in a couple seconds b/c I think it
> needs to go in to 2.6.39.

I should clarify what the problem was for people who don't want to dig
around the archives:

I have a Sandy Bridge box, which means that I need to run a recent
kernel for things to work decently.  The bug was introduced once way
back in the depths of time (i.e. before any kernel that I ever tried
since I got the machine).  It was fixed shortly before 2.6.38 by
commit A.  It was reintroduced in a merge B that was a little past A.
B went in to 2.6.39-something via airlied's tree.  B's other parent
was bad because it didn't contain A.  It looks like this:

-------------------------------.
                                \
(bad pre-2.6.38-rc2)--.          \ (etc)
                       \          \
          .--(good)-----B--(bad)-. \
         /                        \ \
(bad)---A--(good)--v2.6.38---------x-x-v2.6.39-rc7


(A is a1656b9090f7008d2941c314f5a64724bea2ae37 and B is
47ae63e0c2e5fdb582d471dc906eb29be94c732f)


The offending commit is B, but the bisection is screwed, because the
series of nonworking commits dangling off B looks just like any other
series of nonworking commits like the top line that have nothing to do
with the problem.  Sure enough, my bisection ended up wandering into
dark corners (like the networking tree), which were innocent.

I found the problem by manually bisecting the --first-parent chain
from v2.6.39-rc7 to v2.6.38 to figure out that the problem came from a
drm merge and then noticing that something was screwed up when the
bisection pointed to a commit (in the right driver, even) that wasn't
the problem.  (I even tried reverting it to no avail.)  Bisection was
*sure* it was the problem, though, because its parent was in v2.6.38.

I thought that maybe the problem had been introduced more than once,
so I tried v2.6.38-rc5, and it *failed*.  (That's what caused a lot of
my confusion the first time around -- lots of commits that were "good"
(in the sense that they would work if merged correctly into the
v2.6.39 branch before B got there) failed instead.

So I bisected between v2.6.38 and v2.6.38-rc5 to find the commit that
fixed the problem, since there had to be something.  Once I found it,
a bunch of confused calls to git blame found the merge that undid the
fix.

> Think of it as a compression method: it generates the smallest
> possible set of test points for you. But it's a "lossy" compression -
> you don't test everything. And it's extreme: it boils down 10k commit
> events to about 13 bisection events. If anything goes wrong (like the
> bug not being entirely repeatable, or the bug comes and goes), it will
> give the wrong answer.

As I just learned :)

--Andy

  parent reply	other threads:[~2011-05-13 17:25 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-12 17:15 AAARGH bisection is hard (Re: [2.6.39 regression] X locks up hard right after logging in) Andrew Lutomirski
2011-05-12 17:37 ` Linus Torvalds
2011-05-12 18:54   ` Johannes Sixt
2011-05-12 19:17     ` Linus Torvalds
2011-05-13 13:39   ` Andrew Lutomirski
2011-05-13  8:20 ` Christian Couder
2011-05-13 13:38   ` Andrew Lutomirski
2011-05-13 14:56     ` Andrew Lutomirski
2011-05-13 16:11       ` Linus Torvalds
2011-05-13 16:13         ` Andrew Lutomirski
2011-05-13 17:24         ` Andrew Lutomirski [this message]
2011-05-13 17:54           ` Linus Torvalds
2011-05-13 18:34             ` Johannes Sixt
2011-05-13 18:41               ` Linus Torvalds
2011-05-13 18:47                 ` Johannes Sixt
2011-05-13 18:48                 ` Junio C Hamano
2011-05-13 18:55                   ` Andrew Lutomirski
2011-05-13 19:18                   ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BANLkTinyzBnksHk_rt8K2pmg90q5WyZX3w@mail.gmail.com \
    --to=luto@mit.edu \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=shuang.he@intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).