public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jeff@garzik.org>
To: Torsten Kaiser <just.for.lkml@googlemail.com>
Cc: Tejun Heo <htejun@gmail.com>,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org
Subject: Re: sata_sil24 broken since 2.6.23-rc4-mm1
Date: Thu, 27 Sep 2007 02:24:51 -0400	[thread overview]
Message-ID: <46FB4CB3.3090004@garzik.org> (raw)
In-Reply-To: <64bb37e0709262314x1b0100d8lfe34327db6b9bec8@mail.gmail.com>

Torsten Kaiser wrote:
> On 9/27/07, Tejun Heo <htejun@gmail.com> wrote:
>> Tejun Heo wrote:
>>> Torsten Kaiser wrote:
>>>> Comparing the driver/ata directory from rc3-mm1 and rc4-mm1 the
>>>> following change looked the most suspicions to me:
>>>> http://git.kernel.org/?p=linux/kernel/git/jgarzik/libata-dev.git;a=blobdiff;f=drivers/ata/sata_sil24.c;h=3dcb223117be9739ee04d70b6bfc776a4b839a3f;hp=e0cd31aa8002350add53ba6ff07493e503275244;hb=020bc1bd8d369a77bd9379cd9763ac0057651753;hpb=8d4bdf8087e682df98bdb856f6ad451bf6d597e7
>>>>
>>>> That after rc4-mm1 the sata_sil24.c did not change anymore also
>>>> matches the occurrence of the error.
>>>>
>>>> To confirm my theorie I exchanged the sata_sil24.c from rc8-mm1 with
>>>> the version from rc3-mm1.
>>>> I was able to boot the resulting kernel successfully 5 times, without
>>>> the error happening again.
>>> Thanks a lot for chasing down the problem.  The changed code is address
>>> initialization path and it's weird that it causes intermittent failures,
>>> not a consistent one.
>>>
>>> Anyways, does the attached patch fix the problem?
> 
> I'm starting to *really* hate that bug.
> My analysis was wrong, as I booted to modified 2.6.23-rc8-mm1 this
> morning, that failed too. (Same error messages as -rc7-mm1 from the
> first mail in this thread.)
> So it's not that change that causes the breakage.
> 
> And I'm not really finding a good pattern to what boots fail and what work.
> It seems to only fail, if I completely power off the system for
> several hours. (Using the physical switch at the backside of the
> powersupply, not the normal soft-off)
> 
> One of the five boots I tried yesterday, I also powered the system
> completely off that way, but only leaving it off ~10..20 seconds
> seemed not to trigger the bug.
> 
> But I still think that is not a hardware failure, as the -rc3-mm1
> kernel never showed that error, even when I used it several times
> after the first -rc4-mm1 failures.
> 
>> If not, can you add printk of iomap[SIL24_PORT_BAR], offset, initialized
>> cmd_addr and scr_addr in the loop and see whether anything is different
>> between when the driver works and fails.
> 
> Should I do this anyway?
> 
> I compared the dmesg form good and bad boots with -rc7-mm1 but could
> not see any difference, so do you think that these additional
> diagnostics could show a difference?
> Or could you suggest any other debugging options I should try?

I think since its a reproducible problem, I think it's easiest to get 
you straight to git-bisect.  In this case, that would be

1. Clone branch "upstream" of 
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git

2. Test.  If bug persists, you have narrowed down the problem to the -mm 
changes from the SATA developers, that are to be sent for 2.6.24.  If 
the problem does not persist, then it's a problem added in the -mm 
patchset alone, which carries few ATA patches outside of libata-dev.git.

3. If the problem is in libata-dev.git#upstream (likely), you can now 
use git-bisect to find the specific commit that causes the problems. 
Read the git-bisect man page for full details, but the basics are

	a) start with a known good point (v2.6.22? v2.6.23?) and
	known bad point (HEAD, aka the most recent commit in
	libata-dev.git#upstream)

	b) build and boot kernels, marking each as known-good or
	known-bad.

	c) This process will systematically narrow down the problem
	to a single git commit.

Regards,

	Jeff




  reply	other threads:[~2007-09-27  6:25 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-26 20:26 sata_sil24 broken since 2.6.23-rc4-mm1 Torsten Kaiser
2007-09-27  4:54 ` Tejun Heo
2007-09-27  4:57   ` Tejun Heo
2007-09-27  6:14     ` Torsten Kaiser
2007-09-27  6:24       ` Jeff Garzik [this message]
2007-09-27 17:34         ` Torsten Kaiser
2007-09-27 20:22           ` Tejun Heo
2007-09-28  5:36             ` Torsten Kaiser
2007-09-30  6:00               ` Torsten Kaiser
2007-09-30 14:34                 ` Tejun Heo
2007-09-30 16:19                   ` Torsten Kaiser
2007-09-30 17:39                     ` Tejun Heo
2007-09-30 18:39                       ` Torsten Kaiser
2007-10-01 18:00                         ` Torsten Kaiser
2007-10-03 15:21                           ` Torsten Kaiser
2007-10-03 15:55                             ` Torsten Kaiser
2007-10-03 16:38                               ` Matt Mackall
2007-10-03 17:36                                 ` Torsten Kaiser
2007-10-03 17:51                                   ` Matt Mackall
2007-10-03 18:06                                     ` Torsten Kaiser
2007-10-04  5:32                                 ` Torsten Kaiser
2007-10-04 17:05                                   ` Matt Mackall
2007-10-05  6:06                                     ` Torsten Kaiser
2007-10-07  8:44                                       ` Torsten Kaiser
2007-10-07 14:39                                         ` Torsten Kaiser
2007-10-11  3:25                                           ` Tejun Heo
2007-10-11  5:54                                             ` Torsten Kaiser
2007-10-11  6:26                                               ` Tejun Heo
2007-10-11 17:51                                                 ` Torsten Kaiser
2007-10-11  8:26                                             ` Jens Axboe
2007-10-11  8:36                                               ` Tejun Heo
2007-10-11 10:28                                                 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46FB4CB3.3090004@garzik.org \
    --to=jeff@garzik.org \
    --cc=akpm@linux-foundation.org \
    --cc=htejun@gmail.com \
    --cc=just.for.lkml@googlemail.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox