All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <htejun@gmail.com>
To: Torsten Kaiser <just.for.lkml@googlemail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>, Jeff Garzik <jeff@garzik.org>,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org
Subject: Re: sata_sil24 broken since 2.6.23-rc4-mm1
Date: Thu, 11 Oct 2007 12:25:49 +0900	[thread overview]
Message-ID: <470D97BD.4020300@gmail.com> (raw)
In-Reply-To: <64bb37e0710070739s67805d72x6d675cb2af2e8b24@mail.gmail.com>

Torsten Kaiser wrote:
> Looking closer at
> http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=commitdiff;h=ec6fdded4d76aa54aa57341e5dfdd61c507b1dcd
> the change to libata.h seems bogus :
> 
> in ata_qc_first_sg:
> old                                new
> return qc->__sg                    return qc->__sg
> qc->__sg - qc->__sg == 0           qc->n_iter=0
> -> sg - qc->__sg corresponds to qc->n_iter
> 
> in ata_qc_next_sg:
> sg++;                              sg_next(sg); qc->n_iter++;
> sg - qc->__sg < qc->n_elem         qc->n_iter < qc->nelem
> -> sg - qc->__sg corresponds to qc->n_iter
> 
> but in ata_sg_is_last:
> (sg - qc->__sg) +1 == qc->n_elem   qc->n_iter == qc->n_elem
> if sg - qc->__sg corresponds to qc->n_iter then shoudn't it be
> qc->n_iter+1 == qc->n_elem?
> 
> That missing +1 would explain, why the SGE_TRM never gets set.

Thanks a lot for tracking this down.  Does changing the above code fix
your problem?

Jens, Torsten's analysis looks correct && depending on qc state (n_iter)
during iteration doesn't look like a good idea.  Those iterators are not
supposed to have side effects.  Would it be difficult to implement
sg_last() test?

> And it would fit the symptoms, that the boot would fail at random. If
> the "correct" garbage was in place to where the sglist runs off it
> hangs the drive.
> And that would even fit the two different errors that I only got one time each:
> * a completely illegal access (PCI master abort while fetching SGT)
> * wrong alignment of the SGT (SGT no on qword boundary)
> At that that times the garbage seemed to point invalid addresses.
> 
> But I'm still not understanding, how the kernel could only fail
> sometimes at bootup, but after that working without any visible
> errors? Is the sil-chip rather intelligent about detecting corrupted
> sglists and silently ignoring them?

I have no idea why it fails only sometimes.

-- 
tejun

  reply	other threads:[~2007-10-11  3:26 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-26 20:26 sata_sil24 broken since 2.6.23-rc4-mm1 Torsten Kaiser
2007-09-27  4:54 ` Tejun Heo
2007-09-27  4:57   ` Tejun Heo
2007-09-27  6:14     ` Torsten Kaiser
2007-09-27  6:24       ` Jeff Garzik
2007-09-27 17:34         ` Torsten Kaiser
2007-09-27 20:22           ` Tejun Heo
2007-09-28  5:36             ` Torsten Kaiser
2007-09-30  6:00               ` Torsten Kaiser
2007-09-30 14:34                 ` Tejun Heo
2007-09-30 16:19                   ` Torsten Kaiser
2007-09-30 17:39                     ` Tejun Heo
2007-09-30 18:39                       ` Torsten Kaiser
2007-10-01 18:00                         ` Torsten Kaiser
2007-10-03 15:21                           ` Torsten Kaiser
2007-10-03 15:55                             ` Torsten Kaiser
2007-10-03 16:38                               ` Matt Mackall
2007-10-03 17:36                                 ` Torsten Kaiser
2007-10-03 17:51                                   ` Matt Mackall
2007-10-03 18:06                                     ` Torsten Kaiser
2007-10-04  5:32                                 ` Torsten Kaiser
2007-10-04 17:05                                   ` Matt Mackall
2007-10-05  6:06                                     ` Torsten Kaiser
2007-10-07  8:44                                       ` Torsten Kaiser
2007-10-07 14:39                                         ` Torsten Kaiser
2007-10-11  3:25                                           ` Tejun Heo [this message]
2007-10-11  5:54                                             ` Torsten Kaiser
2007-10-11  6:26                                               ` Tejun Heo
2007-10-11 17:51                                                 ` Torsten Kaiser
2007-10-11  8:26                                             ` Jens Axboe
2007-10-11  8:36                                               ` Tejun Heo
2007-10-11 10:28                                                 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=470D97BD.4020300@gmail.com \
    --to=htejun@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=jeff@garzik.org \
    --cc=jens.axboe@oracle.com \
    --cc=just.for.lkml@googlemail.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.