From: Jens Axboe <axboe@suse.de>
To: Tejun <htejun@gmail.com>
Cc: jgarzik@pobox.com, linux-ide@vger.kernel.org
Subject: Re: [PATCH Linux 2.6.12 00/09] NCQ: generic NCQ completion/error-handling
Date: Fri, 1 Jul 2005 10:59:12 +0200 [thread overview]
Message-ID: <20050701085912.GB2243@suse.de> (raw)
In-Reply-To: <20050701002035.GA24878@htj.dyndns.org>
On Fri, Jul 01 2005, Tejun wrote:
> On Thu, Jun 30, 2005 at 05:26:20PM +0200, Jens Axboe wrote:
> > On Thu, Jun 30 2005, Tejun Heo wrote:
> > > Jens Axboe wrote:
> > > >On Mon, Jun 27 2005, Jens Axboe wrote:
> > > >
> > > >>On Mon, Jun 27 2005, Tejun Heo wrote:
> > > >>
> > > >>>Hello, Jeff.
> > > >>>Hello, Jens.
> > > >>>
> > > >>>This patchset implements generic completion and error-handling for
> > > >>>NCQ commands. This patchset assumes that the previous six misc
> > > >>>patches to NCQ are applied.
> > > >>
> > > >>Excellent, much needed work in that area. I will give it a test spin
> > > >>here as well, I have one drive that likes to barf with ncq occasionally.
> > > >
> > > >
> > > >Ok, I've run with this for a few days and finally hit the
> > > >drive-stops-responding condition yesterday afternoon. Error recovery
> > > >worked a lot better than before, but eventually went down anyways. But
> > > >now I got a better look at the error, and it's the drive throwing an
> > > >ICRC (error 0x80). Very odd. I've never seen this happen with non-NCQ
> > > >operations, however I've seen it now a few times using NCQ. Any ideas?
> > > >
> > >
> > > Hello, Jens.
> > >
> > > Can you please describe how the drive went down in detail? If
> > > possible, log messages w/ the debug message patch applied would be
> > > great. As the EH now resets both the controller (on entry to EH) and
> > > the drive (on timeout), we should be able to recover unless something
> > > goes very strange.
> >
> > I'm pretty sure it wasn't the fault of the error handling, although I
> > cannot say for sure of course. I don't have the log safed, but what
> > happened was that the drive threw an 0x80 icrc error, drive was
> > COMRESET, io was errored, and then nothing happened after that. Access
> > to the drive hung.
> >
> > I will save the log the next time it occurs, I could not this time since
> > I was working on the machine remotely and needed it rebooted.
> >
> > > I'm currently trying to rewrite sil24 driver to make it look saner and
> > > support NCQ. Once I'm done with it (maybe one or two more days... I
> > > hope), I'll do the second take of generic NCQ patches including ATAPI EH
> > > fix and stuff and it would be great to have your failure log message
> > > before doing that.
> >
> > It should trigger again within a day or two, I will send it when it
> > does. Can you resend the debug patch?
> >
> > --
> > Jens Axboe
>
>
> Hi, Jens.
>
> I converted most of debug messages I've used during development into
> warning messages when posting the patchset and forgot about it, so
> I've never posted the debug patch. Sorry about that. Here's a small
> patch which adds some more messages though. The following patch also
> adds printk'ing FIS on each command issue in ahci.c:ahci_qc_issue(),
> if you think it would fill your log excessively, feel free to turn it
> off. It wouldn't probably matter anyway.
I will have to kill the issue part of the patch, that would generate
insane amounts of printk traffic :-)
I'll boot the kernel and report what happens.
--
Jens Axboe
next prev parent reply other threads:[~2005-07-01 8:57 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-06-26 15:21 [PATCH Linux 2.6.12 00/09] NCQ: generic NCQ completion/error-handling Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 01/09] NCQ: add ata_qc_complete_err() and @drv_err to functions Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 02/09] NCQ: add timeout to ata_read_log_page() Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 03/09] NCQ: add ap->sactive Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 04/09] NCQ: export scsi_retry_command Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 05/09] NCQ: implement NCQ helpers Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 06/09] NCQ: convert ahci to use new " Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 07/09] NCQ: stop dma before reset Tejun Heo
2005-07-26 21:12 ` Jeff Garzik
2005-07-27 6:25 ` Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 08/09] NCQ: remove/unexport unused/unnecessary functions Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 09/09] NCQ: causes error or timeout Tejun Heo
2005-06-26 15:34 ` test logs Tejun Heo
2005-06-27 14:33 ` [PATCH Linux 2.6.12 00/09] NCQ: generic NCQ completion/error-handling Jens Axboe
2005-06-30 7:36 ` Jens Axboe
2005-06-30 10:51 ` Tejun Heo
2005-06-30 15:26 ` Jens Axboe
2005-07-01 0:20 ` Tejun
2005-07-01 8:59 ` Jens Axboe [this message]
2005-07-04 5:53 ` Jens Axboe
2005-07-06 12:55 ` Jens Axboe
2005-07-06 13:00 ` Jens Axboe
2005-07-06 15:11 ` Tejun Heo
2005-07-08 8:03 ` Jens Axboe
2005-07-08 10:27 ` Tejun Heo
2005-07-08 13:54 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050701085912.GB2243@suse.de \
--to=axboe@suse.de \
--cc=htejun@gmail.com \
--cc=jgarzik@pobox.com \
--cc=linux-ide@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).