All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@suse.de>
To: Tejun <htejun@gmail.com>
Cc: jgarzik@pobox.com, linux-ide@vger.kernel.org
Subject: Re: [PATCH Linux 2.6.12 00/09] NCQ: generic NCQ completion/error-handling
Date: Fri, 1 Jul 2005 10:59:12 +0200	[thread overview]
Message-ID: <20050701085912.GB2243@suse.de> (raw)
In-Reply-To: <20050701002035.GA24878@htj.dyndns.org>

On Fri, Jul 01 2005, Tejun wrote:
> On Thu, Jun 30, 2005 at 05:26:20PM +0200, Jens Axboe wrote:
> > On Thu, Jun 30 2005, Tejun Heo wrote:
> > > Jens Axboe wrote:
> > > >On Mon, Jun 27 2005, Jens Axboe wrote:
> > > >
> > > >>On Mon, Jun 27 2005, Tejun Heo wrote:
> > > >>
> > > >>>Hello, Jeff.
> > > >>>Hello, Jens.
> > > >>>
> > > >>>This patchset implements generic completion and error-handling for
> > > >>>NCQ commands.  This patchset assumes that the previous six misc
> > > >>>patches to NCQ are applied.
> > > >>
> > > >>Excellent, much needed work in that area. I will give it a test spin
> > > >>here as well, I have one drive that likes to barf with ncq occasionally.
> > > >
> > > >
> > > >Ok, I've run with this for a few days and finally hit the
> > > >drive-stops-responding condition yesterday afternoon. Error recovery
> > > >worked a lot better than before, but eventually went down anyways. But
> > > >now I got a better look at the error, and it's the drive throwing an
> > > >ICRC (error 0x80). Very odd. I've never seen this happen with non-NCQ
> > > >operations, however I've seen it now a few times using NCQ. Any ideas?
> > > >
> > > 
> > >  Hello, Jens.
> > > 
> > >  Can you please describe how the drive went down in detail?  If 
> > > possible, log messages w/ the debug message patch applied would be 
> > > great.  As the EH now resets both the controller (on entry to EH) and 
> > > the drive (on timeout), we should be able to recover unless something 
> > > goes very strange.
> > 
> > I'm pretty sure it wasn't the fault of the error handling, although I
> > cannot say for sure of course. I don't have the log safed, but what
> > happened was that the drive threw an 0x80 icrc error, drive was
> > COMRESET, io was errored, and then nothing happened after that. Access
> > to the drive hung.
> > 
> > I will save the log the next time it occurs, I could not this time since
> > I was working on the machine remotely and needed it rebooted.
> > 
> > >  I'm currently trying to rewrite sil24 driver to make it look saner and 
> > > support NCQ.  Once I'm done with it (maybe one or two more days... I 
> > > hope), I'll do the second take of generic NCQ patches including ATAPI EH 
> > > fix and stuff and it would be great to have your failure log message 
> > > before doing that.
> > 
> > It should trigger again within a day or two, I will send it when it
> > does. Can you resend the debug patch?
> > 
> > -- 
> > Jens Axboe
> 
> 
>  Hi, Jens.
> 
>  I converted most of debug messages I've used during development into
> warning messages when posting the patchset and forgot about it, so
> I've never posted the debug patch.  Sorry about that.  Here's a small
> patch which adds some more messages though.  The following patch also
> adds printk'ing FIS on each command issue in ahci.c:ahci_qc_issue(),
> if you think it would fill your log excessively, feel free to turn it
> off.  It wouldn't probably matter anyway.

I will have to kill the issue part of the patch, that would generate
insane amounts of printk traffic :-)

I'll boot the kernel and report what happens.

-- 
Jens Axboe


  reply	other threads:[~2005-07-01  8:57 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-26 15:21 [PATCH Linux 2.6.12 00/09] NCQ: generic NCQ completion/error-handling Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 01/09] NCQ: add ata_qc_complete_err() and @drv_err to functions Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 02/09] NCQ: add timeout to ata_read_log_page() Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 03/09] NCQ: add ap->sactive Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 04/09] NCQ: export scsi_retry_command Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 05/09] NCQ: implement NCQ helpers Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 06/09] NCQ: convert ahci to use new " Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 07/09] NCQ: stop dma before reset Tejun Heo
2005-07-26 21:12   ` Jeff Garzik
2005-07-27  6:25     ` Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 08/09] NCQ: remove/unexport unused/unnecessary functions Tejun Heo
2005-06-26 15:21 ` [PATCH Linux 2.6.12 09/09] NCQ: causes error or timeout Tejun Heo
2005-06-26 15:34 ` test logs Tejun Heo
2005-06-27 14:33 ` [PATCH Linux 2.6.12 00/09] NCQ: generic NCQ completion/error-handling Jens Axboe
2005-06-30  7:36   ` Jens Axboe
2005-06-30 10:51     ` Tejun Heo
2005-06-30 15:26       ` Jens Axboe
2005-07-01  0:20         ` Tejun
2005-07-01  8:59           ` Jens Axboe [this message]
2005-07-04  5:53             ` Jens Axboe
2005-07-06 12:55               ` Jens Axboe
2005-07-06 13:00                 ` Jens Axboe
2005-07-06 15:11                   ` Tejun Heo
2005-07-08  8:03                   ` Jens Axboe
2005-07-08 10:27                     ` Tejun Heo
2005-07-08 13:54                       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050701085912.GB2243@suse.de \
    --to=axboe@suse.de \
    --cc=htejun@gmail.com \
    --cc=jgarzik@pobox.com \
    --cc=linux-ide@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.