public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Robert Hancock <hancockrwd@gmail.com>
To: Serguei Miridonov <mirsev@cicese.mx>
Cc: linux-kernel@vger.kernel.org, Jeff Garzik <jeff@garzik.org>,
	Tejun Heo <tj@kernel.org>
Subject: Re: Intel ICH9M/M-E SATA error-handling/reset problems
Date: Sun, 15 Feb 2009 14:15:01 -0600	[thread overview]
Message-ID: <499877C5.6090205@gmail.com> (raw)
In-Reply-To: <200902151141.44367.mirsev@cicese.mx>

Serguei Miridonov wrote:
> On Sunday 15 February 2009, Robert Hancock wrote:
>> Serguei Miridonov wrote:
>>> On Saturday 14 February 2009, Robert Hancock wrote:
>>>> Serguei Miridonov wrote:
>>> ... something like 10
>>> errors per 2GB transfer can not be the reason to give up. Vista,
>>> at least, recovers and continues the data transfer. Linux simply
>>> can not return the interface or connected device into operating
>>> mode. Do you think it is normal?
>> Could be that Linux is being a bit more aggressive on error
>> handling. In your case, it looks like an error occurred, triggering
>> a hard reset of the device, and the controller seemed unable to
>> talk to the device afterwards. If the command had just been
>> retried, maybe it would have worked better. However, doing that in
>> general can cause issues since you don't know what the state of the
>> link may be..
> 
> Hmm... I was sure there are general recommendations from chipset 
> vendors regarding recovery procedures.
> 
> What is the behavior expected from a SATA connected device if it 
> detects parity error in received data? I'm not familiar with PATA/SATA 
> protocols but I suppose that it just doesn't send data to the physical 
> disk for recording, asserts the error line and waits next command from 
> the controller. If the data block was too big to keep it in the drive 
> cache memory, it may also set number of successfully (physically) 
> written bytes to prevent the software to send it again.

In the case of a CRC error the error flag gets set and the transfer is 
aborted by whichever side detects it. In this case the entire transfer 
gets retried.

> 
> If the above is correct then the kernel should only log the error, do 
> some housekeeping work for the controller and attempt to send data 
> again. There is no need for hard reset right after first error.

Right now interface CRC error is considered an ATA bus error which 
always triggers a reset. It's possible this could be relaxed in some 
cases, but the issue is that if CRC errors are occurring the link may be 
in an invalid state which simply retrying the command will not clear.

Tejun, any thoughts?

> 
> Another question is how the drive reacts to hard reset... My error log 
> shows that both drives do not like it for some reason - they stop 
> responding sometimes, so may be some additional programming of drives 
> is necessary after hard reset... Something which is done in BIOS after 
> power on... I don't know...

The same hard reset is done (and generally has to be done) on driver 
initialization and when a drive is hot plugged, so it should work. 
However, if the link is having problems (and it obviously is, from the 
CRC errors) the drive may not receive the reset either.

> 
> Well, it becomes interesting... I've got datasheet for ICH9 but don't 
> have a kernel driver source to check what messages in log file really 
> mean. Could you point me a link to the uncompressed kernel tree where 
> I can see source files?
> 

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git is 
likely the easiest place to view..

  reply	other threads:[~2009-02-15 20:15 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-14 20:06 Intel ICH9M/M-E SATA error-handling/reset problems Serguei Miridonov
2009-02-14 20:53 ` Jeff Garzik
2009-02-14 22:01 ` Robert Hancock
2009-02-15 18:00   ` Serguei Miridonov
2009-02-15 18:04     ` Robert Hancock
2009-02-15 19:41       ` Serguei Miridonov
2009-02-15 20:15         ` Robert Hancock [this message]
2009-02-15 21:55           ` Serguei Miridonov
2009-02-16  2:11       ` Tejun Heo
2009-02-16 16:17         ` Serguei Miridonov
2009-02-19  6:29           ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=499877C5.6090205@gmail.com \
    --to=hancockrwd@gmail.com \
    --cc=jeff@garzik.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mirsev@cicese.mx \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox