Re: Intel ICH9M/M-E SATA error-handling/reset problems

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Tejun Heo <tj@kernel.org>
To: Serguei Miridonov <mirsev@cicese.mx>
Cc: Robert Hancock <hancockrwd@gmail.com>,
	linux-kernel@vger.kernel.org, Jeff Garzik <jeff@garzik.org>
Subject: Re: Intel ICH9M/M-E SATA error-handling/reset problems
Date: Thu, 19 Feb 2009 15:29:55 +0900	[thread overview]
Message-ID: <499CFC63.2070608@kernel.org> (raw)
In-Reply-To: <200902160817.16614.mirsev@cicese.mx>

Hello, Serguei.

Serguei Miridonov wrote:
>>>> I agree with you completely. Nevertheless, something like 10
>>>> errors per 2GB transfer can not be the reason to give up. Vista,
>>>> at least, recovers and continues the data transfer. Linux simply
>>>> can not return the interface or connected device into operating
>>>> mode. Do you think it is normal?
>> Well, there isn't much point in keeping retrying if the same
>> command fails consecutively. 
> 
> I'm not talking about the _same_ transfer command. I mean intermittent 
> errors, average 10 parity errors per 2GB file. Let me repeat myself 
> from another post:
> 
> ... my very strong opinion based just on general physics is that 
> error rate on SATA can be (and will be) much higher than that one on 
> PATA. PATA operates at lower frequencies and cables are much shorter. 
> eSATA cables are longer and work at up to 3Gb/s. Moreover, consider 
> all these consumer-grade connectors, cables, etc. So, CRC errors could 
> be quite common and software needs to handle them properly to keep 
> transfers fast and maintain the communication with a device.

The kernel doesn't give up after intermittent errors.

> And, remember USB bulk transfer? Who is taking care on CRC check and 
> retries there?

What you're describing is already handled.  No need to worry about it.

>> The problem was the broken speed down
>> logic, so all the retries failed and FS eventually received IO
>> failure.  Should have been fixed with recent changes.
> 
> Slow down may help to reduce amount of errors but it may happen that 
> they can not be avoided completely.
> 
>> In the log, ata2.00 went down after a timeout.  The reset per-se
>> isn't the problem and is the RTTD after a timeout as the controller
>> and device states are unknown.  The situations like yours in the
>> log often happens because an ATAPI device shuts down completely
>> after certain transmission problems.  When this happens, there's
>> nothing much the driver can do and soft reboot wouldn't recover the
>> device either.
> 
> So, this is the kernel job to keep things working, not break them :-)

Yeah, and other than the hardware quirkiness on your machine, it
already works fine.

>> But seeing you're on dv5, I think you might be experiencing
>> something else.  Please take a look at the following bz.
>>
>>   http://bugzilla.kernel.org/show_bug.cgi?id=12276
> 
> Yes, I tried to suspend to RAM and when the laptop waked up it failed 
> to communicate with the hard drive. So, I use hibernate instead.

Can you please try to take a look at the kernel log after the kernel
resumes and see whether you're actually seeing the same problem?

Thanks.

-- 
tejun

     prev parent reply	other threads:[~2009-02-19  6:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-14 20:06 Intel ICH9M/M-E SATA error-handling/reset problems Serguei Miridonov
2009-02-14 20:53 ` Jeff Garzik
2009-02-14 22:01 ` Robert Hancock
2009-02-15 18:00   ` Serguei Miridonov
2009-02-15 18:04     ` Robert Hancock
2009-02-15 19:41       ` Serguei Miridonov
2009-02-15 20:15         ` Robert Hancock
2009-02-15 21:55           ` Serguei Miridonov
2009-02-16  2:11       ` Tejun Heo
2009-02-16 16:17         ` Serguei Miridonov
2009-02-19  6:29           ` Tejun Heo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=499CFC63.2070608@kernel.org \
    --to=tj@kernel.org \
    --cc=hancockrwd@gmail.com \
    --cc=jeff@garzik.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mirsev@cicese.mx \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox