All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Serguei Miridonov <mirsev@cicese.mx>
Cc: Robert Hancock <hancockrwd@gmail.com>,
	linux-kernel@vger.kernel.org, Jeff Garzik <jeff@garzik.org>
Subject: Re: Intel ICH9M/M-E SATA error-handling/reset problems
Date: Thu, 19 Feb 2009 15:29:55 +0900	[thread overview]
Message-ID: <499CFC63.2070608@kernel.org> (raw)
In-Reply-To: <200902160817.16614.mirsev@cicese.mx>

Hello, Serguei.

Serguei Miridonov wrote:
>>>> I agree with you completely. Nevertheless, something like 10
>>>> errors per 2GB transfer can not be the reason to give up. Vista,
>>>> at least, recovers and continues the data transfer. Linux simply
>>>> can not return the interface or connected device into operating
>>>> mode. Do you think it is normal?
>> Well, there isn't much point in keeping retrying if the same
>> command fails consecutively. 
> 
> I'm not talking about the _same_ transfer command. I mean intermittent 
> errors, average 10 parity errors per 2GB file. Let me repeat myself 
> from another post:
> 
> ... my very strong opinion based just on general physics is that 
> error rate on SATA can be (and will be) much higher than that one on 
> PATA. PATA operates at lower frequencies and cables are much shorter. 
> eSATA cables are longer and work at up to 3Gb/s. Moreover, consider 
> all these consumer-grade connectors, cables, etc. So, CRC errors could 
> be quite common and software needs to handle them properly to keep 
> transfers fast and maintain the communication with a device.

The kernel doesn't give up after intermittent errors.

> And, remember USB bulk transfer? Who is taking care on CRC check and 
> retries there?

What you're describing is already handled.  No need to worry about it.

>> The problem was the broken speed down
>> logic, so all the retries failed and FS eventually received IO
>> failure.  Should have been fixed with recent changes.
> 
> Slow down may help to reduce amount of errors but it may happen that 
> they can not be avoided completely.
> 
>> In the log, ata2.00 went down after a timeout.  The reset per-se
>> isn't the problem and is the RTTD after a timeout as the controller
>> and device states are unknown.  The situations like yours in the
>> log often happens because an ATAPI device shuts down completely
>> after certain transmission problems.  When this happens, there's
>> nothing much the driver can do and soft reboot wouldn't recover the
>> device either.
> 
> So, this is the kernel job to keep things working, not break them :-)

Yeah, and other than the hardware quirkiness on your machine, it
already works fine.

>> But seeing you're on dv5, I think you might be experiencing
>> something else.  Please take a look at the following bz.
>>
>>   http://bugzilla.kernel.org/show_bug.cgi?id=12276
> 
> Yes, I tried to suspend to RAM and when the laptop waked up it failed 
> to communicate with the hard drive. So, I use hibernate instead.

Can you please try to take a look at the kernel log after the kernel
resumes and see whether you're actually seeing the same problem?

Thanks.

-- 
tejun

      reply	other threads:[~2009-02-19  6:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-14 20:06 Intel ICH9M/M-E SATA error-handling/reset problems Serguei Miridonov
2009-02-14 20:53 ` Jeff Garzik
2009-02-14 22:01 ` Robert Hancock
2009-02-15 18:00   ` Serguei Miridonov
2009-02-15 18:04     ` Robert Hancock
2009-02-15 19:41       ` Serguei Miridonov
2009-02-15 20:15         ` Robert Hancock
2009-02-15 21:55           ` Serguei Miridonov
2009-02-16  2:11       ` Tejun Heo
2009-02-16 16:17         ` Serguei Miridonov
2009-02-19  6:29           ` Tejun Heo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=499CFC63.2070608@kernel.org \
    --to=tj@kernel.org \
    --cc=hancockrwd@gmail.com \
    --cc=jeff@garzik.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mirsev@cicese.mx \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.