linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Mark Lord <liml@rtr.ca>
Cc: "Bruno Prémont" <bonbons@linux-vserver.org>,
	"Linux Kernel" <linux-kernel@vger.kernel.org>,
	linux-ide@vger.kernel.org, "Jeff Garzik" <jgarzik@pobox.com>
Subject: Re: XFS shutting down due to IO timeout on SATA disk (pata_via for CX700)
Date: Mon, 15 Sep 2008 13:37:02 -0700	[thread overview]
Message-ID: <48CEC76E.7020101@kernel.org> (raw)
In-Reply-To: <48CEC5FB.4040503@rtr.ca>

Mark Lord wrote:
>> Timeout on FLUSH_EXT.  That's a bad sign.  Patch to retry FLUSH is
>> pending but at any rate FLUSH failure is often accompanied by loss of
>> data and XFS is doing the right thing of giving up on it.
> ..
> 
> Tejun, are we *sure* that's really a timeout?
> The status shows 0x40 "drive ready" there, aka. "command complete".

Heh... on timeout, libata EH doesn't touch status register as some
controllers lock the whole machine up on that, so the 0x40 is just the
fill value libata used during qc initialization.  It definitely
requires clarification.

> I have a client who is also seeing this exact scenario on 750GB drives,
> using a patched SLES10 kernel (2.6.16 + libata from 2.6.18 or so).

Hmm.. most of FLUSH timeouts I've seen are either a dying drive or bad
PSU.  There just isn't much which can go wrong from the driver side.
IIRC, there was a problem when the unused part of TF is not cleared
but that was the only one.

> Smartctl output is clean (no logged errors), and the drives themselves
> are fine after a reboot -- necessary since libata/scsi kicked the drive out
> of the RAID array.
>
> Something strange is going on here.

Any chance you can trick the client to hook up the drive to a separate
PSU?

Thanks.

-- 
tejun

  reply	other threads:[~2008-09-15 20:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-11 17:35 XFS shutting down due to IO timeout on SATA disk (pata_via for CX700) Bruno Prémont
2008-09-15  8:55 ` Tejun Heo
2008-09-15 17:02   ` Bruno Prémont
2008-09-15 20:43     ` Tejun Heo
2008-09-16  8:21       ` Bruno Prémont
2008-09-16 17:35         ` Tejun Heo
2008-09-21 19:51           ` Bruno Prémont
2008-09-16 15:07       ` Grant Grundler
2008-09-15 20:30   ` Mark Lord
2008-09-15 20:37     ` Tejun Heo [this message]
2008-09-16  3:49       ` Mark Lord

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48CEC76E.7020101@kernel.org \
    --to=tj@kernel.org \
    --cc=bonbons@linux-vserver.org \
    --cc=jgarzik@pobox.com \
    --cc=liml@rtr.ca \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).