From: Tejun Heo <teheo@suse.de>
To: Chris Webb <chris@arachsys.com>
Cc: linux-scsi@vger.kernel.org, Ric Wheeler <rwheeler@redhat.com>,
Andrei Tanas <andrei@tanas.ca>, NeilBrown <neilb@suse.de>,
linux-kernel@vger.kernel.org,
IDE/ATA development list <linux-ide@vger.kernel.org>,
Jeff Garzik <jgarzik@redhat.com>, Mark Lord <mlord@pobox.com>
Subject: Re: MD/RAID time out writing superblock
Date: Mon, 14 Sep 2009 16:44:49 +0900 [thread overview]
Message-ID: <4AADF471.2020801@suse.de> (raw)
In-Reply-To: <4AADF3C4.5060004@kernel.org>
Tejun Heo wrote:
>> I wonder what's different about these two timeouts such that one causes an I/O
>> error and the other just causes a retry after reset? Presumably if the latter
>> was also just a retry, everything would be (closer to being) fine.
>
> Because this error is actually seen by the md layer and FLUSH in
> general can't be retried cleanly. On retrial, the drive goes on and
> retry the sectors after the point of failure. I'm not sure whether
> FLUSH is actually failing here or it's a communication glitch. At any
> rate, if FLUSH is failing or timing out, the only right thing to do is
> to kick it out of the array as keeping after retrying may lead to
> silent data corruption. Seriously, it's most likely a hardware
> malfunction although I can't tell where the problem is with the given
> data. Get the hardware fixed.
Oooh, another possibility is the above continuous IDENTIFY tries.
Doing things like that generally isn't a good idea because vendors
don't expect IDENTIFY to be mixed regularly with normal IOs and
firmwares aren't tested against that. Even smart commands sometimes
cause problems. So, finding out the thing which is obsessed with the
identity of the drive and stopping it might help.
Thanks.
--
tejun
next prev parent reply other threads:[~2009-09-14 7:44 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <004e01ca25e4$c11a54e0$434efea0$@ca>
[not found] ` <9cfb6af689a7010df166fdebb1ef516b.squirrel@neil.brown.name>
[not found] ` <4A948A82.4080901@redhat.com>
[not found] ` <b585ed9f13649050bbc984869d081315.squirrel@neil.brown.name>
[not found] ` <4A94905F.7050705@redhat.com>
[not found] ` <005101ca25f4$09006830$1b013890$@ca>
[not found] ` <4A94A0E6.4020401@redhat.com>
[not found] ` <005401ca25ff$9ac91cc0$d05b5640$@ca>
[not found] ` <4A950FA6.4020408@redhat.com>
[not found] ` <92cb16daad8278b0aa98125b9e1d057a@localhost>
[not found] ` <4A95573A.6090404@redhat.com>
2009-08-26 18:12 ` MD/RAID: what's wrong with sector 1953519935? Andrei Tanas
2009-08-27 0:07 ` Mark Lord
2009-08-27 1:37 ` Andrei Tanas
2009-08-27 2:33 ` Robert Hancock
[not found] ` <d086b110526f8bac2f562850dfc70b03@localhost>
2009-08-27 21:57 ` MD/RAID time out writing superblock Ric Wheeler
2009-08-31 8:10 ` Tejun Heo
2009-08-31 12:04 ` Ric Wheeler
2009-08-31 12:20 ` Tejun Heo
2009-09-07 11:44 ` Chris Webb
2009-09-07 11:59 ` Chris Webb
2009-09-09 12:02 ` Chris Webb
2009-09-14 7:41 ` Tejun Heo
2009-09-14 7:44 ` Tejun Heo [this message]
2009-09-14 12:48 ` Mark Lord
2009-09-14 13:05 ` Tejun Heo
2009-09-14 14:25 ` Mark Lord
2009-09-16 23:19 ` Chris Webb
2009-09-17 13:29 ` Mark Lord
2009-09-17 13:32 ` Mark Lord
2009-09-17 13:37 ` Chris Webb
2009-09-17 15:35 ` Tejun Heo
2009-09-17 16:16 ` Mark Lord
2009-09-17 16:17 ` Mark Lord
2009-09-18 17:05 ` Chris Webb
2009-09-21 10:26 ` Chris Webb
2009-09-21 19:47 ` Mark Lord
2009-09-22 6:16 ` Robert Hancock
2009-09-20 18:36 ` Robert Hancock
2009-09-14 13:11 ` Henrique de Moraes Holschuh
2009-09-14 13:24 ` Tejun Heo
2009-09-14 14:02 ` Henrique de Moraes Holschuh
2009-09-14 14:34 ` Tejun Heo
2009-09-14 13:14 ` Gabor Gombas
2009-09-07 16:55 ` Allan Wind
2009-09-07 23:26 ` Thomas Fjellstrom
2009-09-14 7:46 ` Tejun Heo
2009-09-14 21:13 ` Thomas Fjellstrom
2009-09-14 22:23 ` Tejun Heo
2009-09-16 22:28 ` Chris Webb
2009-09-16 23:47 ` Tejun Heo
2009-09-17 0:34 ` Neil Brown
2009-09-17 12:00 ` Chris Webb
2009-09-17 11:57 ` Chris Webb
2009-09-17 15:44 ` Tejun Heo
2009-09-17 16:36 ` Allan Wind
2009-09-18 0:16 ` Tejun Heo
2009-09-18 2:47 ` Allan Wind
2009-09-18 17:07 ` Chris Webb
2009-09-20 18:46 ` Robert Hancock
2009-09-21 0:02 ` Kyle Moffett
2009-09-17 13:35 ` Mark Lord
2009-09-17 15:47 ` Tejun Heo
2009-08-31 12:21 ` Mark Lord
2009-08-31 23:45 ` Mark Lord
2009-09-01 13:07 ` Andrei Tanas
2009-09-01 13:15 ` Mark Lord
2009-09-01 13:30 ` Tejun Heo
2009-09-01 13:47 ` Ric Wheeler
2009-09-01 14:18 ` Andrei Tanas
2009-09-02 21:58 ` Allan Wind
2009-09-04 19:39 ` Andrei Tanas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AADF471.2020801@suse.de \
--to=teheo@suse.de \
--cc=andrei@tanas.ca \
--cc=chris@arachsys.com \
--cc=jgarzik@redhat.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=mlord@pobox.com \
--cc=neilb@suse.de \
--cc=rwheeler@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).