All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ric Wheeler <rwheeler@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Andrei Tanas <andrei@tanas.ca>, NeilBrown <neilb@suse.de>,
	linux-kernel@vger.kernel.org,
	IDE/ATA development list <linux-ide@vger.kernel.org>,
	linux-scsi@vger.kernel.org, Jeff Garzik <jgarzik@redhat.com>,
	Mark Lord <mlord@pobox.com>
Subject: Re: MD/RAID time out writing superblock
Date: Mon, 31 Aug 2009 08:04:26 -0400	[thread overview]
Message-ID: <4A9BBC4A.6070708@redhat.com> (raw)
In-Reply-To: <4A9B8583.9050601@kernel.org>

On 08/31/2009 04:10 AM, Tejun Heo wrote:
> Ric Wheeler wrote:
>    
>> On 08/27/2009 05:22 PM, Andrei Tanas wrote:
>>      
>>> Hello,
>>>
>>> This is about the same problem that I wrote two days ago (md gets an
>>> error
>>> while writing superblock and fails a hard drive).
>>>
>>> I've tried to figure out what's really going on, and as far as I can
>>> tell,
>>> the disk doesn't really fail (as confirmed by multiple tests), it
>>> times out
>>> trying to execute ATA_CMD_FLUSH_EXT ("at2.00 cmd ea..." in the log)
>>> command. The reason for this I believe is that md_super_write queues the
>>> write comand with BIO_RW_SYNCIO flag.
>>> As I wrote before, with 32MB cache it is conceivable that it will take
>>> the
>>> drive longer than 30 seconds (defined by SD_TIMEOUT in scsi/sd.h) to
>>> flush
>>> its buffers.
>>>
>>> Changing safe_mode_delay to more conservative 2 seconds should definitely
>>> help, but is it really necessary to write the superblock synchronously
>>> when
>>> array changes status from active to active-idle?
>>>
>>> [90307.328266] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
>>> frozen
>>> [90307.328275] ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
>>> [90307.328277]          res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4
>>> (timeout)
>>> [90307.328280] ata2.00: status: { DRDY }
>>> [90307.328288] ata2: hard resetting link
>>> [90313.218511] ata2: link is slow to respond, please be patient (ready=0)
>>> [90317.377711] ata2: SRST failed (errno=-16)
>>> [90317.377720] ata2: hard resetting link
>>> [90318.251720] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>>> [90318.338026] ata2.00: configured for UDMA/133
>>> [90318.338062] ata2: EH complete
>>> [90318.370625] end_request: I/O error, dev sdb, sector 1953519935
>>> [90318.370632] md: super_written gets error=-5, uptodate=0
>>>
>>>
>>>        
>> 30 seconds is a very long time for a drive to respond, but I think that
>> your explanation fits the facts pretty well...
>>      
> Even with 32MB cache, 30secs should be more than enough.  It's not
> like the drive is gonna do random write on those.  It's likely to make
> only very few number of strokes over the platter and it really
> shouldn't take very long.  I'm yet to see an actual case where a
> properly functioning drive timed out flush because the flush itself
> took long enough.
>
>    

I agree - vendors put a lot of pressure on drive manufacturers to finish 
up (even during error recovery) in much less than 30 seconds. The push 
was always for something closer to 15 seconds iirc.

>> The drive might take a longer time like this when doing error handling
>> (sector remapping, etc), but then I would expect to see your remapped
>> sector count grow.
>>      
> Yes, this is a possibility and according to the spec, libata EH should
> be retrying flushes a few times before giving up but I'm not sure
> whether keeping retrying for several minutes is a good idea either.
> Is it?
>
> Thanks.
>
>    

I don't think that retrying for minutes is a good idea. I wonder if this 
could be caused by power issues or cable issues to the drive?

Ric


  reply	other threads:[~2009-08-31 12:05 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-26  0:32 MD/RAID: what's wrong with sector 1953519935? Andrei Tanas
2009-08-26  0:50 ` NeilBrown
2009-08-26  1:06   ` Ric Wheeler
2009-08-26  1:24     ` NeilBrown
2009-08-26  1:31       ` Ric Wheeler
2009-08-26  2:22         ` Andrei Tanas
2009-08-26  2:41           ` Ric Wheeler
2009-08-26  3:45             ` Andrei Tanas
2009-08-26 10:34               ` Ric Wheeler
2009-08-26 14:46                 ` Andrei Tanas
2009-08-26 14:49                   ` Andrei Tanas
2009-08-26 15:39                   ` Ric Wheeler
2009-08-26 18:12                     ` Andrei Tanas
2009-08-26 18:12                       ` Andrei Tanas
2009-08-27  0:07                       ` Mark Lord
2009-08-27  1:37                         ` Andrei Tanas
2009-08-27  1:37                           ` Andrei Tanas
2009-08-27  2:33                       ` Robert Hancock
2009-08-27 21:22                       ` MD/RAID time out writing superblock Andrei Tanas
2009-08-27 21:57                         ` Ric Wheeler
2009-08-31  8:10                           ` Tejun Heo
2009-08-31 12:04                             ` Ric Wheeler [this message]
2009-08-31 12:20                               ` Tejun Heo
2009-09-07 11:44                                 ` Chris Webb
2009-09-07 11:59                                   ` Chris Webb
2009-09-09 12:02                                     ` Chris Webb
2009-09-14  7:41                                       ` Tejun Heo
2009-09-14  7:44                                         ` Tejun Heo
2009-09-14 12:48                                           ` Mark Lord
2009-09-14 13:05                                             ` Tejun Heo
2009-09-14 14:25                                               ` Mark Lord
2009-09-16 23:19                                                 ` Chris Webb
2009-09-17 13:29                                                   ` Mark Lord
2009-09-17 13:32                                                     ` Mark Lord
2009-09-17 13:37                                                     ` Chris Webb
2009-09-17 15:35                                                     ` Tejun Heo
2009-09-17 16:16                                                       ` Mark Lord
2009-09-17 16:17                                                         ` Mark Lord
2009-09-18 17:05                                                           ` Chris Webb
2009-09-20 17:35                                                             ` Allan Wind
2009-09-28  5:32                                                               ` Allan Wind
2009-09-21 10:26                                                             ` Chris Webb
2009-09-21 19:47                                                               ` Mark Lord
2009-09-22  6:16                                                               ` Robert Hancock
2009-09-20 18:36                                                         ` Robert Hancock
2009-09-14 13:11                                           ` Henrique de Moraes Holschuh
2009-09-14 13:24                                             ` Tejun Heo
2009-09-14 14:02                                               ` Henrique de Moraes Holschuh
2009-09-14 14:34                                                 ` Tejun Heo
2009-09-14 13:14                                         ` Gabor Gombas
2009-09-07 16:55                                   ` Allan Wind
2009-09-07 16:55                                   ` Allan Wind
2009-09-07 23:26                                     ` Thomas Fjellstrom
2009-09-07 23:26                                       ` Thomas Fjellstrom
2009-09-14  7:46                                       ` Tejun Heo
2009-09-14 21:13                                         ` Thomas Fjellstrom
2009-09-14 22:23                                           ` Tejun Heo
2009-09-16 22:28                                 ` Chris Webb
2009-09-16 23:47                                   ` Tejun Heo
2009-09-17  0:34                                     ` Neil Brown
2009-09-17 12:00                                       ` Chris Webb
2009-09-17 11:57                                     ` Chris Webb
2009-09-17 15:44                                       ` Tejun Heo
2009-09-17 16:36                                         ` Allan Wind
2009-09-18  0:16                                           ` Tejun Heo
2009-09-18  2:47                                             ` Allan Wind
2009-09-18 17:07                                         ` Chris Webb
2009-09-20 18:46                                         ` Robert Hancock
2009-09-21  0:02                                           ` Kyle Moffett
2009-09-17 13:35                                     ` Mark Lord
2009-09-17 15:47                                       ` Tejun Heo
2009-08-31 12:21                             ` Mark Lord
2009-08-31 23:45                               ` Mark Lord
2009-09-01 13:07                                 ` Andrei Tanas
2009-09-01 13:07                                   ` Andrei Tanas
2009-09-01 13:15                                   ` Mark Lord
2009-09-01 13:30                                     ` Tejun Heo
2009-09-01 13:47                                       ` Ric Wheeler
2009-09-01 14:18                                         ` Andrei Tanas
2009-09-01 14:18                                           ` Andrei Tanas
2009-09-14  5:30                                           ` Marc Giger
2009-09-14  5:30                                             ` Marc Giger
2009-09-02 21:58                                   ` Allan Wind
2009-09-04 19:39                                     ` Andrei Tanas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A9BBC4A.6070708@redhat.com \
    --to=rwheeler@redhat.com \
    --cc=andrei@tanas.ca \
    --cc=jgarzik@redhat.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=mlord@pobox.com \
    --cc=neilb@suse.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.