From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tejun Heo Subject: Re: hdaprm -Y /dev/sda /dev/sdb -> I/O error -> disk kicked out of RAID - is it normal? Date: Fri, 17 Jul 2009 14:24:00 +0900 Message-ID: <4A600AF0.2070102@kernel.org> References: <4A5FB1DD.3000904@wpkg.org> <4A5FFA46.1070203@kernel.org> <29d09dacb3dce47830a95bc6493d3b88.squirrel@neil.brown.name> <4A6002AC.9090807@kernel.org> <4A600A03.5040808@garzik.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4A600A03.5040808@garzik.org> Sender: linux-ide-owner@vger.kernel.org To: Jeff Garzik Cc: NeilBrown , Tomasz Chmielewski , linux-raid@vger.kernel.org, linux-ide@vger.kernel.org List-Id: linux-raid.ids Jeff Garzik wrote: > Tejun Heo wrote: >> NeilBrown wrote: >>> I tried that and easily found cases where it fails way too fast. >>> FAILFAST seems to mean different things on different devices, making >>> it useless in general (it is still useful in some specific cases >>> such as multipath on devices which are expected to be used under >>> multipath and so treat FAILFAST appropriately). >> >> Yeap, FAILFAST flags seem geared pretty much toward multipathing. > > Yes :/ > > I'm glad this area is getting some attention, because we ideally want to > do two things in parallel: > > * send upper layer advisory message, when we first notice a failure > * begin EH recovery > > Time passes, libata attempts recovery, and completes the command with > success or failure many seconds later. > > Right now, failfast handling is inconsistent, and is not (I think...) > always signalled as soon as we begin EH. Heh.. yeah, it's notified on completion of EH, which BTW is pretty dumb. :-) -- tejun