From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ric Wheeler Subject: Re: [PATCHSET #upstream] libata: improve FLUSH error handling Date: Fri, 28 Mar 2008 10:53:53 -0400 Message-ID: <47ED0681.4090003@emc.com> References: <12066128663306-git-send-email-htejun@gmail.com> <47EBAE2B.8070102@rtr.ca> <47EBB09F.9070607@rtr.ca> <47EC5079.5020105@gmail.com> <47EC58F6.3070601@rtr.ca> <47ECF47A.2040508@emc.com> <47ED061F.2070701@gmail.com> Reply-To: ric@emc.com Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mexforward.lss.emc.com ([128.222.32.20]:50366 "EHLO mexforward.lss.emc.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753831AbYC1O4r (ORCPT ); Fri, 28 Mar 2008 10:56:47 -0400 In-Reply-To: <47ED061F.2070701@gmail.com> Sender: linux-ide-owner@vger.kernel.org List-Id: linux-ide@vger.kernel.org To: Tejun Heo Cc: Mark Lord , jeff@garzik.org, linux-ide@vger.kernel.org, alan@lxorguk.ukuu.org.uk Tejun Heo wrote: > Ric Wheeler wrote: >> I think that is a really important knob to have. Not just for RAID >> systems, but we use the FLUSH_CACHE on systems without barriers mainly >> when we power down & do the unmounts, etc. >> >> If you hit a bad block during power down of a laptop, I can image that >> have a worst case of (30?) seconds is infinitely better than multiple >> minutes ;-) > > Fully finishing FLUSH CACHE requires command repetition. Not fully > finishing FLUSH CACHE on shutdown means sure data loss. Given that > FLUSH CACHE failure is very rare and it's repeatedly retried if and only > if the device actively indicates failure, I'm not too sure. Also note > that if FLUSH CACHE fails, you cannot even trust the FS journal. Things > can get silently corrupt. > I do agree with the above, we should try to get the FLUSH done according to spec, I meant to argue that we should bound the time spent. If my laptop spends more than 30? 60? 120? seconds trying to flush a write cache, I will probably be looking for a way to force it to power down ;-) It is also worth noting that most users of ext3 run without barriers enabled (and the drive write cache enabled) which means that we test this corruption path on any non-UPS power failure. ric