From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752804AbZHaMVu (ORCPT ); Mon, 31 Aug 2009 08:21:50 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752513AbZHaMVt (ORCPT ); Mon, 31 Aug 2009 08:21:49 -0400 Received: from hera.kernel.org ([140.211.167.34]:39929 "EHLO hera.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752470AbZHaMVr (ORCPT ); Mon, 31 Aug 2009 08:21:47 -0400 Message-ID: <4A9BC023.10903@kernel.org> Date: Mon, 31 Aug 2009 21:20:51 +0900 From: Tejun Heo User-Agent: Thunderbird 2.0.0.22 (X11/20090605) MIME-Version: 1.0 To: Ric Wheeler CC: Andrei Tanas , NeilBrown , linux-kernel@vger.kernel.org, IDE/ATA development list , linux-scsi@vger.kernel.org, Jeff Garzik , Mark Lord Subject: Re: MD/RAID time out writing superblock References: <004e01ca25e4$c11a54e0$434efea0$@ca> <9cfb6af689a7010df166fdebb1ef516b.squirrel@neil.brown.name> <4A948A82.4080901@redhat.com> <4A94905F.7050705@redhat.com> <005101ca25f4$09006830$1b013890$@ca> <4A94A0E6.4020401@redhat.com> <005401ca25ff$9ac91cc0$d05b5640$@ca> <4A950FA6.4020408@redhat.com> <92cb16daad8278b0aa98125b9e1d057a@localhost> <4A95573A.6090404@redhat.com> <1571f45804875514762f60c0097171e6@localhost> <4A970154.2020507@redhat.com> <4A9B8583.9050601@kernel.org> <4A9BBC4A.6070708@redhat.com> In-Reply-To: <4A9BBC4A.6070708@redhat.com> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.0 (hera.kernel.org [127.0.0.1]); Mon, 31 Aug 2009 12:20:54 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Ric Wheeler wrote: >>> The drive might take a longer time like this when doing error handling >>> (sector remapping, etc), but then I would expect to see your remapped >>> sector count grow. >>> >> Yes, this is a possibility and according to the spec, libata EH should >> be retrying flushes a few times before giving up but I'm not sure >> whether keeping retrying for several minutes is a good idea either. >> Is it? > > I don't think that retrying for minutes is a good idea. I wonder if this > could be caused by power issues or cable issues to the drive? IIRC, there were two identified weird reasons for flush timeouts. The first was quirky firmware which meant that using NCQ meant timeouts on FLUSH. The second was flaky power. So, yeah, it can be caused by power issue. Not so sure about cable tho. Thanks. -- tejun