From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pierre Beck Subject: Tunables for timeout behaviour? Date: Mon, 23 Apr 2012 23:12:00 +0200 Message-ID: <4F95C59A.7010706@pierre-beck.de> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from ozy.pierre-beck.de ([178.63.78.113]:37269 "EHLO ozy.pierre-beck.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752901Ab2DWVrp (ORCPT ); Mon, 23 Apr 2012 17:47:45 -0400 Received: from p4ff4ac5c.dip.t-dialin.net ([79.244.172.92] helo=[192.168.1.102]) by ozy.pierre-beck.de with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.69) (envelope-from ) id 1SMQYC-0002op-H7 for linux-scsi@vger.kernel.org; Mon, 23 Apr 2012 23:12:00 +0200 Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: linux-scsi@vger.kernel.org Hi, asking on the linux-raid list about implementing timeouts in mdraid, Neil Brown answered with a quite definitive "no" and said the next lower level should take care of timeouts. So here I am :-) SCSI timeouts currently reset device, then bus, then host adapter. I currently experience bus and / or host adapter resets as several minutes of zero I/O on HW RAID cards. Can't speak for simple HBAs, but HW RAID cards seem to search for logical volumes on all attached disks or something and take ages to "reset" even in JBOD mode. In other words, a bad sector may escalate to unnecessary long unavailability of a whole array of disks, despite mdraid in place. Yes, drives with TLER / SCTERC management should not spend much time on bad sectors, but not every drive supports that and a *failing* drive with crashed firmware won't be cooperative anyways. So is there any way to influence this behaviour? I think the mdraid layer could profit alot from stopping retries at the device level. mdraid can simply fail the device and keep working. Neil mentioned "FAILFAST" flags that can be added to requests issued by mdraid, but reading a few patches I suspect these are meant for multipath only (there's zero documentation on FAILFAST?). What I'd wish for was a tunable that goes along with /sys/bus/scsi/devices/.../timeout to limit escalation to a specific level and / or some way for mdraid to flag requests. Greetings, Pierre Beck