From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Ruemker Subject: Re: multipath failover & rhcs Date: Mon, 25 Apr 2011 13:29:21 -0400 Message-ID: <4DB5AF71.6060808@redhat.com> References: <4DB5A901.9040102@redhat.com> Reply-To: device-mapper development Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4DB5A901.9040102@redhat.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com To: Dave Sullivan Cc: lvm-team@redhat.com, dm-devel@redhat.com, Lon Hohberger List-Id: dm-devel.ids On 04/25/2011 01:01 PM, Dave Sullivan wrote: > Hi Guys, > > It seems recently that we have just run into this problem where we > don't fully understand the timeouts that drive multipath fail-over. > > We did thorough testing of pulling fibre/failing hbas manualling and > multipath handled things perfectly. > > Recently we enountered SCSI Block errors, where the multipath > fail-over did not occur before the qdisk timeout. > > This was attributed to the scsi block errors and the scsi lun timeout > of 60 seconds which is set by default. > > I added a comment to the first link below that discusses a situation > that would cause this to occur. We think that this was due to a > defective HBA under high I/O load. > > Once we get the HBA in question we will run some tests to validate > that modifying the scsi block timeouts in fact allows multipath to > fail-over in time to beat the qdisk timeout. > > I'm getting ready to to take a look at the code to see if I can > validate these theories. The area that is still somewhat gray is the > true definition for multipath timings for failover. > > I don't think there is a true definition of a multipath timeout, per > see. I see it as the following: > > multipath check = every 20 seconds for no failed paths > multipath check (if failed paths) = every 5 seconds on failed paths only > > multipath failover occurs = driver timeout attribute met ( Emulex > lpfc_devloss_tmo value) > --capture pulling fibre > --capture disabling hba > > or (for other types of failures) > > multipath failover occurs =scsi block timeout + driver timeout (not > sure if the driver timeout attribute is a added) > > https://access.redhat.com/kb/docs/DOC-2881 > https://docspace.corp.redhat.com/docs/DOC-32822 > > Hmm, just found out that there was new fix in rhel5u5 for this it > looks like from this case in salesforce 00085953. > > -Dave > Hi Dave, These are issues we have recently been working to resolve with this and other qdisk articles. The problem is as you described it: we don't have an accurate definition of how long it will take multipath to fail a path in all scenarios. The formula used in the article is basically wrong, and we're working to fix it, but coming up with a formula for a path timeout has been difficult. This calculation should not be based on no_path_retry at all, as we are really only concerned in the amount of time it takes for the scsi layer to return an error, allowing qdisk's I/O operation to be sent down an alternate path. Regarding the formula you posted: >> multipath check = every 20 seconds for no failed paths >> multipath check (if failed paths) = every 5 seconds on failed paths only Just to clarify, the polling interval doubles after each successful path check, up to 4 times the original. So you're correct, that for a healthy path you should see it checking every 20s after the first few checks. Likewise, your second statement is also accurate in that after a failed check, it drops back to the configured polling interval until the path returns to active status. Regarding case 00085953, I was actually the owner of that one. There was a change that went into 5.5 which lowered the default tur/readsector0 SCSI I/O timeout down from 300 to the checker_timeout value (which defaults to the timeout value in /sys/block/sdX/device/timeout). I am very interested in any information you come up with on the calculation of how long a path failure will take. We will integrate that into this article if you can come up with anything. Let me know if you have any questions. -- John Ruemker, RHCA Technical Account Manager Global Support Services Red Hat, Inc. Office: 919-754-4941 Cell: 919-793-8549