From: Hannes Reinecke <hare@suse.de>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Vu Pham <vuhuong@mellanox.com>, Roland Dreier <roland@kernel.org>,
David Dillow <dillowda@ornl.gov>,
Sebastian Riemer <sebastian.riemer@profitbricks.com>,
linux-rdma <linux-rdma@vger.kernel.org>,
linux-scsi <linux-scsi@vger.kernel.org>,
James Bottomley <jbottomley@parallels.com>
Subject: Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
Date: Mon, 17 Jun 2013 09:14:56 +0200 [thread overview]
Message-ID: <51BEB770.9030305@suse.de> (raw)
In-Reply-To: <51BEB4FF.9000607@acm.org>
On 06/17/2013 09:04 AM, Bart Van Assche wrote:
> On 06/17/13 08:18, Hannes Reinecke wrote:
>> On 06/15/2013 11:52 AM, Bart Van Assche wrote:
[ .. ]
>>>
>>> I think the advantage of multipathd recognizing the SDEV_BLOCK state
>>> before the fast_io_fail_tmo timer has expired is important.
>>> Multipathd does not queue I/O to paths that are in the SDEV_BLOCK
>>> state so setting that state helps I/O to fail over more quickly,
>>> especially for large values of fast_io_fail_tmo.
>>>
>> Sadly it doesn't work that way.
>>
>> SDEV_BLOCK will instruct multipath to not queue _new_ I/Os to the
>> path, but there still will be I/O queued on that path.
>> For these multipath _has_ to wait for I/O completion.
>> And as it turns out, in most cases the application itself will wait
>> for completion on these I/O before continue sending more I/O.
>> So in effect multipath would queue new I/O to other paths, but won't
>> _receive_ new I/O as the upper layers are still waiting for
>> completion of the queued I/O.
>>
>> The only way to excite fast failover with multipathing is to set
>> fast_io_fail to a _LOW_ value (eg 5 seconds), as this will terminate
>> the outstanding I/Os.
>>
>> Large values of fast_io_fail will almost guarantee sluggish I/O
>> failover.
>
> Hello Hannes,
>
> I agree that the value of fast_io_fail_tmo should be kept small.
> Although as you explained changing the SCSI device state into
> SDEV_BLOCK doesn't help for I/O that has already been queued on a
> failed path, I think it's still useful for I/O that is queued after
> the fast_io_fail timer has been started and before that timer has
> expired.
>
Why, but of course.
The typical scenario would be:
-> detect link-loss
-> call scsi_block_request()
-> start dev_loss_tmo and fast_io_fail_tmo
-> When fast_io_fail_tmo triggers:
-> Abort all outstanding requests
-> When dev_loss_tmo triggers:
-> Abort all outstanding requests
-> Remove/disable the I_T nexus
-> call scsi_unblock_request()
However, if and whether multipath detects SDEV_BLOCK doesn't
guarantee a fast failover; in fact is was only added rather recently
as it's not a big win in most cases.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-06-17 7:14 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-12 13:17 [PATCH 0/14] IB SRP initiator patches for kernel 3.11 Bart Van Assche
[not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
2013-06-12 13:20 ` [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion Bart Van Assche
[not found] ` <51B875A4.7040903-HInyCGIudOg@public.gmane.org>
2013-06-12 13:38 ` Bart Van Assche
[not found] ` <51B879CF.1080802-HInyCGIudOg@public.gmane.org>
2013-06-12 14:24 ` Sebastian Riemer
2013-06-27 21:01 ` David Dillow
[not found] ` <1372366870.32164.30.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-27 23:45 ` Roland Dreier
[not found] ` <CAL1RGDWVgAKSL-GNZCkP1FEt9r_y5QWp+74NzDcga6+tcvWpXw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-06-28 7:41 ` Sebastian Riemer
2013-06-12 13:21 ` [PATCH 02/14] IB/srp: Fix race between srp_queuecommand() and srp_claim_req() Bart Van Assche
[not found] ` <51B875EE.3030702-HInyCGIudOg@public.gmane.org>
2013-06-12 14:58 ` Sebastian Riemer
[not found] ` <51B88C7C.4030209-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-12 15:14 ` Bart Van Assche
[not found] ` <51B8903E.3000609-HInyCGIudOg@public.gmane.org>
2013-06-27 21:02 ` David Dillow
[not found] ` <1372366945.32164.32.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-28 7:36 ` Bart Van Assche
2013-06-12 13:23 ` [PATCH 03/14] IB/srp: Avoid that srp_reset_host() is skipped after a TL error Bart Van Assche
[not found] ` <51B87638.50102-HInyCGIudOg@public.gmane.org>
2013-06-13 9:30 ` Sebastian Riemer
[not found] ` <51B99120.9000503-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-13 9:57 ` Bart Van Assche
2013-06-27 21:03 ` David Dillow
2013-06-12 13:24 ` [PATCH 04/14] IB/srp: Skip host settle delay Bart Van Assche
[not found] ` <51B87689.8030806-HInyCGIudOg@public.gmane.org>
2013-06-13 9:53 ` Sebastian Riemer
[not found] ` <51B996A1.6080604-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-13 13:06 ` Or Gerlitz
2013-06-27 21:04 ` David Dillow
2013-06-12 13:25 ` [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus Bart Van Assche
[not found] ` <51B876BF.4070400-HInyCGIudOg@public.gmane.org>
2013-06-13 13:57 ` Sebastian Riemer
[not found] ` <51B9CFC3.8080008-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-13 15:07 ` Bart Van Assche
[not found] ` <51B9E046.3030008-HInyCGIudOg@public.gmane.org>
2013-06-13 15:35 ` Sebastian Riemer
2013-06-13 17:50 ` Vu Pham
[not found] ` <51BA0655.6090707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-13 18:25 ` Bart Van Assche
[not found] ` <51BA0E8F.3030104-HInyCGIudOg@public.gmane.org>
2013-06-13 23:27 ` Vu Pham
[not found] ` <51BA555F.9060807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-14 9:38 ` Sebastian Riemer
[not found] ` <51BAE482.1050304-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-14 17:07 ` Vu Pham
[not found] ` <51BB4DBB.4070800-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-17 9:41 ` Sebastian Riemer
2013-06-27 21:10 ` David Dillow
[not found] ` <1372367432.32164.36.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-28 7:40 ` Bart Van Assche
2013-06-12 13:26 ` [PATCH 06/14] IB/srp: Keep rport as long as the IB transport layer Bart Van Assche
2013-06-12 13:29 ` [PATCH 08/14] IB/srp: Add srp_terminate_io() Bart Van Assche
2013-06-12 13:30 ` [PATCH 09/14] IB/srp: Use SRP transport layer error recovery Bart Van Assche
2013-06-12 13:31 ` [PATCH 10/14] IB/srp: Start timers if a transport layer error occurs Bart Van Assche
2013-06-12 13:33 ` [PATCH 11/14] IB/srp: Fail SCSI commands silently Bart Van Assche
2013-06-12 13:35 ` [PATCH 12/14] IB/srp: Make HCA completion vector configurable Bart Van Assche
[not found] ` <51B87904.1070803-HInyCGIudOg@public.gmane.org>
2013-06-27 21:24 ` David Dillow
[not found] ` <1372368256.32164.41.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-28 8:18 ` Bart Van Assche
[not found] ` <51CD46F0.60301-HInyCGIudOg@public.gmane.org>
2013-06-28 12:04 ` David Dillow
[not found] ` <1372421041.28740.14.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2013-06-28 12:29 ` Bart Van Assche
2013-06-12 13:36 ` [PATCH 13/14] IB/srp: Make transport layer retry count configurable Bart Van Assche
[not found] ` <51B8794F.6050003-HInyCGIudOg@public.gmane.org>
2013-06-27 21:22 ` David Dillow
[not found] ` <1372368138.32164.40.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-28 8:28 ` Bart Van Assche
[not found] ` <51CD4933.5080709-HInyCGIudOg@public.gmane.org>
2013-06-28 12:07 ` David Dillow
[not found] ` <1372421227.28740.17.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2013-06-28 12:30 ` Bart Van Assche
2013-06-12 13:37 ` [PATCH 14/14] IB/srp: Bump driver version and release date Bart Van Assche
2013-06-12 13:28 ` [PATCH 07/14] scsi_transport_srp: Add transport layer error handling Bart Van Assche
[not found] ` <51B8777B.5050201-HInyCGIudOg@public.gmane.org>
2013-06-13 19:43 ` Vu Pham
2013-06-14 13:19 ` Bart Van Assche
[not found] ` <51BB1857.7040802-HInyCGIudOg@public.gmane.org>
2013-06-14 17:59 ` Vu Pham
[not found] ` <51BB5A04.3080901-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-15 9:52 ` Bart Van Assche
[not found] ` <51BC3945.9030900-HInyCGIudOg@public.gmane.org>
2013-06-17 6:18 ` Hannes Reinecke
2013-06-17 7:04 ` Bart Van Assche
2013-06-17 7:14 ` Hannes Reinecke [this message]
2013-06-17 7:29 ` Bart Van Assche
[not found] ` <51BEBAEA.4080202-HInyCGIudOg@public.gmane.org>
2013-06-17 8:10 ` Hannes Reinecke
2013-06-17 10:13 ` Sebastian Riemer
2013-06-18 16:59 ` Vu Pham
[not found] ` <51C09202.2040503-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-19 13:00 ` Bart Van Assche
2013-06-23 21:13 ` Mike Christie
[not found] ` <51C764FB.6070207-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
2013-06-24 7:37 ` Bart Van Assche
-- strict thread matches above, loose matches on Subject: below --
2013-06-19 13:44 Jack Wang
2013-06-19 15:27 ` Bart Van Assche
2013-06-21 12:17 ` Jack Wang
2013-06-24 13:48 Jack Wang
[not found] ` <51C84E39.80806-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-24 15:50 ` Bart Van Assche
[not found] ` <51C86AB4.1000906-HInyCGIudOg@public.gmane.org>
2013-06-24 16:05 ` Jack Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51BEB770.9030305@suse.de \
--to=hare@suse.de \
--cc=bvanassche@acm.org \
--cc=dillowda@ornl.gov \
--cc=jbottomley@parallels.com \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=roland@kernel.org \
--cc=sebastian.riemer@profitbricks.com \
--cc=vuhuong@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox