From: Bart Van Assche <bvanassche@acm.org>
To: Hannes Reinecke <hare@suse.de>
Cc: Vu Pham <vuhuong@mellanox.com>, Roland Dreier <roland@kernel.org>,
David Dillow <dillowda@ornl.gov>,
Sebastian Riemer <sebastian.riemer@profitbricks.com>,
linux-rdma <linux-rdma@vger.kernel.org>,
linux-scsi <linux-scsi@vger.kernel.org>,
James Bottomley <jbottomley@parallels.com>
Subject: Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
Date: Mon, 17 Jun 2013 09:04:31 +0200 [thread overview]
Message-ID: <51BEB4FF.9000607@acm.org> (raw)
In-Reply-To: <51BEAA40.9070908@suse.de>
On 06/17/13 08:18, Hannes Reinecke wrote:
> On 06/15/2013 11:52 AM, Bart Van Assche wrote:
>> On 06/14/13 19:59, Vu Pham wrote:
>>>> On 06/13/13 21:43, Vu Pham wrote:
>>>>>> +/**
>>>>>> + * srp_tmo_valid() - check timeout combination validity
>>>>>> + *
>>>>>> + * If no fast I/O fail timeout has been configured then the
>>>>>> device
>>>>>> loss timeout
>>>>>> + * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O
>>>>>> fail
>>>>>> timeout has
>>>>>> + * been configured then it must be below the device loss timeout.
>>>>>> + */
>>>>>> +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo)
>>>>>> +{
>>>>>> + return (fast_io_fail_tmo < 0 && 1 <= dev_loss_tmo &&
>>>>>> + dev_loss_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
>>>>>> + || (0 <= fast_io_fail_tmo &&
>>>>>> + (dev_loss_tmo < 0 ||
>>>>>> + (fast_io_fail_tmo < dev_loss_tmo &&
>>>>>> + dev_loss_tmo < LONG_MAX / HZ))) ? 0 : -EINVAL;
>>>>>> +}
>>>>>> +EXPORT_SYMBOL_GPL(srp_tmo_valid);
>>>>> fast_io_fail_tmo is off, one cannot turn off dev_loss_tmo with
>>>>> negative value
>>>>> dev_loss_tmo is off, one cannot turn off fast_io_fail_tmo with
>>>>> negative value
>>>>
>>>> OK, will update the documentation such that it correctly refers to
>>>> "off" instead of a negative value and I will also mention that
>>>> dev_loss_tmo can now be disabled.
>>>>
>>> It's not only the documentation but also the code logic, you
>>> cannot turn
>>> dev_loss_tmo off if fast_io_fail_tmo already turned off and vice
>>> versa
>>> with the return statement above.
>>
>> Does this mean that you think it would be useful to disable both the
>> fast_io_fail and the dev_loss mechanisms, and hence rely on the user
>> to remove remote ports that have disappeared and on the SCSI command
>> timeout to detect path failures ? I'll start testing this to see
>> whether that combination does not trigger any adverse behavior.
>>
>>>>> If rport's state is already SRP_RPORT_BLOCKED, I don't think we
>>>>> need
>>>>> to do extra block with scsi_block_requests()
>>>>
>>>> Please keep in mind that srp_reconnect_rport() can be called from
>>>> two
>>>> different contexts: that function can not only be called from inside
>>>> the SRP transport layer but also from inside the SCSI error handler
>>>> (see also the srp_reset_device() modifications in a later patch in
>>>> this series). If this function is invoked from the context of the
>>>> SCSI
>>>> error handler the chance is high that the SCSI device will have
>>>> another state than SDEV_BLOCK. Hence the scsi_block_requests()
>>>> call in
>>>> this function.
>>> Yes, srp_reconnect_rport() can be called from two contexts;
>>> however, it
>>> deals with same rport & rport's state.
>>> I'm thinking something like this:
>>>
>>> if (rport->state != SRP_RPORT_BLOCKED) {
>>> scsi_block_requests(shost);
>>
>> Sorry but I'm afraid that that approach would still allow the user
>> to unblock one or more SCSI devices via sysfs during the
>> i->f->reconnect(rport) call, something we do not want.
>>
>>> I think that we can use only the pair
>>> scsi_block_requests()/scsi_unblock_requests() unless the advantage of
>>> multipathd recognizing the SDEV_BLOCK is noticeable.
>>
>> I think the advantage of multipathd recognizing the SDEV_BLOCK state
>> before the fast_io_fail_tmo timer has expired is important.
>> Multipathd does not queue I/O to paths that are in the SDEV_BLOCK
>> state so setting that state helps I/O to fail over more quickly,
>> especially for large values of fast_io_fail_tmo.
>>
> Sadly it doesn't work that way.
>
> SDEV_BLOCK will instruct multipath to not queue _new_ I/Os to the
> path, but there still will be I/O queued on that path.
> For these multipath _has_ to wait for I/O completion.
> And as it turns out, in most cases the application itself will wait
> for completion on these I/O before continue sending more I/O.
> So in effect multipath would queue new I/O to other paths, but won't
> _receive_ new I/O as the upper layers are still waiting for
> completion of the queued I/O.
>
> The only way to excite fast failover with multipathing is to set
> fast_io_fail to a _LOW_ value (eg 5 seconds), as this will terminate
> the outstanding I/Os.
>
> Large values of fast_io_fail will almost guarantee sluggish I/O
> failover.
Hello Hannes,
I agree that the value of fast_io_fail_tmo should be kept small.
Although as you explained changing the SCSI device state into SDEV_BLOCK
doesn't help for I/O that has already been queued on a failed path, I
think it's still useful for I/O that is queued after the fast_io_fail
timer has been started and before that timer has expired.
Bart.
next prev parent reply other threads:[~2013-06-17 7:04 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-12 13:17 [PATCH 0/14] IB SRP initiator patches for kernel 3.11 Bart Van Assche
[not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
2013-06-12 13:26 ` [PATCH 06/14] IB/srp: Keep rport as long as the IB transport layer Bart Van Assche
2013-06-12 13:28 ` [PATCH 07/14] scsi_transport_srp: Add transport layer error handling Bart Van Assche
[not found] ` <51B8777B.5050201-HInyCGIudOg@public.gmane.org>
2013-06-13 19:43 ` Vu Pham
2013-06-14 13:19 ` Bart Van Assche
[not found] ` <51BB1857.7040802-HInyCGIudOg@public.gmane.org>
2013-06-14 17:59 ` Vu Pham
[not found] ` <51BB5A04.3080901-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-15 9:52 ` Bart Van Assche
[not found] ` <51BC3945.9030900-HInyCGIudOg@public.gmane.org>
2013-06-17 6:18 ` Hannes Reinecke
2013-06-17 7:04 ` Bart Van Assche [this message]
2013-06-17 7:14 ` Hannes Reinecke
2013-06-17 7:29 ` Bart Van Assche
[not found] ` <51BEBAEA.4080202-HInyCGIudOg@public.gmane.org>
2013-06-17 8:10 ` Hannes Reinecke
2013-06-17 10:13 ` Sebastian Riemer
2013-06-18 16:59 ` Vu Pham
[not found] ` <51C09202.2040503-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-19 13:00 ` Bart Van Assche
2013-06-23 21:13 ` Mike Christie
[not found] ` <51C764FB.6070207-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
2013-06-24 7:37 ` Bart Van Assche
-- strict thread matches above, loose matches on Subject: below --
2013-06-19 13:44 Jack Wang
2013-06-19 15:27 ` Bart Van Assche
2013-06-21 12:17 ` Jack Wang
2013-06-24 13:48 Jack Wang
[not found] ` <51C84E39.80806-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-24 15:50 ` Bart Van Assche
[not found] ` <51C86AB4.1000906-HInyCGIudOg@public.gmane.org>
2013-06-24 16:05 ` Jack Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51BEB4FF.9000607@acm.org \
--to=bvanassche@acm.org \
--cc=dillowda@ornl.gov \
--cc=hare@suse.de \
--cc=jbottomley@parallels.com \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=roland@kernel.org \
--cc=sebastian.riemer@profitbricks.com \
--cc=vuhuong@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).