All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laurence Oberman <loberman@redhat.com>
To: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
	Mike Snitzer <snitzer@redhat.com>,
	linux-block@vger.kernel.org, lsf@lists.linux-foundation.org,
	device-mapper development <dm-devel@redhat.com>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [Lsf] Notes from the four separate IO track sessions at LSF/MM
Date: Thu, 28 Apr 2016 12:23:44 -0400 (EDT)	[thread overview]
Message-ID: <610090691.32303585.1461860624844.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <5722320E.5080202@sandisk.com>

Hello Folks,

We still suffer from periodic complaints in our large customer base regarding the long recovery times for dm-multipath.
Most of the time this is when we have something like a switch back-plane issue or an issue where RSCN'S are blocked coming back up the fabric.
Corner cases still bite us often.

Most of the complaints originate from customers for example seeing Oracle cluster evictions where during the waiting on the mid-layer all mpath I/O is blocked until recovery.

We have to tune eh_deadline, eh_timeout and fast_io_fail_tmo but even tuning those we have to wait on serial recovery even if we set the timeouts low.

Lately we have been living with
eh_deadline=10
eh_timeout=5
fast_fail_io_tmo=10
leaving default sd timeout at 30s

So this continues to be an issue and I have specific examples using the jammer I can provide showing the serial recovery times here.

Thanks

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

----- Original Message -----
From: "Bart Van Assche" <bart.vanassche@sandisk.com>
To: "James Bottomley" <James.Bottomley@HansenPartnership.com>, "Mike Snitzer" <snitzer@redhat.com>
Cc: linux-block@vger.kernel.org, lsf@lists.linux-foundation.org, "device-mapper development" <dm-devel@redhat.com>, "linux-scsi" <linux-scsi@vger.kernel.org>
Sent: Thursday, April 28, 2016 11:53:50 AM
Subject: Re: [Lsf] Notes from the four separate IO track sessions at LSF/MM

On 04/28/2016 08:40 AM, James Bottomley wrote:
> Well, the entire room, that's vendors, users and implementors
> complained that path failover takes far too long.  I think in their
> minds this is enough substance to go on.

The only complaints I heard about path failover taking too long came 
from people working on FC drivers. Aren't SCSI transport layer 
implementations expected to fail I/O after fast_io_fail_tmo expired 
instead of waiting until the SCSI error handler has finished? If so, why 
is it considered an issue that error handling for the FC protocol can 
take very long (hours)?

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2016-04-28 16:23 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-27 23:39 Notes from the four separate IO track sessions at LSF/MM James Bottomley
2016-04-28 12:11 ` Mike Snitzer
2016-04-28 15:40   ` James Bottomley
2016-04-28 15:53     ` [Lsf] " Bart Van Assche
2016-04-28 16:19       ` Knight, Frederick
2016-04-28 16:37         ` Bart Van Assche
2016-04-28 17:33         ` James Bottomley
2016-04-28 16:23       ` Laurence Oberman [this message]
2016-04-28 16:41         ` [dm-devel] " Bart Van Assche
2016-04-28 16:47           ` Laurence Oberman
2016-04-29 21:47             ` Laurence Oberman
2016-04-29 21:51               ` Laurence Oberman
2016-04-30  0:36               ` Bart Van Assche
2016-04-30  0:47                 ` Laurence Oberman
2016-05-02 18:49                   ` Bart Van Assche
2016-05-02 19:28                     ` Laurence Oberman
2016-05-02 22:28                       ` Bart Van Assche
2016-05-03 17:44                         ` Laurence Oberman
2016-05-26  2:38     ` bio-based DM multipath is back from the dead [was: Re: Notes from the four separate IO track sessions at LSF/MM] Mike Snitzer
2016-05-27  8:39       ` Hannes Reinecke
2016-05-27  8:39         ` Hannes Reinecke
2016-05-27 14:44         ` Mike Snitzer
2016-05-27 15:42           ` Hannes Reinecke
2016-05-27 15:42             ` Hannes Reinecke
2016-05-27 16:10             ` Mike Snitzer
2016-04-29 16:45 ` [dm-devel] Notes from the four separate IO track sessions at LSF/MM Benjamin Marzinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=610090691.32303585.1461860624844.JavaMail.zimbra@redhat.com \
    --to=loberman@redhat.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=bart.vanassche@sandisk.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=lsf@lists.linux-foundation.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.