From: Laurence Oberman <loberman@redhat.com>
To: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: linux-block@vger.kernel.org,
linux-scsi <linux-scsi@vger.kernel.org>,
Mike Snitzer <snitzer@redhat.com>,
James Bottomley <James.Bottomley@HansenPartnership.com>,
device-mapper development <dm-devel@redhat.com>,
lsf@lists.linux-foundation.org
Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM
Date: Thu, 28 Apr 2016 12:47:24 -0400 (EDT) [thread overview]
Message-ID: <74308856.32308210.1461862044976.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <57223D36.60304@sandisk.com>
Hello Bart, This is when we have a subset of the paths fails.
As you know the remaining path wont be used until the eh_handler is either done or is short circuited.
What I will do is set this up via my jammer and capture a test using latest upstream.
Of course my customer pain points are all in the RHEL kernels so I need to capture a recovery trace
on the latest upstream kernel.
When the SCSI commands for a path are black-holed and remain that way, even with eh_deadline and the short circuited adapter resets
we simply try again and get back in the wait loop until we finally declare the device offline.
This can take a while and differs depending on Qlogic, Emulex or fnic etc.
First thing tomorrow will set this up and show you what I mean.
Thanks!!
Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services
----- Original Message -----
From: "Bart Van Assche" <bart.vanassche@sandisk.com>
To: "Laurence Oberman" <loberman@redhat.com>
Cc: linux-block@vger.kernel.org, "linux-scsi" <linux-scsi@vger.kernel.org>, "Mike Snitzer" <snitzer@redhat.com>, "James Bottomley" <James.Bottomley@HansenPartnership.com>, "device-mapper development" <dm-devel@redhat.com>, lsf@lists.linux-foundation.org
Sent: Thursday, April 28, 2016 12:41:26 PM
Subject: Re: [dm-devel] [Lsf] Notes from the four separate IO track sessions at LSF/MM
On 04/28/2016 09:23 AM, Laurence Oberman wrote:
> We still suffer from periodic complaints in our large customer base
> regarding the long recovery times for dm-multipath.
> Most of the time this is when we have something like a switch
> back-plane issue or an issue where RSCN'S are blocked coming back up
> the fabric. Corner cases still bite us often.
>
> Most of the complaints originate from customers for example seeing
> Oracle cluster evictions where during the waiting on the mid-layer
> all mpath I/O is blocked until recovery.
>
> We have to tune eh_deadline, eh_timeout and fast_io_fail_tmo but
> even tuning those we have to wait on serial recovery even if we
> set the timeouts low.
>
> Lately we have been living with
> eh_deadline=10
> eh_timeout=5
> fast_fail_io_tmo=10
> leaving default sd timeout at 30s
>
> So this continues to be an issue and I have specific examples using
> the jammer I can provide showing the serial recovery times here.
Hello Laurence,
The long recovery times you refer to, is that for a scenario where all
paths failed or for a scenario where some paths failed and other paths
are still working? In the latter case, how long does it take before
dm-multipath fails over to another path?
Thanks,
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-04-28 16:47 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-27 23:39 Notes from the four separate IO track sessions at LSF/MM James Bottomley
2016-04-28 12:11 ` Mike Snitzer
2016-04-28 15:40 ` James Bottomley
2016-04-28 15:53 ` [Lsf] " Bart Van Assche
2016-04-28 16:19 ` Knight, Frederick
2016-04-28 16:37 ` Bart Van Assche
2016-04-28 17:33 ` James Bottomley
2016-04-28 16:23 ` Laurence Oberman
2016-04-28 16:41 ` [dm-devel] " Bart Van Assche
2016-04-28 16:47 ` Laurence Oberman [this message]
2016-04-29 21:47 ` Laurence Oberman
2016-04-29 21:51 ` Laurence Oberman
2016-04-30 0:36 ` Bart Van Assche
2016-04-30 0:47 ` Laurence Oberman
2016-05-02 18:49 ` Bart Van Assche
2016-05-02 19:28 ` Laurence Oberman
2016-05-02 22:28 ` Bart Van Assche
2016-05-03 17:44 ` Laurence Oberman
2016-05-26 2:38 ` bio-based DM multipath is back from the dead [was: Re: Notes from the four separate IO track sessions at LSF/MM] Mike Snitzer
2016-05-27 8:39 ` Hannes Reinecke
2016-05-27 8:39 ` Hannes Reinecke
2016-05-27 14:44 ` Mike Snitzer
2016-05-27 15:42 ` Hannes Reinecke
2016-05-27 15:42 ` Hannes Reinecke
2016-05-27 16:10 ` Mike Snitzer
2016-04-29 16:45 ` [dm-devel] Notes from the four separate IO track sessions at LSF/MM Benjamin Marzinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=74308856.32308210.1461862044976.JavaMail.zimbra@redhat.com \
--to=loberman@redhat.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=bart.vanassche@sandisk.com \
--cc=dm-devel@redhat.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=lsf@lists.linux-foundation.org \
--cc=snitzer@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.