From: Mike Christie <michaelc@cs.wisc.edu>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Hannes Reinecke <hare@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
Gabriel C <nix.or.die@googlemail.com>,
linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: Multipath failover handling (Was: Re: 2.6.24-rc3-mm1)
Date: Mon, 07 Jan 2008 12:24:35 -0600 [thread overview]
Message-ID: <47826E63.9070008@cs.wisc.edu> (raw)
In-Reply-To: <1199728667.3126.32.camel@localhost.localdomain>
James Bottomley wrote:
>>> However, there's still devloss_tmo to consider ... even in
>>> multipath, I don't think you want to signal path failure until
>>> devloss_tmo has fired otherwise you'll get too many transient up/down
>>> events which damage performance if the array has an expensive failover
>>> model.
>>>
>> Yes. But currently we have a very high failover latency as we always have
>> to wait for the requeued commands to time-out.
>> Hence we're damaging performance on arrays with inexpensive failover.
>
> If it's a either/or choice between the two that's showing our current
> approach to multi-path is broken.
>
>>> The other problem is what to do with in-flight commands at the time the
>>> link went down. With your current patch, they're still stuck until they
>>> time out ... surely there needs to be some type of recovery mechanism
>>> for these?
>>>
>> Well, the in-flight commands are owned by the HBA driver, which should
>> have the proper code to terminate / return those commands with the
>> appriopriate codes. They will then be rescheduled and will be caught
>> like 'normal' IO requests.
>
> But my point is that if a driver goes blocked, those commands will be
> forced to wait the blocked timeout anyway, so your proposed patch does
> nothing to improve the case for dm anyway ... you only avoid commands
> stuck when a device goes blocked if by chance its request queue was
> empty.
How about my patches to use new transport error values and make the
iscsi and fc behave the same.
The problem I think Hannes and I are both trying to solve is this:
1. We do not want to wait for dev_loss_tmo seconds for failover.
2. The FC drivers can hook into fast_io_fail_tmo related callouts and
with that set that tmo to a very low value like a couple of seconds if
they are using multipath, so failovers are fast. However, there is a bug
with where when the fast_io_fail_tmo fires requests that made it to the
driver get failed and returned to the multipath layer, but commands in
the blocked request queue are stuck in there until dev_loss_tmo fires.
With my patches here (need to be rediffed and for FC I need to handle
JamesS's comments about not using a new field for the fast_fail_timeout
state bit):
http://marc.info/?l=linux-scsi&m=117399843216280&w=2
http://marc.info/?l=linux-scsi&m=117399544112073&w=2
http://marc.info/?l=linux-scsi&m=117399844316771&w=2
http://marc.info/?l=linux-scsi&m=117400203324693&w=2
http://marc.info/?l=linux-scsi&m=117400203324690&w=2
For FC we can use the fast_io_fail_tmo for fast failovers, and commands
will not get stuck in a blocked queue for dev_loss_tmo seconds because
when the fast_io_fail_tmo fires the target's queues are unblocked and
fc_remote_port_chkready() ready kicks in (iSCSI does the same with the
patches in the links). And with the patches if multipath-tools is
sending its path testing IO it will get a DID_TRANSPORT_* error code
that it can use to make a decent path failing decision with.
next prev parent reply other threads:[~2008-01-07 18:25 UTC|newest]
Thread overview: 139+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-21 4:45 2.6.24-rc3-mm1 Andrew Morton
2007-11-21 5:51 ` 2.6.24-rc3-mm1 Dave Young
2007-11-21 6:00 ` 2.6.24-rc3-mm1 Andrew Morton
2007-11-21 6:03 ` 2.6.24-rc3-mm1 Dave Young
2007-11-21 6:15 ` 2.6.24-rc3-mm1 Andrew Morton
2007-11-21 6:22 ` 2.6.24-rc3-mm1 Dave Young
2007-11-21 18:35 ` 2.6.24-rc3-mm1 Kirill A. Shutemov
2007-11-21 22:25 ` 2.6.24-rc3-mm1 Andrew Morton
2007-11-26 18:48 ` 2.6.24-rc3-mm1 Rik van Riel
2007-11-26 19:33 ` 2.6.24-rc3-mm1 Jiri Slaby
2007-11-21 5:56 ` 2.6.24-rc3-mm1 - Build Failure on S390x Kamalesh Babulal
2007-11-21 6:04 ` Andrew Morton
2007-11-21 5:58 ` 2.6.24-rc3-mm1 KAMEZAWA Hiroyuki
2007-11-21 6:08 ` 2.6.24-rc3-mm1 Andrew Morton
2007-11-21 12:49 ` 2.6.24-rc3-mm1 Rene Herman
2007-11-21 6:11 ` 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC Kamalesh Babulal
2007-11-21 6:18 ` Andrew Morton
2007-11-21 9:22 ` Kamalesh Babulal
2007-11-21 9:29 ` Andrew Morton
2007-11-21 9:43 ` Kamalesh Babulal
2007-11-21 19:33 ` Torsten Kaiser
2007-11-22 10:04 ` Kirill A. Shutemov
2007-11-21 19:22 ` Len Brown
2007-11-21 19:48 ` Torsten Kaiser
2007-11-24 0:49 ` Alexey Dobriyan
2007-11-26 19:39 ` Rik van Riel
2007-11-26 20:33 ` Andrew Morton
2007-11-26 20:45 ` Ingo Molnar
2007-11-26 22:08 ` Jiri Slaby
2007-11-26 22:17 ` Andrew Morton
2007-11-26 22:22 ` Jiri Slaby
2007-11-26 23:14 ` Jiri Slaby
2007-11-26 23:28 ` Andrew Morton
2007-11-27 17:50 ` Rik van Riel
2007-11-26 20:54 ` Rik van Riel
2007-11-26 20:56 ` Christoph Lameter
2007-11-21 8:06 ` 2.6.24-rc3-mm1- powerpc link failure Kamalesh Babulal
2007-11-21 8:06 ` Kamalesh Babulal
2007-11-21 22:52 ` Stephen Rothwell
2007-11-21 22:52 ` Stephen Rothwell
2007-11-21 8:24 ` 2.6.24-rc3-mm1 make headers_check fails Kamalesh Babulal
2007-11-21 0:32 ` Andrew Morton
2007-11-21 8:41 ` Kamalesh Babulal
2007-11-21 8:44 ` Avi Kivity
2007-11-21 8:52 ` Robert P. J. Day
2007-11-21 9:04 ` Andrew Morton
2007-11-21 9:06 ` Robert P. J. Day
2007-11-21 9:58 ` Sam Ravnborg
2007-11-21 10:00 ` Avi Kivity
2007-11-21 10:17 ` Avi Kivity
2007-11-21 10:31 ` Robert P. J. Day
2007-11-28 5:02 ` Andrew Morton
2007-12-02 8:56 ` Avi Kivity
2007-11-24 14:34 ` Adrian Bunk
2007-11-21 8:42 ` 2.6.24-rc3-mm1 (sync is slow ?) KAMEZAWA Hiroyuki
2007-11-21 8:49 ` Andrew Morton
2007-11-22 3:06 ` KAMEZAWA Hiroyuki
2007-11-24 12:04 ` kosaki
2007-11-24 18:04 ` Gabriel C
2007-11-26 7:06 ` KAMEZAWA Hiroyuki
2007-11-21 8:49 ` KAMEZAWA Hiroyuki
2007-11-21 18:23 ` 2.6.24-rc3-mm1: usb mouse doesn't work Kirill A. Shutemov
2007-11-21 22:22 ` Andrew Morton
2007-11-22 10:17 ` Kirill A. Shutemov
2007-11-22 17:07 ` [linux-usb-devel] " Alan Stern
2007-11-22 17:41 ` Marin Mitov
2007-11-23 2:51 ` Alan Stern
2007-11-23 5:19 ` Kirill A. Shutemov
2007-11-23 16:21 ` Alan Stern
2007-12-31 21:06 ` Alan Stern
2007-11-21 21:45 ` 2.6.24-rc3-mm1: I/O error, system hangs Laurent Riffard
2007-11-21 22:41 ` Andrew Morton
2007-11-23 7:29 ` Laurent Riffard
2007-11-23 7:29 ` Laurent Riffard
2007-11-23 7:51 ` Hannes Reinecke
2007-11-23 7:51 ` Hannes Reinecke
2007-11-23 11:38 ` Hannes Reinecke
2007-11-23 17:52 ` Laurent Riffard
2007-11-24 6:42 ` James Bottomley
2007-11-24 12:57 ` Laurent Riffard
2007-11-24 13:26 ` James Bottomley
2007-11-24 13:26 ` James Bottomley
2007-11-24 17:54 ` Gabriel C
2007-11-24 18:04 ` James Bottomley
2007-11-24 18:08 ` Gabriel C
2007-11-24 18:08 ` Gabriel C
2007-11-24 18:28 ` Gabriel C
2007-11-24 18:28 ` Gabriel C
2007-11-24 22:59 ` Laurent Riffard
2007-11-25 7:37 ` James Bottomley
2007-11-25 7:37 ` James Bottomley
2007-11-25 20:39 ` Laurent Riffard
2007-11-25 20:39 ` Laurent Riffard
2007-11-28 21:38 ` Laurent Riffard
2007-11-24 17:44 ` James Bottomley
2007-11-26 7:54 ` Hannes Reinecke
2007-11-22 10:22 ` 2.6.24-rc3-mm1 Kirill A. Shutemov
2007-11-23 0:18 ` 2.6.24-rc3-mm1 Andrew Morton
2007-11-23 0:48 ` 2.6.24-rc3-mm1 Thomas Gleixner
2007-11-23 6:05 ` 2.6.24-rc3-mm1 Kirill A. Shutemov
2007-11-23 8:59 ` 2.6.24-rc3-mm1 Andreas Herrmann
2007-11-23 1:39 ` 2.6.24-rc3-mm1 Gabriel C
2007-11-23 4:12 ` 2.6.24-rc3-mm1 Andrew Morton
2007-11-23 5:55 ` 2.6.24-rc3-mm1 Gabriel C
2007-11-27 6:15 ` 2.6.24-rc3-mm1 Andrew Morton
2007-12-11 16:33 ` 2.6.24-rc3-mm1 James Bottomley
2007-12-12 10:08 ` 2.6.24-rc3-mm1 Boaz Harrosh
2007-12-12 11:03 ` [PATCH] REQ-flags to/from BIO-flags bugfix Boaz Harrosh
2007-12-12 15:18 ` Matthew Wilcox
2007-12-12 15:54 ` Matthew Wilcox
2007-12-13 5:36 ` David Chinner
2007-12-12 16:06 ` Boaz Harrosh
2007-12-12 16:33 ` Matthew Wilcox
2007-12-12 11:36 ` 2.6.24-rc3-mm1 Jens Axboe
2007-12-14 9:00 ` 2.6.24-rc3-mm1 Hannes Reinecke
2007-12-14 9:00 ` 2.6.24-rc3-mm1 Hannes Reinecke
2007-12-14 14:26 ` 2.6.24-rc3-mm1 James Bottomley
2008-01-07 14:05 ` Multipath failover handling (Was: Re: 2.6.24-rc3-mm1) Hannes Reinecke
2008-01-07 14:05 ` Hannes Reinecke
2008-01-07 17:57 ` James Bottomley
2008-01-07 18:24 ` Mike Christie [this message]
2007-11-26 19:13 ` 2.6.24-rc3-mm1 Randy Dunlap
2007-11-26 19:34 ` 2.6.24-rc3-mm1 Christoph Lameter
2007-11-26 20:40 ` 2.6.24-rc3-mm1 Randy Dunlap
2007-11-26 20:56 ` 2.6.24-rc3-mm1 Christoph Lameter
2007-11-26 20:47 ` [PATCH -mm] x86 allnoconfig memory model Randy Dunlap
2007-11-26 21:00 ` Christoph Lameter
2007-11-26 21:17 ` Randy Dunlap
2007-11-26 21:20 ` Andrew Morton
2007-11-26 21:52 ` Christoph Lameter
2007-11-26 21:57 ` Andrew Morton
2007-11-26 23:19 ` Christoph Lameter
2007-11-27 7:16 ` 2.6.24-rc3-mm1 - brick my Dell Latitude D820 Valdis.Kletnieks
2007-11-27 7:27 ` Andrew Morton
2007-11-27 7:54 ` Valdis.Kletnieks
2007-11-27 8:17 ` Andrew Morton
2007-11-27 10:25 ` Ingo Molnar
2007-11-27 8:25 ` Dave Young
2007-11-27 8:46 ` Valdis.Kletnieks
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47826E63.9070008@cs.wisc.edu \
--to=michaelc@cs.wisc.edu \
--cc=James.Bottomley@HansenPartnership.com \
--cc=akpm@linux-foundation.org \
--cc=hare@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=nix.or.die@googlemail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.