All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Mike Christie <michaelc@cs.wisc.edu>
Cc: device-mapper development <dm-devel@redhat.com>,
	SCSI Mailing List <linux-scsi@vger.kernel.org>
Subject: Re: mpath: don't fail paths on first error
Date: Fri, 06 Jun 2008 16:18:05 +0200	[thread overview]
Message-ID: <4849471D.4060104@suse.de> (raw)
In-Reply-To: <4848218B.3090404@cs.wisc.edu>

Hi Mike,

Mike Christie wrote:
> The problem we see a lot at Red Hat is that if drivers fail a command 
> with DID_BUS_BUSY or DID_ERROR for something like underrun or even for 
> transient path problems, we can normally recover from this pretty 
> quickly and we do not need to switch path groups.
> 
Yeah, I thought about this, too.
> queue_if_no_path/no_path_retry will prevent IO from being fail upwards, 
> but just switching paths can cause a lot of strain on the target, so we 
> might want to prevent path switching when we do not need to. If we are 
> using a box that requires manual failover or a box that does not use 
> manual failover but still has to shift resources between storage 
> controllers when switching paths, we most likely do not want to mark 
> paths failed for these transient errors.
> 
Well, the original design idea was that it always will be quicker or
less error-prone to just move the I/O to the next path.
Seeing that this is not always the case this approach is probably
better.

> The attached patch allows us to wait X seconds before marking a path as 
> failed. If within X seconds from seeing the first IO error, we do not 
> see a IO complete successfully then we mark a path as failed. This patch 
> work best with the fail fast enhancements ones where for a lot of path 
> problems the fast io fail / recovery timeout will fail io quickly to us 
> and the test IOs do not get stuck, and where some errors like DID_ERROR 
> are not even failed fast.
> 
> The patch should apply over linus's tree or scsi-misc.
> 
Thanks for this, Mike.

Signed-off-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

      reply	other threads:[~2008-06-06 14:18 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-05 17:25 mpath: don't fail paths on first error Mike Christie
2008-06-06 14:18 ` Hannes Reinecke [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4849471D.4060104@suse.de \
    --to=hare@suse.de \
    --cc=dm-devel@redhat.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.