All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Christie <michaelc@cs.wisc.edu>
To: Tomohiro Kusumi <kusumi.tomohiro@jp.fujitsu.com>
Cc: linux-scsi@vger.kernel.org, James.Bottomley@suse.de,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] scsi_transport_fc: handle transient error on multipath environment
Date: Fri, 12 Feb 2010 12:03:32 -0600	[thread overview]
Message-ID: <4B7597F4.6070403@cs.wisc.edu> (raw)
In-Reply-To: <4B7593F4.2050102@cs.wisc.edu>

On 02/12/2010 11:46 AM, Mike Christie wrote:
> - Maybe you want to instead hook something into the dm-mutlipath's
> request (no more bios like in 2004 :)). Can you set a timer on that
> level of request. If that times out then, dm-multipath could do
> something like call blk_abort_queue.

Some more detail. I was thinking maybe you could stack the timeout
handlers like is done for request_fn handlers or maybe the scsi cmd
would use the upper layer's timer somehow. Not sure... but at the least
I think we would not want both a scsi request and dm request timers
running at the same time.

Then for the error handling and timeout handling, most FC drivers have a
terminate_rport_io which works without having to block the entire host.
Those drivers could implement a newer eh where instead of firing the
code in scsi_error.c when a cmd times out, it would run
terminate_rport_io from some workqueue.

new dm request timed out()
	-> scsi_timed_out
		-> fc_timed_out()
			{
				run new eh from workqueue();
			}


new_eh()
	/* no new cmds should be started until we figure out what is going on */
	block rport()
	/* releases cmds upwards so they can run while we try to figure out
what is going on */
	terminate_rport_io()
	/* check if devices are ok */
	send_tur()
	if (tur failed)
		start old scsi_error.c code to unjam us.
	else
		/* everything looks ok so let IO run to this path again */
		unblock rport()


> 
> I think the problem with blk_abort_queue might be that it stops all IO
> to the entire host where you probably just want to work on the remote
> port/path. For that you could call something like
> recover_transient_error. Maybe it could just be a call to
> terminate_rport_io from a workqueue though.

  reply	other threads:[~2010-02-12 18:03 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-12  8:09 [PATCH] scsi_transport_fc: handle transient error on multipath environment Tomohiro Kusumi
2010-02-12 15:27 ` James Smart
2010-04-01  8:42   ` Tomohiro Kusumi
2010-02-12 17:46 ` Mike Christie
2010-02-12 18:03   ` Mike Christie [this message]
2010-04-01  8:42     ` Tomohiro Kusumi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B7597F4.6070403@cs.wisc.edu \
    --to=michaelc@cs.wisc.edu \
    --cc=James.Bottomley@suse.de \
    --cc=kusumi.tomohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.